R is a programming language and software environment for statistical computing and graphics. The R language has become a de facto standard among statisticians for developing statistical software, and is widely used for statistical software development and data analysis.
Now that that's out of the way.... here's my problem.
I have a data frame. Lets call the data frame "object." When I type str(object) I get
> str(object)
'data.frame': 251 obs. of 18 variables:
$ X : int 1 2 3 4 5 6 7 8 9 10 ...
$ Utility : Factor w/ 251 levels "ALLIED UTILITIES INC WOODARD MANOR SUBDIVISION",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Mean : num 0 0 0 0 0 0 0 0 0 0 ...
$ Median : num 0 0 0 0 0 0 0 0 0 0 ...
$ St.Dev : num 0 0 0 0 0 0 0 0 0 0 ...
$ PopulationWhite : int 89 95 95 91 91 77 86 80 95 98 ...
$ Population.Black : int 7 1 1 4 4 12 7 9 1 1 ...
$ Population.Hispanic : int 5 5 5 5 7 20 17 15 5 3 ...
$ Population.65.and.Older : int 15 34 34 23 38 22 33 15 34 13 ...
$ Head.of.Household...25 : int 4 3 3 3 2 4 3 6 3 1 ...
$ Head.of.Household.25.to.64 : int 72 49 49 65 46 59 48 69 49 80 ...
$ Head.of.Household.65.and.Older: int 24 48 48 32 52 37 49 25 48 19 ...
$ Homeowner.Occupied : int 82 79 79 78 83 77 83 66 79 94 ...
$ Population.Density....sqmi. : int 446 2244 2670 324 251 63 50 820 2670 103 ...
$ Region : Factor w/ 3 levels "C","N","S": 1 1 1 1 3 3 3 1 1 3 ...
$ Have.Data : Factor w/ 2 levels "N","Y": 1 1 1 1 1 1 1 1 1 1 ...
$ Use.Level : Factor w/ 4 levels "“H”","“L”","“M”",..: 4 4 4 4 4 4 4 4 4 4 ...
$ code : Factor w/ 6 levels "1","2","3","4",..: 4 4 4 4 6 6 6 4 4 6 ...
So far everything looks good. I want to isolate specific rows of my data frame according levels of Region, Have.Data and Use.Level variables. For some reason I can do this with say, 2 of those variables, but the third. For example, I can do something like this:
> Region[58]
[1] C
Levels: C N S
> (Region[58]=="C")
[1] TRUE
> (Region[58]=="C")&(Have.Data[58]=="Y")
[1] TRUE
So clearly at observation 58, I can see that the Region is labelled C, and it returns TRUE if I ask if Region[58] is C. Similarly I can do that when I ask if both are at certain levels simultaneously.
For some reason this falls apart when I try to do it for the Use.Level variable.
> Use.Level[58]
[1] “M”
Levels: “H” “L” “M” “N”
> (Use.Level[58]=="M")
[1] FALSE
I'm not sure why any of this happens. Is there anyone here with experience using R by any chance?
Posts
Could it be that the values for Region are simply C, N and S while the values of Use.Level are "H", "L", "M" ? (Note the added quotation marks).
Not sure why its behaving differently.. I guess I'll try playing around with it.
Thanks H/A!