Don't like the snow? You can make a bookmark with the following text instead of a url: javascript:snowStorm.toggleSnow(). Clicking it will toggle the snow on and off.
Our new Indie Games subforum is now open for business in G&T. Go and check it out, you might land a code for a free game. If you're developing an indie game and want to post about it, follow these directions. If you don't, he'll break your legs! Hahaha! Seriously though.
Our rules have been updated and given their own forum. Go and look at them! They are nice, and there may be new ones that you didn't know about! Hooray for rules! Hooray for The System! Hooray for Conforming!

R programming

Folken FanelFolken Fanel J.2CWhen's KoFRegistered User regular
edited May 2011 in Help / Advice Forum
Wikipedia wrote:
R is a programming language and software environment for statistical computing and graphics. The R language has become a de facto standard among statisticians for developing statistical software, and is widely used for statistical software development and data analysis.

Now that that's out of the way.... here's my problem.

I have a data frame. Lets call the data frame "object." When I type str(object) I get
> str(object)
'data.frame':	251 obs. of  18 variables:
 $ X                             : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Utility                       : Factor w/ 251 levels "ALLIED UTILITIES INC   WOODARD MANOR SUBDIVISION",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Mean                          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Median                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ St.Dev                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ PopulationWhite               : int  89 95 95 91 91 77 86 80 95 98 ...
 $ Population.Black              : int  7 1 1 4 4 12 7 9 1 1 ...
 $ Population.Hispanic           : int  5 5 5 5 7 20 17 15 5 3 ...
 $ Population.65.and.Older       : int  15 34 34 23 38 22 33 15 34 13 ...
 $ Head.of.Household...25        : int  4 3 3 3 2 4 3 6 3 1 ...
 $ Head.of.Household.25.to.64    : int  72 49 49 65 46 59 48 69 49 80 ...
 $ Head.of.Household.65.and.Older: int  24 48 48 32 52 37 49 25 48 19 ...
 $ Homeowner.Occupied            : int  82 79 79 78 83 77 83 66 79 94 ...
 $ Population.Density....sqmi.   : int  446 2244 2670 324 251 63 50 820 2670 103 ...
 $ Region                        : Factor w/ 3 levels "C","N","S": 1 1 1 1 3 3 3 1 1 3 ...
 $ Have.Data                     : Factor w/ 2 levels "N","Y": 1 1 1 1 1 1 1 1 1 1 ...
 $ Use.Level                     : Factor w/ 4 levels "“H”","“L”","“M”",..: 4 4 4 4 4 4 4 4 4 4 ...
 $ code                          : Factor w/ 6 levels "1","2","3","4",..: 4 4 4 4 6 6 6 4 4 6 ...

So far everything looks good. I want to isolate specific rows of my data frame according levels of Region, Have.Data and Use.Level variables. For some reason I can do this with say, 2 of those variables, but the third. For example, I can do something like this:
> Region[58]
[1] C
Levels: C N S
> (Region[58]=="C")
[1] TRUE
> (Region[58]=="C")&(Have.Data[58]=="Y")
[1] TRUE

So clearly at observation 58, I can see that the Region is labelled C, and it returns TRUE if I ask if Region[58] is C. Similarly I can do that when I ask if both are at certain levels simultaneously.

For some reason this falls apart when I try to do it for the Use.Level variable.
> Use.Level[58]
[1] “M”
Levels: “H” “L” “M” “N”
> (Use.Level[58]=="M")
[1] FALSE

I'm not sure why any of this happens. Is there anyone here with experience using R by any chance?

Folken Fanel on

Posts

  • Baron DirigibleBaron Dirigible Registered User regular
    edited May 2011
    Just from looking at it, it seems you're omitting the quotes?
    > Use.Level[58]
    [1] “M”
    Levels: “H” “L” “M” “N”
    > (Use.Level[58]=="M")

    Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away.
  • TzyrTzyr Registered User regular
    edited May 2011
    From the output of the object:
    $ Region                        : Factor w/ 3 levels "C","N","S": 1 1 1 1 3 3 3 1 1 3 ...
    $ Use.Level                    : Factor w/ 4 levels "“H”","“L”","“M”",..: 4 4 4 4 4 4 4 4 4 4 ...
    

    Could it be that the values for Region are simply C, N and S while the values of Use.Level are "H", "L", "M" ? (Note the added quotation marks).

    "Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away. "
  • DaenrisDaenris Registered User regular
    edited May 2011
    Yeah, for whatever reason, your level labels in Use.Level include quotation marks in the level name, while those for your other factors don't. Either compare it to “M” ( like (Use.Level[58]=="“M”") ) or try to figure out why the quotes are there in the first place and remove them.

  • Folken FanelFolken Fanel J.2C When's KoFRegistered User regular
    edited May 2011
    Hm... that's really strange. Looking at the original .csv it looks like this:
    "Region"	"Have.Data"	"Use.Level"
    "C"	"N"	“N”
    "C"	"N"	“N”
    "C"	"N"	“N”
    "C"	"N"	“N”
    "S"	"N"	“N”
    "S"	"N"	“N”
    "S"	"N"	“N”
    "C"	"N"	“N”
    "C"	"N"	“N”
    

    Not sure why its behaving differently.. I guess I'll try playing around with it.

  • Folken FanelFolken Fanel J.2C When's KoFRegistered User regular
    edited May 2011
    I think I figured it out. I just opened up the .csv and deleted all the quotes and that seemed to fix everything. So weird.

    Thanks H/A!

  • zilozilo Registered User
    edited May 2011
    Yup, the quote marks around the first two columns are different ASCII codes from the quote marks around the third column.

Sign In or Register to comment.