As was foretold, we've added advertisements to the forums! If you have questions, or if you encounter any bugs, please visit this thread: https://forums.penny-arcade.com/discussion/240191/forum-advertisement-faq-and-reports-thread/
Options

R programming

Folken FanelFolken Fanel anime afWhen's KoFRegistered User regular
edited May 2011 in Help / Advice Forum
Wikipedia wrote:
R is a programming language and software environment for statistical computing and graphics. The R language has become a de facto standard among statisticians for developing statistical software, and is widely used for statistical software development and data analysis.

Now that that's out of the way.... here's my problem.

I have a data frame. Lets call the data frame "object." When I type str(object) I get
> str(object)
'data.frame':	251 obs. of  18 variables:
 $ X                             : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Utility                       : Factor w/ 251 levels "ALLIED UTILITIES INC   WOODARD MANOR SUBDIVISION",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Mean                          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Median                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ St.Dev                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ PopulationWhite               : int  89 95 95 91 91 77 86 80 95 98 ...
 $ Population.Black              : int  7 1 1 4 4 12 7 9 1 1 ...
 $ Population.Hispanic           : int  5 5 5 5 7 20 17 15 5 3 ...
 $ Population.65.and.Older       : int  15 34 34 23 38 22 33 15 34 13 ...
 $ Head.of.Household...25        : int  4 3 3 3 2 4 3 6 3 1 ...
 $ Head.of.Household.25.to.64    : int  72 49 49 65 46 59 48 69 49 80 ...
 $ Head.of.Household.65.and.Older: int  24 48 48 32 52 37 49 25 48 19 ...
 $ Homeowner.Occupied            : int  82 79 79 78 83 77 83 66 79 94 ...
 $ Population.Density....sqmi.   : int  446 2244 2670 324 251 63 50 820 2670 103 ...
 $ Region                        : Factor w/ 3 levels "C","N","S": 1 1 1 1 3 3 3 1 1 3 ...
 $ Have.Data                     : Factor w/ 2 levels "N","Y": 1 1 1 1 1 1 1 1 1 1 ...
 $ Use.Level                     : Factor w/ 4 levels "“H”","“L”","“M”",..: 4 4 4 4 4 4 4 4 4 4 ...
 $ code                          : Factor w/ 6 levels "1","2","3","4",..: 4 4 4 4 6 6 6 4 4 6 ...

So far everything looks good. I want to isolate specific rows of my data frame according levels of Region, Have.Data and Use.Level variables. For some reason I can do this with say, 2 of those variables, but the third. For example, I can do something like this:
> Region[58]
[1] C
Levels: C N S
> (Region[58]=="C")
[1] TRUE
> (Region[58]=="C")&(Have.Data[58]=="Y")
[1] TRUE

So clearly at observation 58, I can see that the Region is labelled C, and it returns TRUE if I ask if Region[58] is C. Similarly I can do that when I ask if both are at certain levels simultaneously.

For some reason this falls apart when I try to do it for the Use.Level variable.
> Use.Level[58]
[1] “M”
Levels: “H” “L” “M” “N”
> (Use.Level[58]=="M")
[1] FALSE

I'm not sure why any of this happens. Is there anyone here with experience using R by any chance?

Twitter: Folken_fgc Steam: folken_ XBL: flashg03 PSN: folken_PA SFV: folken_
Dyvim Tvar wrote: »
Characters I hate:

Everybody @Folken Fanel plays as.
Folken Fanel on

Posts

  • Options
    Baron DirigibleBaron Dirigible Registered User regular
    edited May 2011
    Just from looking at it, it seems you're omitting the quotes?
    > Use.Level[58]
    [1] “M”
    Levels: “H” “L” “M” “N”
    > (Use.Level[58]=="M")

    Baron Dirigible on
  • Options
    TzyrTzyr Registered User regular
    edited May 2011
    From the output of the object:
    $ Region                        : Factor w/ 3 levels "C","N","S": 1 1 1 1 3 3 3 1 1 3 ...
    $ Use.Level                    : Factor w/ 4 levels "“H”","“L”","“M”",..: 4 4 4 4 4 4 4 4 4 4 ...
    

    Could it be that the values for Region are simply C, N and S while the values of Use.Level are "H", "L", "M" ? (Note the added quotation marks).

    Tzyr on
  • Options
    DaenrisDaenris Registered User regular
    edited May 2011
    Yeah, for whatever reason, your level labels in Use.Level include quotation marks in the level name, while those for your other factors don't. Either compare it to “M” ( like (Use.Level[58]=="“M”") ) or try to figure out why the quotes are there in the first place and remove them.

    Daenris on
  • Options
    Folken FanelFolken Fanel anime af When's KoFRegistered User regular
    edited May 2011
    Hm... that's really strange. Looking at the original .csv it looks like this:
    "Region"	"Have.Data"	"Use.Level"
    "C"	"N"	“N”
    "C"	"N"	“N”
    "C"	"N"	“N”
    "C"	"N"	“N”
    "S"	"N"	“N”
    "S"	"N"	“N”
    "S"	"N"	“N”
    "C"	"N"	“N”
    "C"	"N"	“N”
    

    Not sure why its behaving differently.. I guess I'll try playing around with it.

    Folken Fanel on
    Twitter: Folken_fgc Steam: folken_ XBL: flashg03 PSN: folken_PA SFV: folken_
    Dyvim Tvar wrote: »
    Characters I hate:

    Everybody @Folken Fanel plays as.
  • Options
    Folken FanelFolken Fanel anime af When's KoFRegistered User regular
    edited May 2011
    I think I figured it out. I just opened up the .csv and deleted all the quotes and that seemed to fix everything. So weird.

    Thanks H/A!

    Folken Fanel on
    Twitter: Folken_fgc Steam: folken_ XBL: flashg03 PSN: folken_PA SFV: folken_
    Dyvim Tvar wrote: »
    Characters I hate:

    Everybody @Folken Fanel plays as.
  • Options
    zilozilo Registered User regular
    edited May 2011
    Yup, the quote marks around the first two columns are different ASCII codes from the quote marks around the third column.

    zilo on
Sign In or Register to comment.