The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
The Guiding Principles and New Rules document is now in effect.

Generating an Empirical(?) Formula

b0bd0db0bd0d Registered User regular
edited June 2010 in Help / Advice Forum
I'm not too sure how to do this. It's not for a class, somebody was talking about it at work. Say I have some sample data that spans some months over an area. I then have data from that same area that is the actual data. How do I come up with a fomula that if you plug in some sample data, will give an estimate of the actual data? Or what's the process called? Curve fitting? Regression?

b0bd0d on

Posts

  • VeritasVRVeritasVR Registered User regular
    edited June 2010
    Extrapolation?

    You base a formula off data points (linear, exponential, etc) which typically gives you an [x,y] equation.

    VeritasVR on
    CoH_infantry.jpg
    Let 'em eat fucking pineapples!
  • MetalbourneMetalbourne Inside a cluster b personalityRegistered User regular
    edited June 2010
    VeritasVR wrote: »
    Extrapolation?

    You base a formula off data points (linear, exponential, etc) which typically gives you an [x,y] equation.

    You can plug all the X,Y data into excel and then create a chart from that. There's also the ability to create an equation from the data. It's not perfect, but it's close enough for engineering work.

    Metalbourne on
  • DemerdarDemerdar Registered User regular
    edited June 2010
    b0bd0d wrote: »
    I'm not too sure how to do this. It's not for a class, somebody was talking about it at work. Say I have some sample data that spans some months over an area. I then have data from that same area that is the actual data. How do I come up with a fomula that if you plug in some sample data, will give an estimate of the actual data? Or what's the process called? Curve fitting? Regression?

    Do you have to plug the data in excel or do you have to do it by hand?

    Doing is by hand blows.

    Demerdar on
    y6GGs3o.gif
  • GdiguyGdiguy San Diego, CARegistered User regular
    edited June 2010
    It's regression, and doing it by hand only blows because you have to calculate the inverse of a matrix... but it's fairly easy in excel, or any statistics language (R, Matlab, etc)

    Gdiguy on
  • GoodOmensGoodOmens Registered User regular
    edited June 2010
    The real tricky part about regression is figuring out what type of curve is best. In other words, does the data best fit a linear model, a quadratic model, a logarithmic model, etc. Hopefully you have enough knowledge about the situation to make an educated decision about that. As others have said, Excel will handle the actual calculations.

    GoodOmens on
    steam_sig.png
    IOS Game Center ID: Isotope-X
  • b0bd0db0bd0d Registered User regular
    edited June 2010
    Yeah, I've been reading some stuff about it. I plugged in the sample data and the actual data. The only trouble is that I'm getting really low R squared values. Even using a 6 order polynomial it's still around 0.36. That means that the data is not that correlated? Is there any manipluations I can do to try and fix it? I've tried to remove outliers and get to a 90%-95% confidence interval but it really didn't change anything. Plotted on a log scale but that didn't do nothing either. Should I just reduce the confidence level lower and lower until I get a good correlation? There seems to be a trend between the sample and actual data unless there isn't one. I mean, it follows a trend unless it doesn't follow a trend. Can I use probability to say if the sample is a certain value, there are percentages that the actual value will be? i.e. if the sample is 4 then there is a 50% the actual data is 4, 25% it's 3, 20% it's 5, and 5% it's 6.

    My statistics book is about 1200 miles away. Oh yeah, I'm doing this in excel. Trying to draw a correlation between 30 sample points and 30 actual data points.

    b0bd0d on
  • VeritasVRVeritasVR Registered User regular
    edited June 2010
    You might want to just plot one set of data (actual or sample) and get an equation from that. Your R-squared value might be higher. Then graph that line and determine the average deviation from your other set of data. Not sure what information you can get from it, but it should tell you something.

    VeritasVR on
    CoH_infantry.jpg
    Let 'em eat fucking pineapples!
  • PlutoniumPlutonium Registered User regular
    edited June 2010
    It's been a while since my statistics class, but if I am interpreting the situation correctly, what you want is called a Chi-square test or a T-test, depending on what type of data you're working with.

    Plutonium on
  • DemerdarDemerdar Registered User regular
    edited June 2010
    b0bd0d wrote: »
    Yeah, I've been reading some stuff about it. I plugged in the sample data and the actual data. The only trouble is that I'm getting really low R squared values. Even using a 6 order polynomial it's still around 0.36. That means that the data is not that correlated? Is there any manipluations I can do to try and fix it? I've tried to remove outliers and get to a 90%-95% confidence interval but it really didn't change anything. Plotted on a log scale but that didn't do nothing either. Should I just reduce the confidence level lower and lower until I get a good correlation? There seems to be a trend between the sample and actual data unless there isn't one. I mean, it follows a trend unless it doesn't follow a trend. Can I use probability to say if the sample is a certain value, there are percentages that the actual value will be? i.e. if the sample is 4 then there is a 50% the actual data is 4, 25% it's 3, 20% it's 5, and 5% it's 6.

    My statistics book is about 1200 miles away. Oh yeah, I'm doing this in excel. Trying to draw a correlation between 30 sample points and 30 actual data points.

    If you are comparing sample data to actual data, ideally you would want a linear regression, right? If their relationship is linear such that the average slope of the line is 1 then you have a perfect fit.

    Demerdar on
    y6GGs3o.gif
  • Fuzzy Cumulonimbus CloudFuzzy Cumulonimbus Cloud Registered User regular
    edited June 2010
    If you're doing observed versus expected, you need to do a t-test (with null h) or a chi-square test to tell you if anything is out of the ordinary.

    Linear regression only works within a data set. You want to compare two data sets.

    Fuzzy Cumulonimbus Cloud on
  • Fuzzy Cumulonimbus CloudFuzzy Cumulonimbus Cloud Registered User regular
    edited June 2010
    You can also track coefficients of variation.

    Fuzzy Cumulonimbus Cloud on
Sign In or Register to comment.