The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
The Guiding Principles and New Rules document is now in effect.
So I'm a grad student studying mathematics and I need to analyze a set of data and run some tests to decide what class of distribution the data belongs to. I'm interested in finding out the number of days that pass between no-hitters that are thrown in major league baseball. However, the biggest problem I'm having is finding said information. Wikipedia gives each no hitter ever thrown and the date of each, but calculating the number of days between each event would be a giant pain in the ass. Is there any chance that someone knows of such a site that will give me the number of days between no hitters?
Or else there's a website here: http://www.timeanddate.com/date/duration.html that does the same thing. Unless you can cleverly get something to parse the table from Wikipedia, I don't see any faster way of doing things. I can't imagine any website keeps track of such an arcane statistic (or so it sounds to me - I have no clue how baseball works).
ugh, this is giving me bad flashbacks, but you can extract the HTML for that table into a new, really basic HTML doc (just give it <HTML> <BODY> and </BODY> </HTML> before and after the table, literally nothing else) and tell Excel to import the data from that file. It'll work. It might be slow depending on how much data (expect about 1 full minute of delay for 1 MB in file size on a 3 GHz Intel dual core CPU with 1 GB RAM) but if you let it sit for a while it'll eventually push out a result.
There's even a nifty function in Excel for calculating the difference in days between two dates, and I'm pretty sure there's a working function for calculating the difference in weekdays, and you can give it a list of holidays to skip. I say I'm pretty sure and "working" because if you've got a bit of experience in the subject using Access, you'll note that Access hasn't been able to calculate business days properly since at least 1997. I don't know if Access 97 does it right either, in fact, I strongly suspect the function was broken back then, too.
Pheezer on
IT'S GOT ME REACHING IN MY POCKET IT'S GOT ME FORKING OVER CASH
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
As an alternative to Pheezer's way (but I have no idea if it's easier or not): If you can copy the data straight up into a notepad file you could also open that notepad file (once saved) in Excel, comma-delineate the data, and then do the fancy functions.
That's the way I learned how to process data into excel, but I have no idea if it's easier or harder than Pheezer's html method.
As an alternative to Pheezer's way (but I have no idea if it's easier or not): If you can copy the data straight up into a notepad file you could also open that notepad file (once saved) in Excel, comma-delineate the data, and then do the fancy functions.
That's the way I learned how to process data into excel, but I have no idea if it's easier or harder than Pheezer's html method.
Actually, Excel stores dates as 5 digit numbers, always. 1/1/1900 is stored in Excel as "1", 3/24/2007 is stored as "39165", and is 39,165 days after January 1, 1900.
Because Excel does this lookup accurately, you can just type the dates into Excel, and subtract each no-hitter date from the previous one. Make sure the cell that you're doing the subtraction in had the data type set to "number" rather than "date" and the cell will display the number of days between those two dates.
So I'm a grad student studying mathematics and I need to analyze a set of data and run some tests to decide what class of distribution the data belongs to. I'm interested in finding out the number of days that pass between no-hitters that are thrown in major league baseball. However, the biggest problem I'm having is finding said information. Wikipedia gives each no hitter ever thrown and the date of each, but calculating the number of days between each event would be a giant pain in the ass. Is there any chance that someone knows of such a site that will give me the number of days between no hitters?
Why aren't you using a stats package like SPSS? All you need is one column listing all dates, and another one coded as no-hitter (yes/no).
This seems pretty damn basic for postgrad study...
Well I gave up on this idea in favor of something easier so I could get this done for Monday and start on homework for other classes.
I decided to see if the number of total homeruns each team hits in a given year is part of some distribution (turns out its normally distributed) using the Kolmogorov-Smirnov Test.
Thanks for everyone's help though. I'll probably look into these suggestions when I don't have time constraints. Lock please.
Posts
linky
(Please do not gift. My game bank is already full.)
Puzzle League: 073119-160185
There's even a nifty function in Excel for calculating the difference in days between two dates, and I'm pretty sure there's a working function for calculating the difference in weekdays, and you can give it a list of holidays to skip. I say I'm pretty sure and "working" because if you've got a bit of experience in the subject using Access, you'll note that Access hasn't been able to calculate business days properly since at least 1997. I don't know if Access 97 does it right either, in fact, I strongly suspect the function was broken back then, too.
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
That's the way I learned how to process data into excel, but I have no idea if it's easier or harder than Pheezer's html method.
Actually, Excel stores dates as 5 digit numbers, always. 1/1/1900 is stored in Excel as "1", 3/24/2007 is stored as "39165", and is 39,165 days after January 1, 1900.
Because Excel does this lookup accurately, you can just type the dates into Excel, and subtract each no-hitter date from the previous one. Make sure the cell that you're doing the subtraction in had the data type set to "number" rather than "date" and the cell will display the number of days between those two dates.
Why aren't you using a stats package like SPSS? All you need is one column listing all dates, and another one coded as no-hitter (yes/no).
This seems pretty damn basic for postgrad study...
I decided to see if the number of total homeruns each team hits in a given year is part of some distribution (turns out its normally distributed) using the Kolmogorov-Smirnov Test.
Thanks for everyone's help though. I'll probably look into these suggestions when I don't have time constraints. Lock please.