As was foretold, we've added advertisements to the forums! If you have questions, or if you encounter any bugs, please visit this thread: https://forums.penny-arcade.com/discussion/240191/forum-advertisement-faq-and-reports-thread/
Options

Need some pro help with Excel

RamiRami Registered User regular
edited July 2017 in Help / Advice Forum
So for reasons not interesting enough to explain, I find myself needing to deal with a bunch of data on a bunch of excel spreadsheets. About 30, with ~20,000 entries each.

I've never been that big on excel because it hasn't really been necessary tool for anything I've done, but I know enough to shove the whole lot into a pivot table and sort out what data I actually need displayed.

The problem I have is one of the data points I'm most in need of is times, and these all come in the form of '1 h 23 m' or '5 h 12 m' etc.

All that matters to me is the hour intervals, set to the next highest hour interval. So I don't care about '1 h 23 m' and '1 h 55 m', they're both just '2 hours' to me. I can't however, figure out how to tell excel to sort these into ranges of 0-1, 1.01-2, 2.01-3, etc.

I suspect part of the problem here is that entering times as 'X h Y m' is not an excel valid format, unfortunately that's how they are presented and I can't do anything about it.

Anyone know how to sort this mess out?

Steam / Xbox Live: WSDX NNID: W-S-D-X 3DS FC: 2637-9461-8549
sig.gif
Rami on

Posts

  • Options
    LeptonLepton Registered User regular
    So, I would recommend using something like Python or VBA to parse the data before throwing it all in the pivot table. However, if you don't want to do that, I have a formula that does what you ask, but it does assume that your minutes take the form mm (08 min, for example).

    vpaooiydcd0a.png

  • Options
    OrogogusOrogogus San DiegoRegistered User regular
    edited July 2017
    A horrifically messy alternative that deals with minutes that show up in both single and double digits (8 m and 18 m) -- change the cell reference from A1 as needed:

    =LEFT(A1, FIND(" h", A1))+IF(VALUE(MID(A1, FIND("h", A1)+1, FIND("m", A1)-FIND("h", A1)-1))>0, 1, 0)

    EDIT: That is to say, you would add a column/row, copy this formula in and fix the reference as needed, then paste it all the way down the column or across the row.

    Orogogus on
  • Options
    LeptonLepton Registered User regular
    Yeah, it occurred to me that you could also use IFERROR to account for single digit minutes. Here's the updated formula for A1:

    =IFERROR(IFERROR(IF(INT(MID(A1,LEN(A1)-3,2))>0,INT(LEFT(A1,LEN(A1)-7))+1,INT(LEFT(A1,LEN(A1)-7))),IF(INT(MID(A1,LEN(A1)-2,1))>0,INT(LEFT(A1,LEN(A1)-6))+1,INT(LEFT(A1,LEN(A1)-6)))),"Something Wrong")

    I like IFERROR.

    Thanks, Orogogus, for getting me to think a little deeper about this.

  • Options
    HefflingHeffling No Pic EverRegistered User regular
    Another way is to highlight the column and then use the "Convert Text To Columns" function from the Data tab. You can tell it to delineate by spaces and it will create four columns of:

    1 | h | 00 | m
    3 | h | 45 | m
    etc format.

  • Options
    LeptonLepton Registered User regular
    Heffling wrote: »
    Another way is to highlight the column and then use the "Convert Text To Columns" function from the Data tab. You can tell it to delineate by spaces and it will create four columns of:

    1 | h | 00 | m
    3 | h | 45 | m
    etc format.

    Well now, that is a far more elegant solution. Good thinking.

  • Options
    Bliss 101Bliss 101 Registered User regular
    If all you care about is the completed hours, as implied by your post (1h 23 mins being the same as 1 h 55 mins; both would mean 2h), then here's an easy trick. Select the column, do a Search>Replace, type in " *" without the quotation marks but with the space. Your 1 h 22 min and 1 h 55 min will both be just 1. Then just add another column that adds a +1 to everything in the previous column. Done.

    MSL59.jpg
  • Options
    RamiRami Registered User regular
    Thanks guys, I'll see if I can get something useable out of these suggestions.

    Steam / Xbox Live: WSDX NNID: W-S-D-X 3DS FC: 2637-9461-8549
    sig.gif
  • Options
    RamiRami Registered User regular
    edited July 2017
    Hey guys, so I tried out the IFERROR:

    =IFERROR(IFERROR(IF(INT(MID(A1,LEN(A1)-3,2))>0,INT(LEFT(A1,LEN(A1)-7))+1,INT(LEFT(A1,LEN(A1)-7))),IF(INT(MID(A1,LEN(A1)-2,1))>0,INT(LEFT(A1,LEN(A1)-6))+1,INT(LEFT(A1,LEN(A1)-6)))),"Something Wrong")

    And it almost worked perfectly.

    zyi4c1r.png

    There seems to be some inconsistencies though, '9 h 20 m' appears (incorrectly) as 9, but '9 h 18 m' correctly appears as 10.

    I also failed to mention times sub 1 hour just appear as XX m, which is also throwing it off. The '-' segments are fine because they aren't included in the count anyway.


    Any ideas on fixing this? It's really close right now.

    Rami on
    Steam / Xbox Live: WSDX NNID: W-S-D-X 3DS FC: 2637-9461-8549
    sig.gif
  • Options
    PailryderPailryder Registered User regular
    edited July 2017
    you could also build a string of IF for finding that 'h' if that is a good representative of your data. i wasn't sure what was font and what was spaces so adjust the numbers below as needed. I like to keep things as short and simple as possible :)
    =IF(MID(A1,3,1)="h",LEFT(A1,2),"")

    Pailryder on
  • Options
    RamiRami Registered User regular
    This may be a good time to point out that I don't understand the IFERROR formula at all. I wasn't kidding when I said I didn't really know much excel. So I'm not really sure what to do with that amended one.


    The backstory behind this, btw, is that one of the departments where I work has been shitting the bed for 3 years and it just came to light in a big way. So a lot of people are getting fired and some people from other departments (like me) are being brought in to sort out 3 years of data that hasn't been processed. If I can get this conversion working though it will literally save me ~2 weeks of tedious data handling.

    Steam / Xbox Live: WSDX NNID: W-S-D-X 3DS FC: 2637-9461-8549
    sig.gif
  • Options
    HefflingHeffling No Pic EverRegistered User regular
    The problem you're running into is that the initial formula is written in such a way as that the hour term has to be present. You can change the "Something Wrong" portion to a "1" if you know that the only time this will occur is when you have less than 1 hour present.

    The iferror function looks for an excel result that gives a #N/A error due to a formula problem and returns a specific result if that error occurs. It's of the form of =iferror(function,result) where result only displays if the function errors out.

    Also, a trick to make your spreadsheets look nice is to recognize that Excel looks at blanks as a double quote, so "" means blank. You can use that to write a function such as:

    =if(A1="","",function)

    This will generate a blank if the cell is blank, and the function otherwise.

  • Options
    Inquisitor77Inquisitor77 2 x Penny Arcade Fight Club Champion A fixed point in space and timeRegistered User regular
    I would avoid formulae altogether (in this case, per an earlier post, it would take forever and you don't know how to troubleshoot it yourself).

    Maybe just combine Heffling's solution with some basic math?

    1) Use text-to-columns to sort out the data in hours and minutes.

    2) Do a quick filter or pivot table of each column to ensure that there were no errors in the conversion process. This is highly likely given the errors you encountered with the formula - I'm betting the formatting was inconsistent or there are typos somewhere along the line that need to be resolved. Based on the errors you find, I'm betting you can do Find-Replace to get rid of the vast majority of them, and then deal with the rest by hand.

    3) In another column, use Excel to multiply the hours column by 60 to convert them into minutes.

    4) Add a new column again, and use Excel to add the converted hours column and the minutes column in to a single consolidated minutes value.

    I feel like this process is important. It gives you a clean data value. As you can see, managing custom text strings in Excel is a pain in the ass. What you want is a single value that gives you everything you need that can then be easily converted into something else. If this is something your company needs to manage over the long term, they should find a way to stick with minutes as the underlying unit, if at all possible.

    You can then use this consolidated minutes value in any number of ways that don't require you to parse out what someone meant when they put in "8 h 022 m" instead of "8h 22m" or "8 h 22m" within the formula itself.


    An alternative thought: If you are intent on continuing the progress you've made and don't care to clean up the existing data and just want to move forward with the interval values, is to filter the column by errors so that you can look for their patterns. Then you can do a Find/Replace for each error pattern on the original data to clean it up so that it no longer returns errors.

    For example, just replacing all the columns with no hour values with "1 h 00 m" so that they are read correctly.

  • Options
    LeptonLepton Registered User regular
    Check for trailing spaces on your data. I think that is why the IFERROR formula is giving back just 9 on say, "9 h 20 m " (note the extra space after the "m"). I was able to replicate your problem in seconds when I added an extra space at the end.

    If you have trailing spaces, you can use the TRIM function to get rid of the trailing spaces. E.g., =TRIM(A1)

    IFERROR(A,B) means that if A gives you an error in excel, it returns B. So for a simple example, a formula "=A1/B1" will give you an error if B1 is 0. The formula "=IFERROR(A1/B1,"Cannot divide by 0")" will return A1/B1 so long as B1 isn't zero. If B1 is zero, the formula will return "Cannot divide by 0".

    I didn't consider that you could just have minutes, but that makes sense. That can be solved by working in additional IF statements, perhaps.

    =IFERROR(IFERROR(IFERROR(IF(INT(MID(A1,LEN(A1)-3,2))>0,INT(LEFT(A1,LEN(A1)-7))+1,INT(LEFT(A1,LEN(A1)-7))),IF(INT(MID(A1,LEN(A1)-2,1))>0,INT(LEFT(A1,LEN(A1)-6))+1,INT(LEFT(A1,LEN(A1)-6)))),IF(INT(LEFT(A1,LEN(A1)-2))>0,1,0)),"Something Wrong")

    My official recommendation is to parse the data with Python first and then put it in Excel.

  • Options
    RamiRami Registered User regular
    It works!

    The TRIM fixed the times rounding down, and that revised IFERROR works on the sub 1 hour formats

    0ShX0Fl.png

    Thanks so much guys, this is going to save me a lot of time that I can spend sorting out this dumpster fire instead of manually processing 3 years of data.

    Steam / Xbox Live: WSDX NNID: W-S-D-X 3DS FC: 2637-9461-8549
    sig.gif
Sign In or Register to comment.