Excel Question: Garbage in, garbage out

DeebaserDeebaser on my way to work in a suit and a tieAhhhh...come on fucking guyRegistered User regular
edited February 2011 in Help / Advice Forum
Okay, so I have a wierd dataset I need to manipulate that has been manhandled, mangled, manually updated and misplaced for so long we're trying to figure out if it is anything other than garbage data.

Anyway, here's the situation:

We have two columns: CUSTOMER_NAME, CUSTOMER_TYPE with a shitton of varchar entries entries for each customer. Alegedly the last entry for each customer name is the most current customer type we should be using.


Example:

'Bob's Dildo Shack' - 'Small business'
'Bob's Dildo Shack' - 'Internet Enterprise'
'Bob's Dildo Shack' - 'Mom and Pop Joint'
'Bob's Dildo Shack' - 'Adult Store'
'Dildopolis' - 'Adult Store'
'Dildopolis' - 'Rubber Dick Outlet'


I want to seperate the lime entries from the non-lime. I'm thinking I can use an if statement, but I'm feeling excel tarded at the moment.

Please help, ye masters of Office.

Deebaser on

Posts

  • LaOsLaOs SaskatoonRegistered User regular
    edited February 2011
    I have an idea, but am about to leave work. Will check in when I get home.

    [Edit]
    When you say you want to separate the lime entries from the non-lime, what do you mean? You want to pull that information somewhere else? You want to mark that line for sorting or something, or what?

    LaOs on
  • DeebaserDeebaser on my way to work in a suit and a tie Ahhhh...come on fucking guyRegistered User regular
    edited February 2011
    Thanks in advance.

    I want to flag the non-lime for deletion somehow, and then delete them. I'm futzing around with an IF, but it isn't working.

    I just may throw the whole goddamn thing into access.

    Deebaser on
  • LaOsLaOs SaskatoonRegistered User regular
    edited February 2011
    Oh, hah, that might be even cleaner than I had thought. Anyway, out the door now so catch you a bit later tonight.

    LaOs on
  • BobbleBobble Registered User regular
    edited February 2011
    If you're looking for the last case of each customer (in the first column), then to the if like this:

    In cell C2, put this: If(A2=A3,0,1).

    This way, when there's a 1 in column C, it indicates that the customer's changing on the next row. Paste the values over the formula, then you can sort on column C to just get the 1's. Pretty sure that should do it?

    Bobble on
  • schussschuss Registered User regular
    edited February 2011
    Bobble wrote: »
    If you're looking for the last case of each customer (in the first column), then to the if like this:

    In cell C2, put this: If(A2=A3,0,1).

    This way, when there's a 1 in column C, it indicates that the customer's changing on the next row. Paste the values over the formula, then you can sort on column C to just get the 1's. Pretty sure that should do it?

    Yes, this will do it. This is my #1 most used formula. Excel 2007+ also has an "eliminate dupes" I believe.

    schuss on
  • DeebaserDeebaser on my way to work in a suit and a tie Ahhhh...come on fucking guyRegistered User regular
    edited February 2011
    Bobble, that is perfect bro! I feel like such an idiot, I was looking up, not down [If(A1=A2,0,1)] and that's not right at all.

    I wound up throwing it into access, maxing out the primary key/"customer name" and joining to pull the appropriate "customer type", but it was driving me nuts that I couldn't instantly do it with excel.

    Deebaser on
  • LaOsLaOs SaskatoonRegistered User regular
    edited February 2011
    Bobble's formula will work so long as your names are all in one section each. Like, all Bob's Dildo Shop together then all Dildopolis together, then all Ye Olde Dildo Shoppe together, etc. It won't work if the latest name/entry for Bob's Dildo Shop appears after Ye Olde Dildo Shoppe or something like that.

    Basically, my solution was going to do the same thing, but it was going to check to see if a specific entry was definitely the last entry for that name.
    Using this example table:
    Customer Name - Customer Type
    Bob's Dildo Shack - Small Business
    Bob's Dildo Shack - Internet Enterprise
    Bob's Dildo Shack - Mom & Pop Joint
    Dildopolis - Adult Store
    Dildopolis - Rubber Dick Outlet
    Bob's Dildo Shack - Adult Store

    This formula would be pasted into C2 and filled down to the end of the rows.

    =IF(CONCATENATE($A2,COUNTIF($A:$A,$A2))=CONCATENATE($A2,COUNTIF($A:$A,$A2)-COUNTIF($A3:$A$100,$A2)),"","x")

    This formula will look at the Customer Name and basically result in a blank cell on the last entry of that customer's name, regardless of how far down it appears (even away from the initial section). Anything that is not the latest entry for that exact customer name results in a x which you can use to sort by and delete all that have an x in that column. It works the same way but does extra checking, basically, to ensure it's the lowest/latest entry for that customer name.

    The first CONCATENATE combines the Customer Name on this row with the total number of times this exact Customer Name has been entered in this column. ("Bob's Dildo Shack4")

    The second CONCATENATE combines the Customer Name on this row with the total number of times this exact Customer Name has been entered in this column minus the total number of entries of this Customer Name in this column below this entry. ("Bob's Dildo Shack1")

    The whole thing is an IF statement comparing the two results of the CONCATENATE formulas. If the first result is equal to the second result, a blank cell is generated. If they do not equal, an x is generated. Sort by the rows with x in this column and you can safely delete them.

    What you will want to make sure you do is make sure that the range you use in the last COUNTIF formula extends to at least one row lower than the end of your data. (You need to specify a range for this formula here, though--simply using $C:$C for the the whole column will not work. I'd just try to pick an arbitrarily large number of rows down that you are unlikely to hit.)

    Anyway, like I said, if your Names are always bunched, then this is overkill, although this works if they're bunched, as well--it's just more complex.

    LaOs on
  • DeebaserDeebaser on my way to work in a suit and a tie Ahhhh...come on fucking guyRegistered User regular
    edited February 2011
    LaOs,

    That is just beautiful. I am saving it, and naming a recurring crystal report in your honor.

    Deebaser on
  • LaOsLaOs SaskatoonRegistered User regular
    edited February 2011
    I do what I can. I really love playing around in Excel. (I should really brush up on Pivot Tables and learn some VBA--I never really use either though.)

    Glad I could help.

    LaOs on
  • schussschuss Registered User regular
    edited February 2011
    I never really do super-advanced things in excel, as before I was limited to 65k rows (most of my queries are in the hundreds of thousands to millions of rows). I just got 2010 though, so mayhaps it's time to brush up.
    I also get Hyperion Explorer to play with, which is great when you're a data nerd who sucks at programming.

    schuss on
Sign In or Register to comment.