SQL assistance

Deebaser · May 2011

So I have this code that has an error in here somewhere and I can't for the life of me find it (element names have been changed, obv).

The problem I'm seeing is that when I select #updown2, I get one event date (expected), but the output select below is giving me 5 rows for COMPANY A, each with different event dates. Does anyone have any thoughts on what glaring error Im not seeing?

select DISTINCT
customer_id,
max(event_dt) as maxevent
into #updown2
from #updown
group by _id,
go

select distinct
u3.customer_id,
u3.State,
u3.Customer_Name,
u3.VARCHAR_THING,
u3.event_dt,
from
#updown u3
, #updown2 u4
where
u4.maxsd = u3.event_dt
AND u3.customer_id = u4.customer_id
group by u3.customer_id
having other_event_date = max(other_event_date)

Deebaser · May 2011

I resolved the error through black magic I don't rightly understand. The bad shit goes away if I take out the HAVING clause and just create another bullshit hash table for the other_event_date.

New Question: Why the hell does is a Having clause fucking up my shit.
Bonus Points: Who the fuck even uses HAVING clauses anyway

(Sometimes I really hate having to modify other people's code)

1ddqd · May 2011

My understanding is that HAVING = WHERE for aggregate functions like GROUP.

My favorite thing to do is re-write things and use UPDATE instead of creating temp tables like that. It's still a temp table, but it makes working with multiple entry tables (like date of events for records) much easier.

All you do is set the 'home' temp table to be pulled from a unique PK recordset. Once you know that dataset is solid, update the accounts using your max( functions to get the latest event.

One thing I've found is that in recordsets that have multiple events/day, they'll also have a sequence number. Use that whenever you can, since you won't always have datetime (could just be a short date)

JHunz · May 2011

It's a bit hard to tell what was going wrong because I think you obfuscated things either a bit too much, or incompletely.

But, if my assumption is correct that event_dt and other_event_date are two separate date columns in #updown, your problem was as follows:

The having clause was restricting your results to those where the date column other_event_date matched the maximum value of other_event_date for that customer. However, you were outputting event_dt, that being a separate date.
So if you had two records in #updown with different event dates, but the same other_event_date, both would get selected out. Your join to u4 isn't filtering that out because there will always be a record in u4 that matches on both of the join conditions (and you aren't joining based on other_event_date).
Your distinct isn't filtering it out because other_event_date isn't in your select statement.

If you never use HAVING statements for anything else, at the minimum know how to use them to find duplicate records:
select
-- All the columns that you want/need to be distinct
, count(*)
from <table>
having count(*) > 1

Penny Arcade

Quick Links

SQL assistance

Posts