The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
The Guiding Principles and New Rules document is now in effect.
I'm working on a problem relating to adding user data fields to a off the shelf CMS (Joomla). The solution has already been developed by someone else, and basically involves storing all the different data for all the different users in one table. Since this supports totally dynamic data fields its probably the best solution, so I've not thought of readdressing that part of the project.
However, some jerk somewhere has now decided to request a feature where I need to do a search for users that possess two pieces of data from these added fields. I'm trying to determine the retarded SQL or PHP trick that will have the least impact on performance.
I've probably done a poor job of explaining, so let me provide this example. Extra fields are stored in a table called ext_data that looks like this
ID USERID FIELDNAME DATA
1 17 FavoriteCheese Cheddar
2 17 CarBrand Honda
3 18 FavoriteCheese Havarti
4 18 CarBrand Mazda
5 19 FavoriteCheese Munster
6 19 CarBrand Chevrolet
So, what is the best way to find out the users who both drive Mazdas and eat Havarti?
Taco Bell does win the franchise war according to the tome of knowledge that is Demolition Man. However, I've watched Demolition Man more then a few times and never once did I see WoW. In conclusion Taco Bell has more lasting power then WoW.
But the fields in question are both in this table in different rows. There is no where clause that can cover that contingency that I know of (can't say I've ever attempted something this stupid before)
SELECT ext_data_a.USERID,
ext_data_a.FIELDNAME AS FIELDNAME_A, ext_data_a.DATA AS DATA_A,
ext_data_b.FIELDNAME AS FIELDNAME_B, ext_data_b.DATA AS DATA_B
FROM ext_data AS ext_data_a, ext_data AS ext_data_b
WHERE ext_data_a.USERID == ext_data_b.USERID
AND ext_data_b.ID > ext_data_a.ID
AND FIELDNAME_A != FIELDNAME_B
AND (FIELDNAME_B != "CarBrand" OR DATA_B == "Mazda")
AND (FIELDNAME_A != "FavoriteCheese" OR DATA_A == "Havarti")
Oh, and if there's an uneven number of fields in the table, that query won't work. If the two fields to be queried are invariant, you might be able to create a secondary, well-ordered table, and then populate it every hour or so with a script. No idea how to do it in PHP, but it would be straightforward. Select from the table for FIELDNAME="FavoriteCheese", another select for FIELDNAME="CarBrand", and then re-insert into the new table. If the fields can change per query, haha good luck you'll need it.
It will be a pair of hardcoded queries. Thank god. Client may go further insane, fingers crossed.
AND (FIELDNAME_B != "CarBrand" OR DATA_B == "Mazda")
AND (FIELDNAME_A != "FavoriteCheese" OR DATA_A == "Havarti")
I don't quite get what you were up to here.
It's an encoding of implication into boolean logic. An implication is something like: (A = "Foo") => (B = "Bar"), or if A is "Foo", B must be "Bar". This is equivalent to (A != "Foo") OR (B = "Bar"). The two lines you quoted mean: only return tuples when (CarBrand is the first field and "Mazda" is the data) and (FavoriteCheese is the second field and "Havarti" is the data).
select distinct USERID from ext_data ED1
where data = 'Havarti'
and exists (select 'Y' from ext_data ED2 where data='Mazda' and ED1.userid = ED2.userid)
?
If you wanted it a bit more easy to autogenerate
select distinct USERID from ext_data ED1
where exists (select 'Y' from ext_data ED where data='Havarti' and ED1.userid = ED.userid)
and exists (select 'Y' from ext_data ED where data='Mazda' and ED1.userid = ED.userid)
Would an INTERSECT work?
SELECT userid FROM ext_data WHERE fieldname='carbrand' AND data='mazda'
INTERSECT
SELECT userid FROM ext_data WHERE fieldname='favoritecheese' AND data='havarti'
SELECT table1.USERID from ext_data AS table1 JOIN ext_data AS table2 ON table2.DATA = 'Havarti' AND table2.FIELDNAME = 'FavoriteCheese' AND table2.USERID = table1.USERID WHERE table1.FIELDNAME = 'CarBrand' AND table1.DATA = 'Mazda';
I'm far from a SQL expert, so this may end up being hellish on the DB if the tables are huge, but I think this is as limited as you're going to get it. Try it with the word EXPLAIN before the query to be sure you're not going to lock like 5 million rows or something.
SELECT DISTINCT a.USERID
FROM ext_data AS a, ext_data AS b
WHERE a.FIELDNAME = 'FavoriteCheese' AND a.DATA = 'Havarti'
AND b.FIELDNAME = 'CarBrand' AND b.DATA = 'Mazda'
AND a.USERID = b.USERID
When I replicated your table, this is the result I got:
Posts
You should really select specific fields as well but this is just quick and dirty.
猿も木から落ちる
EDIT: adapt this syntax to your DBMS as needed.
猿も木から落ちる
Try it again, I badly munged up the first copy of the query. I tested it in SQLite, so MySQL shouldn't have any problems.
I don't quite get what you were up to here.
猿も木から落ちる
It's an encoding of implication into boolean logic. An implication is something like: (A = "Foo") => (B = "Bar"), or if A is "Foo", B must be "Bar". This is equivalent to (A != "Foo") OR (B = "Bar"). The two lines you quoted mean: only return tuples when (CarBrand is the first field and "Mazda" is the data) and (FavoriteCheese is the second field and "Havarti" is the data).
Surely this is as simple as
?
If you wanted it a bit more easy to autogenerate
Am I missing something obvious here?
Unless he's using MySQL 3, which I really doubt!
SELECT userid FROM ext_data WHERE fieldname='carbrand' AND data='mazda'
INTERSECT
SELECT userid FROM ext_data WHERE fieldname='favoritecheese' AND data='havarti'
SELECT table1.USERID from ext_data AS table1 JOIN ext_data AS table2 ON table2.DATA = 'Havarti' AND table2.FIELDNAME = 'FavoriteCheese' AND table2.USERID = table1.USERID WHERE table1.FIELDNAME = 'CarBrand' AND table1.DATA = 'Mazda';
I'm far from a SQL expert, so this may end up being hellish on the DB if the tables are huge, but I think this is as limited as you're going to get it. Try it with the word EXPLAIN before the query to be sure you're not going to lock like 5 million rows or something.
When I replicated your table, this is the result I got: