tags:

views:

313

answers:

2

I have a table that, due to the third party system we are using, sometimes has duplicate data. Since the model uses an EAV method there's no way to filter this the "right" way, so I am aggregating the data into a View - I know this is a data collection problem but it's easier for me to fix it on the display end than go through this system and potentially break existing data and forms. I need to check one of two fields to see if one or both are entered, but only pick one (otherwise the name displays twice like this: "John,John" instead of just "John"). Here's my code for the relevant part:

group_concat(
(
  case when (`s`.`fieldid` = 2) then `s`.`data` 
  else
    case when (`s`.`fieldid` = 35) then `s`.`data` 
    else NULL end 
  end
) separator ','),_utf8'') as first_name

If both fieldid 2 and fieldid 35 are entered, I would expect this to just return the value from fieldid = 2 and not the value from fieldid = 35, since the Else clause shouldn't execute when the original case when is true. However it's grabbing that, and then still executing the case when inside of the else clause?

How can I fix this code to give me either fieldid = 2 OR fieldid = 35, but avoid globbing them both together which results in the name being duplicated?

EDIT

Here is the table structure:

table: subscribers_data
subscriberid (int)   fieldid (int)   data (text)

It uses an E-A-V structure so a sample record might be:

subscriberid          fieldid         data
1                     2               John
1                     3               Smith
1                     35              John
1                     36              Smith

with fieldid 2 and 35 being the custom field "First Name" (defined in a separate table) and fieldid 3 and 36 being "Last Name".

Here is the full view that I'm using:

select `ls`.`subscriberid` AS `id`,
left(`l`.`name`,(locate(_utf8'_',`l`.`name`) - 1)) AS `user_id`,
ifnull(group_concat((
 case when (`s`.`fieldid` = 2) then `s`.`data` 
 when (`s`.`fieldid` = 35) then `s`.`data` 
 else NULL end) separator ','),_utf8'') AS `first_name`,
ifnull(group_concat((
 case when (`s`.`fieldid` = 3) then `s`.`data` 
 when (`s`.`fieldid` = 36) then `s`.`data` 
 else NULL end) separator ','),_utf8'') AS `last_name`,
ifnull(`ls`.`emailaddress`,_utf8'') AS `email_address`,
ifnull(group_concat((
 case when (`s`.`fieldid` = 81) then `s`.`data` 
 else NULL end) separator ','),_utf8'') AS `mobile_phone`,
ifnull(group_concat((
 case when (`s`.`fieldid` = 100) then `s`.`data` 
 else NULL end) separator ','),_utf8'') AS `sms_only` 
from ((`list_subscribers` `ls` 
join `lists` `l` on((`ls`.`listid` = `l`.`listid`))) 
left join `subscribers_data` `s` on((`ls`.`subscriberid` = `s`.`subscriberid`))) 
where (left(`l`.`name`,(locate(_utf8'_',`l`.`name`) - 1)) regexp _utf8'[[:digit:]]+') 
group by `ls`.`subscriberid`,`l`.`name`,`ls`.`emailaddress`

The view is being used as the Model for a Ruby on Rails application, so I'm using some creative hacking to fake out a "user_id" that Rails expects (we name the field list.name in the Lists table using a numeric ID that our front-end Rails app generates when we add a new user, so I'm extracting just this number to make the view look like a Rails-convention database table)

+1  A: 

I am not a mysql guy, but in a sql server case statement, you could do it without the first 'else'

case
  when fieldid = 2 then data
  when fieldid = 35 then data
  else null
end

Also, you seem to be returning the same 'data' field in both cases

Ray
The "data" field is the EAV thing I was talking about. Fieldid links to a list of custom fields (first name, etc but it can be and is defined mroe than once - both 2 and 35 are "first name") and data is the actual value (so in this case both contain "John"). The thing is, that still gives me both fields - I only want one or the other.
Wayne M
Basically I had the code above at first: It returns "John,John" but I want it to just be "John"
Wayne M
You're going to have to show some more code, then, WayneM. What's going on around this `group_concat()`?
Jonathan Leffler
+1  A: 

Anything inside group_concat() doesn't have a way to see the context in which it's running. So, you have have two rows in a single group, one with fieldid=2 and second with fieldid=35, it will do the following:

  • processing row with fieldid=2...
    • s.fieldid = 2 is true, return s.data
  • processing row with fieldid=35...
    • s.fieldid = 2 is false, try the else part
    • s.fieldid = 35 is true, return s.data

This explains why is "John" returned multiple times. The only way to fix it is to run a different query outside of group_concat().

EDIT:

Ih you really have to do it this way, use something like this instead:

SELECT ...
   min(CASE WHEN s.fieldid IN (2,35) THEN s.data ELSE NULL END) AS first_name
...

Alternatively you can do group_concat(DISTINCT ...) if the two values can't be different (otherwise you would get e.g. "John,Johny"). Why do you have two values for first_name/last_name though?

Lukáš Lalinský
Any idea how to handle it? I can remove group_concat from the sample I showed above and it seems to work, but doing it to another one (with different IDs) doesn't give me either result even when they're both there, but keeping group_concat still gives it to me twice.
Wayne M
Post the actual query you are running and the table structure.
Lukáš Lalinský
Using it without group_concat also seems to not give me data for some records where the data exists. It's very weird and confusing.
Wayne M
See my updated answer. Using for example `min` instead of `group_concat` should work, but the schema you have is weird (even for EAV). I don't see a reason to have multiple fields representing the same information.
Lukáš Lalinský
There are two values because we're using a third party tool that lets us create forms with custom fields, and some clients require first name while others don't. Our data guys created two different fields (one required, one not) but we didn't find out until recently that it meant the fields are duplicated; we updated some clients with new fields but it left them with dupes since we can't remove the old fields without potentially breaking the existing forms which are on customer sites.
Wayne M