How do I write a SELECT query to find records having an item in a complex field that appears in another field

dwoolley3
dwoolley3 used Ask the Experts™
on
I would like to have a SELECT query (or statements) in SQL Server 2005 that selects records where one of the items in a Complex field (items separated by commas) matches another field in a view. When the Complex field contains only one item, it is trivial:

SELECT *
  FROM table
where Complex IN
(SELECT SMS
   FROM ComplexView)

Let's say that SMS could be potentially be some of these 4 values: A, B, ABC, BD
And let's say that Complex could contain any of those 4 values or a combination of them, separated by commas:
A
B
ABC
BD
A,B
A,ABC
A,BD
B,ABC
B,BD
ABC,BD
A,B,ABC
B,ABC,BD
A,B,ABC,BD

Thus, I want a record having Complex field of "A,BD" to be displayed if one of the SMS values is "A" or "BD"; however, the record would not display if SMS is only the value of "B".
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2011
Commented:
Please try the following:

SELECT *
  FROM YourTable t
  where exists (select 1 from ComplexView c where (','+replace(t.Complex,' ','')+',') LIKE ('%,'+c.SMS+',%'))

Open in new window


I hope this helps.
Christopher GordonSenior Developer Analyst

Commented:
Does your base table (SELECT * FROM table) in your example have some kind of identity field or unique key?

If so, here is another approach.  It is dependent on a function called dbo.CSVToTable which I've attached.  The first recommendation may work fine, I just wanted to have some fun with this one :)


(see the EE Testing Code for more details after you've created the CSV To Table Function)

select      *
from      mytable

where      myUniqueId in
(

      select      distinct t1.myUniqueId

      from      myTable t1

      --parse the csv values in the base table to it's own table
      outer apply master.dbo.CSVToTable(t1.myInput, ',') matches

      --compare the parsed values from base table to acceptable values in complex.
      inner join MyComplexView complex on
            complex.sms = rtrim(ltrim(matches.code))
)

 CSVToTable.sql EE-Testing-Code.sql
You need to first split the Complex column. Use the technique described here: http://blogs.microsoft.co.il/blogs/itai/archive/2009/02/01/t-sql-split-function.aspx and leverage the CROSS APPLY statemet. Then you can do:
select * from myTable CROSS APPLY [Split](Complex, ',') 
where Data IN (SELECT SMS FROM ComplexView)
order by myTable.Complex 

Open in new window

Introduction to R

R is considered the predominant language for data scientist and statisticians. Learn how to use R for your own data science projects.

Great job in creating the query, wdosanjos! In my data, I do not have embedded spaces, so the query could become:

SELECT *
  FROM YourTable t
  where exists (select 1 from ComplexView c where (','+t.Complex+',') LIKE ('%,'+c.SMS+',%'))

alternatively, I was using the PATINDEX and combined it with your SELECT as follows:

where exists (select 1 from  ComplexView c where PATINDEX('%,'+c.SMS+',%', ','+t.Complex+',') > 0)

Both queries, however, take a long time to go through 50,000 records. I will also experiment with the other two suggestions that are a bit more involved to see if they would improve the performance.
Top Expert 2012

Commented:
>>Both queries, however, take a long time to go through 50,000 records.<<
Right, that is because it is a lousy design and you should seriously consider normalizing that data, if you expect any improvement in speed.

Author

Commented:
Since the query (that resolved my initial request) does take longer to execute than what we need, I have implemented a different design, which has given me good performance.

While I am satisfied with the solution of the first response and I have a good performing "new" design, I am open to suggestions on a design that normalizes the data. The customer record has an identity key and a field called Complex that has multiple SMS items in it, Would you suggest having the 20+ SMS fields in the customer record? though I was wanting to avoid having so many extra fields when most records will use just one SMS or two, but some records will use many of them. Alternatively, I could create a new table having pairs of the identity key (from the customer table) with each different SMS value (thus 1 to 20 or more pairs for each customer record).
Top Expert 2011

Commented:
create a table with 1 value per row ...

and query using EXISTS

select distinct Customerid
  from yourtable as A
 where exists (select customer from yourtable as x
                         where category='B'
                       and a.customerid=x.customerid)
 and exists (select customer from yourtable as x
                         where category='ABC'
                       and a.customerid=x.customerid)
 and exists (select customer from yourtable as x
                         where category='BD'
                       and a.customerid=x.customerid)



yourtable
customer  int not null,
category  char(10) not null,
primary key clustered  (customer,category)



 
Top Expert 2012

Commented:
>>Alternatively, I could create a new table having pairs of the identity key (from the customer table) with each different SMS value (thus 1 to 20 or more pairs for each customer record). <<
That is the way to go.  And make sure you create a Primary Key on those two columns.

Author

Commented:
Okay. I am open to the possibility of having an additional table as mentioned by Lowfatspread and acperkins. I also appreciate the details provided by Lowfatspread.

Perhaps we can integrate the query idea of Lowfatspread with my initial query.

SELECT *
  FROM table
where Complex IN
(SELECT SMS
   FROM ComplexView)

The bottom part, (SELECT SMS FROM ComplexView) is a subset (maybe 5 or 10) of the 20+ valid SMS values. I'd like to select all fields of all records from my "table" where the customer's record has an SMS (or category according to the table made by Lowfatspread) that appears within the subset. Can someone help me finish this query and replace the "where Complex IN" with something that would do the job?

Author

Commented:
Since my original question was answered, I awarded points accordingly, yet I appreciated the alternative suggestions of creating another table to normalize the data. In the end, I created a totally different way to achieve my desired results.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial