Solved

How can I use LIKE in a SQL query to find any string containing a space when CHARINDEX is conflicting with it?

Posted on 2013-12-27
24
447 Views
Last Modified: 2013-12-30
So I have a field which I'm filtering with in a query.  I'm using the LIKE keyword which is intended to find any string containing at least one space (this field holds full names -- no, it's not a normalized table).  The problem is this: when I use '% %' for the LIKE condition I get the following error: "Invalid length parameter passed to the SUBSTRING function."  After goggling a bit, I found that there may be an incompatibility between CHARINDEX and LIKE (I'm using both...I have CHARINDEX finding spaces for me).  Note also, that PATINDEX apparently has the same issue.  I'd like to not use any WHERE clause at all, but I get the same error if I do that.  A LIKE condition that does not get an error is this one that looks for a space and something else (ex. '[a-z] %'...note the space...but this is too restrictive for my purposes).


CREATE TABLE #tblPlanNames (planned_officiant_name varChar(100), planned_officiant_LastName varChar(100))
BEGIN
  INSERT INTO #tblPlanNames 
  SELECT planned_officiant_name, ltrim(rtrim(replace(replace(replace(replace(planned_officiant_name,'  ',' '),'  ',' '),'801) ','801)'),'435) ','435)'))) planned_officiant_LastName FROM marriage WHERE planned_officiant_name is not NULL
END

SELECT DISTINCT  
CASE 
when charindex(' ', planned_officiant_LastName) >  0 AND patindex('%[0-9]%', reverse(left(reverse(planned_officiant_LastName), charindex(' ', reverse(planned_officiant_LastName))-1))) = 0 then reverse(left(reverse(planned_officiant_LastName), charindex(' ', reverse(planned_officiant_LastName))-1)) 
when charindex(' ', planned_officiant_LastName) >  0 AND patindex('%[0-9]%', reverse(left(reverse(planned_officiant_LastName), charindex(' ', reverse(planned_officiant_LastName))-1))) > 0 then reverse(left(right(reverse(planned_officiant_LastName), len(planned_officiant_LastName) - charindex(' ', reverse(planned_officiant_LastName))), charindex(' ', right(reverse(planned_officiant_LastName), len(planned_officiant_LastName) - charindex(' ', reverse(planned_officiant_LastName)))) - 1))
ELSE planned_officiant_LastName  
END lastName, planned_officiant_name 
FROM #tblPlanNames 
WHERE planned_officiant_LastName is not NULL AND ltrim(rtrim(planned_officiant_LastName)) <> '' 
  AND planned_officiant_LastName LIKE '% %') 
ORDER BY lastName

DROP TABLE #tblPlanNames

Open in new window


Thoughts?
0
Comment
Question by:David L. Hansen
  • 11
  • 7
  • 3
  • +3
24 Comments
 
LVL 40

Expert Comment

by:Sharath
ID: 39742322
Can you post some sample data from your table and the expected result
0
 
LVL 59

Expert Comment

by:Kevin Cross
ID: 39742371
Try: WHERE LTRIM(RTRIM(planned_officiant_LastName)) LIKE '% %'
Or: WHERE planned_officiant_LastName LIKE '[^ ]%[ ]%[^ ]'

One issue could be that you are matching on ' ' at the beginning or at the end.  Another solution is to use the CHARINDEX(' ', ...) > 0 in the WHERE also.
0
 
LVL 15

Author Comment

by:David L. Hansen
ID: 39742377
Sure.

id     |     planned_officiant_name     |     clerk
10    |         John Doe 867-5309        |        Clerk A
11    |         Judge Doe 777-0001      |        Clerk B
12    |         Thomas Payne               |        ClerkA

Note that the name often has a phone number in-line with it (same string) but not always...again, this table is not at all normalized --legacy stuff).  I'm trying to get just the last name out of the string (out of planned_officiant_name).  My patindex/charindex stuff (see the case statement) is written for the sole purpose of getting just the last name.
0
 
LVL 59

Expert Comment

by:Kevin Cross
ID: 39742403
Or flip the wildcards to the outside, then you will match anything (even records with spaces on the outside since you have LTRIM/RTRIM for use later).

SELECT CASE when charindex(' ', planned_officiant_LastName) >  0 AND patindex('%[0-9]%', reverse(left(reverse(planned_officiant_LastName), charindex(' ', reverse(planned_officiant_LastName))-1))) = 0 then reverse(left(reverse(planned_officiant_LastName), charindex(' ', reverse(planned_officiant_LastName))-1)) 
when charindex(' ', planned_officiant_LastName) >  0 AND patindex('%[0-9]%', reverse(left(reverse(planned_officiant_LastName), charindex(' ', reverse(planned_officiant_LastName))-1))) > 0 then reverse(left(right(reverse(planned_officiant_LastName), len(planned_officiant_LastName) - charindex(' ', reverse(planned_officiant_LastName))), charindex(' ', right(reverse(planned_officiant_LastName), len(planned_officiant_LastName) - charindex(' ', reverse(planned_officiant_LastName)))) - 1))
ELSE planned_officiant_LastName  
END lastName, planned_officiant_lastname
FROM (
    VALUES('John Doe 867-5309'),('Judge Doe 777-0001'),('Thomas Payne'),(' Not_Valid ')
) t(planned_officiant_LastName)
WHERE planned_officiant_LastName LIKE '%[^ ][ ][^ ]%'
;

Open in new window

0
 
LVL 16

Expert Comment

by:Surendra Nath
ID: 39742406
The issue is coming from your Left Function or the right function, these internally uses the substring function, hence they are throwing the error as substring error.

Check the lenght's in the LEFT and RIGHT that you have provided, in the select statement and see if they are negative.

If they are negative or NULL you will get this error, correct it out and it should be sorted.
0
 
LVL 31

Expert Comment

by:awking00
ID: 39742433
There two statements are precisely the same thing -

when charindex(' ', planned_officiant_LastName) >  0

where ... planned_officiant_LastName LIKE '% %'
0
 
LVL 15

Author Comment

by:David L. Hansen
ID: 39742443
Kevin,
Try: WHERE LTRIM(RTRIM(planned_officiant_LastName)) LIKE '% %'
Or: WHERE planned_officiant_LastName LIKE '[^ ]%[ ]%[^ ]'
gives the exact same error (as stated).

Neo_jarvis,
Shouldn't the ELSE case take care of any negatives or nulls occurring in the first two cases?
0
 
LVL 15

Author Comment

by:David L. Hansen
ID: 39742453
awking00,

I see what you mean logically, but I'm not sure what solution you are pointing to.
0
 
LVL 16

Expert Comment

by:Surendra Nath
ID: 39742476
No, Not neccssarily

Because let us say the string has a space within it


when charindex(' ', planned_officiant_LastName) >  0 AND patindex('%[0-9]%', reverse(left(reverse(planned_officiant_LastName), charindex(' ', reverse(planned_officiant_LastName))-1))) > 0 then reverse(left(right(reverse(planned_officiant_LastName), len(planned_officiant_LastName) - charindex(' ', reverse(planned_officiant_LastName))), charindex(' ', right(reverse(planned_officiant_LastName), len(planned_officiant_LastName) - charindex(' ', reverse(planned_officiant_LastName)))) - 1))


in the above statement, if the right function already eliminates the spaces, then the left function will have nothing to eliminate right... So in that place the charindex will return zero and the charindex - 1 will result in a length of -1 ( essentially you are doing left(X,-1) _ which is an error...
0
 
LVL 59

Expert Comment

by:Kevin Cross
ID: 39742493
Exactly as Neo_jarvis said.
e.g., 'Judge 777-0001'
0
 
LVL 59

Expert Comment

by:Kevin Cross
ID: 39742525
Something like this could work:

SELECT SUBSTRING(planned_officiant_lastname, CHARINDEX(' ', planned_officiant_LastName)+1, CHARINDEX(' ', planned_officiant_LastName+' ', CHARINDEX(' ', planned_officiant_LastName)+1)-CHARINDEX(' ', planned_officiant_LastName))
FROM (
    VALUES('John Doe 867-5309'),('Judge 777-0001'),('Thomas Payne'),(' Not_Valid ')
) t(planned_officiant_LastName)
WHERE planned_officiant_LastName LIKE '%[^ ][ ][a-z]%'
;

Open in new window


Note the updated LIKE grabs values with a letter after the space.  If you want to include records with phone numbers also, you can add that to a CASE statement to show last name or NULL.  Then you can create a different column to show the phone number.
0
 
LVL 15

Author Comment

by:David L. Hansen
ID: 39742550
I see...yes.  Not sure how to fix it though, perhaps putting another WHEN in between the existing two and test for a "< 0" scenario??
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 
LVL 15

Author Comment

by:David L. Hansen
ID: 39742569
I just added a third WHEN (and I placed it in between the original two)...here it is:

when patindex('% %', planned_officiant_LastName) >  0 AND patindex('%[0-9]%', reverse(left(reverse(planned_officiant_LastName), patindex('% %', reverse(planned_officiant_LastName))-1))) < 0 then planned_officiant_LastName

Guess what though?  After adding this and changing the LIKE to use '% %' I still get the same error.  :-(
0
 
LVL 59

Expert Comment

by:Kevin Cross
ID: 39742580
Yes, it will not work simply to add a WHEN (at least not the one you think).  The challenge is that the scenario meets the criteria of having a number pattern.  The problem is when you take off the number, there is only one name left (i.e., no space): 'Judge 777-0001'.

Did you not like the way I showed here:
http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/Q_28326433.html#a39742525
0
 
LVL 15

Author Comment

by:David L. Hansen
ID: 39742618
Sorry Kevin, I totally missed that one.  So when I use that solution it looks like this:
--Handle poor data
CREATE TABLE #tblPlanNames (planned_officiant_name varChar(100), planned_officiant_LastName varChar(100))
BEGIN
  INSERT INTO #tblPlanNames 
  SELECT planned_officiant_name, ltrim(rtrim(replace(replace(replace(replace(planned_officiant_name,'  ',' '),'  ',' '),'801) ','801)'),'435) ','435)'))) planned_officiant_LastName FROM marriage WHERE planned_officiant_name is not NULL
END

SELECT DISTINCT SUBSTRING(planned_officiant_lastname, CHARINDEX(' ', planned_officiant_LastName)+1, CHARINDEX(' ', planned_officiant_LastName+' ', CHARINDEX(' ', planned_officiant_LastName)+1)-CHARINDEX(' ', planned_officiant_LastName)) lastName 
, planned_officiant_name 
from #tblPlanNames 
where planned_officiant_LastName is not NULL AND ltrim(rtrim(planned_officiant_LastName)) <> '' 
  --AND planned_officiant_LastName like '% %'order by lastName
DROP TABLE #tblPlanNames

Open in new window

and it seems to be working (I don't even have to use a LIKE anymore -- nice)...however, for something like "(B ) Lindsay Jones" instead of returning "Jones" it returns ")".  I need to better understand your condition.

The logic I'm after is this:   Return the whole name if no spaces exist in the name field, although if the name contains at least one space and the last word in that string contains numerics, then return the second-to-the-last word (whatever that might be).  Finally, if a space exists and the last word does not contain numerics then return that last word.
0
 
LVL 59

Expert Comment

by:Kevin Cross
ID: 39742635
It seems like you need something like I have here '%[^ ][ ][a-z]%' that determines that the space is before another word versus a special character or number.  You could then put this in a CASE that matches your intent.

CASE
  WHEN planned_officiant_LastName LIKE '%[^ ][ ][a-z]%' THEN ...
  WHEN planned_officiant_LastName LIKE '%[^ ][ ][^a-z]%' THEN ...
  ELSE planned_officiant_LastName
END

Although, in writing this out, I think using PATINDEX instead of CHARINDEX or LIKE may be better for you as what you are after is finding where you have two words connected by a space versus just the existence of a space.
0
 
LVL 15

Author Comment

by:David L. Hansen
ID: 39742638
Also, "(B) Adam Doe 111-302-7758" returns "Adam" instead of "Doe"....I think that's because we got rid of the Reverses.  Your solution returns the second word going from left-to-right...I am so sorry....when I say I need the second word (because a phone number is at the end) I mean the second word going from right-to-left.   Last Name is what I need.
0
 
LVL 15

Author Comment

by:David L. Hansen
ID: 39742648
Just wanted to say thanks in advance of the final solution....this is very helpful!
0
 
LVL 16

Expert Comment

by:Surendra Nath
ID: 39742649
can you post some data for the table planNames here preferably insert statements, so that we can test it out...

We need one for each of the cases above...
0
 
LVL 15

Author Comment

by:David L. Hansen
ID: 39742667
Ok:

('John Doe 867-5309'), ('Judge 777-0001'), ('Thomas Payne'), ('(Bishop) John D. Doe (111)222-4444'), ('(B) John Ethan Doe 111-444-5555')
0
 
LVL 15

Author Comment

by:David L. Hansen
ID: 39742670
I guess ('Doe') exists too and should just return 'Doe'
0
 
LVL 59

Assisted Solution

by:Kevin Cross
Kevin Cross earned 250 total points
ID: 39746133
Given the cases above, this seems to work.  Note: I used a common table expression (you can use a derived table if you cannot use CTEs) to simplify the final expression as it eliminates the need to repeat the split of numbers, prefixes, et cetera.

WITH cte AS (
SELECT planned_officiant_LastName
     , parsed_officiant_LastName = 
           SUBSTRING(planned_officiant_LastName, 
                     PATINDEX('%[ ][a-z]%', ' '+planned_officiant_LastName), 
                     LEN(planned_officiant_LastName) -
                         PATINDEX('%[ ][a-z]%', ' '+planned_officiant_LastName) -
                         PATINDEX('%[ ][a-z]%', ' '+REVERSE(planned_officiant_LastName)) +2)
FROM (
    VALUES('John Doe 867-5309'), 
          ('Judge 777-0001'), 
          ('Thomas Payne'), 
          ('(Bishop) John D. Doe (111)222-4444'), 
          ('(B) John Ethan Doe 111-444-5555'), 
          ('Doe')
) tblPlanNames(planned_officiant_LastName)
)
SELECT lastName = RIGHT(parsed_officiant_LastName, CHARINDEX(' ', REVERSE(parsed_officiant_LastName)+' ')-1)
FROM cte
;

Open in new window

0
 
LVL 69

Accepted Solution

by:
ScottPletcher earned 250 total points
ID: 39746958
Please try this instead:

SELECT DISTINCT ca2.LastName
FROM (
    VALUES('John Doe 867-5309'),('Judge Doe 777-0001'),('Thomas Payne'),(' Not_Valid ')
) t(planned_officiant_LastName)
CROSS APPLY (
    SELECT RTRIM(LTRIM(CASE WHEN PATINDEX('%[0-9]%', t.planned_officiant_LastName) > 0
        THEN LEFT(t.planned_officiant_LastName, PATINDEX('%[0-9]%', t.planned_officiant_LastName) - 1)
        ELSE t.planned_officiant_LastName END)) AS LastName_Minus_Numbers
) AS ca1
CROSS APPLY (
    SELECT LTRIM(RTRIM(RIGHT(ca1.LastName_Minus_Numbers, CHARINDEX(' ', REVERSE(' ' + ca1.LastName_Minus_Numbers))))) AS LastName
) AS ca2
WHERE
    planned_officiant_LastName > ''
--ORDER BY LastName
0
 
LVL 15

Author Closing Comment

by:David L. Hansen
ID: 39747164
That did it...Thanks guys...appreciate it.
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now