Solved

How to find the last word in a string in SQLITE

Posted on 2016-10-29
14
59 Views
Last Modified: 2016-11-01
I have an SQLite database with a Buyers_table which has a column:  Buyer.name.
The Buyer.name value has been entered as John Smith, John Q. Smith, John and Betty Smith, etc.
...basically many different "name strings" but with the last_name always the last word, with a preceding " " blank space.

I need to pull out just the lastName, the last word in the string.  I can do this successfully  ONLY WHEN the Buyer.name field is simply two words such as John Smith.
such as:

SELECT substr(Buyer.name, instr(Buyer.name, ' ') + 1) AS last_name
FROM Buyer;

Can anyone help me see how to extract just the last word, the last_name?  Yes I can do this in SQL, but this project, database, is SQLite so things like CHARINDEX() cannot be used.

Thanks so much,

Dan K
0
Comment
Question by:Dan Kaufman
  • 6
  • 3
  • 3
  • +1
14 Comments
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
work with a derived table usig a case expression, e.g.

select
      case when pos1 > 0 then substr(name, pos1 + 1)
           else name
      end
from (
      SELECT instr(Buyer.name, ' ') pos1, name
      FROM Buyer
     )
;

you can then branch according to the pos1 value returned by the derived table.
0
 

Author Comment

by:Dan Kaufman
Comment Utility
Thank you PortletPaul...an excellent suggestion to use a derived table...I was stuck on the path of "simple" string manipulation.

However, it does appear to me at least that I'll need multiple nested loops to break down the reality of the "name" field, the reality of which is the folks entering the data to the Buyer_table can and do get creative on what a person's name is.  An example of the data I'm dealing with:
John Smith.
John G. Smith,
John and Betty Smith,
John G. and Betty Smith,
Mr. and Mrs. John G. Smith
etc.

I'm dealing with "names" with 1 space, 2 spaces, 3 spaces, 4 spaces, and even 5 spaces.   But all I need is the last word in the string.

which is why I have been trying to "simply" identify either the first space from the right instead of the left, and/or the last word in the "name string".

My initial query works just fine as long as the "name" is simply a first_name and a last_name.  It gives me very similar results to your derived table routine.  This basically just splits the string at the first, from the left, space.  However this does not directly give me the position number of the first space as your derived table routine does.

SELECT
Buyer.name,
substr(Buyer.name, 1, instr(Buyer.name, ' ') - 1) AS first_name,
substr(Buyer.name,    instr(Buyer.name, ' ') + 1) AS last_name
FROM
Buyer
;

The short of it is I'm not sure I am even capable of writing a branched-nested-derived table CASE expression that will deal with names having 3, 4, or even 5 spaces in them.  So...I'm hoping to find a "simple solution" that identifies either the last space from the left OR first space from the right, from which I can pull the last word in the string.

Thanks again!
0
 
LVL 2

Expert Comment

by:Jimy
Comment Utility
Hi Dan,

Maybe you can try REVERSE function

Reverse(substr(reverse(Buyer.name), 1 ,   instr(reverse(Buyer.name), ' ') ) ) AS last_name

Or in combination with len(Buyer.name) - instr(reverse(Buyer.name), ' ') To get last space position.

Regards,
Jimy
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
If reverse existed in SQLite it would work nicely.You can create your own functions in SQLite but I don't know which ones are good or bad as I have never really used SQLite in anger.

A more generic approach without a reverse function would be to use a technique derived from "splitstring" approaches. Dynamically build-up a "tally table" (a table of integers) using CTEs, then expand the data using a cross join, and through that find the maximum value of a space in that string. GROUP BY reduces the data back to single rows, and we can now locate the last name using that max(position of space in original string).

nb the example below creates 100 rows in tally, if you need more see note at base
;WITH
Digits AS (
          SELECT 0 AS digit UNION ALL 
          SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
          SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
          )
, Tally AS (
          SELECT tens.digit * 10 + ones.digit AS number
          FROM Digits ones
          CROSS JOIN Digits tens
          )
SELECT
        substr(name, 1, pos1 - 1) AS first_name
      , substr(name, pos1 + 1)    AS last_name
FROM (SELECT
                buyer.name
              , MAX(tally.number) AS pos1
        FROM buyer
        CROSS JOIN tally
        WHERE substr(buyer.name, tally.number, 1) = ' '
        GROUP BY
                buyer.name
     ) AS d
ORDER BY last_name, first_name

Open in new window


|           first_name | last_name |
|----------------------|-----------|
| Mr. and Mrs. John G. |     Smith |
|              John G. |    Smith, |
|    John G. and Betty |    Smith, |
|       John and Betty |    Smith, |
|                 John |    Smith. |

[SQL Fiddle][1]

CREATE TABLE Buyer
    ("name" TEXT(100))
;

INSERT INTO Buyer
    ("name")
VALUES
     ('John Smith.')
    ,('John G. Smith,')
    ,('John and Betty Smith,')
    ,('John G. and Betty Smith,')
    ,('Mr. and Mrs. John G. Smith')
;

Open in new window


see http://sqlfiddle.com/#!7/724c5/3

-- this is sql server syntax, but hopefully it conveys the required message
      SELECT 
              [1s].digit 
            + [10s].digit * 10
            + [100s].digit * 100  /* add more like this as needed */
            AS num
      FROM cteDigits [1s]
      CROSS JOIN cteDigits [10s]
      CROSS JOIN cteDigits [100s] /* add more like this as needed */

Open in new window

0
 
LVL 2

Assisted Solution

by:Jimy
Jimy earned 250 total points
Comment Utility
This remove all characters from right, up to first space and then take this first part from whole string, so there will left only last name

select replace(Buyer.name, rtrim(Buyer.name, '1234567890qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM!@#$%^&*()_+-=`~[]\/{}|;:,.<>?'), '') AS last_name
FROM Buyer;

Open in new window

1
 

Author Comment

by:Dan Kaufman
Comment Utility
Many thanks! to PortletPaul and Jimy, and Expert Exchange "at large", for your insight and expert help.

Jimy's solution, essentially "one line of code" was the simplest and most elegant for my purposes. Though I didn't mention it in my Question, the initial need and use of the solution "to pull the last word from a text string" was in an ORDER BY clause of the Buyer_table VIEW.  As I pointed out in my question, the folks entering the data to the Buyer Name field would do so in many creative ways...but as I think is most common we tend to think of a list of names as alphabetized by the person's last name.  This is why I was so frustrated that I couldn't write what seemed (in my little pea-brain) like simple code for a common need.  But Experts Exchange to the rescue!
0
 
LVL 31

Accepted Solution

by:
awking00 earned 250 total points
Comment Utility
select substr(name,length(rtrim(lower(name),'abcdefghijklmnopqrstuvwxyz'))+1)
from buyer;
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 

Author Comment

by:Dan Kaufman
Comment Utility
Excellent awking00, ..."There's more than one way to skin a cat."

The only edit or addition I can see to be made would be to add the "-" dash character to the rtrim character list.  This will cover, include, those last_names that are hyphenated, such as Fitz-Patrick.

SELECT substr(name,length(rtrim(lower(name),'abcdefghijklmnopqrstuvwxyz-'))+1) AS last_name
from Buyer;

Thanks so much.  I've learned a lot.
0
 

Author Closing Comment

by:Dan Kaufman
Comment Utility
Having a "Name" field/column in a database seems almost as common as a sunny day.  And having that Name added to the database_table as John Smith, John G. Smith, John and Betty Smith, etc. seems almost as common as a cloudy day.   AND...on a practical basis in using the Name_list, it would seem to me  that having the list in alphabetic order by a person's last_name would be pretty common too.

However for me at least it became a challenge with my feeble SQLite skills writing a simple ORDER BY clause to have my BUYERS VIEW in A-to-Z order by the Buyer's last names. The range of solutions was surprising to me too.  In the end though Albert Einstein was proven right again: “Make things as simple as possible, but not simpler.”

Thanks again first and foremost to Expert Exchange and in particular awking00 and Jimy and PortletPaul for your expert help.  I couldn't have done without you.
0
 
LVL 31

Expert Comment

by:awking00
Comment Utility
To be sure, having names like your examples in one field is not very safe. I actually like to use five fields to store name, qualifier (e.g. Mr. Mrs, etc.), firstname, middle, lastname, extension (e.g.Jr., III, etc.). It prevents the issue of having names like Otto von Bismarck or Jill St. John retrieve false information.
0
 

Author Comment

by:Dan Kaufman
Comment Utility
Excellent point awking00.

I didn't design the original "intake form" with the singular name field, but I will have considerable influence in improving it.  I can see by your two additional name examples that even "our" solution above would have problems since it is essentially keyed on the first-space-from-the-right to define what is the "last name".  We don't have too many von Bismarks in the US Southeast so it might not be a problem, but it'd be an interesting conumdrum to code around vs. eliminating the problem to begin with separate intake fields.

Keep up the good work!
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
Mc. Donald .....

Any Scottish descendents in the Southeast?
0
 
LVL 31

Expert Comment

by:awking00
Comment Utility
There may not be too many von Bismarcks, but I know there are a bunch of van Burens in Georgia :-)
0
 

Author Comment

by:Dan Kaufman
Comment Utility
Awright...PortletPaul and awking00,

You got me there.  If any Mc. Donalds or van Burens show up as Buyers I'll whip up a CASE loop just for them.  Until then I'll keep pushing for a rewrite of the intake form, name_table and whatever else trouble y'all got me in.

Keep up the good work!
1

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

'Between' is such a common word we rarely think about it but in SQL it has a very specific definition we should be aware of. While most database vendors will have their own unique phrases to describe it (see references at end) the concept in common …
Confronted with some SQL you don't know can be a daunting task. It can be even more daunting if that SQL carries some of the old secret codes used in the Ye Olde query syntax, such as: (+)     as used in Oracle;     *=     =*    as used in Sybase …
It is a freely distributed piece of software for such tasks as photo retouching, image composition and image authoring. It works on many operating systems, in many languages.
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now