Link to home
Start Free TrialLog in
Avatar of Dan Kaufman
Dan KaufmanFlag for United States of America

asked on

How to find the last word in a string in SQLITE

I have an SQLite database with a Buyers_table which has a column:  Buyer.name.
The Buyer.name value has been entered as John Smith, John Q. Smith, John and Betty Smith, etc.
...basically many different "name strings" but with the last_name always the last word, with a preceding " " blank space.

I need to pull out just the lastName, the last word in the string.  I can do this successfully  ONLY WHEN the Buyer.name field is simply two words such as John Smith.
such as:

SELECT substr(Buyer.name, instr(Buyer.name, ' ') + 1) AS last_name
FROM Buyer;

Can anyone help me see how to extract just the last word, the last_name?  Yes I can do this in SQL, but this project, database, is SQLite so things like CHARINDEX() cannot be used.

Thanks so much,

Dan K
Avatar of PortletPaul
PortletPaul
Flag of Australia image

work with a derived table usig a case expression, e.g.

select
      case when pos1 > 0 then substr(name, pos1 + 1)
           else name
      end
from (
      SELECT instr(Buyer.name, ' ') pos1, name
      FROM Buyer
     )
;

you can then branch according to the pos1 value returned by the derived table.
Avatar of Dan Kaufman

ASKER

Thank you PortletPaul...an excellent suggestion to use a derived table...I was stuck on the path of "simple" string manipulation.

However, it does appear to me at least that I'll need multiple nested loops to break down the reality of the "name" field, the reality of which is the folks entering the data to the Buyer_table can and do get creative on what a person's name is.  An example of the data I'm dealing with:
John Smith.
John G. Smith,
John and Betty Smith,
John G. and Betty Smith,
Mr. and Mrs. John G. Smith
etc.

I'm dealing with "names" with 1 space, 2 spaces, 3 spaces, 4 spaces, and even 5 spaces.   But all I need is the last word in the string.

which is why I have been trying to "simply" identify either the first space from the right instead of the left, and/or the last word in the "name string".

My initial query works just fine as long as the "name" is simply a first_name and a last_name.  It gives me very similar results to your derived table routine.  This basically just splits the string at the first, from the left, space.  However this does not directly give me the position number of the first space as your derived table routine does.

SELECT
Buyer.name,
substr(Buyer.name, 1, instr(Buyer.name, ' ') - 1) AS first_name,
substr(Buyer.name,    instr(Buyer.name, ' ') + 1) AS last_name
FROM
Buyer
;

The short of it is I'm not sure I am even capable of writing a branched-nested-derived table CASE expression that will deal with names having 3, 4, or even 5 spaces in them.  So...I'm hoping to find a "simple solution" that identifies either the last space from the left OR first space from the right, from which I can pull the last word in the string.

Thanks again!
Avatar of Jimy
Jimy

Hi Dan,

Maybe you can try REVERSE function

Reverse(substr(reverse(Buyer.name), 1 ,   instr(reverse(Buyer.name), ' ') ) ) AS last_name

Or in combination with len(Buyer.name) - instr(reverse(Buyer.name), ' ') To get last space position.

Regards,
Jimy
If reverse existed in SQLite it would work nicely.You can create your own functions in SQLite but I don't know which ones are good or bad as I have never really used SQLite in anger.

A more generic approach without a reverse function would be to use a technique derived from "splitstring" approaches. Dynamically build-up a "tally table" (a table of integers) using CTEs, then expand the data using a cross join, and through that find the maximum value of a space in that string. GROUP BY reduces the data back to single rows, and we can now locate the last name using that max(position of space in original string).

nb the example below creates 100 rows in tally, if you need more see note at base
;WITH
Digits AS (
          SELECT 0 AS digit UNION ALL 
          SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
          SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
          )
, Tally AS (
          SELECT tens.digit * 10 + ones.digit AS number
          FROM Digits ones
          CROSS JOIN Digits tens
          )
SELECT
        substr(name, 1, pos1 - 1) AS first_name
      , substr(name, pos1 + 1)    AS last_name
FROM (SELECT
                buyer.name
              , MAX(tally.number) AS pos1
        FROM buyer
        CROSS JOIN tally
        WHERE substr(buyer.name, tally.number, 1) = ' '
        GROUP BY
                buyer.name
     ) AS d
ORDER BY last_name, first_name

Open in new window


|           first_name | last_name |
|----------------------|-----------|
| Mr. and Mrs. John G. |     Smith |
|              John G. |    Smith, |
|    John G. and Betty |    Smith, |
|       John and Betty |    Smith, |
|                 John |    Smith. |

[SQL Fiddle][1]

CREATE TABLE Buyer
    ("name" TEXT(100))
;

INSERT INTO Buyer
    ("name")
VALUES
     ('John Smith.')
    ,('John G. Smith,')
    ,('John and Betty Smith,')
    ,('John G. and Betty Smith,')
    ,('Mr. and Mrs. John G. Smith')
;

Open in new window


see http://sqlfiddle.com/#!7/724c5/3

-- this is sql server syntax, but hopefully it conveys the required message
      SELECT 
              [1s].digit 
            + [10s].digit * 10
            + [100s].digit * 100  /* add more like this as needed */
            AS num
      FROM cteDigits [1s]
      CROSS JOIN cteDigits [10s]
      CROSS JOIN cteDigits [100s] /* add more like this as needed */

Open in new window

SOLUTION
Avatar of Jimy
Jimy

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Many thanks! to PortletPaul and Jimy, and Expert Exchange "at large", for your insight and expert help.

Jimy's solution, essentially "one line of code" was the simplest and most elegant for my purposes. Though I didn't mention it in my Question, the initial need and use of the solution "to pull the last word from a text string" was in an ORDER BY clause of the Buyer_table VIEW.  As I pointed out in my question, the folks entering the data to the Buyer Name field would do so in many creative ways...but as I think is most common we tend to think of a list of names as alphabetized by the person's last name.  This is why I was so frustrated that I couldn't write what seemed (in my little pea-brain) like simple code for a common need.  But Experts Exchange to the rescue!
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Excellent awking00, ..."There's more than one way to skin a cat."

The only edit or addition I can see to be made would be to add the "-" dash character to the rtrim character list.  This will cover, include, those last_names that are hyphenated, such as Fitz-Patrick.

SELECT substr(name,length(rtrim(lower(name),'abcdefghijklmnopqrstuvwxyz-'))+1) AS last_name
from Buyer;

Thanks so much.  I've learned a lot.
Having a "Name" field/column in a database seems almost as common as a sunny day.  And having that Name added to the database_table as John Smith, John G. Smith, John and Betty Smith, etc. seems almost as common as a cloudy day.   AND...on a practical basis in using the Name_list, it would seem to me  that having the list in alphabetic order by a person's last_name would be pretty common too.

However for me at least it became a challenge with my feeble SQLite skills writing a simple ORDER BY clause to have my BUYERS VIEW in A-to-Z order by the Buyer's last names. The range of solutions was surprising to me too.  In the end though Albert Einstein was proven right again: “Make things as simple as possible, but not simpler.”

Thanks again first and foremost to Expert Exchange and in particular awking00 and Jimy and PortletPaul for your expert help.  I couldn't have done without you.
To be sure, having names like your examples in one field is not very safe. I actually like to use five fields to store name, qualifier (e.g. Mr. Mrs, etc.), firstname, middle, lastname, extension (e.g.Jr., III, etc.). It prevents the issue of having names like Otto von Bismarck or Jill St. John retrieve false information.
Excellent point awking00.

I didn't design the original "intake form" with the singular name field, but I will have considerable influence in improving it.  I can see by your two additional name examples that even "our" solution above would have problems since it is essentially keyed on the first-space-from-the-right to define what is the "last name".  We don't have too many von Bismarks in the US Southeast so it might not be a problem, but it'd be an interesting conumdrum to code around vs. eliminating the problem to begin with separate intake fields.

Keep up the good work!
Mc. Donald .....

Any Scottish descendents in the Southeast?
There may not be too many von Bismarcks, but I know there are a bunch of van Burens in Georgia :-)
Awright...PortletPaul and awking00,

You got me there.  If any Mc. Donalds or van Burens show up as Buyers I'll whip up a CASE loop just for them.  Until then I'll keep pushing for a rewrite of the intake form, name_table and whatever else trouble y'all got me in.

Keep up the good work!