Link to home
Start Free TrialLog in
Avatar of enrique_aeo
enrique_aeo

asked on

separate column

Hi experts

i have this column: [Nombre_empleado]
Segundo Alfredo Mendoza Ponce

i need separate in three columns
col1: Segundo
col2: Alfredo
col3: Mendoza Ponce

the data is:
Segundo Alfredo Mendoza Ponce
Nelson Eduardo Alvarado Gomez
Ana Lucia Cordoba Gomez
Avatar of Jim Horn
Jim Horn
Flag of United States of America image

Curiosity overwhelms me ... What would be the logic to know to keep 'Mendoza Ponce' in one column vs. the separate 'Segundo' and 'Alfredo' in separate columns?
Avatar of enrique_aeo
enrique_aeo

ASKER

col3: Mendoza Ponce = lastname
Okay, but this is a common problem with coding where first and last names coexist in the same column, and some names are two words.  How would SQL Server know which is which?

I have friends with these names ... Can you program T-SQL to tell the difference between first name and last name every time?
  • Hans Jurgen Von Heteren-Freeze
  • Clarence Der Schleickmeister
  • Jean Claude Van Damme
  • Rush Hudson Limbaugh III
  • Wendy Sue Pokipsee
  • Bill S. Preston, Esquire
  • Ted Theodore Logan
  • Buddy Holly  (both can be an acceptable first name, so if in a list it'll be tough to know how to assign it correctly)

I bring this up because many times the answer is 'You can't code SQL Server well enough to know every time, which means you're going to have make some on-the-fly assumptions.

Setting that aside, check out T-SQL:  Normalized data to a single comma delineated string and back, scroll down to the 'Comma delineated string to normalized data' section, which after you substitute space for comma will at least get you to the point where all names are in separate columns.
Also, if your company is willing to throw a little money at this problem, I know there are third-party services which will have a big list of common first and last names, which will allow you to run your dataset through the service and get you about 99.??% of the way there.

I don't remember the names of them though, sorry, but I'm guessing they can be found via Google using a search like 'Data Quality Names'
The data comes from SAP and according to them the names and surnames have it in a single field
>according to them the names and surnames have it in a single field
In that case you have my permission to locate whomever designed that so that first and last names are in the same column, thus giving you this problem, and give them a swift slap in the back of the head.

>The data comes from SAP
Surprising.  They should know better.
i mystake

i have this column: [Nombre_empleado]
 Segundo Alfredo Mendoza Ponce

 i need separate in three columns
 col1: Mendoza Ponce
 col2: Segundo
 col3: Alfredo
please help
Same answer.  Also I'm going to go with the correct answer of 'You can't do that' here, other than the >80% solution I posted above with the article link, and the 99.??% solution of the third-party service.
this article
https://www.experts-exchange.com/questions/28983290/separate-column.html
It does not help me because I have the data separated by a space
I am sure that the first 2 values should go in the first column

[Nombre_empleado]
Bardales Apolo Cyndi Yolanda

  i need separate in three columns
  col1: Bardales Apolo
  col2: Cyndi
  col3: Yolanda
for example
Bardales Apolo Cyndi Yolanda
Alvarez Calderon Sattui Gonzalo
Martinez Ore Gabriel Daniel
Lopez Arroyo Eli Egberto

then col1:
Bardales Apolo
Alvarez Calderon
Martinez Ore
Lopez Arroyo
@Author -

Jim Sir is right.

How we can decide whether we need pick first word or 2 words for the first Name. ? In T-SQL we have build logic based on the requirement.Here we dont know how many words we need to pick for first name
That only works for Latino names. Is typical for a Latino to have 4 names (2 first names and 2 last names) and in some occasions even more.
So Enrique is it possible to have a name like this: Antonio Eduardo Banderas Gil y Gil ? If affirmative the result should be like:
col1: Banderas Gil y Gil
col2: Antonio
col3: Eduardo


Is this assumption right?
I am 100% sure that the first 2 values should go in the first column

 [Nombre_empleado]
 Bardales Apolo Cyndi Yolanda

   i need separate in three columns
   col1: Bardales Apolo
   col2: Cyndi
   col3: Yolanda
What about last 2 column, how many words should go in col2 and col3 ?
Enrique, your examples always shows 4 names but what we want to know is how to deal with exceptions (for example people with 3 or 5 names).
always shows 4 names
I am 100% sure
Can you give us 2 rows and the expected output.
I've used CROSS APPLYs to remove the "find the spaces" logic from the main SELECT, for less "clutter" in the main SELECT.

If/when you run into exceptions, let me know and we can adjust the code as needed then.


SELECT Nombre_empleado,
    LEFT(Nombre_empleado, col1_end - 1) AS col1,
    SUBSTRING(Nombre_empleado, col1_end + 1, col3_start - col1_end - 2) AS col2,
    SUBSTRING(Nombre_empleado, col3_start, 100) AS col3
FROM (
    VALUES('Segundo Alfredo Mendoza Ponce'),('Nelson Eduardo Alvarado Gomez'),('Ana Lucia Cordoba Gomez'),('Segundo Alfredo Mendoza Ponce')
) AS test_data(Nombre_empleado)
CROSS APPLY (
    SELECT CHARINDEX(' ', Nombre_empleado, CHARINDEX(' ', Nombre_empleado) + 1) AS col1_end
) AS ca1
CROSS APPLY (
    SELECT CHARINDEX(' ', Nombre_empleado, col1_end + 1) + 1 AS col3_start
) AS ca2
this is the results

Nombre_empleado                             col1                             col2                     col3
Segundo Alfredo Mendoza Ponce      Segundo Alfredo      Mendoza      Ponce
Nelson Eduardo Alvarado Gomez      Nelson Eduardo      Alvarado      Gomez
Ana Lucia Cordoba Gomez      Ana Lucia      Cordoba      Gomez
Segundo Alfredo Mendoza Ponce      Segundo Alfredo      Mendoza      Ponce

this line be replaced by the table?
VALUES('Segundo Alfredo Mendoza Ponce'),('Nelson Eduardo Alvarado Gomez'),('Ana Lucia Cordoba Gomez'),('Segundo Alfredo Mendoza Ponce')
Yes.  That was just to show the results on the sample data you posted.

SELECT ...
FROM dbo.your_table_name
CROSS APPLY ...
SELECT Nombre_empleado,
     LEFT(Nombre_empleado, col1_end - 1) AS col1,
     SUBSTRING(Nombre_empleado, col1_end + 1, col3_start - col1_end - 2) AS col2,
     SUBSTRING(Nombre_empleado, col3_start, 100) AS col3
 FROM [dbo].[empleados]
 CROSS APPLY (
     SELECT CHARINDEX(' ', Nombre_empleado, CHARINDEX(' ', Nombre_empleado) + 1) AS col1_end
 ) AS ca1
 CROSS APPLY (
     SELECT CHARINDEX(' ', Nombre_empleado, col1_end + 1) + 1 AS col3_start
 ) AS ca2

i have this error
Msg 537, Level 16, State 2, Line 1
Invalid length parameter passed to the LEFT or SUBSTRING function.
ASKER CERTIFIED SOLUTION
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
it's works
SELECT Nombre_empleado,
     LEFT(Nombre_empleado, col1_end - 1) AS col1,
     CASE WHEN col3_start = 1 THEN '' ELSE SUBSTRING(Nombre_empleado, col1_end + 1, col3_start - col1_end - 2) END AS col2,
     CASE WHEN col3_start = 1 THEN '' ELSE SUBSTRING(Nombre_empleado, col3_start, 100) END AS col3
 FROM [dbo].[empleados]
 CROSS APPLY (
     SELECT CHARINDEX(' ', Nombre_empleado + ' ', CHARINDEX(' ', Nombre_empleado) + 1) AS col1_end
 ) AS ca1
 CROSS APPLY (
     SELECT CHARINDEX(' ', Nombre_empleado + ' ', col1_end + 1) + 1 AS col3_start
 ) AS ca2