Link to home
Start Free TrialLog in
Avatar of Dianna Chua
Dianna Chua

asked on

Parsing data from 1 field to multiple fields in Oracle Database

I have a column containing data as below

data
aaaaa.2
bbbbb.4/5
aaaaa.4/6, ccccc.6/10
aaaaa.1/2/10, bbbbb.7

I need to parse the data into separate columns per below

data1         data2        data3        data4      
aaaaa.2
bbbbb.4    bbbbb.5
aaaaa.4     aaaaa.6     ccccc.6       ccccc.10
aaaaa.1     aaaaa.2     aaaaa.10   bbbbb.7

Can anyone help?
Avatar of Sean Stuber
Sean Stuber

SELECT MAX(CASE WHEN rn = 1 THEN s3 END) data1,
         MAX(CASE WHEN rn = 2 THEN s3 END) data2,
         MAX(CASE WHEN rn = 3 THEN s3 END) data3,
         MAX(CASE WHEN rn = 4 THEN s3 END) data4
    FROM (SELECT data, s2 || x s3, ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m) rn
            FROM (SELECT data,
                         s,
                         n,
                         m,
                         REGEXP_SUBSTR(s, '^.+\.')
                             s2,
                         TO_NUMBER(REGEXP_SUBSTR(s,
                                                 '[0-9]+',
                                                 INSTR(s, '.'),
                                                 m))
                             x
                    FROM (SELECT data,
                                 n,
                                 TRIM(REGEXP_SUBSTR(data,
                                                    '[^,]+',
                                                    1,
                                                    n))
                                     s
                            FROM yourtable,
                                 (    SELECT LEVEL n
                                        FROM DUAL
                                  CONNECT BY LEVEL <= 4)),
                         (    SELECT LEVEL m
                                FROM DUAL
                          CONNECT BY LEVEL <= 4)
                   WHERE s IS NOT NULL)
           WHERE x IS NOT NULL)
GROUP BY data;

Open in new window


or, using PIVOT syntax

SELECT data1,
       data2,
       data3,
       data4
  FROM (SELECT data,
               s,
               x,
               ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m) rn
          FROM (SELECT data,
                       n,
                       m,
                       REGEXP_SUBSTR(s, '^.+\.')
                           s,
                       TO_NUMBER(REGEXP_SUBSTR(s,
                                               '[0-9]+',
                                               INSTR(s, '.'),
                                               m))
                           x
                  FROM (SELECT data,
                               n,
                               TRIM(REGEXP_SUBSTR(data,
                                                  '[^,]+',
                                                  1,
                                                  n))
                                   s
                          FROM yourtable,
                               (    SELECT LEVEL n
                                      FROM DUAL
                                CONNECT BY LEVEL <= 4)),
                       (    SELECT LEVEL m
                              FROM DUAL
                        CONNECT BY LEVEL <= 4)
                 WHERE s IS NOT NULL)
         WHERE x IS NOT NULL)
       PIVOT (MAX(s || x) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4))

Open in new window



you didn't specify so I made the following assumptions
You are using 11gR2 or higher
Your main field delimiters will always be commas
Your sub-field delimiters will always be forward slashes
Your numeric portion will always exist and be delimited by a period from the text portion.
You will never have more than 4 main (text) fields
You will never have more then 4 sub-fields (numeric) within a main field
You will never have more than 4 total data elements (columns) after all subfields have been extracted from all main fields.

If any of these assumptions are not correct, please provide the correct rule to clarify, plus sample data and expected results
Avatar of Dianna Chua

ASKER

Thanks for the quick response.

I am using Oracle 11.2.0.3.0.
The rest of the assumptions are correct.

But I get the following error when I execute the command
ORA-01428: argument '0' is out of range

Please advise.  Thanks,
That probably means one of the assumptions is NOT correct.
Most likely this one
   ---- Your numeric portion will always exist and be delimited by a period from the text portion.

Which would cause this INSTR(s, '.')  to return 0, which would make the regexp function fail with the error you noted.

Can you provide some sample data where the error occurs as well as what you expect the result to be
I'm just taking a guess at what your data might look like, try this...

SELECT data1,
       data2,
       data3,
       data4
  FROM (SELECT data,
                  REGEXP_SUBSTR(s, '[^.]+\.?')
               || REGEXP_SUBSTR(s,
                                '[0-9]+',
                                GREATEST(INSTR(s, '.'), 1),
                                m.COLUMN_VALUE)
                   s,
               ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m.COLUMN_VALUE)
                   rn
          FROM (SELECT data,
                       n.COLUMN_VALUE
                           n,
                       TRIM(REGEXP_SUBSTR(data,
                                          '[^,]+',
                                          1,
                                          n.COLUMN_VALUE))
                           s
                  FROM yourtable,
                       TABLE(    SELECT COLLECT(LEVEL)
                                   FROM DUAL
                             CONNECT BY LEVEL <= REGEXP_COUNT(data, '[^,]+')) n),
               TABLE(    SELECT COLLECT(LEVEL)
                           FROM DUAL
                     CONNECT BY LEVEL <= REGEXP_COUNT(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1))) m)
       PIVOT (MAX(s) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4))

Open in new window


same assumptions as earlier except I've removed this one:
--- Your numeric portion will always exist and be delimited by a period from the text portion.

So data like this 'abcd,xyz.1/2,mnop'
would be parsed into 4 columns
abcd      xyz.1      xyz.2      mnop

Also, I removed the arbitrary limits of 4 on the text and numeric parsing, but I did keep the total column limit at 4 because the columns must be defined in the sql, they can't be derived on the fly.

But, if you want more columns all you would have to change are these 2 sections to add more columns, the rest of the sql will generate the "rn" counter of each field and subfield regardless of size.

SELECT data1, data2, data3, data4

PIVOT (MAX(s) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4))




Here is the sample data I used in my tests of the query above

WITH
    yourtable(data)
    AS
        (SELECT 'aaaaa.2' FROM DUAL
         UNION ALL
         SELECT 'bbbbb.4/5' FROM DUAL
         UNION ALL
         SELECT 'aaaaa.4/6, ccccc.6/10' FROM DUAL
         UNION ALL
         SELECT 'aaaaa.1/2/10, bbbbb.7' FROM DUAL
         UNION ALL
         SELECT 'ddddd.1/2/3/4' FROM DUAL
         UNION ALL
         SELECT 'aaa.1,bbb.2,ccc.3,ddd.4' FROM DUAL
         UNION ALL
         SELECT 'abcd,xyz.1/2,mnop' FROM DUAL
         UNION ALL
         SELECT NULL
           FROM DUAL)

Open in new window

Oops, you are right.  I didn't understand your question earlier but you are correct, the numeric portion does not always exist.

The query runs now, but when the data is as below
abcd,efgh

the result i get is
abcdabc      efghefgh

Please help. Thanks.
I get results like this from 'abcd,efgh'

   abcd      efgh


I got these results using the query from post 42418913

did you make any changes to the query?
Hi

No changes made to the query,  I just replaced yourtable with my query for extracting the data.
Hi

I encountered the following error when I included the query in a procedure.

PL/SQL: ORA-22905: cannot access rows from a non-nested table item

with the following line highlighted

 TABLE(    SELECT COLLECT(LEVEL)
The query works when I tested it.  Please post your entire query.  So we can compare what might be different.  Even something that appears minor might have caused the problem.
The query is per below.  It works when I execute it in PLSQL with line 2 marked off, but when I include it into my procedure, the procedure fails to compile with the ORA-22905 error


    SELECT data1, data2, data3, data4
      into m_cartlot1, m_cartlot2, m_cartlot3, m_cartlot4
    FROM (SELECT data, REGEXP_SUBSTR(s, '[^.]+\.?') || REGEXP_SUBSTR(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1), m.COLUMN_VALUE) s,
    ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m.COLUMN_VALUE) rn
    FROM (SELECT data, n.COLUMN_VALUE n, TRIM(REGEXP_SUBSTR(data, '[^,]+', 1, n.COLUMN_VALUE)) s
      FROM (select charactervalue as data from tres_rawdata_lot
      where testopno='PAC02.02'
      and itemprompt = 'Welded Cartridge Lot #'
      and lotid ='964291.2'),
        TABLE(    SELECT COLLECT(LEVEL)
        FROM DUAL
        CONNECT BY LEVEL <= REGEXP_COUNT(data, '[^,]+')) n),
        TABLE(    SELECT COLLECT(LEVEL)
        FROM DUAL
        CONNECT BY LEVEL <= REGEXP_COUNT(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1))) m)
        PIVOT (MAX(s) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4));
Hi

Is the information provided sufficient? Do you need me to provide the full procedure?

Thanks.
ASKER CERTIFIED SOLUTION
Avatar of Sean Stuber
Sean Stuber

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks!  The query works fine within the procedure now.
great, glad to hear it

please remember to close the question then
Problem solved.