Parsing data from 1 field to multiple fields in Oracle Database

I have a column containing data as below

data
aaaaa.2
bbbbb.4/5
aaaaa.4/6, ccccc.6/10
aaaaa.1/2/10, bbbbb.7

I need to parse the data into separate columns per below

data1         data2        data3        data4      
aaaaa.2
bbbbb.4    bbbbb.5
aaaaa.4     aaaaa.6     ccccc.6       ccccc.10
aaaaa.1     aaaaa.2     aaaaa.10   bbbbb.7

Can anyone help?
Dianna ChuaAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

sdstuberCommented:
SELECT MAX(CASE WHEN rn = 1 THEN s3 END) data1,
         MAX(CASE WHEN rn = 2 THEN s3 END) data2,
         MAX(CASE WHEN rn = 3 THEN s3 END) data3,
         MAX(CASE WHEN rn = 4 THEN s3 END) data4
    FROM (SELECT data, s2 || x s3, ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m) rn
            FROM (SELECT data,
                         s,
                         n,
                         m,
                         REGEXP_SUBSTR(s, '^.+\.')
                             s2,
                         TO_NUMBER(REGEXP_SUBSTR(s,
                                                 '[0-9]+',
                                                 INSTR(s, '.'),
                                                 m))
                             x
                    FROM (SELECT data,
                                 n,
                                 TRIM(REGEXP_SUBSTR(data,
                                                    '[^,]+',
                                                    1,
                                                    n))
                                     s
                            FROM yourtable,
                                 (    SELECT LEVEL n
                                        FROM DUAL
                                  CONNECT BY LEVEL <= 4)),
                         (    SELECT LEVEL m
                                FROM DUAL
                          CONNECT BY LEVEL <= 4)
                   WHERE s IS NOT NULL)
           WHERE x IS NOT NULL)
GROUP BY data;

Open in new window


or, using PIVOT syntax

SELECT data1,
       data2,
       data3,
       data4
  FROM (SELECT data,
               s,
               x,
               ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m) rn
          FROM (SELECT data,
                       n,
                       m,
                       REGEXP_SUBSTR(s, '^.+\.')
                           s,
                       TO_NUMBER(REGEXP_SUBSTR(s,
                                               '[0-9]+',
                                               INSTR(s, '.'),
                                               m))
                           x
                  FROM (SELECT data,
                               n,
                               TRIM(REGEXP_SUBSTR(data,
                                                  '[^,]+',
                                                  1,
                                                  n))
                                   s
                          FROM yourtable,
                               (    SELECT LEVEL n
                                      FROM DUAL
                                CONNECT BY LEVEL <= 4)),
                       (    SELECT LEVEL m
                              FROM DUAL
                        CONNECT BY LEVEL <= 4)
                 WHERE s IS NOT NULL)
         WHERE x IS NOT NULL)
       PIVOT (MAX(s || x) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4))

Open in new window



you didn't specify so I made the following assumptions
You are using 11gR2 or higher
Your main field delimiters will always be commas
Your sub-field delimiters will always be forward slashes
Your numeric portion will always exist and be delimited by a period from the text portion.
You will never have more than 4 main (text) fields
You will never have more then 4 sub-fields (numeric) within a main field
You will never have more than 4 total data elements (columns) after all subfields have been extracted from all main fields.

If any of these assumptions are not correct, please provide the correct rule to clarify, plus sample data and expected results
Dianna ChuaAuthor Commented:
Thanks for the quick response.

I am using Oracle 11.2.0.3.0.
The rest of the assumptions are correct.

But I get the following error when I execute the command
ORA-01428: argument '0' is out of range

Please advise.  Thanks,
sdstuberCommented:
That probably means one of the assumptions is NOT correct.
Most likely this one
   ---- Your numeric portion will always exist and be delimited by a period from the text portion.

Which would cause this INSTR(s, '.')  to return 0, which would make the regexp function fail with the error you noted.

Can you provide some sample data where the error occurs as well as what you expect the result to be
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

sdstuberCommented:
I'm just taking a guess at what your data might look like, try this...

SELECT data1,
       data2,
       data3,
       data4
  FROM (SELECT data,
                  REGEXP_SUBSTR(s, '[^.]+\.?')
               || REGEXP_SUBSTR(s,
                                '[0-9]+',
                                GREATEST(INSTR(s, '.'), 1),
                                m.COLUMN_VALUE)
                   s,
               ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m.COLUMN_VALUE)
                   rn
          FROM (SELECT data,
                       n.COLUMN_VALUE
                           n,
                       TRIM(REGEXP_SUBSTR(data,
                                          '[^,]+',
                                          1,
                                          n.COLUMN_VALUE))
                           s
                  FROM yourtable,
                       TABLE(    SELECT COLLECT(LEVEL)
                                   FROM DUAL
                             CONNECT BY LEVEL <= REGEXP_COUNT(data, '[^,]+')) n),
               TABLE(    SELECT COLLECT(LEVEL)
                           FROM DUAL
                     CONNECT BY LEVEL <= REGEXP_COUNT(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1))) m)
       PIVOT (MAX(s) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4))

Open in new window


same assumptions as earlier except I've removed this one:
--- Your numeric portion will always exist and be delimited by a period from the text portion.

So data like this 'abcd,xyz.1/2,mnop'
would be parsed into 4 columns
abcd      xyz.1      xyz.2      mnop

Also, I removed the arbitrary limits of 4 on the text and numeric parsing, but I did keep the total column limit at 4 because the columns must be defined in the sql, they can't be derived on the fly.

But, if you want more columns all you would have to change are these 2 sections to add more columns, the rest of the sql will generate the "rn" counter of each field and subfield regardless of size.

SELECT data1, data2, data3, data4

PIVOT (MAX(s) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4))




Here is the sample data I used in my tests of the query above

WITH
    yourtable(data)
    AS
        (SELECT 'aaaaa.2' FROM DUAL
         UNION ALL
         SELECT 'bbbbb.4/5' FROM DUAL
         UNION ALL
         SELECT 'aaaaa.4/6, ccccc.6/10' FROM DUAL
         UNION ALL
         SELECT 'aaaaa.1/2/10, bbbbb.7' FROM DUAL
         UNION ALL
         SELECT 'ddddd.1/2/3/4' FROM DUAL
         UNION ALL
         SELECT 'aaa.1,bbb.2,ccc.3,ddd.4' FROM DUAL
         UNION ALL
         SELECT 'abcd,xyz.1/2,mnop' FROM DUAL
         UNION ALL
         SELECT NULL
           FROM DUAL)

Open in new window

Dianna ChuaAuthor Commented:
Oops, you are right.  I didn't understand your question earlier but you are correct, the numeric portion does not always exist.

The query runs now, but when the data is as below
abcd,efgh

the result i get is
abcdabc      efghefgh

Please help. Thanks.
sdstuberCommented:
I get results like this from 'abcd,efgh'

   abcd      efgh


I got these results using the query from post 42418913

did you make any changes to the query?
Dianna ChuaAuthor Commented:
Hi

No changes made to the query,  I just replaced yourtable with my query for extracting the data.
Dianna ChuaAuthor Commented:
Hi

I encountered the following error when I included the query in a procedure.

PL/SQL: ORA-22905: cannot access rows from a non-nested table item

with the following line highlighted

 TABLE(    SELECT COLLECT(LEVEL)
sdstuberCommented:
The query works when I tested it.  Please post your entire query.  So we can compare what might be different.  Even something that appears minor might have caused the problem.
Dianna ChuaAuthor Commented:
The query is per below.  It works when I execute it in PLSQL with line 2 marked off, but when I include it into my procedure, the procedure fails to compile with the ORA-22905 error


    SELECT data1, data2, data3, data4
      into m_cartlot1, m_cartlot2, m_cartlot3, m_cartlot4
    FROM (SELECT data, REGEXP_SUBSTR(s, '[^.]+\.?') || REGEXP_SUBSTR(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1), m.COLUMN_VALUE) s,
    ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m.COLUMN_VALUE) rn
    FROM (SELECT data, n.COLUMN_VALUE n, TRIM(REGEXP_SUBSTR(data, '[^,]+', 1, n.COLUMN_VALUE)) s
      FROM (select charactervalue as data from tres_rawdata_lot
      where testopno='PAC02.02'
      and itemprompt = 'Welded Cartridge Lot #'
      and lotid ='964291.2'),
        TABLE(    SELECT COLLECT(LEVEL)
        FROM DUAL
        CONNECT BY LEVEL <= REGEXP_COUNT(data, '[^,]+')) n),
        TABLE(    SELECT COLLECT(LEVEL)
        FROM DUAL
        CONNECT BY LEVEL <= REGEXP_COUNT(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1))) m)
        PIVOT (MAX(s) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4));
Dianna ChuaAuthor Commented:
Hi

Is the information provided sufficient? Do you need me to provide the full procedure?

Thanks.
sdstuberCommented:
Will this subquery always produce at most one row?


 (select charactervalue as data from tres_rawdata_lot
      where testopno='PAC02.02'
      and itemprompt = 'Welded Cartridge Lot #'
      and lotid ='964291.2')

if it will, then try this...

The extra "smarts" I put in to limit the looping without hardcoding the "4" aren't supported within the context of pl/sql in 11.2.0.3.
So, this takes the simpler approach of always searching for 4 parts.

    SELECT data1,
           data2,
           data3,
           data4
      INTO m_cartlot1,
           m_cartlot2,
           m_cartlot3,
           m_cartlot4
      FROM (SELECT data,
                   s2,
                   x,
                   ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m) rn
              FROM (SELECT data,
                           n,
                           m,
                           s,
                           REGEXP_SUBSTR(s, '[^.]+\.?')
                               s2,
                           TO_NUMBER(
                               REGEXP_SUBSTR(
                                   s,
                                   '[0-9]+',
                                   GREATEST(INSTR(s, '.'), 1),
                                   m
                               )
                           )
                               x
                      FROM (SELECT data,
                                   n,
                                   TRIM(
                                       REGEXP_SUBSTR(
                                           data,
                                           '[^,]+',
                                           1,
                                           n
                                       )
                                   )
                                       s
                              FROM (SELECT charactervalue AS data
                                      FROM tres_rawdata_lot
                                     WHERE     testopno = 'PAC02.02'
                                           AND itemprompt = 'Welded Cartridge Lot #'
                                           AND lotid = '964291.2'),
                                   (    SELECT LEVEL n
                                          FROM DUAL
                                    CONNECT BY LEVEL <= 4)),
                           (    SELECT LEVEL m
                                  FROM DUAL
                            CONNECT BY LEVEL <= 4)
                     WHERE n <= NVL(REGEXP_COUNT(data, '[^,]+'), 1)) z
             WHERE m <= GREATEST(NVL(REGEXP_COUNT(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1)), 1), 1))
           PIVOT (MAX(s2 || x) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4));

Open in new window



if that subquery might return more than one row, then you'll get an error like this:

ORA-01422: exact fetch returns more than requested number of rows

And you'll have to either iterate through multiple results, or bulk collect into collection variables, or change the subquery to make sure it only returns one row

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Dianna ChuaAuthor Commented:
Thanks!  The query works fine within the procedure now.
sdstuberCommented:
great, glad to hear it

please remember to close the question then
Dianna ChuaAuthor Commented:
Problem solved.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Databases

From novice to tech pro — start learning today.