Dianna Chua
asked on
Parsing data from 1 field to multiple fields in Oracle Database
I have a column containing data as below
data
aaaaa.2
bbbbb.4/5
aaaaa.4/6, ccccc.6/10
aaaaa.1/2/10, bbbbb.7
I need to parse the data into separate columns per below
data1 data2 data3 data4
aaaaa.2
bbbbb.4 bbbbb.5
aaaaa.4 aaaaa.6 ccccc.6 ccccc.10
aaaaa.1 aaaaa.2 aaaaa.10 bbbbb.7
Can anyone help?
data
aaaaa.2
bbbbb.4/5
aaaaa.4/6, ccccc.6/10
aaaaa.1/2/10, bbbbb.7
I need to parse the data into separate columns per below
data1 data2 data3 data4
aaaaa.2
bbbbb.4 bbbbb.5
aaaaa.4 aaaaa.6 ccccc.6 ccccc.10
aaaaa.1 aaaaa.2 aaaaa.10 bbbbb.7
Can anyone help?
ASKER
Thanks for the quick response.
I am using Oracle 11.2.0.3.0.
The rest of the assumptions are correct.
But I get the following error when I execute the command
ORA-01428: argument '0' is out of range
Please advise. Thanks,
I am using Oracle 11.2.0.3.0.
The rest of the assumptions are correct.
But I get the following error when I execute the command
ORA-01428: argument '0' is out of range
Please advise. Thanks,
That probably means one of the assumptions is NOT correct.
Most likely this one
---- Your numeric portion will always exist and be delimited by a period from the text portion.
Which would cause this INSTR(s, '.') to return 0, which would make the regexp function fail with the error you noted.
Can you provide some sample data where the error occurs as well as what you expect the result to be
Most likely this one
---- Your numeric portion will always exist and be delimited by a period from the text portion.
Which would cause this INSTR(s, '.') to return 0, which would make the regexp function fail with the error you noted.
Can you provide some sample data where the error occurs as well as what you expect the result to be
I'm just taking a guess at what your data might look like, try this...
same assumptions as earlier except I've removed this one:
--- Your numeric portion will always exist and be delimited by a period from the text portion.
So data like this 'abcd,xyz.1/2,mnop'
would be parsed into 4 columns
abcd xyz.1 xyz.2 mnop
Also, I removed the arbitrary limits of 4 on the text and numeric parsing, but I did keep the total column limit at 4 because the columns must be defined in the sql, they can't be derived on the fly.
But, if you want more columns all you would have to change are these 2 sections to add more columns, the rest of the sql will generate the "rn" counter of each field and subfield regardless of size.
SELECT data1, data2, data3, data4
PIVOT (MAX(s) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4))
Here is the sample data I used in my tests of the query above
SELECT data1,
data2,
data3,
data4
FROM (SELECT data,
REGEXP_SUBSTR(s, '[^.]+\.?')
|| REGEXP_SUBSTR(s,
'[0-9]+',
GREATEST(INSTR(s, '.'), 1),
m.COLUMN_VALUE)
s,
ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m.COLUMN_VALUE)
rn
FROM (SELECT data,
n.COLUMN_VALUE
n,
TRIM(REGEXP_SUBSTR(data,
'[^,]+',
1,
n.COLUMN_VALUE))
s
FROM yourtable,
TABLE( SELECT COLLECT(LEVEL)
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT(data, '[^,]+')) n),
TABLE( SELECT COLLECT(LEVEL)
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1))) m)
PIVOT (MAX(s) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4))
same assumptions as earlier except I've removed this one:
--- Your numeric portion will always exist and be delimited by a period from the text portion.
So data like this 'abcd,xyz.1/2,mnop'
would be parsed into 4 columns
abcd xyz.1 xyz.2 mnop
Also, I removed the arbitrary limits of 4 on the text and numeric parsing, but I did keep the total column limit at 4 because the columns must be defined in the sql, they can't be derived on the fly.
But, if you want more columns all you would have to change are these 2 sections to add more columns, the rest of the sql will generate the "rn" counter of each field and subfield regardless of size.
SELECT data1, data2, data3, data4
PIVOT (MAX(s) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4))
Here is the sample data I used in my tests of the query above
WITH
yourtable(data)
AS
(SELECT 'aaaaa.2' FROM DUAL
UNION ALL
SELECT 'bbbbb.4/5' FROM DUAL
UNION ALL
SELECT 'aaaaa.4/6, ccccc.6/10' FROM DUAL
UNION ALL
SELECT 'aaaaa.1/2/10, bbbbb.7' FROM DUAL
UNION ALL
SELECT 'ddddd.1/2/3/4' FROM DUAL
UNION ALL
SELECT 'aaa.1,bbb.2,ccc.3,ddd.4' FROM DUAL
UNION ALL
SELECT 'abcd,xyz.1/2,mnop' FROM DUAL
UNION ALL
SELECT NULL
FROM DUAL)
ASKER
Oops, you are right. I didn't understand your question earlier but you are correct, the numeric portion does not always exist.
The query runs now, but when the data is as below
abcd,efgh
the result i get is
abcdabc efghefgh
Please help. Thanks.
The query runs now, but when the data is as below
abcd,efgh
the result i get is
abcdabc efghefgh
Please help. Thanks.
I get results like this from 'abcd,efgh'
abcd efgh
I got these results using the query from post 42418913
did you make any changes to the query?
abcd efgh
I got these results using the query from post 42418913
did you make any changes to the query?
ASKER
Hi
No changes made to the query, I just replaced yourtable with my query for extracting the data.
No changes made to the query, I just replaced yourtable with my query for extracting the data.
ASKER
Hi
I encountered the following error when I included the query in a procedure.
PL/SQL: ORA-22905: cannot access rows from a non-nested table item
with the following line highlighted
TABLE( SELECT COLLECT(LEVEL)
I encountered the following error when I included the query in a procedure.
PL/SQL: ORA-22905: cannot access rows from a non-nested table item
with the following line highlighted
TABLE( SELECT COLLECT(LEVEL)
The query works when I tested it. Please post your entire query. So we can compare what might be different. Even something that appears minor might have caused the problem.
ASKER
The query is per below. It works when I execute it in PLSQL with line 2 marked off, but when I include it into my procedure, the procedure fails to compile with the ORA-22905 error
SELECT data1, data2, data3, data4
into m_cartlot1, m_cartlot2, m_cartlot3, m_cartlot4
FROM (SELECT data, REGEXP_SUBSTR(s, '[^.]+\.?') || REGEXP_SUBSTR(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1), m.COLUMN_VALUE) s,
ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m.COLUMN_VALUE) rn
FROM (SELECT data, n.COLUMN_VALUE n, TRIM(REGEXP_SUBSTR(data, '[^,]+', 1, n.COLUMN_VALUE)) s
FROM (select charactervalue as data from tres_rawdata_lot
where testopno='PAC02.02'
and itemprompt = 'Welded Cartridge Lot #'
and lotid ='964291.2'),
TABLE( SELECT COLLECT(LEVEL)
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT(data, '[^,]+')) n),
TABLE( SELECT COLLECT(LEVEL)
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1))) m)
PIVOT (MAX(s) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4));
SELECT data1, data2, data3, data4
into m_cartlot1, m_cartlot2, m_cartlot3, m_cartlot4
FROM (SELECT data, REGEXP_SUBSTR(s, '[^.]+\.?') || REGEXP_SUBSTR(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1), m.COLUMN_VALUE) s,
ROW_NUMBER() OVER(PARTITION BY data ORDER BY n, m.COLUMN_VALUE) rn
FROM (SELECT data, n.COLUMN_VALUE n, TRIM(REGEXP_SUBSTR(data, '[^,]+', 1, n.COLUMN_VALUE)) s
FROM (select charactervalue as data from tres_rawdata_lot
where testopno='PAC02.02'
and itemprompt = 'Welded Cartridge Lot #'
and lotid ='964291.2'),
TABLE( SELECT COLLECT(LEVEL)
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT(data, '[^,]+')) n),
TABLE( SELECT COLLECT(LEVEL)
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT(s, '[0-9]+', GREATEST(INSTR(s, '.'), 1))) m)
PIVOT (MAX(s) FOR rn IN (1 AS data1, 2 AS data2, 3 AS data3, 4 AS data4));
ASKER
Hi
Is the information provided sufficient? Do you need me to provide the full procedure?
Thanks.
Is the information provided sufficient? Do you need me to provide the full procedure?
Thanks.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks! The query works fine within the procedure now.
great, glad to hear it
please remember to close the question then
please remember to close the question then
ASKER
Problem solved.
Open in new window
or, using PIVOT syntax
Open in new window
you didn't specify so I made the following assumptions
You are using 11gR2 or higher
Your main field delimiters will always be commas
Your sub-field delimiters will always be forward slashes
Your numeric portion will always exist and be delimited by a period from the text portion.
You will never have more than 4 main (text) fields
You will never have more then 4 sub-fields (numeric) within a main field
You will never have more than 4 total data elements (columns) after all subfields have been extracted from all main fields.
If any of these assumptions are not correct, please provide the correct rule to clarify, plus sample data and expected results