Solved

# need to understand regexp_substr and connect by level Oracle

Posted on 2014-10-14
2,352 Views
followup from a previous post:

I'm studying the solution (posted below), some questions . . .

so this
``````TABLE(
SELECT COLLECT(LEVEL)
FROM DUAL
CONNECT BY LEVEL <= LENGTH(x.stripped_mtg_time) - LENGTH(REPLACE(x.stripped_mtg_time, ',')) + 1
)
``````
returns
1
2
for T1,TR567?

I kind of see that from this simple example:
``````select level,t.*
from
( select 'one' AS VALUE from dual
union all
select 'two'from dual
) t
connect by level<=2
ORDER BY  VALUE, LEVEL
``````

But where does COLUMN_VALUE come from (see solution below) ? Is it looking for the pattern in the COLUMN_VALUE occurrence ? so it evaluates 1,2 looking for the pattern (F1 inside of M1,F12) ?

previous post solution:
`````` SELECT a,
b,
MAX(
CASE
WHEN INSTR(sch, b_day) = 0 THEN 'bad'
WHEN b_period = '1' AND NOT REGEXP_LIKE(sch, '1([^0]|\$)') THEN 'bad'
WHEN INSTR(sch, b_period) > 0 THEN 'good'
END
) AS new_method
FROM (SELECT a,
b,
REGEXP_SUBSTR(
a,
'[^,]+',
1,
COLUMN_VALUE
)
sch,
REGEXP_SUBSTR(b, '^[MTWRFS]') b_day,
REGEXP_SUBSTR(b, '[0-9]+\$') b_period
FROM tab1 x,
TABLE(
SELECT COLLECT(LEVEL)
FROM DUAL
CONNECT BY LEVEL <= LENGTH(x.a) - LENGTH(REPLACE(x.a, ',')) + 1
))
WHERE a IN ('M1,F12','MW34,T56,R12,F1234') -- testing !!!
GROUP BY a, b
-- ORDER BY a, b
ORDER BY a, CASE SUBSTR(b,1,1) WHEN 'M' THEN 1 WHEN 'T' THEN 2 WHEN 'W' THEN 3 WHEN 'R' THEN 4 WHEN 'F' THEN 5 WHEN 'S' THEN 6 END, SUBSTR(b,2)
``````
which produces this
``````M1,F12	M1	good
M1,F12	F1	good
M1,F12	F2	good
MW34,T56,R12,F1234	M3	good
MW34,T56,R12,F1234	M4	good
MW34,T56,R12,F1234	T5	good
MW34,T56,R12,F1234	T6	good
MW34,T56,R12,F1234	W3	good
MW34,T56,R12,F1234	W4	good
MW34,T56,R12,F1234	R1	good
MW34,T56,R12,F1234	R2	good
MW34,T56,R12,F1234	F1	good
MW34,T56,R12,F1234	F2	good
MW34,T56,R12,F1234	F3	good
MW34,T56,R12,F1234	F4	good
``````
0
• 8
• 6
• 2

LVL 73

Assisted Solution

sdstuber earned 500 total points
ID: 40380810
the TABLE() function creates COLUMN_VALUE.

It has to give a name to values pulled from the collection.

If the collection were to be cast as some named collection type, the TABLE function would return the names of the elements in those types.   Since it's just a generic collection, it generates its own name.

It has nothing to do with the regexps or the connect by, it's solely an artifact of the TABLE() function
0

LVL 73

Expert Comment

ID: 40380855
Here's a simpler example to show the column_value generation

select * from table(select collect(level) from dual connect by level < 10)
0

Author Comment

ID: 40381179
>>the TABLE() function creates COLUMN_VALUE, etc.
-- ok, that helps and makes sense.

>>select * from table(select collect(level) from dual connect by level < 10)
-- ok, I'll check this tomorrow and probably have some additional questions.

Thx.
0

Author Comment

ID: 40381985
>>the TABLE() function creates COLUMN_VALUE, etc.
>>Here's a simpler example to show the column_value generation: select * from table(select collect(level) from dual connect by level < 10)
-- ok, I see both of these, thx.

-----------------
for the 1st REGEXP_SUBSTR here:
``````SELECT a,
b,
REGEXP_SUBSTR(
a,
'[^,]+',
1,
COLUMN_VALUE
) sch,
REGEXP_SUBSTR(b, '^[MTWRFS]') b_day,
REGEXP_SUBSTR(b, '[0-9]+\$') b_period
FROM tab1 x,
TABLE(
SELECT COLLECT(LEVEL)
FROM DUAL
CONNECT BY LEVEL <= LENGTH(x.a) - LENGTH(REPLACE(x.a, ',')) + 1
)
``````

- I see the results, but not quite getting the REGEXP_SUBSTR . . .
REGEXP_SUBSTR(
a,
'[^,]+',
1,
COLUMN_VALUE
)

so this is saying
- look at "a" (M1,R12),
- for pattern "not comma, then everything after the comma"
- start in position 1,
- looking for the COLUMN_VALUE occurrence

but I'm just not seeing how this results in M1 for the 1st occurrence and R12 for the second occurence - I don't see the mechanics . . .

``````A	B	SCH	Dy	Pd	COLUMN_VALUE
M1,R12	M1	M1	M	1	1
M1,R12	R1	M1	R	1	1
M1,R12	R2	M1	R	2	1
M1,R12	M1	R12	M	1	2
M1,R12	R1	R12	R	1	2
M1,R12	R2	R12	R	2	2
MWF4	M4	MWF4	M	4	1
MWF4	W4	MWF4	W	4	1
MWF4	F4	MWF4	F	4	1
``````
0

LVL 73

Accepted Solution

sdstuber earned 500 total points
ID: 40382077
>>  for pattern "not comma, then everything after the comma"

no

it's saying look for substrings consisting of 1 or more not-commas

The "+" means 1 or more of whatever expression came immediately before it,
which was  not-commas
0

Author Comment

ID: 40382129
>>it's saying look for substrings consisting of 1 or more not-commas
>>The "+" means 1 or more of whatever expression came immediately before it, which was  not-commas
-- so it must be looking at it as "M1 R12" and says, "oh look, there's two strings there, the first occurrence being M1 and the second being R12"
0

LVL 73

Expert Comment

ID: 40382163
yes exactly, that's why the COLUMN_VALUE is generated so we can iterate through chunks of non-comma substrings.
0

Author Comment

ID: 40382188
Ok, I think I see it all now, thanks a lot.

For
TABLE(SELECT COLLECT(LEVEL) FROM DUAL CONNECT BY LEVEL <= LENGTH(x.a) - LENGTH(REPLACE(x.a, ',')) + 1

does this all fall under the heading "Collections" and so I need to study up on that ?
0

LVL 73

Expert Comment

ID: 40382232
yes

COLLECT() as the name implies, creates a collection of NUMBERs
0

Author Comment

ID: 40382304
sorry, one more quick question . . .

slightwv posted this book for REGEXP. The reviews look good, but 544 pages for REGEXP ? I already have a 1,300 page PL/SQL book by Feuerstein to slog through . . .
0

LVL 76

Expert Comment

ID: 40382354
>> but 544 pages for REGEXP ?

Powerful doesn't come easy...  That book will take you places with regex that most people outside of Perl programmers never dare to go.

The regex covered in this question was pretty simple compared to some expressions that are out there.

Remember that that book isn't 100% compatible with Oracle's implementation of regex.

A lot of regex (and SQL/programming/etc... for that matter) is just understanding the concepts of "what is possible" and not memorization.
0

LVL 73

Expert Comment

ID: 40382366
Read the oracle sql guide.  It's appendix is much smaller than 544 and is oracle specific.

Then, simply practice. - this is more important than any resource either of us mentioned.

Try a little regexp golf for fun.  Note Oracle's implementation might not be able to solve patterns that others can.
0

Author Comment

ID: 40382372
slightwv,

ok, thanks. At the moment I am focused on Oracle, so the book is probably out of scope for me. I did run into this in some JavaScript, but still 544 pages is probably more than I need.

I have a 11g Sql book by Jason Price. It doesn't have detailed explanations of REGEXP, so I kind of need something between that and 544 pages of not really Oracle stuff.
0

Author Comment

ID: 40382382
sdstuber,

-- ok, sounds good.

>>Then, simply practice
-- I broke everything down in the excellent solution I got, so It's starting to make sense, and I'm sure I'll have other opportunities to excel . . .
0

LVL 76

Expert Comment

ID: 40382400
>>At the moment I am focused on Oracle, so the book is probably out of scope for me.

If I had to guess I would say the book is 80-90% accurate for Oracle's implementation but I last touched my copy over 15 years ago.
0

Author Comment

ID: 40382423
slightwv, ok, I will take that into consideration, thx.
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question