• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 501
  • Last Modified:

Oracle SQL -- REGEX / substring / etc ?

Below gets 10,000+ records.

How can I change so it merges
distinct NOTE_ID's into one field,
removing duplicate numbers so only
the below two SUMMARY rows display ?
-------------------------------------
14670      03,04,27,33,34,77,99
14672      11,33,27,99
-------------------------------------
SELECT NOTE_ID, NOTE_TEXT
FROM DOCUMENT_TEXT_TAB
WHERE OUTPUT_TYPE LIKE '%_N'
order by NOTE_ID
s001.jpg
0
finance_teacher
Asked:
finance_teacher
6 Solutions
 
sdstuberCommented:
if 11gR2 search for str2tbl function on EE and try this...


SELECT   note_id, LISTAGG(s, ',') WITHIN GROUP (ORDER BY s)
    FROM (SELECT DISTINCT note_id, COLUMN_VALUE s
            FROM document_text_tab, TABLE(str2tbl(note_text))
           WHERE output_type LIKE '%_N')
GROUP BY note_id
ORDER BY note_id
0
 
slightwv (䄆 Netminder) Commented:
I was already working on this and it is pretty much the same but it doesn't use str2tbl.

with mydata as
(
	select distinct note_id, rtrim(regexp_substr(note_text,'([[:alnum:]]*)(,)?',1,column_value),',') note_text_value
	from tab1,
 	table(
  	cast(
  	multiset(select level from dual connect by level <= ((length(note_text)-length(replace(note_text,','))))+1)
  	as sys.odcivarchar2list
  	)
  	)
  	where output_type like '%_N'
)
select note_id, listagg(note_text_value,',') within group (order by note_text_value) from mydata
group by note_id
/

Open in new window

0
 
sdstuberCommented:
or, without using str2tbl...

set the connect by level <= 6  to a number sufficiently high that it is at least as big as maximum number of elements in a string.  With your sample data, it's 6 but change as needed

SELECT   note_id, LISTAGG(s, ',') WITHIN GROUP (ORDER BY s)
    FROM (SELECT DISTINCT note_id, REGEXP_SUBSTR(note_text, '[^,]+', 1, n) s
            FROM document_text_tab,
                 (SELECT     LEVEL n
                        FROM DUAL
                  CONNECT BY LEVEL <= 6)
           WHERE output_type LIKE '%_N')
   WHERE s IS NOT NULL
GROUP BY note_id
ORDER BY note_id;
0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
sdstuberCommented:
and a 10g/ 11gR1 version if you don't have 11gR2

SELECT   note_id,
         RTRIM(EXTRACT(XMLAGG(XMLELEMENT("x", s || ',') ORDER BY s), '/x/text()').getstringval(),
               ','
              )
    FROM (SELECT DISTINCT note_id, REGEXP_SUBSTR(note_text, '[^,]+', 1, n) s
            FROM document_text_tab,
                 (SELECT     LEVEL n
                        FROM DUAL
                  CONNECT BY LEVEL <= 6)
           WHERE output_type LIKE '%_N')
   WHERE s IS NOT NULL
GROUP BY note_id
ORDER BY note_id;
0
 
flow01Commented:
and if note_text containts single  'numbers'   instead of a string of num

with DOCUMENT_TEXT_TAB
as
(
SELECT 14670 NOTE_ID, '03' NOTE_TEXT from dual union all
SELECT 14670 NOTE_ID, '03' NOTE_TEXT from dual union all
SELECT 14670 NOTE_ID, '04' NOTE_TEXT from dual union all
SELECT 14670 NOTE_ID, '33' NOTE_TEXT from dual union all
SELECT 14670 NOTE_ID, '34' NOTE_TEXT from dual union all
SELECT 14670 NOTE_ID, '77' NOTE_TEXT from dual union all
SELECT 14670 NOTE_ID, '99' NOTE_TEXT from dual union all
SELECT 14670 NOTE_ID, '03' NOTE_TEXT from dual union all
SELECT 14670 NOTE_ID, '27' NOTE_TEXT from dual union all
SELECT 14672 NOTE_ID, '11' NOTE_TEXT from dual union all
SELECT 14672 NOTE_ID, '33' NOTE_TEXT from dual union all
SELECT 14672 NOTE_ID, '27' NOTE_TEXT from dual union all
SELECT 14672 NOTE_ID, '99' NOTE_TEXT from dual
)
select note_id, listagg(note_text,',') within group (order by note_text)
from (select distinct note_id, note_text from DOCUMENT_TEXT_TAB
--  WHERE output_type LIKE '%_N' )
group by note_id
order by note_id
/
0
 
awking00Commented:
Just note that to ensure the output tpe ends in '_N' you should escape the underscore to items ending in '1N', for example, are not included.
where output_type like '%\_N' escape '\'
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now