Solved

Oracle - Replace repeated characters in string

Posted on 2015-02-16
5
471 Views
Last Modified: 2015-02-17
Hi Experts,

I've a requirement to take a string from a CSV file, non qualified. I'm loading this into a staging table using UTL_FILE, and then splitting out the line into columns using the 'replace' function shown as column 'Existing' below.

The issue is that multiple columns can be empty, meaning that the line can have consecutive commas in it, and the replace function cannot deal with this. While I can nest numerous replace functions, I'm after advice on if this is the best approach.

I could write a function to keep applying replaces to a variable if it finds consecutive commas in place, however this seems incredibly inefficient.

select a, 
       replace(replace(replace(x.a,',,',', ,'),',,', ', ,'),',,', ', ,') "Existing",
       regexp_replace(x.a,',,',', ,') "RegEx1",
       regexp_replace(x.a, '(,,)', ', ,') "RegEx2",
       regexp_replace(x.a, '(,){2}', ', , ') "RegEx3"
from   (select '1,2,3,4,5,6,7,8,9,10,11' a from dual union all
        select ',2,,4,,6,,8,,10,' a from dual union all
        select ',,3,,,6,,,9,,' a from dual union all
        select ',,,4,,,,8,,,' a from dual union all
        select '1,,,4,5,,,,,,11' a from dual union all
        select '1,,,,,,,,,,11' a from dual union all
        select ',,' from dual
       ) x

Open in new window


The most promising is the 'RegEx3' column, however this hasn't put a space at the end, but if '11' is populated, it's prefixed with a space.

The end goal for this is to take this string, split it into columns (Example code below), whilst replacing 'null' or empty instances with space.

select LineNumber,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 1)), chr(13), '')  a1, 
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 2)), chr(13), '')  a2,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 3)), chr(13), '')  a3,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 4)), chr(13), '')  a4,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 5)), chr(13), '')  a5,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 6)), chr(13), '')  a6,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 7)), chr(13), '')  a7,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 8)), chr(13), '')  a8,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 9)), chr(13), '')  a9,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 10)), chr(13), '') a10,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 11)), chr(13), '') a11,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 12)), chr(13), '') a12

Open in new window


Note the above also replaces {CR} with nothing as well.
0
Comment
Question by:UBB
  • 3
  • 2
5 Comments
 
LVL 73

Expert Comment

by:sdstuber
ID: 40612536
just one replace is sufficient and then simple regexp_substr to extract.
you do still need to cleanup the added spaces though when done.



SELECT linenumber,
trim(regexp_substr(linecontent,'[^,]+',1,1))a1,
trim(regexp_substr(linecontent,'[^,]+',1,2))a2,
trim(regexp_substr(linecontent,'[^,]+',1,3))a3,
trim(regexp_substr(linecontent,'[^,]+',1,4))a4,
trim(regexp_substr(linecontent,'[^,]+',1,5))a5,
trim(regexp_substr(linecontent,'[^,]+',1,6))a6,
trim(regexp_substr(linecontent,'[^,]+',1,7))a7,
trim(regexp_substr(linecontent,'[^,]+',1,8))a8,
trim(regexp_substr(linecontent,'[^,]+',1,9))a9,
trim(regexp_substr(linecontent,'[^,]+',1,10))a10,
trim(regexp_substr(linecontent,'[^,]+',1,11))a11,
trim(regexp_substr(linecontent,'[^,]+',1,12))a12
from (select linenumber,replace(linecontent,',',' ,') linecontent      
  FROM yourdata)
0
 
LVL 73

Accepted Solution

by:
sdstuber earned 500 total points
ID: 40612550
or, with no replace at all

SELECT linenumber,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,1),',')a1,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,2),',')a2,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,3),',')a3,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,4),',')a4,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,5),',')a5,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,6),',')a6,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,7),',')a7,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,8),',')a8,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,9),',')a9,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,10),',')a10,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,11),',')a11,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,12),',')a12
from yourdata
0
 

Author Comment

by:UBB
ID: 40613972
Thanks ststubber,

That works perfectly. Could you walk me through the regex pattern please - I can see that we're looking for zero or 1 comma, followed by either a comma or end of line - how does the regex know to pick up between the first and proceeding comma.

I shouldn't question too much but i'm nervous about deploying code I don't 100% understand!

Thanks
0
 

Author Closing Comment

by:UBB
ID: 40613973
Perfect - Understood the context of the question and provided exactly what was asked!
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 40614364
>>> can see that we're looking for zero or 1 comma,

no, it's looking for 0 or more non-commas followed by a comma or end of line.

that's why it also requires the RTRIM when done to strip off the trailing comma from each field read
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Oracle SQL Select Statement 19 58
update set column values in oracle 3 43
PL/SQL Two statements 6 50
dates - loop 12 41
Working with Network Access Control Lists in Oracle 11g (part 1) Part 2: http://www.e-e.com/A_9074.html So, you upgraded to a shiny new 11g database and all of a sudden every program that used UTL_MAIL, UTL_SMTP, UTL_TCP, UTL_HTTP or any oth…
This post first appeared at Oracleinaction  (http://oracleinaction.com/undo-and-redo-in-oracle/)by Anju Garg (Myself). I  will demonstrate that undo for DML’s is stored both in undo tablespace and online redo logs. Then, we will analyze the reaso…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
This video shows how to Export data from an Oracle database using the Datapump Export Utility.  The corresponding Datapump Import utility is also discussed and demonstrated.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now