Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Oracle - Replace repeated characters in string

Posted on 2015-02-16
5
Medium Priority
?
617 Views
Last Modified: 2015-02-17
Hi Experts,

I've a requirement to take a string from a CSV file, non qualified. I'm loading this into a staging table using UTL_FILE, and then splitting out the line into columns using the 'replace' function shown as column 'Existing' below.

The issue is that multiple columns can be empty, meaning that the line can have consecutive commas in it, and the replace function cannot deal with this. While I can nest numerous replace functions, I'm after advice on if this is the best approach.

I could write a function to keep applying replaces to a variable if it finds consecutive commas in place, however this seems incredibly inefficient.

select a, 
       replace(replace(replace(x.a,',,',', ,'),',,', ', ,'),',,', ', ,') "Existing",
       regexp_replace(x.a,',,',', ,') "RegEx1",
       regexp_replace(x.a, '(,,)', ', ,') "RegEx2",
       regexp_replace(x.a, '(,){2}', ', , ') "RegEx3"
from   (select '1,2,3,4,5,6,7,8,9,10,11' a from dual union all
        select ',2,,4,,6,,8,,10,' a from dual union all
        select ',,3,,,6,,,9,,' a from dual union all
        select ',,,4,,,,8,,,' a from dual union all
        select '1,,,4,5,,,,,,11' a from dual union all
        select '1,,,,,,,,,,11' a from dual union all
        select ',,' from dual
       ) x

Open in new window


The most promising is the 'RegEx3' column, however this hasn't put a space at the end, but if '11' is populated, it's prefixed with a space.

The end goal for this is to take this string, split it into columns (Example code below), whilst replacing 'null' or empty instances with space.

select LineNumber,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 1)), chr(13), '')  a1, 
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 2)), chr(13), '')  a2,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 3)), chr(13), '')  a3,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 4)), chr(13), '')  a4,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 5)), chr(13), '')  a5,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 6)), chr(13), '')  a6,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 7)), chr(13), '')  a7,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 8)), chr(13), '')  a8,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 9)), chr(13), '')  a9,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 10)), chr(13), '') a10,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 11)), chr(13), '') a11,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 12)), chr(13), '') a12

Open in new window


Note the above also replaces {CR} with nothing as well.
0
Comment
Question by:UBB
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 74

Expert Comment

by:sdstuber
ID: 40612536
just one replace is sufficient and then simple regexp_substr to extract.
you do still need to cleanup the added spaces though when done.



SELECT linenumber,
trim(regexp_substr(linecontent,'[^,]+',1,1))a1,
trim(regexp_substr(linecontent,'[^,]+',1,2))a2,
trim(regexp_substr(linecontent,'[^,]+',1,3))a3,
trim(regexp_substr(linecontent,'[^,]+',1,4))a4,
trim(regexp_substr(linecontent,'[^,]+',1,5))a5,
trim(regexp_substr(linecontent,'[^,]+',1,6))a6,
trim(regexp_substr(linecontent,'[^,]+',1,7))a7,
trim(regexp_substr(linecontent,'[^,]+',1,8))a8,
trim(regexp_substr(linecontent,'[^,]+',1,9))a9,
trim(regexp_substr(linecontent,'[^,]+',1,10))a10,
trim(regexp_substr(linecontent,'[^,]+',1,11))a11,
trim(regexp_substr(linecontent,'[^,]+',1,12))a12
from (select linenumber,replace(linecontent,',',' ,') linecontent      
  FROM yourdata)
0
 
LVL 74

Accepted Solution

by:
sdstuber earned 2000 total points
ID: 40612550
or, with no replace at all

SELECT linenumber,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,1),',')a1,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,2),',')a2,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,3),',')a3,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,4),',')a4,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,5),',')a5,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,6),',')a6,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,7),',')a7,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,8),',')a8,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,9),',')a9,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,10),',')a10,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,11),',')a11,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,12),',')a12
from yourdata
0
 

Author Comment

by:UBB
ID: 40613972
Thanks ststubber,

That works perfectly. Could you walk me through the regex pattern please - I can see that we're looking for zero or 1 comma, followed by either a comma or end of line - how does the regex know to pick up between the first and proceeding comma.

I shouldn't question too much but i'm nervous about deploying code I don't 100% understand!

Thanks
0
 

Author Closing Comment

by:UBB
ID: 40613973
Perfect - Understood the context of the question and provided exactly what was asked!
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 40614364
>>> can see that we're looking for zero or 1 comma,

no, it's looking for 0 or more non-commas followed by a comma or end of line.

that's why it also requires the RTRIM when done to strip off the trailing comma from each field read
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article started out as an Experts-Exchange question, which then grew into a quick tip to go along with an IOUG presentation for the Collaborate confernce and then later grew again into a full blown article with expanded functionality and legacy…
When it comes to protecting Oracle Database servers and systems, there are a ton of myths out there. Here are the most common.
This video explains at a high level with the mandatory Oracle Memory processes are as well as touching on some of the more common optional ones.
This video shows how to configure and send email from and Oracle database using both UTL_SMTP and UTL_MAIL, as well as comparing UTL_SMTP to a manual SMTP conversation with a mail server.

609 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question