Solved

Oracle - Replace repeated characters in string

Posted on 2015-02-16
5
551 Views
Last Modified: 2015-02-17
Hi Experts,

I've a requirement to take a string from a CSV file, non qualified. I'm loading this into a staging table using UTL_FILE, and then splitting out the line into columns using the 'replace' function shown as column 'Existing' below.

The issue is that multiple columns can be empty, meaning that the line can have consecutive commas in it, and the replace function cannot deal with this. While I can nest numerous replace functions, I'm after advice on if this is the best approach.

I could write a function to keep applying replaces to a variable if it finds consecutive commas in place, however this seems incredibly inefficient.

select a, 
       replace(replace(replace(x.a,',,',', ,'),',,', ', ,'),',,', ', ,') "Existing",
       regexp_replace(x.a,',,',', ,') "RegEx1",
       regexp_replace(x.a, '(,,)', ', ,') "RegEx2",
       regexp_replace(x.a, '(,){2}', ', , ') "RegEx3"
from   (select '1,2,3,4,5,6,7,8,9,10,11' a from dual union all
        select ',2,,4,,6,,8,,10,' a from dual union all
        select ',,3,,,6,,,9,,' a from dual union all
        select ',,,4,,,,8,,,' a from dual union all
        select '1,,,4,5,,,,,,11' a from dual union all
        select '1,,,,,,,,,,11' a from dual union all
        select ',,' from dual
       ) x

Open in new window


The most promising is the 'RegEx3' column, however this hasn't put a space at the end, but if '11' is populated, it's prefixed with a space.

The end goal for this is to take this string, split it into columns (Example code below), whilst replacing 'null' or empty instances with space.

select LineNumber,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 1)), chr(13), '')  a1, 
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 2)), chr(13), '')  a2,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 3)), chr(13), '')  a3,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 4)), chr(13), '')  a4,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 5)), chr(13), '')  a5,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 6)), chr(13), '')  a6,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 7)), chr(13), '')  a7,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 8)), chr(13), '')  a8,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 9)), chr(13), '')  a9,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 10)), chr(13), '') a10,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 11)), chr(13), '') a11,
              replace(trim(REGEXP_SUBSTR(replace(replace(LineContent,',,',', ,'),',,',', ,'), '[^,]+', 1, 12)), chr(13), '') a12

Open in new window


Note the above also replaces {CR} with nothing as well.
0
Comment
Question by:UBB
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 74

Expert Comment

by:sdstuber
ID: 40612536
just one replace is sufficient and then simple regexp_substr to extract.
you do still need to cleanup the added spaces though when done.



SELECT linenumber,
trim(regexp_substr(linecontent,'[^,]+',1,1))a1,
trim(regexp_substr(linecontent,'[^,]+',1,2))a2,
trim(regexp_substr(linecontent,'[^,]+',1,3))a3,
trim(regexp_substr(linecontent,'[^,]+',1,4))a4,
trim(regexp_substr(linecontent,'[^,]+',1,5))a5,
trim(regexp_substr(linecontent,'[^,]+',1,6))a6,
trim(regexp_substr(linecontent,'[^,]+',1,7))a7,
trim(regexp_substr(linecontent,'[^,]+',1,8))a8,
trim(regexp_substr(linecontent,'[^,]+',1,9))a9,
trim(regexp_substr(linecontent,'[^,]+',1,10))a10,
trim(regexp_substr(linecontent,'[^,]+',1,11))a11,
trim(regexp_substr(linecontent,'[^,]+',1,12))a12
from (select linenumber,replace(linecontent,',',' ,') linecontent      
  FROM yourdata)
0
 
LVL 74

Accepted Solution

by:
sdstuber earned 500 total points
ID: 40612550
or, with no replace at all

SELECT linenumber,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,1),',')a1,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,2),',')a2,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,3),',')a3,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,4),',')a4,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,5),',')a5,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,6),',')a6,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,7),',')a7,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,8),',')a8,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,9),',')a9,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,10),',')a10,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,11),',')a11,
rtrim(regexp_substr(linecontent,'[^,]*(,|$)',1,12),',')a12
from yourdata
0
 

Author Comment

by:UBB
ID: 40613972
Thanks ststubber,

That works perfectly. Could you walk me through the regex pattern please - I can see that we're looking for zero or 1 comma, followed by either a comma or end of line - how does the regex know to pick up between the first and proceeding comma.

I shouldn't question too much but i'm nervous about deploying code I don't 100% understand!

Thanks
0
 

Author Closing Comment

by:UBB
ID: 40613973
Perfect - Understood the context of the question and provided exactly what was asked!
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 40614364
>>> can see that we're looking for zero or 1 comma,

no, it's looking for 0 or more non-commas followed by a comma or end of line.

that's why it also requires the RTRIM when done to strip off the trailing comma from each field read
0

Featured Post

[Webinar] Learn How Hackers Steal Your Credentials

Do You Know How Hackers Steal Your Credentials? Join us and Skyport Systems to learn how hackers steal your credentials and why Active Directory must be secure to stop them. Thursday, July 13, 2017 10:00 A.M. PDT

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Working with Network Access Control Lists in Oracle 11g (part 2) Part 1: http://www.e-e.com/A_8429.html Previously, I introduced the basics of network ACL's including how to create, delete and modify entries to allow and deny access.  For many‚Ķ
Using SQL Scripts we can save all the SQL queries as files that we use very frequently on our database later point of time. This is one of the feature present under SQL Workshop in Oracle Application Express.
This video shows setup options and the basic steps and syntax for duplicating (cloning) a database from one instance to another. Examples are given for duplicating to the same machine and to different machines
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question