Solved

Oracle question related to question ID: 27979543

Posted on 2012-12-28
30
355 Views
Last Modified: 2013-01-02
I posted a question to get the text out of a string without special characters included. I got the answer but upon looking at what I actually needed its not that simple. I had an example that looked something like the following

'~(|_|*|football)~' and I wanted football returned and I am able to do that

I then look at the data in the field and it has the following

blablablablablablabla '~(|_|*|football)~' and then more text blablablablabla

so the part of the string with the special characters is only a portion of the text in the field

I would need the following returned
blablablablablablabla football and then more text blablablablabla

I tried figuring it out but the ticks(') are really screwing me up and whenever I run it the variable screen comes up..

Any thoughts?
0
Comment
Question by:jknj72
  • 11
  • 10
  • 8
  • +1
30 Comments
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38728282
If 'valid' characters are letters, numbers and a space try this:

select regexp_replace('blablablablablablabla ''~(|_|*|football)~'' and then more text blablablablabla','[^a-zA-Z0-9 ]') from dual;

What '[^a-zA-Z0-9 ]' does is says if it is NOT '^' a lowercase letter, a-z, an upper case letter, A-Z, a number, 0-9 or a space (just inside the closing ']'), replace it with nothing.
0
 
LVL 23

Expert Comment

by:David
ID: 38728580
You say you are seeing the variable screen -- meaning you are prompted to replace some placeholder with a value?  A simple example is to code &zzz -- the prompt is then to replace "zzz".

If I am on track, then your session is set with some substibution value other than the default "&" -- snapshot the prompt window, please.  Secondly, if you are not substituting a value, then make sure to begin with, SET DEFINE OFF .
0
 
LVL 31

Expert Comment

by:awking00
ID: 38729553
This query takes all characters before the '~ (whether or not they exist) and concatenates them with the function sdstuber provided against all of the characters between the '~ and ~' and concatenates them with all of the characters after the ~'  (whether or not they exist), adding a space between the three parts then trimming any outside spaces -

select trim(substr(text,instr(text,'~'||chr(39)) + 2)||' '||
REPLACE(SUBSTR(REGEXP_SUBSTR(text, '~[^~]+~'), INSTR(REGEXP_SUBSTR(text, '~[^~]+~'), '|', -1) + 1), ')~')||' '||
substr(text,instr(text,'~'||chr(39)) + 2)) mystring
from txt;

MYSTRING
-------------------------------------------------------
football
Mobile, Alabama,
blahblahblah football blahblahblah
0
 
LVL 31

Expert Comment

by:awking00
ID: 38732677
To show the beginning strings -
SQL> select * from txt;
TEXT
-------------------------------------------
'~(`|¤|football)~'
'~(`|¤|{link:10098}|Mobile, Alabama,)~'
blahblah'~(`|¤|football)~'blahblahblah
0
 

Author Comment

by:jknj72
ID: 38732870
awking00 I get an invalid character error when trying to run. Please note that I am very new to Oracle, but learning more every day, and somehow I cant figure out how to declare a variable 'text' and set it and then run the select statement using the variable in TOAD. So I just replaced 'text' everywhere with the actual string so Im not sure that if I figure out how to do that, that it will work but I will keep trying.
Anyway, I have attached a screenshot of the Select statement and you will notice that it is giving me an invalid character when it reaches the football portion of the string. It actually has the open paran highlighted for the error.

I appreciate the help!!
Screenshot-Error.jpg
0
 
LVL 31

Expert Comment

by:awking00
ID: 38732998
Note how I showed the strings in my txt table. They already included the apostrophes. The problem is when you use the command substr('blabla... '~...',..., it's expecting the string to end after the second apostrophe followed by a comma, then the second parameter, etc. To get around this, you would need to issue the command as substr('blabla...''~... with an additional apostrophe. This is what I had to do in order to get the "actual" string (that conatined apostrophes) inserted into the table.
0
 
LVL 31

Expert Comment

by:awking00
ID: 38733015
In your original question, you stated that you had a text field, presumably in a table, that you needed to manipulate. You should be able to manipulate that field with my query by substituting that field name for "text" and that tablename for "txt."
0
 

Author Comment

by:jknj72
ID: 38733078
ok Im sorry but Im confused. the string will be in the table with just a single quote around that portion of the string? If I put the extra tic mark around the '~(`|¤|football)~' to be
''~(`|¤|football)~'' then it will work but the string is in the table as
'blablablablablablabla '~(|_|*|football)~' and then more text blablablablabla'
How would I get the additional tics in there in order to get this to work. FYI, I really dont wanna do that but I dont know if it can be done otherwise. Also, if I get a list from the front end developer of the characters being used in the football portion of the string would that help? Maybe we can do a replace or something? I know we are close though!!!!
0
 
LVL 31

Expert Comment

by:awking00
ID: 38733508
If it's already in the table with the single quotes, then just apply the query to that field.
Select ... substr(thatfield,...)
as opposed to
Select ... substr('blablablablabla'~(|_|*|football)~' and then more text blablablablabla', ...
since this requires the apostrophes to identify the first argument as a string expression, but "thatfield" is already identified as a string expression. It worked before using sdstuber's method. It's just that now the argument for the middle portion that uses that method is not the text field itself but the substring of the text field between the '~ and the ~'. You can still insert the test field in the substring used to get the first and last portions.
By the way, if there already a space before and after the blabla... , there's no need to concatenate the space and use the trim function. I also see I made a mistake copying and pasting

Breaking it up might make it more understandable -

First part of text field is
select substr(text,1, instr(text,'~'||chr(39)) - 1)
Second part is
REPLACE(SUBSTR(REGEXP_SUBSTR(text, '~[^~]+~'), INSTR(REGEXP_SUBSTR(text, '~[^~]+~'), '|', -1) + 1), ')~') Where the part in bold replaces what sdstuber called "yrcol" in the accepted answer to your previous question.
And the third part is
substr(text,instr(text,'~'||chr(39)) + 2)) mystring
0
 

Author Comment

by:jknj72
ID: 38736432
ok I ran the following query without the logic

select rownum, para_text
FROM omni_ent_paragraph where rownum < 3 and para_text like '%|%'
order by para_text ;

And I get these results

Morris and Essex Line  to Liberty State Park - '~(`|¤|6:00am-10:00am)~'
Convent Station
Madison Station
Chatham Station


Then I run this with the logic

select rownum, trim(substr(para_text,instr(para_text,'~'||chr(39)) + 2)||' '||
REPLACE(SUBSTR(REGEXP_SUBSTR(para_text, '~[^~]+~'), INSTR(REGEXP_SUBSTR(para_text, '~[^~]+~'), '|', -1) + 1), ')~')||' '||
substr(para_text,instr(para_text,'~'||chr(39)) + 2)) mystring
from omni_ent_paragraph where rownum < 3 and para_text like '%|%'
order by para_text;

and got these results

Convent Station
Madison Station
Chatham Station
 6:00am-10:00am  
Convent Station
Madison Station
Chatham Station

Here is another result from each of the queries

With logic

Gladstone Station
Peapack Station
Far Hills Station
Millington Station
Gillette Station
Murray Hill Station
 6:00am-10:00am
Gladstone Station
Peapack Station
Far Hills Station
Millington Station
Gillette Station
Murray Hill Station

Without logic

Gladstone Branch to Summit  Station -  '~(`|¤|6:00am-10:00am)~'
Gladstone Station
Peapack Station
Far Hills Station
Millington Station
Gillette Station
Murray Hill Station

It seemns as if the portion after the parsing is being duplicated and replacing the front of the string somehow?

Sorry so long just wanted to make sure Im explaining this correct....

Thanks
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38736444
For this row:
Morris and Essex Line  to Liberty State Park - '~(`|¤|6:00am-10:00am)~'

What are your expected results?
Do you just want the text or do you want specific text on their own individual rows?
0
 
LVL 31

Expert Comment

by:awking00
ID: 38736504
>>Then I run this with the logic
select rownum, trim(substr(para_text,instr(para_text,'~'||chr(39)) + 2)||' '||<<
Did you notice that I had made a copy and paste error and the first part should have been -
select [rownum],substr(text,1, instr(text,'~'||chr(39)) - 1) ...
0
 

Author Comment

by:jknj72
ID: 38736556
I didnt see the error. I added the line you provided and got these results

Gladstone Branch  to Liberty State Park -  '~(`|¤|6:00am-10:00am) 6:00am-10:00am  
Bernardsville Station
Basking Ridge Station
Lyons Station
Berkeley Heights Station
New Providence Station

Im looking for this

Gladstone Branch  to Liberty State Park -  6:00am-10:00am  
Bernardsville Station
Basking Ridge Station
Lyons Station
Berkeley Heights Station
New Providence Station


So we are really close.....
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38736738
Did you ever look at what I posted in my first post?

Check the following simple test case.  I did add some additional characters to not be replaced based on your latest data.

drop table tab1 purge;
create table tab1(col1 varchar2(100));

insert into tab1 values('Gladstone Branch  to Liberty State Park -  ''~(`|¤|6:00am-10:00am) 6:00am-10:00am');

select regexp_replace(col1,'[^a-zA-Z0-9: -]') from tab1;

Open in new window

0
 
LVL 31

Expert Comment

by:awking00
ID: 38736803
Two issues to resolve. One, I actually need to correct my correction.
This -
select substr(text,1, instr(text,'~'||chr(39)) - 1)
should have been this -
select substr(text,1, instr(text,chr(39)||'~') - 1)

Two, previously your examples would have shown this -
Gladstone Branch  to Liberty State Park -  '~(`|¤|6:00am-10:00am) 6:00am-10:00am
like this -
Gladstone Branch  to Liberty State Park -  '~(`|¤|6:00am-10:00am)~' 6:00am-10:00am
with the ~' occurring after the parenthesis
So the incorrect first part would return nothing. Because the ~' did not exist, it was looking for a substring starting at position 1 for a length of -1.
The second part would return nothing because there is no substring beginning with '~ and ending with~'.
And the third part should have returned the entire string starting at the second position Is there possibly a space at the beginning of that string or are you sure it didn't return a string starting with "ladstone" ... instead of "Gladstone" ...?

Given all of that explanation, my next question is in order to get 'Gladstone Branch  to Liberty State Park -  6:00am-10:00am' as your desired result, "Is that because the '6:00am - 10:00 am' comes after the closing parenthesis or because it comes after the last pipe ("|") and before the closing parenthesis?"
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 

Author Comment

by:jknj72
ID: 38737245
awking
>>Given all of that explanation, my next question is in order to get 'Gladstone Branch  to Liberty State Park -  6:00am-10:00am' as your desired result, "Is that because the '6:00am - 10:00 am' comes after the closing parenthesis or because it comes after the last pipe ("|") and before the closing parenthesis?"

I need everything before the '~(
and whatever text after the last pipe and before the ~)' without the special characters

Then everything after the ~)'

all concatenated

slight
Im going to try it again now

Thanks to both of you!!
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38737274
>>I need everything before the '~( and whatever text after the last pipe and before the ~)' without the special characters

Then wouldn't:
Gladstone Branch  to Liberty State Park -  '~(`|¤|6:00am-10:00am) 6:00am-10:00am  

Become:
Gladstone Branch  to Liberty State Park -  6:00am-10:00am 6:00am-10:00am  

and not what you posted as desired?
0
 

Author Comment

by:jknj72
ID: 38737300
>>Then wouldn't:
Gladstone Branch  to Liberty State Park -  '~(`|¤|6:00am-10:00am) 6:00am-10:00am  

Become:
Gladstone Branch  to Liberty State Park -  6:00am-10:00am 6:00am-10:00am  

It never had '~(`|¤|6:00am-10:00am) 6:00am-10:00am
It was always just '~(`|¤|6:00am-10:00am)~'

Bottom line
ABC '~(`|¤|6:00am-10:00am)~' DEF

I need
ABC 6:00am-10:00am DEF

Sorry if Ive gotten off my orig post somewhat....

Also, slight yours is real close I just tried it and I think I may be able to tweak it a little to give me the desried results. I will keep trying
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38737308
>>Also, slight yours is real close

Please post where it is failing and I'll tweak it with you.
0
 
LVL 31

Expert Comment

by:awking00
ID: 38737357
>>It never had '~(`|¤|6:00am-10:00am) 6:00am-10:00am
It was always just '~(`|¤|6:00am-10:00am)~' <<
So the actual data was
Gladstone Branch  to Liberty State Park -  '~(`|¤|6:00am-10:00am)~' ?
0
 

Author Comment

by:jknj72
ID: 38737390
>>So the actual data was
Gladstone Branch  to Liberty State Park -  '~(`|¤|6:00am-10:00am)~' ?

The actualy data for this record is:
Gladstone Branch  to Liberty State Park -  '~(`|¤|6:00am-10:00am)~'  
Bernardsville Station
Basking Ridge Station
Lyons Station
Berkeley Heights Station
New Providence Station
0
 
LVL 31

Expert Comment

by:awking00
ID: 38737441
Can we try again? This seems to work -
SQL> select * from txt;
TEXT
---------------------------------------------------------------------
'~(`|¤|football)~'
'~(`|¤|{link:10098}|Mobile, Alabama,)~'
blahblah'~(`|¤|football)~'blahblahblah
Gladstone Branch  to Liberty State Park -  '~(`|¤|6:00am-10:00am)~'

SQL> select substr(text,1, instr(text,chr(39)||'~') - 1)||
  2  REPLACE(SUBSTR(REGEXP_SUBSTR(text, '~[^~]+~'), INSTR(REGEXP_SUBSTR(text, '~
[^~]+~'), '|', -1) + 1), ')~')||
  3  substr(text,instr(text,'~'||chr(39)) + 2) mystr
  4  from txt;
MYSTR
--------------------------------------------------------------------------------
football
Mobile, Alabama,
blahblahfootballblahblahblah
Gladstone Branch  to Liberty State Park -  6:00am-10:00am
0
 
LVL 31

Expert Comment

by:awking00
ID: 38737455
Are there any other cases such as no '~' and no '|' or just '~' without '|' or just '|' without '~'? If so, what is the selection criteria?
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38737603
>>This seems to work -

Seems like a lot of extra calls?

The sample below seems to replicate the output above with a single call.

More accurate requirements would help greatly.  for example do you ALWAYS have the closing )~' that will ALWAYS match a starting '~( ?

drop table tab1 purge;
create table tab1(col1 varchar2(100));

insert into tab1 values('''~(`|¤|football)~''');
insert into tab1 values('''~(`|¤|{link:10098}|Mobile, Alabama,)~''');
insert into tab1 values('blahblah''~(`|¤|football)~''blahblahblah');
insert into tab1 values('Gladstone Branch  to Liberty State Park -  ''~(`|¤|6:00am-10:00am)~''');
commit;

select regexp_replace(col1,'([^'']*)(.*\|)(.*)(\).*[''])(.*)','\1\3\5') from tab1; 

Open in new window

0
 

Author Comment

by:jknj72
ID: 38737792
yes awkin there are times when there are no special characters and those records are not coming back correct. Furthermore there are instances where there are multiple cases where there are the special characters.

FYI,
the premise behind what we are doing is we are giving users the ability to create web pages and in this paragraph section they can make any words they want italicized, bold, underlined, etc... Thats where the special characters come in, and they can do this multiple times throughout their paragraph or they dont have to do it at all.

slight, your initial replace is working but I guess my next problem is if there is a crlf it is not keeping it. I think if I can get that figured out then I will be good. I have an example if youd like. I also tried to run what you just posted and it runs for a while so I stop it before it executes.

I appreciate all the work you both have been putting in to help me. I hopefully can pass it on one day!!
0
 
LVL 76

Accepted Solution

by:
slightwv (䄆 Netminder) earned 500 total points
ID: 38737827
>>next problem is if there is a crlf it is not keeping it

Just add any character you want to 'keep' inside the square braces.

In Oracle just use the CHR sql function.

select regexp_replace(col1,'[^a-zA-Z0-9: ' || chr(10) || chr(13)|| '-]') from tab1;

>>Thats where the special characters come in, and they can do this multiple times throughout their paragraph or they dont have to do it at all.

Can you not just remove the ability of the users to do this in the first place?

If you remove the special characters after the fact, you lose being able to display it properly anyway.  Just reomve the ability in the app and you don't have this issue.
0
 

Author Comment

by:jknj72
ID: 38737876
they are being put there on purpose in the database and they are being displayed without the speical characters in another portion so its not being saved without the characters its just how I am displaying them.

I will try now..Thanks
0
 

Author Comment

by:jknj72
ID: 38737894
That worked great. Very impressive....THANK YOU!!!
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38737907
>>That worked great

Be careful of edge cases!  That is data that might slip through the cracks that you didn't expect.
0
 

Author Closing Comment

by:jknj72
ID: 38737908
great job
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Oracle Finace 3 43
Oracle Syntax 8 39
File generation using utl_file 4 28
SQL query 4 27
Introduction A previously published article on Experts Exchange ("Joins in Oracle", http://www.experts-exchange.com/Database/Oracle/A_8249-Joins-in-Oracle.html) makes a statement about "Oracle proprietary" joins and mixes the join syntax with gen…
Configuring and using Oracle Database Gateway for ODBC Introduction First, a brief summary of what a Database Gateway is.  A Gateway is a set of driver agents and configurations that allow an Oracle database to communicate with other platforms…
Via a live example, show how to take different types of Oracle backups using RMAN.
This video shows how to recover a database from a user managed backup

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now