I don't understand your response. I already created a control file and loaded the data into an oracle table. I am trying to remove whitespace and the numbers.
Main Topics
Browse All TopicsI have some data in an oracle table that I need to clean up.
Here is a sample of the data from a text file before it was loaded into an oracle table via sql loader (I have 10,774 records in the table):
|(1) NT-53556 NT53556|(10) Radar Signal Interpretation Using Neural Network for Defect Detection in Concrete,|(20) Shoukry, Samir N. Martinelli, David Varadarajan, Srinivas T. Halabe, Udaya B. |(11) 01 Mar 1996|(15) 5 Page(s) |(22) N--(U) ISSN 0025-5327, BRIDGES, CONCRETE, GROUND PENETRATING RADAR, INFRASTRUCTURE, NEURAL NETWORKS.; |(27) na|(14) JOURNAL ARTICLES |(12) NTIAC |(3) Unclassified
As you can see The columns are pipe delimited.
Here is what I need to do:
1. There are 19 columns. I need to delete the numbers that appear on the left of the text for example: (1) from each column. Each record of a column for example author column has the same number. So for every record of that column at the beginning the data in that column will begin with (20). Everything after the open parens, number and close parens I want to remain in that column.
2. I also need to delete all the white spaces that appear after the open parens, number and close parens.
3. In the first field delete duplicate information (NT-53556 NT53556). As you can see the data appears twice. The only difference is the data on the left has a hyphen in it. I want to delete the data on the right (i.e. data without hyphen)
Could someone please help me? Thanks!
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
sikyala,
You need to do a little parsing, but it's pretty easy:
update yourtable
set col1 = substr(col1, instr(col1,' ')+1, instr(col1, ' ', 1, 2) - instr(col1, ' ')), -- this looks for the first space, then grabs everything from there to the next space
col2 = substr(col1, instr(col1,' ')+1), -- grab everything after the first space
col3= <same as #2>
That should do it.
Good luck!
DrSQL
Hi,
For case 1
you can very well use -
Update <tablename>
SET field1 = SUBSTR(field1,INSTR(field1
For case 2
you can use
Update <tablename>
SET field1 = TRIM(field1) -- This will remove white spaces from starting and ending of
the string
For case 3
I am not clear for the requirement , What is the base of identifying duplication cause in example there is no consistency in the data, few record have no space in between and few has many.
I hope this will solve the problem
Cheers!!
JB
For case 3 a sample of the data in ACCENS_NUM field (Primary Key of the table) is as follows:
(1) NT-53575 NT53575
(1) NT-53574 NT53574
(1) NT-53573 NT53573
(1) NT-53572 NT53572
- I want to essentially delete (1)
- delete the white space after (1)
- delete the data that appears on right (for example: NT53575, NT53574 and so on) and the white space that appears before this information
Basically if I did a query of this field I want to see:
NT-53575
NT-53574
NT-53573
NT-53572
Thanks!
In that case I would advice you to use following function,
--------------------------
-- Author - JB
-- Date - 23-Aug-2005
-- Modified -N.A.
-- Purpose -
-- Remarks - I have assumed that we have fixed string Primary key i.e. of 8 Characters
-- Try using following function in SELECT statement first and passing argument as col1
--------------------------
CREATE OR REPLACE FUNCTION my_val (p_string VARCHAR2)
RETURN VARCHAR2
IS
v_var VARCHAR2(100);
BEGIN
IF SUBSTR(REPLACE(p_string,'-
THEN
v_var := SUBSTR(p_string,1, 8) ;
ELSE
v_var := p_string;
END IF;
RETURN v_var;
END my_val;
/
For Col 1 and Col2 related queries should be executed first so extra space will be removed and then try this function OTHERWISE what I would suggest is user my previous comments for col1 and col2 in the above function and manipulate the data.
Cheers !!
JB
JB
I have found your instructions very helpful. I already used the code from case 1 and 2 to remove the number parens and white space on the left side of the data of the ACCENS_NUM field. So I just need to remove the duplicate information (i.e. the whitespace and data that appears on the right side of each row.
This is how the rows look now:
NT-53575 NT53575
I want to remove the white space and information that follows it.
example:
NT53575
Looking at the responses I have received it isn't clear how I should do this. Thanks for your help.
sikyala
sikyala,
After your clarification, I posted the following, please try this instead:
update yourtable
set col1 = substr(ltrim(substr(col1, instr(col1,')')+1)),1,inst
Good luck!
DrSQL
P.S. Here's a test I ran:
SQL> select substr(ltrim(substr(col1, instr(col1,')')+1)),1,inst
SQL> 2
SWL> ed
Wrote file afiedt.buf
1* select substr(ltrim(substr(&&col1
SQL> /
Enter value for col1: '(1) NT-53575 NT53575'
old 1: select substr(ltrim(substr(&&col1
new 1: select substr(ltrim(substr('(1) NT-53575 NT53575', instr('(1) NT-53575 NT53575',')
SUBSTR(L
--------
NT-53575
sikyala,
You are welcome. Please try to be careful in your selection of the accepted versus the assisted answer. I'm not by any means complaining about the point split - JB's contribution seems deserving. I just like to see the answer that gets used as the one that is accepted. It makes it clearer for those who read the thread later. And, if I'm mis-reading your last comment, I apologize. Thanks.
Good luck!
DrSQL
The reason for the split is because it was a 3 part question. basically I asked for help with 3 solutions to modify the data (i.e. I numbered each item (see original message)). I used The accepted answer as solutions to 2 of the questions (as stated in my last message to JB ) and I used the assisted answer as a solution to the 3rd question (as stated in my last comment). I hope this makes sense.
Business Accounts
Answer for Membership
by: leoahmadPosted on 2005-08-11 at 18:10:21ID: 14657426
Your contrl file be:
LOAD DATA
INFILE *
INTO TABLE load_delimited_data
FIELDS TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
( data1 "SUBSTR(DATA1,6,13)",
data2 "SUBSTR(DATA2,6)",
data3 "SUBSTR(DATA3,6)",
data4 "SUBSTR(DATA3,6)",
.............
.............
)
|(1) NT-53556 NT53556|(10) Radar Signal Interpretation Using Neural Network for Defect Detection in Concrete,|(20) Shoukry, Samir N. Martinelli, David Varadarajan, Srinivas T. Halabe, Udaya B. |(11) 01 Mar 1996|(15) 5 Page(s) |(22) N--(U) ISSN 0025-5327, BRIDGES, CONCRETE, GROUND PENETRATING RADAR, INFRASTRUCTURE, NEURAL NETWORKS.; |(27) na|(14) JOURNAL ARTICLES |(12) NTIAC |(3) Unclassified