Removing extra white spaces from .csv and replace with "|"

I have a .csv file to import into a table in a MSSQL database.  The problem is the .csv file is just one columns with about 4 different fields all separted with a series of spaces.
Example
Jon Doe                          NUll                           12333           S1548
Mary Jane                          Miii                  15483                E22566

and so on.
I am trying to remove the white spaces but when I use the trim feature in the DTS package I lose some data,  IS there anyway to do this and get all the data at the same time?
krbnldyAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

manav_mathurCommented:


But remember, things get dicey when some of your fields are null.
perl -i -pe 's/\s+/,/g/' yourfilename.csv

Open in new window

0
manav_mathurCommented:
I typoed an extra "/" at the end of the code
perl -i -pe 's/\s+/,/g' yourfilename.csv

Open in new window

0
manav_mathurCommented:
Ah....I didnt see the spaces between the first name and the last name. So that simple one-liner goes doesnt work. Also, this is only one column that we are talking about here and not the whole file. The first column can be obtained using Text::CSV.

Is there always only one space between first name and last name, and always more than one space between fields?? Is there any special characters there??
In other words, how do we differentiate betwen the space in the Name, and the space between the fields.
0
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

Adam314Commented:
You can use this:
    perl -i.bak -pe 's/\s{3,}/\|/g'  <filename>

Replace <filename> with the actual filename you want changed.  This will create a backup file, with the same name as <filename> but with .bak added.
The 3 means at least three spaces will be considered the seperator.  This works fine on your example, but if your actual data is different, you can change the 3 to another number.
0
Adam314Commented:
One more thing..... If you are on windows, replace the single quotes with double quotes:
    perl -i.bak -pe "s/\s{3,}/\|/g" <filename>
0
Anthony PerkinsCommented:
>>I am trying to remove the white spaces but when I use the trim feature in the DTS package I lose some data<<
Post your code.  There is no way the Trim function can truncate non-blank data.
0
krbnldyAuthor Commented:
Is the perl in the code in reference to the language,  If so I have never used that kind of programming
0
Adam314Commented:
It is a command line perl program.  Instead of the perl code being placed in a file, you run it directly from the command line.

What it means:
    perl               Use the perl interpreter to run the code
    -i.bak            Edit the file in-place (use the named file as both input and output)
                         and create a backup with the extension .bak
    -p                  assume "while (<>) { ... }" loop around program
                         print line at end of while loop
    -e                  The actual perl code.  It is placed in quotes so the shell doesn't do anything
                          unix shells use single-quotes, windows shell use double-quote
    filename        The filename to edit


So, in long form, it would be this:
#!/usr/bin/perl
use File::Copy;
my $filename=shift;
copy($filename, "$filename.bak");
open(IN, "<$filename.bak") or die "input: $!\n";
open(OUT, ">$filename") or die "output: $!\n";
while(my $Line = <IN>) {
    $Line =~ s/\s{3,}/\|/g;
    print OUT $Line;
}
close(OUT);
close(IN);

Open in new window

0
nmcdermaidCommented:
You need to find the exact specification for this file.

Is it delimited (columns seperated by known token(s) )
or is it fixed width (columns appear at fixed positions)?

Once you know that you can write some bulletproof code to parse it.
0
krbnldyAuthor Commented:
thanks for your suggestion, but i was not able to implement it.  Have never done anything in that language and had to take this project to someone else
0
krbnldyAuthor Commented:
not able to use the solution
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
nmcdermaidCommented:
Without the exact specification or more information, there isn't anything else taht we can do, sorry.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.