How to remove blank spaces and DOS line break in a DAT file from UNIX script

How to remove blank spaces and DOS line break in a DAT file from UNIX script unattended (through sh file)

Would prefer NOT to create new file. Using some script to clean-up (blank spaces & ^M) the same file

If you could please provide sample script which will accept file names and clean this mess&that would be a great help

Scenario:
Windows user will create files with | separated columns and upload them onto UNIX server
UNIX script parses these files for further processing
Load them into Oracle db using sql loader

Issue:
UNIX script & sql loader fails to parse each line since it has got blank spaces and ^M (visible from vi editor)

Please check the attached sample file

Thanks in Advance
grind67Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

TintinCommented:
You don't mention what Unix flavour you have, but try

sed -i -e ''s/ //g' -e 's/.$//' file
peter991Commented:
$> dos2unix file 2>&-
grind67Author Commented:
Sorry, here it is
Machine manufacturer and class: HP RP7420
Operating System                : HPUX 11.11

Thanks for your quick response...I tried the command > sed -i -e ''s/ //g' -e 's/.$//' NOTES.DAT
it didnt do anything to the file and It didnt give me the prompt back it went to > prompt....I had to press Ctrl + C to exit and return back to normal prompt

Please advise
OWASP: Forgery and Phishing

Learn the techniques to avoid forgery and phishing attacks and the types of attacks an application or network may face.

ozoCommented:
sed -i -e ''s/ //g' -e 's/.$//' file
should have been
sed -i -e 's/ //g' -e 's/.$//' file
grind67Author Commented:
thanks ozo,

I tried this > sed -i -e 's/ //g' -e 's/.$//' NOTES.DAT

received below error
sed: illegal option -- i
Usage: sed [-n] [-e script] [-f source_file] [file...]

Please avise?
grind67Author Commented:
HI peter991

could you please suggest where can I download and how to install/use in UNIX please?
peter991Commented:
In HP the name is dos2ux

/usr/bin/dos2ux

My suggestion don't work on my HP. Only on Solaris.
I know you prefer not to create a new file..

$> /usr/bin/dos2ux dirty_file > clean_file
grind67Author Commented:
Hi Pete991, thanks for your suggestion anyway..it is good to know & learn :)
peter991Commented:
To remove blank spaces try:

$> sed -e '/^$/d' file > new file
peter991Commented:
should be new_file ... sorry!
grind67Author Commented:
Hi Pete991, tried this
sed -e '/^$/d' NOTES.DAT > NOTES_NOSPACE.DAT

Not sure it removed the blank spaces or It couldnt remove the spaces because of dos line break ^M end of line

Please check the attached Pete991_01.zip for your reference
Pete991-01.zip
TintinCommented:
HPUX sed doesn't have the -i option, so you'll have to do

sed  -e 's/ //g' -e 's/.$//' NOTES.DAT >/tmp/$$ && mv /tmp/$$ NOTES.DAT
peter991Commented:
This worked for me:

The ^M should be typed with ctrl+v ctrl+m
# sed 's/$$//' peter_tst.lst | sed 's/^M$//' | sed -e '/^$/d' > new_file



grind67Author Commented:
Hi Tintin, thats awesome...worked like a treat

we have slight problem (my fault I didnt highlight this, sorry)

Source line
<QLD SERVICED BHW PROJECT AS PER INTRANET          BRIEFING NOTE RELEASED 29/11/07. (SUN RETAIL      VACANT CONSUMPTION SITE).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          > upto here

After executing above command
QLDSERVICEDBHWPROJECTASPERINTRANETBRIEFINGNOTERELEASED29/11/07.(SUNRETAILVACANTCONSUMPTIONSITE).

As we can see it removed the spaces in between words (from all columns) as well...could you please advise how to tackle this
grind67Author Commented:
HI TinTin, If we could remove just the space after the last char on each column that would be great help

OR
could you please advise how to remove just the ^M from the file? whithout creating new file.

Thanks,
sjm_eeCommented:
sed -e 's/[ ]*^M$//'

where ^M is entered as c-V c-M.
TintinCommented:
To remove two or more spaces (which I'm assuming should be suitable for your data), do

sed 's/  *//g'

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
grind67Author Commented:
thank you all

Took TinTin approach...using below cmd to remove double spaces...it did remove all double space...which is awesome

But still only subset of the row get loaded into oracle table via sqllodr....below are the steps taken
sed -e 's/  //g' UATQLD_BI_NOTES_20081119.DAT >/tmp/$$ && mv /tmp/$$ UATQLD_BI_NOTES_20081119.DAT

Step 1
From Putty issed below cmd
sed -e 's/  //g' UATQLD_BI_NOTES_20081119.DAT >/tmp/$$ && mv /tmp/$$ UATQLD_BI_NOTES_20081119.DAT

Step 2
Opened the file from winscp to remove the header & Footer row manually and saved

Step 3
Moved the file to a windows server where oracle sqlldr is available

Step 4
executed the loader

Step 5
Only subset of rows get loaded (please check attached source DAT, BAD, and LOG files

please adivse
dat-bad-log-files.zip
grind67Author Commented:
Quick Update

After making control file change "  note_description CHAR(400) "substr(:NOTE_DESCRIPTION,1,400)"" based on another question I raised in EXchange

Very interesting thing after loading the data with your above suggestion

WORKED_CISOV_BI_NOTE_200811180931.DAT filed loaded perfectly
FAILED_UATQLD_BI_NOTES_20081119_DAT...failed and loaded only subset of rows

Please check the attached source file and pleae advice if you see something different

once thing I notice when I tried to FAILED_UATQLD_BI_NOTES_20081119_DAT using UltraEdit...that it displayed a warning msg saying "file is probably not a DOS format", do you want to open with dos format?"

This warning never appears when opening the worked file "WORKED_CISOV_BI_NOTE_200811180931.DAT"

Please advise
WORKED-FAILED-DAT.zip
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Unix OS

From novice to tech pro — start learning today.