How to remove blank spaces and DOS line break in a DAT file from UNIX script

How to remove blank spaces and DOS line break in a DAT file from UNIX script unattended (through sh file)

Would prefer NOT to create new file. Using some script to clean-up (blank spaces & ^M) the same file

If you could please provide sample script which will accept file names and clean this mess&that would be a great help

Scenario:
Windows user will create files with | separated columns and upload them onto UNIX server
UNIX script parses these files for further processing
Load them into Oracle db using sql loader

Issue:
UNIX script & sql loader fails to parse each line since it has got blank spaces and ^M (visible from vi editor)

Please check the attached sample file

Thanks in Advance
grind67Asked:
Who is Participating?
 
TintinCommented:
To remove two or more spaces (which I'm assuming should be suitable for your data), do

sed 's/  *//g'
0
 
TintinCommented:
You don't mention what Unix flavour you have, but try

sed -i -e ''s/ //g' -e 's/.$//' file
0
 
peter991Commented:
$> dos2unix file 2>&-
0
Cloud Class® Course: Microsoft Windows 7 Basic

This introductory course to Windows 7 environment will teach you about working with the Windows operating system. You will learn about basic functions including start menu; the desktop; managing files, folders, and libraries.

 
grind67Author Commented:
Sorry, here it is
Machine manufacturer and class: HP RP7420
Operating System                : HPUX 11.11

Thanks for your quick response...I tried the command > sed -i -e ''s/ //g' -e 's/.$//' NOTES.DAT
it didnt do anything to the file and It didnt give me the prompt back it went to > prompt....I had to press Ctrl + C to exit and return back to normal prompt

Please advise
0
 
ozoCommented:
sed -i -e ''s/ //g' -e 's/.$//' file
should have been
sed -i -e 's/ //g' -e 's/.$//' file
0
 
grind67Author Commented:
thanks ozo,

I tried this > sed -i -e 's/ //g' -e 's/.$//' NOTES.DAT

received below error
sed: illegal option -- i
Usage: sed [-n] [-e script] [-f source_file] [file...]

Please avise?
0
 
grind67Author Commented:
HI peter991

could you please suggest where can I download and how to install/use in UNIX please?
0
 
peter991Commented:
In HP the name is dos2ux

/usr/bin/dos2ux

My suggestion don't work on my HP. Only on Solaris.
I know you prefer not to create a new file..

$> /usr/bin/dos2ux dirty_file > clean_file
0
 
grind67Author Commented:
Hi Pete991, thanks for your suggestion anyway..it is good to know & learn :)
0
 
peter991Commented:
To remove blank spaces try:

$> sed -e '/^$/d' file > new file
0
 
peter991Commented:
should be new_file ... sorry!
0
 
grind67Author Commented:
Hi Pete991, tried this
sed -e '/^$/d' NOTES.DAT > NOTES_NOSPACE.DAT

Not sure it removed the blank spaces or It couldnt remove the spaces because of dos line break ^M end of line

Please check the attached Pete991_01.zip for your reference
Pete991-01.zip
0
 
TintinCommented:
HPUX sed doesn't have the -i option, so you'll have to do

sed  -e 's/ //g' -e 's/.$//' NOTES.DAT >/tmp/$$ && mv /tmp/$$ NOTES.DAT
0
 
peter991Commented:
This worked for me:

The ^M should be typed with ctrl+v ctrl+m
# sed 's/$$//' peter_tst.lst | sed 's/^M$//' | sed -e '/^$/d' > new_file



0
 
grind67Author Commented:
Hi Tintin, thats awesome...worked like a treat

we have slight problem (my fault I didnt highlight this, sorry)

Source line
<QLD SERVICED BHW PROJECT AS PER INTRANET          BRIEFING NOTE RELEASED 29/11/07. (SUN RETAIL      VACANT CONSUMPTION SITE).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          > upto here

After executing above command
QLDSERVICEDBHWPROJECTASPERINTRANETBRIEFINGNOTERELEASED29/11/07.(SUNRETAILVACANTCONSUMPTIONSITE).

As we can see it removed the spaces in between words (from all columns) as well...could you please advise how to tackle this
0
 
grind67Author Commented:
HI TinTin, If we could remove just the space after the last char on each column that would be great help

OR
could you please advise how to remove just the ^M from the file? whithout creating new file.

Thanks,
0
 
sjm_eeCommented:
sed -e 's/[ ]*^M$//'

where ^M is entered as c-V c-M.
0
 
grind67Author Commented:
thank you all

Took TinTin approach...using below cmd to remove double spaces...it did remove all double space...which is awesome

But still only subset of the row get loaded into oracle table via sqllodr....below are the steps taken
sed -e 's/  //g' UATQLD_BI_NOTES_20081119.DAT >/tmp/$$ && mv /tmp/$$ UATQLD_BI_NOTES_20081119.DAT

Step 1
From Putty issed below cmd
sed -e 's/  //g' UATQLD_BI_NOTES_20081119.DAT >/tmp/$$ && mv /tmp/$$ UATQLD_BI_NOTES_20081119.DAT

Step 2
Opened the file from winscp to remove the header & Footer row manually and saved

Step 3
Moved the file to a windows server where oracle sqlldr is available

Step 4
executed the loader

Step 5
Only subset of rows get loaded (please check attached source DAT, BAD, and LOG files

please adivse
dat-bad-log-files.zip
0
 
grind67Author Commented:
Quick Update

After making control file change "  note_description CHAR(400) "substr(:NOTE_DESCRIPTION,1,400)"" based on another question I raised in EXchange

Very interesting thing after loading the data with your above suggestion

WORKED_CISOV_BI_NOTE_200811180931.DAT filed loaded perfectly
FAILED_UATQLD_BI_NOTES_20081119_DAT...failed and loaded only subset of rows

Please check the attached source file and pleae advice if you see something different

once thing I notice when I tried to FAILED_UATQLD_BI_NOTES_20081119_DAT using UltraEdit...that it displayed a warning msg saying "file is probably not a DOS format", do you want to open with dos format?"

This warning never appears when opening the worked file "WORKED_CISOV_BI_NOTE_200811180931.DAT"

Please advise
WORKED-FAILED-DAT.zip
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.