Solved

file manu pulatuion in unix shell script

Posted on 2009-04-06
8
761 Views
Last Modified: 2013-11-17
Hi,
I have 2 files like Gmf_Incr_1.XML and Gmf_Incr_2.XML. Now I need to merge these 2 files and generate a new file Gmf_Incr.<MMDDYYYYY>.XML format.

Here is the example&
Gmf_Incr_1.XML will have

<Group><Key>&..</Key><CommentSeg> &..
  &&&. </CommentSeg>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg>

Gmf_Incr_2.XML will have

<RateSeg>&&&.</RateSeg></Group>
<RateSeg>&&&.</RateSeg></Group>
<RateSeg>&&&.</RateSeg></Group>

 
And the final output file Gmf_Incr.04062009.XML should have

<Gmf_Incr>
<Group><Key>&..</Key><CommentSeg> &..
                                   </CommentSeg> <RateSeg>&&&.</RateSeg></Group>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg><RateSeg>&&&.</RateSeg></Group>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg><RateSeg>&&&.</RateSeg></Group>
</Gmf_Incr>






In my current shell script I am using  the below line

# concat of GMF files
paste -d~ $1/$3_1.XML $1/$3_2.XML > $1/Gmf_Incr_Bulk1.XML

sed -e "s/~//g" $1/Gmf_Incr_Bulk1.XML > $1/Gmf_Incr_Bulk.XML


In the above code the final file is coming proper only when there is no new line char. If any new line char found the line is breaking in middle of the line.

Can any body help me out

Thanks
Deepak
####################################################################################
#!/bin/sh
# Parameters
# 1 - Folder where the extract files are present
# 2 - Parent folder of the individual plan folders
# 3 - Name of the Extract - Gmf_Incr/Gmf_Bulk
# Author : Deepak
#####################################################################################
#get the plan for which extract should be done
plan=`sed q $1/$3_Request.txt | sed 's/"//g' | cut -d',' -f 2`
echo 'This is the plan ==> '$plan
 
tmpfilename=`date '+%m%d%Y.XML'`
filename="${3}.${tmpfilename}"
echo 'This is the filename ==> '$filename
 
# concat of GMF files
paste -d~ $1/$3_1.XML $1/$3_2.XML > $1/Gmf_Incr_Bulk1.XML
 
sed -e "s/~//g" $1/Gmf_Incr_Bulk1.XML > $1/Gmf_Incr_Bulk.XML
 
 
 
cp $1/Gmf_Incr_Bulk.XML $2/$plan/$filename
rm -f  $1/Gmf_Incr_Bulk1.XML
rm -f $1/Gmf_Incr_Bulk.XML
rm -f $1/$3_2.XML
rm -f $1/$3_1.XML
 
#Adding of tag
 
 
Myfile="$2/$plan/Test.xml"
 
mytag1="<${3}>"
mytag2="</${3}>"
 
echo "$mytag1" >$Myfile
 
#----- Loop to read file data content
 
while read Line
do
 echo "$Line"  >> $Myfile
 
done < $2/$plan/$filename
 
 
echo "$mytag2" >>$Myfile
# move file to proper requested plan folder
mv $2/$plan/Test.xml $2/$plan/$filename
 
echo " gmf  $filename file generated now copying to NDM location " >> /home/dsadm/DataStageShellScripts/smf_gmf_incr.log
 
# copy file to desired directory for NDM transmission
cp $2/$plan/$filename /apps/IBM/DataStage/Projects/extract/Plans/710MI/
 
echo " copying file to NDM location over " >> /home/dsadm/DataStageShellScripts/smf_gmf_incr.log
 
echo "gmf file generation process is over " >> /home/dsadm/DataStageShellScripts/smf_gmf_incr.log

Open in new window

0
Comment
Question by:deepak_tyco
8 Comments
 
LVL 20

Expert Comment

by:flow01
ID: 24076469
try removing the line break with the translate function tr (/n = newline)

sed -e "s/~//g" $1/Gmf_Incr_Bulk1.XML > $1/Gmf_Incr_Bulk.XML |  tr '\n' ''
0
 

Author Comment

by:deepak_tyco
ID: 24076549
Hi ,

Basically current file gnerated as below

Report>
<Project>
<Proj_Name>ABC Enhancement</Proj_Name>
<Proj_Type>Mechanical</Proj_Type>
<Proj_Description>Project started on 01/03/2006.
However, it is running behind due to unavailable
Resources</Proj_Description>
<Proj_Hours>123.00</Proj_Hours.
</Project>
<Report>


I want the file to be

<Report>
<Project>
<Proj_Name>ABC Enhancement</Proj_Name>
<Proj_Type>Mechanical</Proj_Type>
<Proj_Description>Project started on 01/03/2006 However, it is running behind due to unavailable Resources</Proj_Description>
<Proj_Hours>123.00</Proj_Hours.
</Project>
<Report>






0
 
LVL 19

Expert Comment

by:simon3270
ID: 24096731
The following awk script will take an XML file and will join together any lines which don't end with a tag (i.e. if the last character in the line isn't ">"):

awk 'BEGIN{n="";}
{if (n == "") {n=$0} else {n=n " " $0}}
/>$/{print n;n="";}' input_file.xml > output_file.xml

Just use that to manipulate your two XML files, then paste the resulting files together.
0
Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

 
LVL 5

Expert Comment

by:vikaskhoria
ID: 24105587
Use paste command with head and tail to get this done.
Read about paste here:
http://www.softpanorama.org/Tools/paste.shtml

Basically paste simply merges data from two files, horizontaly, i.e side by side.
So you may need head/tail or awk operations to skip the header, footers etc.
0
 

Accepted Solution

by:
deepak_tyco earned 0 total points
ID: 24105886
Hi ,

Thanks for all of ur suggestion:
I am able to fixed this issue with a perl script.

#!/usr/bin/perl
open FILE, ">/apps/IBM/DataStage/Projects/datastage/extracts/Gmf_Incr_3.XML" or die $!;
open (MYFILE, $ARGV[0]);
while (<MYFILE>) {
chomp;
if ( /.*\>$/ ) {
print FILE "$_\n";
} else {
print FILE "$_";
}
}
close (MYFILE);
close (FILE);

open FILE1, ">/apps/IBM/DataStage/Projects/datastage/extracts/Gmf_Incr_4.XML" or die $!;
open (MYFILE1, $ARGV[1]);
while (<MYFILE1>) {
chomp;
if ( /.*\>$/ ) {
print FILE1 "$_\n";
} else {
print FILE1 "$_";
}
}
close (MYFILE1);
close (FILE1);

0
 
LVL 19

Expert Comment

by:simon3270
ID: 24106190
Your fix will join lines togther without a space between them.  For example:

<hello>
<longtag>without end
end</longtag>
</hello>

will be rearranged as:

<hello>
<longtag>without endend</longtag>
</hello>

(note the "endend"). My awk script would instead produce the better:

<hello>
<longtag>without end end</longtag>
</hello>
0
 

Author Comment

by:deepak_tyco
ID: 24106226
Hi,

Thanks for the info.
But i dont want extra space other wise my XML will not parse in other system as the size is increasing.

Thanks
Deepak
0
 
LVL 19

Expert Comment

by:simon3270
ID: 24106254
The size does not increase - you are replacing a single newline (in the original file) with a single space (in the modified file), which is what an XML viewer woudl do anyway.
0

Featured Post

Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

My previous tech tip, Installing the Solaris OS From the Flash Archive On a Tape (http://www.experts-exchange.com/articles/OS/Unix/Solaris/Installing-the-Solaris-OS-From-the-Flash-Archive-on-a-Tape.html), discussed installing the Solaris Operating S…
I promised to write further about my project, and here I am.  First, I needed to setup the Primary Server.  You can read how in this article: Setup FreeBSD Server with full HDD encryption (http://www.experts-exchange.com/OS/Unix/BSD/FreeBSD/A_3660-S…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question