Solved

file manu pulatuion in unix shell script

Posted on 2009-04-06
8
769 Views
Last Modified: 2013-11-17
Hi,
I have 2 files like Gmf_Incr_1.XML and Gmf_Incr_2.XML. Now I need to merge these 2 files and generate a new file Gmf_Incr.<MMDDYYYYY>.XML format.

Here is the example&
Gmf_Incr_1.XML will have

<Group><Key>&..</Key><CommentSeg> &..
  &&&. </CommentSeg>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg>

Gmf_Incr_2.XML will have

<RateSeg>&&&.</RateSeg></Group>
<RateSeg>&&&.</RateSeg></Group>
<RateSeg>&&&.</RateSeg></Group>

 
And the final output file Gmf_Incr.04062009.XML should have

<Gmf_Incr>
<Group><Key>&..</Key><CommentSeg> &..
                                   </CommentSeg> <RateSeg>&&&.</RateSeg></Group>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg><RateSeg>&&&.</RateSeg></Group>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg><RateSeg>&&&.</RateSeg></Group>
</Gmf_Incr>






In my current shell script I am using  the below line

# concat of GMF files
paste -d~ $1/$3_1.XML $1/$3_2.XML > $1/Gmf_Incr_Bulk1.XML

sed -e "s/~//g" $1/Gmf_Incr_Bulk1.XML > $1/Gmf_Incr_Bulk.XML


In the above code the final file is coming proper only when there is no new line char. If any new line char found the line is breaking in middle of the line.

Can any body help me out

Thanks
Deepak
####################################################################################
#!/bin/sh
# Parameters
# 1 - Folder where the extract files are present
# 2 - Parent folder of the individual plan folders
# 3 - Name of the Extract - Gmf_Incr/Gmf_Bulk
# Author : Deepak
#####################################################################################
#get the plan for which extract should be done
plan=`sed q $1/$3_Request.txt | sed 's/"//g' | cut -d',' -f 2`
echo 'This is the plan ==> '$plan
 
tmpfilename=`date '+%m%d%Y.XML'`
filename="${3}.${tmpfilename}"
echo 'This is the filename ==> '$filename
 
# concat of GMF files
paste -d~ $1/$3_1.XML $1/$3_2.XML > $1/Gmf_Incr_Bulk1.XML
 
sed -e "s/~//g" $1/Gmf_Incr_Bulk1.XML > $1/Gmf_Incr_Bulk.XML
 
 
 
cp $1/Gmf_Incr_Bulk.XML $2/$plan/$filename
rm -f  $1/Gmf_Incr_Bulk1.XML
rm -f $1/Gmf_Incr_Bulk.XML
rm -f $1/$3_2.XML
rm -f $1/$3_1.XML
 
#Adding of tag
 
 
Myfile="$2/$plan/Test.xml"
 
mytag1="<${3}>"
mytag2="</${3}>"
 
echo "$mytag1" >$Myfile
 
#----- Loop to read file data content
 
while read Line
do
 echo "$Line"  >> $Myfile
 
done < $2/$plan/$filename
 
 
echo "$mytag2" >>$Myfile
# move file to proper requested plan folder
mv $2/$plan/Test.xml $2/$plan/$filename
 
echo " gmf  $filename file generated now copying to NDM location " >> /home/dsadm/DataStageShellScripts/smf_gmf_incr.log
 
# copy file to desired directory for NDM transmission
cp $2/$plan/$filename /apps/IBM/DataStage/Projects/extract/Plans/710MI/
 
echo " copying file to NDM location over " >> /home/dsadm/DataStageShellScripts/smf_gmf_incr.log
 
echo "gmf file generation process is over " >> /home/dsadm/DataStageShellScripts/smf_gmf_incr.log

Open in new window

0
Comment
Question by:deepak_tyco
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 20

Expert Comment

by:flow01
ID: 24076469
try removing the line break with the translate function tr (/n = newline)

sed -e "s/~//g" $1/Gmf_Incr_Bulk1.XML > $1/Gmf_Incr_Bulk.XML |  tr '\n' ''
0
 

Author Comment

by:deepak_tyco
ID: 24076549
Hi ,

Basically current file gnerated as below

Report>
<Project>
<Proj_Name>ABC Enhancement</Proj_Name>
<Proj_Type>Mechanical</Proj_Type>
<Proj_Description>Project started on 01/03/2006.
However, it is running behind due to unavailable
Resources</Proj_Description>
<Proj_Hours>123.00</Proj_Hours.
</Project>
<Report>


I want the file to be

<Report>
<Project>
<Proj_Name>ABC Enhancement</Proj_Name>
<Proj_Type>Mechanical</Proj_Type>
<Proj_Description>Project started on 01/03/2006 However, it is running behind due to unavailable Resources</Proj_Description>
<Proj_Hours>123.00</Proj_Hours.
</Project>
<Report>






0
 
LVL 19

Expert Comment

by:simon3270
ID: 24096731
The following awk script will take an XML file and will join together any lines which don't end with a tag (i.e. if the last character in the line isn't ">"):

awk 'BEGIN{n="";}
{if (n == "") {n=$0} else {n=n " " $0}}
/>$/{print n;n="";}' input_file.xml > output_file.xml

Just use that to manipulate your two XML files, then paste the resulting files together.
0
Resolve Critical IT Incidents Fast

If your data, services or processes become compromised, your organization can suffer damage in just minutes and how fast you communicate during a major IT incident is everything. Learn how to immediately identify incidents & best practices to resolve them quickly and effectively.

 
LVL 5

Expert Comment

by:vikaskhoria
ID: 24105587
Use paste command with head and tail to get this done.
Read about paste here:
http://www.softpanorama.org/Tools/paste.shtml

Basically paste simply merges data from two files, horizontaly, i.e side by side.
So you may need head/tail or awk operations to skip the header, footers etc.
0
 

Accepted Solution

by:
deepak_tyco earned 0 total points
ID: 24105886
Hi ,

Thanks for all of ur suggestion:
I am able to fixed this issue with a perl script.

#!/usr/bin/perl
open FILE, ">/apps/IBM/DataStage/Projects/datastage/extracts/Gmf_Incr_3.XML" or die $!;
open (MYFILE, $ARGV[0]);
while (<MYFILE>) {
chomp;
if ( /.*\>$/ ) {
print FILE "$_\n";
} else {
print FILE "$_";
}
}
close (MYFILE);
close (FILE);

open FILE1, ">/apps/IBM/DataStage/Projects/datastage/extracts/Gmf_Incr_4.XML" or die $!;
open (MYFILE1, $ARGV[1]);
while (<MYFILE1>) {
chomp;
if ( /.*\>$/ ) {
print FILE1 "$_\n";
} else {
print FILE1 "$_";
}
}
close (MYFILE1);
close (FILE1);

0
 
LVL 19

Expert Comment

by:simon3270
ID: 24106190
Your fix will join lines togther without a space between them.  For example:

<hello>
<longtag>without end
end</longtag>
</hello>

will be rearranged as:

<hello>
<longtag>without endend</longtag>
</hello>

(note the "endend"). My awk script would instead produce the better:

<hello>
<longtag>without end end</longtag>
</hello>
0
 

Author Comment

by:deepak_tyco
ID: 24106226
Hi,

Thanks for the info.
But i dont want extra space other wise my XML will not parse in other system as the size is increasing.

Thanks
Deepak
0
 
LVL 19

Expert Comment

by:simon3270
ID: 24106254
The size does not increase - you are replacing a single newline (in the original file) with a single space (in the modified file), which is what an XML viewer woudl do anyway.
0

Featured Post

Why You Need a DevOps Toolchain

IT needs to deliver services with more agility and velocity. IT must roll out application features and innovations faster to keep up with customer demands, which is where a DevOps toolchain steps in. View the infographic to see why you need a DevOps toolchain.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

November 2009 Recently, a question came up in the DB2 forum regarding the date format in DB2 UDB for AS/400.  Apparently in UDB LUW (Linux/Unix/Windows), the date format is a system-wide setting, and is not controlled at the session level.  I'm n…
In tuning file systems on the Solaris Operating System, changing some parameters of a file system usually destroys the data on it. For instance, changing the cache segment block size in the volume of a T3 requires that you delete the existing volu…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question