Solved

file manu pulatuion in unix shell script

Posted on 2009-04-06
8
770 Views
Last Modified: 2013-11-17
Hi,
I have 2 files like Gmf_Incr_1.XML and Gmf_Incr_2.XML. Now I need to merge these 2 files and generate a new file Gmf_Incr.<MMDDYYYYY>.XML format.

Here is the example&
Gmf_Incr_1.XML will have

<Group><Key>&..</Key><CommentSeg> &..
  &&&. </CommentSeg>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg>

Gmf_Incr_2.XML will have

<RateSeg>&&&.</RateSeg></Group>
<RateSeg>&&&.</RateSeg></Group>
<RateSeg>&&&.</RateSeg></Group>

 
And the final output file Gmf_Incr.04062009.XML should have

<Gmf_Incr>
<Group><Key>&..</Key><CommentSeg> &..
                                   </CommentSeg> <RateSeg>&&&.</RateSeg></Group>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg><RateSeg>&&&.</RateSeg></Group>
<Group><Key>&..</Key><CommentSeg> &.. </CommentSeg><RateSeg>&&&.</RateSeg></Group>
</Gmf_Incr>






In my current shell script I am using  the below line

# concat of GMF files
paste -d~ $1/$3_1.XML $1/$3_2.XML > $1/Gmf_Incr_Bulk1.XML

sed -e "s/~//g" $1/Gmf_Incr_Bulk1.XML > $1/Gmf_Incr_Bulk.XML


In the above code the final file is coming proper only when there is no new line char. If any new line char found the line is breaking in middle of the line.

Can any body help me out

Thanks
Deepak
####################################################################################
#!/bin/sh
# Parameters
# 1 - Folder where the extract files are present
# 2 - Parent folder of the individual plan folders
# 3 - Name of the Extract - Gmf_Incr/Gmf_Bulk
# Author : Deepak
#####################################################################################
#get the plan for which extract should be done
plan=`sed q $1/$3_Request.txt | sed 's/"//g' | cut -d',' -f 2`
echo 'This is the plan ==> '$plan
 
tmpfilename=`date '+%m%d%Y.XML'`
filename="${3}.${tmpfilename}"
echo 'This is the filename ==> '$filename
 
# concat of GMF files
paste -d~ $1/$3_1.XML $1/$3_2.XML > $1/Gmf_Incr_Bulk1.XML
 
sed -e "s/~//g" $1/Gmf_Incr_Bulk1.XML > $1/Gmf_Incr_Bulk.XML
 
 
 
cp $1/Gmf_Incr_Bulk.XML $2/$plan/$filename
rm -f  $1/Gmf_Incr_Bulk1.XML
rm -f $1/Gmf_Incr_Bulk.XML
rm -f $1/$3_2.XML
rm -f $1/$3_1.XML
 
#Adding of tag
 
 
Myfile="$2/$plan/Test.xml"
 
mytag1="<${3}>"
mytag2="</${3}>"
 
echo "$mytag1" >$Myfile
 
#----- Loop to read file data content
 
while read Line
do
 echo "$Line"  >> $Myfile
 
done < $2/$plan/$filename
 
 
echo "$mytag2" >>$Myfile
# move file to proper requested plan folder
mv $2/$plan/Test.xml $2/$plan/$filename
 
echo " gmf  $filename file generated now copying to NDM location " >> /home/dsadm/DataStageShellScripts/smf_gmf_incr.log
 
# copy file to desired directory for NDM transmission
cp $2/$plan/$filename /apps/IBM/DataStage/Projects/extract/Plans/710MI/
 
echo " copying file to NDM location over " >> /home/dsadm/DataStageShellScripts/smf_gmf_incr.log
 
echo "gmf file generation process is over " >> /home/dsadm/DataStageShellScripts/smf_gmf_incr.log

Open in new window

0
Comment
Question by:deepak_tyco
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 20

Expert Comment

by:flow01
ID: 24076469
try removing the line break with the translate function tr (/n = newline)

sed -e "s/~//g" $1/Gmf_Incr_Bulk1.XML > $1/Gmf_Incr_Bulk.XML |  tr '\n' ''
0
 

Author Comment

by:deepak_tyco
ID: 24076549
Hi ,

Basically current file gnerated as below

Report>
<Project>
<Proj_Name>ABC Enhancement</Proj_Name>
<Proj_Type>Mechanical</Proj_Type>
<Proj_Description>Project started on 01/03/2006.
However, it is running behind due to unavailable
Resources</Proj_Description>
<Proj_Hours>123.00</Proj_Hours.
</Project>
<Report>


I want the file to be

<Report>
<Project>
<Proj_Name>ABC Enhancement</Proj_Name>
<Proj_Type>Mechanical</Proj_Type>
<Proj_Description>Project started on 01/03/2006 However, it is running behind due to unavailable Resources</Proj_Description>
<Proj_Hours>123.00</Proj_Hours.
</Project>
<Report>






0
 
LVL 19

Expert Comment

by:simon3270
ID: 24096731
The following awk script will take an XML file and will join together any lines which don't end with a tag (i.e. if the last character in the line isn't ">"):

awk 'BEGIN{n="";}
{if (n == "") {n=$0} else {n=n " " $0}}
/>$/{print n;n="";}' input_file.xml > output_file.xml

Just use that to manipulate your two XML files, then paste the resulting files together.
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 
LVL 5

Expert Comment

by:vikaskhoria
ID: 24105587
Use paste command with head and tail to get this done.
Read about paste here:
http://www.softpanorama.org/Tools/paste.shtml

Basically paste simply merges data from two files, horizontaly, i.e side by side.
So you may need head/tail or awk operations to skip the header, footers etc.
0
 

Accepted Solution

by:
deepak_tyco earned 0 total points
ID: 24105886
Hi ,

Thanks for all of ur suggestion:
I am able to fixed this issue with a perl script.

#!/usr/bin/perl
open FILE, ">/apps/IBM/DataStage/Projects/datastage/extracts/Gmf_Incr_3.XML" or die $!;
open (MYFILE, $ARGV[0]);
while (<MYFILE>) {
chomp;
if ( /.*\>$/ ) {
print FILE "$_\n";
} else {
print FILE "$_";
}
}
close (MYFILE);
close (FILE);

open FILE1, ">/apps/IBM/DataStage/Projects/datastage/extracts/Gmf_Incr_4.XML" or die $!;
open (MYFILE1, $ARGV[1]);
while (<MYFILE1>) {
chomp;
if ( /.*\>$/ ) {
print FILE1 "$_\n";
} else {
print FILE1 "$_";
}
}
close (MYFILE1);
close (FILE1);

0
 
LVL 19

Expert Comment

by:simon3270
ID: 24106190
Your fix will join lines togther without a space between them.  For example:

<hello>
<longtag>without end
end</longtag>
</hello>

will be rearranged as:

<hello>
<longtag>without endend</longtag>
</hello>

(note the "endend"). My awk script would instead produce the better:

<hello>
<longtag>without end end</longtag>
</hello>
0
 

Author Comment

by:deepak_tyco
ID: 24106226
Hi,

Thanks for the info.
But i dont want extra space other wise my XML will not parse in other system as the size is increasing.

Thanks
Deepak
0
 
LVL 19

Expert Comment

by:simon3270
ID: 24106254
The size does not increase - you are replacing a single newline (in the original file) with a single space (in the modified file), which is what an XML viewer woudl do anyway.
0

Featured Post

NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Installing FreeBSD… FreeBSD is a darling of an operating system. The stability and usability make it a clear choice for servers and desktops (for the cunning). Savvy?  The Ports collection makes available every popular FOSS application and packag…
FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

617 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question