07592161981m
asked on
Large file splitting into smaller files
Hi Experts,
I have a large java dump file created and bundled in a single file for 6 days of data.
How can i divide and create smaller files like for 6 days of data into 6 different files for each day,How can I do this by a script ?
I have a large java dump file created and bundled in a single file for 6 days of data.
How can i divide and create smaller files like for 6 days of data into 6 different files for each day,How can I do this by a script ?
ASKER
I've requested that this question be deleted for the following reason:
Wrong Question.
Wrong Question.
Assuming that the first column of your file contains a datestamp (not timestamp!) without embedded spaces:
awk '{print $0 > $1".out"}' inputfile
awk '{print $0 > $1".out"}' inputfile
ASKER
awk '{print $0 > $1".out"}' verbosegc.20120911.170447. 37451.txt. 001
awk: (FILENAME=verbosegc.201209 11.170447. 37451.txt. 001 FNR=14) fatal: can't redirect to `</initialized>.out' (No such file or directory)
Getting this error.
awk: (FILENAME=verbosegc.201209
Getting this error.
ASKER
<af type="nursery" id="1" timestamp="Sep 11 17:04:52 2012" intervalms="0.000">.
Based on the above line , need to separate files for Sep11, 12, 13 & 14.
Have these timestamps in randomly.
Based on the above line , need to separate files for Sep11, 12, 13 & 14.
Have these timestamps in randomly.
ASKER
<af type="nursery" id="1" timestamp="Sep 11 17:04:52 2012" intervalms="0.000">
<af type="nursery" id="193" timestamp="Sep 12 00:00:03 2012" intervalms="1370951.683">
<af type="nursery" id="491" timestamp="Sep 13 00:00:05 2012" intervalms="802938.834">
<af type="nursery" id="757" timestamp="Sep 14 00:02:49 2012" intervalms="1115834.953">
Need to Separate the files based on the starting time of the day.
<af type="nursery" id="193" timestamp="Sep 12 00:00:03 2012" intervalms="1370951.683">
<af type="nursery" id="491" timestamp="Sep 13 00:00:05 2012" intervalms="802938.834">
<af type="nursery" id="757" timestamp="Sep 14 00:02:49 2012" intervalms="1115834.953">
Need to Separate the files based on the starting time of the day.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
You are the Master . It worked great.
Thx for the points!
You're fast - I eventually planned to enhance the command so that the output filenames would contain the input filename:
awk -F"timestamp=\"" '{D=substr($2,1,6); gsub(" ","",D); print $0 > FILENAME "-" D ".out"}' inputfile
Better?
You're fast - I eventually planned to enhance the command so that the output filenames would contain the input filename:
awk -F"timestamp=\"" '{D=substr($2,1,6); gsub(" ","",D); print $0 > FILENAME "-" D ".out"}' inputfile
Better?
ASKER
After each timestamp line I have some data to be get copied in the file. Its not getting copying in the new files created.
<af type="nursery" id="1" timestamp="Sep 11 17:04:52 2012" intervalms="0.000">
<minimum requested_bytes="184" />
<time exclusiveaccessms="0.008" meanexclusiveaccessms="0.0 08" threads="0" lastthreadtid="0x000000000 0011100" />
<refs soft="2390" weak="11402" phantom="0" dynamicSoftReferenceThresh old="32" maxSoftReferenceThreshold= "32" />
<nursery freebytes="0" totalbytes="167772160" percent="0" />
<tenured freebytes="1808907656" totalbytes="1811939328" percent="99" >
<soa freebytes="1718311304" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<gc type="scavenger" id="1" totalid="1" intervalms="0.000">
<flipped objectcount="257542" bytes="12485696" />
<tenured objectcount="0" bytes="0" />
<finalization objectsqueued="421" />
<scavenger tiltratio="50" />
<nursery freebytes="155140736" totalbytes="167772160" percent="92" tenureage="10" />
<tenured freebytes="1808907656" totalbytes="1811939328" percent="99" >
<soa freebytes="1718311304" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<time totalms="115.926" />
</gc>
<nursery freebytes="155075200" totalbytes="167772160" percent="92" />
<tenured freebytes="1808907656" totalbytes="1811939328" percent="99" >
<soa freebytes="1718311304" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<refs soft="2383" weak="11274" phantom="0" dynamicSoftReferenceThresh old="32" maxSoftReferenceThreshold= "32" />
<time totalms="116.058" />
</af>
<af type="nursery" id="2" timestamp="Sep 11 17:04:56 2012" intervalms="4104.046">
<minimum requested_bytes="568" />
<time exclusiveaccessms="0.016" meanexclusiveaccessms="0.0 16" threads="0" lastthreadtid="0x000000000 0011100" />
<refs soft="3883" weak="11844" phantom="0" dynamicSoftReferenceThresh old="31" maxSoftReferenceThreshold= "32" />
<nursery freebytes="0" totalbytes="167772160" percent="0" />
<tenured freebytes="1806034848" totalbytes="1811939328" percent="99" >
<soa freebytes="1715438496" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<gc type="scavenger" id="2" totalid="3" intervalms="4104.164">
<flipped objectcount="503380" bytes="25876296" />
<tenured objectcount="0" bytes="0" />
<finalization objectsqueued="2509" />
<scavenger tiltratio="50" />
<nursery freebytes="141327832" totalbytes="167772160" percent="84" tenureage="11" />
<tenured freebytes="1806034848" totalbytes="1811939328" percent="99" >
<soa freebytes="1715438496" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<time totalms="190.079" />
</gc>
<nursery freebytes="141262296" totalbytes="167772160" percent="84" />
<tenured freebytes="1806034848" totalbytes="1811939328" percent="99" >
<soa freebytes="1715438496" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<refs soft="2779" weak="11428" phantom="0" dynamicSoftReferenceThresh old="31" maxSoftReferenceThreshold= "32" />
<time totalms="190.319" />
</af>
<af type="nursery" id="1" timestamp="Sep 11 17:04:52 2012" intervalms="0.000">
<minimum requested_bytes="184" />
<time exclusiveaccessms="0.008" meanexclusiveaccessms="0.0
<refs soft="2390" weak="11402" phantom="0" dynamicSoftReferenceThresh
<nursery freebytes="0" totalbytes="167772160" percent="0" />
<tenured freebytes="1808907656" totalbytes="1811939328" percent="99" >
<soa freebytes="1718311304" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<gc type="scavenger" id="1" totalid="1" intervalms="0.000">
<flipped objectcount="257542" bytes="12485696" />
<tenured objectcount="0" bytes="0" />
<finalization objectsqueued="421" />
<scavenger tiltratio="50" />
<nursery freebytes="155140736" totalbytes="167772160" percent="92" tenureage="10" />
<tenured freebytes="1808907656" totalbytes="1811939328" percent="99" >
<soa freebytes="1718311304" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<time totalms="115.926" />
</gc>
<nursery freebytes="155075200" totalbytes="167772160" percent="92" />
<tenured freebytes="1808907656" totalbytes="1811939328" percent="99" >
<soa freebytes="1718311304" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<refs soft="2383" weak="11274" phantom="0" dynamicSoftReferenceThresh
<time totalms="116.058" />
</af>
<af type="nursery" id="2" timestamp="Sep 11 17:04:56 2012" intervalms="4104.046">
<minimum requested_bytes="568" />
<time exclusiveaccessms="0.016" meanexclusiveaccessms="0.0
<refs soft="3883" weak="11844" phantom="0" dynamicSoftReferenceThresh
<nursery freebytes="0" totalbytes="167772160" percent="0" />
<tenured freebytes="1806034848" totalbytes="1811939328" percent="99" >
<soa freebytes="1715438496" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<gc type="scavenger" id="2" totalid="3" intervalms="4104.164">
<flipped objectcount="503380" bytes="25876296" />
<tenured objectcount="0" bytes="0" />
<finalization objectsqueued="2509" />
<scavenger tiltratio="50" />
<nursery freebytes="141327832" totalbytes="167772160" percent="84" tenureage="11" />
<tenured freebytes="1806034848" totalbytes="1811939328" percent="99" >
<soa freebytes="1715438496" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<time totalms="190.079" />
</gc>
<nursery freebytes="141262296" totalbytes="167772160" percent="84" />
<tenured freebytes="1806034848" totalbytes="1811939328" percent="99" >
<soa freebytes="1715438496" totalbytes="1721342976" percent="99" />
<loa freebytes="90596352" totalbytes="90596352" percent="100" />
</tenured>
<refs soft="2779" weak="11428" phantom="0" dynamicSoftReferenceThresh
<time totalms="190.319" />
</af>
So not every line contains "timestamp"? Didn't know that.
Anyway, here you go:
awk -F"timestamp=\"" 'BEGIN {D="NODATE"} {if($0~"timestamp") {D=substr($2,1,6); gsub(" ","",D)}; print $0 > FILENAME "-" D ".out"}' inputfile
Should the first line of inputfile not contain "timestamp" then this line (and all following lines up to, but not including, the first line containing "timestamp") will go to a file named "inputfile-NODATE.out"
Anyway, here you go:
awk -F"timestamp=\"" 'BEGIN {D="NODATE"} {if($0~"timestamp") {D=substr($2,1,6); gsub(" ","",D)}; print $0 > FILENAME "-" D ".out"}' inputfile
Should the first line of inputfile not contain "timestamp" then this line (and all following lines up to, but not including, the first line containing "timestamp") will go to a file named "inputfile-NODATE.out"
ASKER
thank you very much WoolMilkPorc. It worked with out issues.
http://www.techiecorner.com/107/how-to-split-large-file-into-several-smaller-files-linux/