• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 479
  • Last Modified:

Large file splitting into smaller files

Hi Experts,

I have a large java dump file created and bundled in a single file for 6 days of data.
How  can i divide and create smaller files like for 6 days of data into 6 different files for each day,How can I do this by a script ?
0
07592161981m
Asked:
07592161981m
  • 7
  • 4
1 Solution
 
07592161981mAuthor Commented:
I've requested that this question be deleted for the following reason:

Wrong Question.
0
 
woolmilkporcCommented:
Assuming that the first column of your file contains a datestamp (not timestamp!) without embedded spaces:

awk '{print $0 > $1".out"}' inputfile
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
07592161981mAuthor Commented:
awk '{print $0 > $1".out"}' verbosegc.20120911.170447.37451.txt.001
awk: (FILENAME=verbosegc.20120911.170447.37451.txt.001 FNR=14) fatal: can't redirect to `</initialized>.out' (No such file or directory)

Getting this error.
0
 
07592161981mAuthor Commented:
<af type="nursery" id="1" timestamp="Sep 11 17:04:52 2012" intervalms="0.000">.

Based on the above line , need to separate files for Sep11, 12, 13 & 14.

Have these timestamps in randomly.
0
 
07592161981mAuthor Commented:
<af type="nursery" id="1" timestamp="Sep 11 17:04:52 2012" intervalms="0.000">
<af type="nursery" id="193" timestamp="Sep 12 00:00:03 2012" intervalms="1370951.683">
<af type="nursery" id="491" timestamp="Sep 13 00:00:05 2012" intervalms="802938.834">
<af type="nursery" id="757" timestamp="Sep 14 00:02:49 2012" intervalms="1115834.953">

Need to Separate the files based on the starting time of the day.
0
 
woolmilkporcCommented:
awk -F"timestamp=\""  '{D=substr($2,1,6); gsub(" ","",D); print $0 > D ".out"}' inputfile
0
 
07592161981mAuthor Commented:
You are the  Master . It  worked  great.
0
 
woolmilkporcCommented:
Thx for the points!

You're fast - I eventually planned to enhance the command so that the output filenames would contain the input filename:

awk -F"timestamp=\""  '{D=substr($2,1,6); gsub(" ","",D); print $0 > FILENAME "-" D ".out"}' inputfile

Better?
0
 
07592161981mAuthor Commented:
After each timestamp line I have some data to be  get copied in the file. Its not getting copying in the new files created.

<af type="nursery" id="1" timestamp="Sep 11 17:04:52 2012" intervalms="0.000">
  <minimum requested_bytes="184" />
  <time exclusiveaccessms="0.008" meanexclusiveaccessms="0.008" threads="0" lastthreadtid="0x0000000000011100" />
  <refs soft="2390" weak="11402" phantom="0" dynamicSoftReferenceThreshold="32" maxSoftReferenceThreshold="32" />
  <nursery freebytes="0" totalbytes="167772160" percent="0" />
  <tenured freebytes="1808907656" totalbytes="1811939328" percent="99" >
    <soa freebytes="1718311304" totalbytes="1721342976" percent="99" />
    <loa freebytes="90596352" totalbytes="90596352" percent="100" />
  </tenured>
  <gc type="scavenger" id="1" totalid="1" intervalms="0.000">
    <flipped objectcount="257542" bytes="12485696" />
    <tenured objectcount="0" bytes="0" />
    <finalization objectsqueued="421" />
    <scavenger tiltratio="50" />
    <nursery freebytes="155140736" totalbytes="167772160" percent="92" tenureage="10" />
    <tenured freebytes="1808907656" totalbytes="1811939328" percent="99" >
      <soa freebytes="1718311304" totalbytes="1721342976" percent="99" />
      <loa freebytes="90596352" totalbytes="90596352" percent="100" />
    </tenured>
    <time totalms="115.926" />
  </gc>
  <nursery freebytes="155075200" totalbytes="167772160" percent="92" />
  <tenured freebytes="1808907656" totalbytes="1811939328" percent="99" >
    <soa freebytes="1718311304" totalbytes="1721342976" percent="99" />
    <loa freebytes="90596352" totalbytes="90596352" percent="100" />
  </tenured>
  <refs soft="2383" weak="11274" phantom="0" dynamicSoftReferenceThreshold="32" maxSoftReferenceThreshold="32" />
  <time totalms="116.058" />
</af>


<af type="nursery" id="2" timestamp="Sep 11 17:04:56 2012" intervalms="4104.046">
  <minimum requested_bytes="568" />
  <time exclusiveaccessms="0.016" meanexclusiveaccessms="0.016" threads="0" lastthreadtid="0x0000000000011100" />
  <refs soft="3883" weak="11844" phantom="0" dynamicSoftReferenceThreshold="31" maxSoftReferenceThreshold="32" />
  <nursery freebytes="0" totalbytes="167772160" percent="0" />
  <tenured freebytes="1806034848" totalbytes="1811939328" percent="99" >
    <soa freebytes="1715438496" totalbytes="1721342976" percent="99" />
    <loa freebytes="90596352" totalbytes="90596352" percent="100" />
  </tenured>
  <gc type="scavenger" id="2" totalid="3" intervalms="4104.164">
    <flipped objectcount="503380" bytes="25876296" />
    <tenured objectcount="0" bytes="0" />
    <finalization objectsqueued="2509" />
    <scavenger tiltratio="50" />
    <nursery freebytes="141327832" totalbytes="167772160" percent="84" tenureage="11" />
    <tenured freebytes="1806034848" totalbytes="1811939328" percent="99" >
      <soa freebytes="1715438496" totalbytes="1721342976" percent="99" />
      <loa freebytes="90596352" totalbytes="90596352" percent="100" />
    </tenured>
    <time totalms="190.079" />
  </gc>
  <nursery freebytes="141262296" totalbytes="167772160" percent="84" />
  <tenured freebytes="1806034848" totalbytes="1811939328" percent="99" >
    <soa freebytes="1715438496" totalbytes="1721342976" percent="99" />
    <loa freebytes="90596352" totalbytes="90596352" percent="100" />
  </tenured>
  <refs soft="2779" weak="11428" phantom="0" dynamicSoftReferenceThreshold="31" maxSoftReferenceThreshold="32" />
  <time totalms="190.319" />
</af>
0
 
woolmilkporcCommented:
So not every line contains "timestamp"? Didn't know that.

Anyway, here you go:

awk -F"timestamp=\""  'BEGIN {D="NODATE"} {if($0~"timestamp") {D=substr($2,1,6); gsub(" ","",D)}; print $0 > FILENAME "-" D ".out"}' inputfile

Should the first line of inputfile not contain "timestamp" then this line (and all following lines up to, but not including, the first line containing "timestamp") will go to a file named "inputfile-NODATE.out"
0
 
07592161981mAuthor Commented:
thank you very much WoolMilkPorc. It worked with out issues.
0

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

  • 7
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now