Solved

Splitting a file into parts

Posted on 2007-03-27
8
165 Views
Last Modified: 2010-04-20
I have a file which I would like to split into certain number of parts. I
wrote a script using the split command. For example, if there are 19 lines
and I pass in a parameter of n=4, I would like the file to be split into
4 parts of 5,5,5,4 parts. I have looked into csplit but it doesn't seem to
be doing what I want it to do.
0
Comment
Question by:soccerplayer
  • 3
  • 3
  • 2
8 Comments
 
LVL 58

Accepted Solution

by:
amit_g earned 500 total points
ID: 18803713
Try this ...

n=4
split -l $((($(cat YourFileName | wc -l) + $n) / $n))
0
 

Author Comment

by:soccerplayer
ID: 18816277
This does not work. I ran this running last night and it was still running this morning.
0
 
LVL 58

Expert Comment

by:amit_g
ID: 18818074
What do you mean that it is still running since last night? Is the file you are splitting huge? Even then whole night is way too long. In any case you should have first tested it on a smaller file.
0
Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

 
LVL 1

Expert Comment

by:crovaxy
ID: 18822383
Use the 'dd' utility.

For a file with 100 bytes use:

dd if=/path/to/file_to_split of=/path/to/file_part1 count=25 bs=1
dd if=/path/to/file_to_split of=/path/to/file_part2 count=25 bs=1 skip=25
dd if=/path/to/file_to_split of=/path/to/file_part3 count=25 bs=1 skip=50
dd if=/path/to/file_to_split of=/path/to/file_part4 count=25 bs=1 skip=75

Check out the 'dd' man page for more information.
0
 
LVL 1

Expert Comment

by:crovaxy
ID: 18822419
Ops, you want to split by lines... sorry!

Then you can use something like this:

filename=test_part2.txt
lines=4
total_lines=`cat $filename | wc -l`;
offset=4

while :; do
        if [ $offset -gt $total_lines ]; then
                echo "PART #"
                cat $filename | head -n $offset | tail -n $[$total_lines%$lines];
                break;
        else
                echo "PART #"
                cat $filename | head -n $offset | tail -n $lines;
                offset=$[$offset+$lines];
        fi
done
0
 

Author Comment

by:soccerplayer
ID: 18831162
crovaxy, thank your for your response. I know that it can be done in a script but I was wondering if there was any way to do it using the split or csplit commands.
0
 
LVL 58

Expert Comment

by:amit_g
ID: 18831200
It can be. The command I posted is tested. You need to explain what did not work and how you used it.
0
 
LVL 1

Expert Comment

by:crovaxy
ID: 18833779
the amit_g answer is correct. You can parse the file using that command. You're problem, probably, is that you're not passing the YourFileName argument to the split command. The YourFileName in the amit_g answer is just for the 'cat' command which will be parsed by 'wc' to achieve the total lines on the file... it is not the filename argument to the split command itself.

Eg:

n=4
split -l $((($(cat YourFileName | wc -l) + $n) / $n)) YouFileName
0

Featured Post

Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

rdate is a Linux command and the network time protocol for immediate date and time setup from another machine. The clocks are synchronized by entering rdate with the -s switch (command without switch just checks the time but does not set anything). …
Linux users are sometimes dumbfounded by the severe lack of documentation on a topic. Sometimes, the documentation is copious, but other times, you end up with some obscure "it varies depending on your distribution" over and over when searching for …
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question