Output Control

Hello,

I have a tab delimited file containing following example records (no specific characters or row length)

word word word word word word
word word
word word word word
word word word word word
word word word
word

I'm looking for a way to only output 3 words per line. If a line contains more than 3 words, move the rest to a new line. In case a line contains 5 words and 2 words are moved to a new line, then any line with one word in the original file should be appended at the end or beginning of the line to make a 3 word record. In case there is 1 record, then any line from the original file containing 2 words should be appended. In case there are not enough 1 or 2 word records left then ANY 1 or 2 word combinations from either the original or the new file can be appended.

Thank you!
faithless1Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

TvMptCommented:
Hi.
Im not used with perl but why you dont read all lines to a string and then split the words making an array, then you easily print 3 positions of array at a time and per line.
0
johanntagleCommented:
To clarify, if your input is like this:

one two three four five
six seven
eight nine ten eleven
twelve thirteen fourteen fifteen sixteen
seventeeen eighteen nineteen
twenty

Will the output be like:
one two three
four five six
seven eight nine
ten eleven twelve
thirteen fourteen fifteen
sixteen seventeen eighteen
nineteen twenty

OR will it be like
one two three
four five twenty
six seven eight
nine ten eleven
twelve thirteen fourteen
fifteen sixteen seventeeen
eighteen nineteen

In the second example "twenty" got placed to the front because you said "In case a line contains 5 words and 2 words are moved to a new line, then any line with one word in the original file should be appended at the end or beginning of the line to make a 3 word record. "

The first output option should be easy, the second one could get complicated as you will need to read the whole file first and find lines that will "fit the puzzle".

Am curious - can you give me an idea what you need this for?
0
wilcoxonCommented:
I think this will do what you want.  Look for comments with XXX in them for things you can change to alter the behavior.  Mostly these relate to tabs - your question says tab-delimited for input but the example is space-delimited and you don't refer to how you want the output.

Given the input specified in johanntagle's comment, it will produce the following output (this is how I interpreted your question):

one two three
four five twenty
six seven eleven
eight nine ten
twelve thirteen fourteen
fifteen sixteen
seventeeen eighteen nineteen

Note "fifteen sixteen" is only two words and not the last record because you did not specify what to do when there were no more one or two word records left in either file to append.

The script is callable as:

script.pl input_file
#!/usr/bin/perl

use strict;
use warnings;

# XXX - output separator - to use tab isntead of space, change " " to "\t"
my $sep = " ";

my @lines;
my @base = (undef, [], []);
my @xtra = (undef, [], []);

# XXX - for testing, uncomment "while (<DATA>)" and comment out "while (<>)"
#while (<DATA>) {
while (<>) {
    chomp;
    # XXX - if you want only tab delimiter, change \s+ to \t+
    my @words = split /\s+/;
    next unless @words; # skip lines with no words at all
    push @lines, [@words];
    my $cnt = scalar @words;
    if ($cnt == 1) {
        push @{$base[1]}, @lines-1;
    } elsif ($cnt == 2) {
        push @{$base[2]}, @lines-1;
    } elsif ($cnt % 3 == 1) {
        push @{$xtra[1]}, @lines-1;
    } elsif ($cnt % 3 == 2) {
        push @{$xtra[2]}, @lines-1;
    }
}

for my $i (0 .. @lines-1) {
    my @words = @{$lines[$i]};
    next unless @words; # skip lines where we removed all words
    while (@words > 3) {
        print join($sep, splice(@words, 0, 3)), "\n";
    }
    next unless @words;
    # where to look for "extra" words
    my $off = (@words == 1) ? 2 : 1;
    my $line = -1;
    # skip lines we've already seen - check @base first then @xtra
    while (@{$base[$off]} and $line < $i) {
        $line = pop @{$base[$off]};
    }
    while (@{$xtra[$off]} and $line < $i) {
        $line = pop @{$xtra[$off]};
    }
    if ($line >= 0) {
        push @words, splice(@{$lines[$line]}, -$off);
    }
    print join($sep, @words), "\n";
}

__DATA__
one two three four five
six seven
eight nine ten eleven
twelve thirteen fourteen fifteen sixteen
seventeeen eighteen nineteen
twenty

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
wilcoxonCommented:
To explain why the output is done that way...

one two three - first 3 words of first line
four five twenty - last two words of first line plus the only one-word line in the original file
six seven eleven - line two plus the first one word line in the new file
eight nine ten - first 3 words of third line
twelve thirteen fourteen - first 3 words of fourth line
fifteen sixteen - last two words of fourth line - no more single words so left as two words
seventeeen eighteen nineteen - fifth line
0
faithless1Author Commented:
Excellent, does exactly what I need. Thank you!!!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.