I need (preferably) a Perl script that
- reads input from stdin ... Input is a huge volume of records ... Fields/Columns in each record are tab separated ... Number of fields is not known in advance but all records will have same number of fields.
- accepts column numbers as command line arguments
- outputs all unique values seen in the input for the specified columns
e.g.
input file
A 22 78 rest
E 22 90 best
A 32 55 lest
./myscript.pl 1 4
ie, output all unique values in column 1 and column 4 ... output would look something like
COLUMN 1
A
E
COLUMN 4
rest
best
lest
While in most cases number of unique values must fit in the memory, there are some cases where they may be too big to fit in ... If such cases can be handled - well and good ... in case such cases cannot be handled, it would be good enough if a message saying "too many values in column n" is displayed.
Start Free Trial