Unix Shell script to sort & keep only one occurrence of repeated field ( sed awk Perl grep )

sunhux
sunhux used Ask the Experts™
on

File1 has the following lines (columns delimited by space(s)):

997818 found in 3498
1004060 found in 3499
1214451 found in 3498
879730 found in 3499
8029032 found in 3515
8054065 found in 3515
8056462 found in 3515
8138803 found in 3517
8135802 found in 3516
8135803 found in 3516
. . .

I need a script that will sort by the 4th column & for repeated 4th column values/
lines, just list out only the 4th columns' values once.  So the output file2 will be :
3498
3499
3515
3516
3517
...


Then another script will read file2 and compare against file3, eg, file3 has lines below:
3515
3517

The final output will be those lines / values in file2 that are not found in file3, so final output:
3498
3499
3516

I'm Ok if you can combine the 2 scripts into 1 script or even a single liner
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Commented:
awk '{print $4}' file1 | sort -u | comm -13 file3 -
Distinguished Expert 2017

Commented:
Instead of asking for scripts to perform a specific task on the same set of data, could you detail your overall goal is with this data?

Having 10 scripts processing the same set of data extracting different things where a single multipurpose script will do.
i.e. get the first column do something, use the fourth column and do something else etc.
read in files and ...etc.
Top Expert 2007
Commented:
try

 awk '{ print $4 }' file1 | sort -u | egrep -v -f file3

Result will be thrown on Standard output which can be redirected to a file, if required, A file named 'file2' will be created for intermediate result.
if [ $# -lt 2 ]
then
echo "Usage: $0 file1 file3"
exit
fi
awk '{print $4}' $1|sort -u|tee file2|comm -3 - $2

Open in new window

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial