AsifMughal
asked on
Using the Sort in Unix
Hello All
I am trying to sort a 70 Meg text file and also remain any duplicates. Below is some sample data from the file.
15601584DSP4453 003508KIG235700
15692047DSP4453 003508DIG254701
11201584DSP4453 004508TIG254700
12392047DSP4453 004508UIG254701
10341928DSP1035 003508JCD265801
A duplicate is defined by three fields, first one is the first 8 characters (e.g. 15601584, in line 1), then the 7 characters at position 22 (e.g. KIG2357, in line 1) and then 2 characters at position 29 (e.g. 00 for line 1).
So any record with the same combination of the three fields is duplicate and needs to be omitted. I have tried to use the sort command with -u switch, but uses the whole line as record for searching for duplicates.
It is possible to specify which fields to use to search for duplicates by specifying the start and end positions of the text, which marks a field. You can do this with specifying a field to sort by using the -k switch and then specifying the fields, is there anything similar with the -u switch
I look forward to a reply.
Thanks in advance
Asif Mughal
I am trying to sort a 70 Meg text file and also remain any duplicates. Below is some sample data from the file.
15601584DSP4453 003508KIG235700
15692047DSP4453 003508DIG254701
11201584DSP4453 004508TIG254700
12392047DSP4453 004508UIG254701
10341928DSP1035 003508JCD265801
A duplicate is defined by three fields, first one is the first 8 characters (e.g. 15601584, in line 1), then the 7 characters at position 22 (e.g. KIG2357, in line 1) and then 2 characters at position 29 (e.g. 00 for line 1).
So any record with the same combination of the three fields is duplicate and needs to be omitted. I have tried to use the sort command with -u switch, but uses the whole line as record for searching for duplicates.
It is possible to specify which fields to use to search for duplicates by specifying the start and end positions of the text, which marks a field. You can do this with specifying a field to sort by using the -k switch and then specifying the fields, is there anything similar with the -u switch
I look forward to a reply.
Thanks in advance
Asif Mughal
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
https://www.experts-exchange.com/jsp/qShow.jsp?ta=win2k&qid=20139269
https://www.experts-exchange.com/jsp/qShow.jsp?ta=unix&qid=20097742
https://www.experts-exchange.com/jsp/qShow.jsp?ta=mssql&qid=20168039
https://www.experts-exchange.com/jsp/qShow.jsp?ta=oracle&qid=20099953
https://www.experts-exchange.com/jsp/qShow.jsp?ta=progsoftgen&qid=20092371
https://www.experts-exchange.com/jsp/qShow.jsp?ta=cplusprog&qid=20096424
https://www.experts-exchange.com/jsp/qShow.jsp?ta=javascript&qid=20173722
https://www.experts-exchange.com/jsp/qShow.jsp?ta=mfc&qid=20118706
https://www.experts-exchange.com/jsp/qShow.jsp?ta=visualbasic&qid=20179047
https://www.experts-exchange.com/jsp/qShow.jsp?ta=visualbasic&qid=20174222
Please clean them up before asking additional questions. You lose the points from your account as soon as you post a question, so it won't cost anything for you to finish up these old questions.
cjswimmer (not a moderator)