[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 566
  • Last Modified:

Remove duplicate lines using sed in Oracle Linux 6.5 environment

I have a sample file (sample.txt) that looks like this:

ddddddddddddddddddddddd
ddddddddddddddddddddddd
aaaaaaaaaaaaaaaaaaaaaaa
avvvvvvvvvvvvvvvvvvvvvv
bbbbbbbbbbbbbbbbbbbbbbb
ddddddddddddddddddddddd

What I wish to do is remove all duplicate lines leaving only unique lines:

ddddddddddddddddddddddd
aaaaaaaaaaaaaaaaaaaaaaa
avvvvvvvvvvvvvvvvvvvvvv
bbbbbbbbbbbbbbbbbbbbbbb


Constraints:
1) I know I can use this command:
cat  sample.txt | sort | uniq > newfile.txt 

Open in new window

but would prefer to not sort the file as I wish to leave it in its original order.
2) Would prefer to make the change inline (not writing to a newfile etc.)

I found this link http://www.linuxquestions.org/questions/programming-9/removing-duplicate-lines-with-sed-276169/

which offered the following solution (which I modified as follows)
# delete duplicate, consecutive lines from a file (emulates "uniq").
# First line in a set of duplicate lines is kept, rest are deleted.
sed -i '$!N; /^\(.*\)\n\1$/!P; D' sample.txt

# delete duplicate, nonconsecutive lines from a file. Beware not to
# overflow the buffer size of the hold space, or else use GNU sed.
sed -i 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P' sample.txt

Open in new window

The 1st sed command (consecutive lines) worked.
The 2nd sed command (non-consecutive lines) did not work (i.e. lines were in fact replicated in the file) and I do not understand sed enough to fix it.

Any help would be greatly appreciated.  If this requirement can not be fulfilled using sed or sed alone please supply alternative.
0
klyles95
Asked:
klyles95
1 Solution
 
ozoCommented:
perl -i -ne 'print unless $s{$_}++' sample.txt
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now