Solved

Remove duplicate lines using sed in Oracle Linux 6.5 environment

Posted on 2014-04-02
1
526 Views
Last Modified: 2014-04-03
I have a sample file (sample.txt) that looks like this:

ddddddddddddddddddddddd
ddddddddddddddddddddddd
aaaaaaaaaaaaaaaaaaaaaaa
avvvvvvvvvvvvvvvvvvvvvv
bbbbbbbbbbbbbbbbbbbbbbb
ddddddddddddddddddddddd

What I wish to do is remove all duplicate lines leaving only unique lines:

ddddddddddddddddddddddd
aaaaaaaaaaaaaaaaaaaaaaa
avvvvvvvvvvvvvvvvvvvvvv
bbbbbbbbbbbbbbbbbbbbbbb


Constraints:
1) I know I can use this command:
cat  sample.txt | sort | uniq > newfile.txt 

Open in new window

but would prefer to not sort the file as I wish to leave it in its original order.
2) Would prefer to make the change inline (not writing to a newfile etc.)

I found this link http://www.linuxquestions.org/questions/programming-9/removing-duplicate-lines-with-sed-276169/

which offered the following solution (which I modified as follows)
# delete duplicate, consecutive lines from a file (emulates "uniq").
# First line in a set of duplicate lines is kept, rest are deleted.
sed -i '$!N; /^\(.*\)\n\1$/!P; D' sample.txt

# delete duplicate, nonconsecutive lines from a file. Beware not to
# overflow the buffer size of the hold space, or else use GNU sed.
sed -i 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P' sample.txt

Open in new window

The 1st sed command (consecutive lines) worked.
The 2nd sed command (non-consecutive lines) did not work (i.e. lines were in fact replicated in the file) and I do not understand sed enough to fix it.

Any help would be greatly appreciated.  If this requirement can not be fulfilled using sed or sed alone please supply alternative.
0
Comment
Question by:klyles95
1 Comment
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 39974196
perl -i -ne 'print unless $s{$_}++' sample.txt
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

The purpose of this article is to demonstrate how we can use conditional statements using Python.
It’s 2016. Password authentication should be dead — or at least close to dying. But, unfortunately, it has not traversed Quagga stage yet. Using password authentication is like laundering hotel guest linens with a washboard — it’s Passé.
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

26 Experts available now in Live!

Get 1:1 Help Now