Solved

Remove duplicate lines using sed in Oracle Linux 6.5 environment

Posted on 2014-04-02
1
532 Views
Last Modified: 2014-04-03
I have a sample file (sample.txt) that looks like this:

ddddddddddddddddddddddd
ddddddddddddddddddddddd
aaaaaaaaaaaaaaaaaaaaaaa
avvvvvvvvvvvvvvvvvvvvvv
bbbbbbbbbbbbbbbbbbbbbbb
ddddddddddddddddddddddd

What I wish to do is remove all duplicate lines leaving only unique lines:

ddddddddddddddddddddddd
aaaaaaaaaaaaaaaaaaaaaaa
avvvvvvvvvvvvvvvvvvvvvv
bbbbbbbbbbbbbbbbbbbbbbb


Constraints:
1) I know I can use this command:
cat  sample.txt | sort | uniq > newfile.txt 

Open in new window

but would prefer to not sort the file as I wish to leave it in its original order.
2) Would prefer to make the change inline (not writing to a newfile etc.)

I found this link http://www.linuxquestions.org/questions/programming-9/removing-duplicate-lines-with-sed-276169/

which offered the following solution (which I modified as follows)
# delete duplicate, consecutive lines from a file (emulates "uniq").
# First line in a set of duplicate lines is kept, rest are deleted.
sed -i '$!N; /^\(.*\)\n\1$/!P; D' sample.txt

# delete duplicate, nonconsecutive lines from a file. Beware not to
# overflow the buffer size of the hold space, or else use GNU sed.
sed -i 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P' sample.txt

Open in new window

The 1st sed command (consecutive lines) worked.
The 2nd sed command (non-consecutive lines) did not work (i.e. lines were in fact replicated in the file) and I do not understand sed enough to fix it.

Any help would be greatly appreciated.  If this requirement can not be fulfilled using sed or sed alone please supply alternative.
0
Comment
Question by:klyles95
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
1 Comment
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 39974196
perl -i -ne 'print unless $s{$_}++' sample.txt
0

Featured Post

Three Reasons Why Backup is Strategic

Backup is strategic to your business because your data is strategic to your business. Without backup, your business will fail. This white paper explains why it is vital for you to design and immediately execute a backup strategy to protect 100 percent of your data.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Utilizing an array to gracefully append to a list of EmailAddresses
It’s 2016. Password authentication should be dead — or at least close to dying. But, unfortunately, it has not traversed Quagga stage yet. Using password authentication is like laundering hotel guest linens with a washboard — it’s Passé.
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Suggested Courses

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question