[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now

x
?
Solved

Remove duplicate lines using sed in Oracle Linux 6.5 environment

Posted on 2014-04-02
1
Medium Priority
?
560 Views
Last Modified: 2014-04-03
I have a sample file (sample.txt) that looks like this:

ddddddddddddddddddddddd
ddddddddddddddddddddddd
aaaaaaaaaaaaaaaaaaaaaaa
avvvvvvvvvvvvvvvvvvvvvv
bbbbbbbbbbbbbbbbbbbbbbb
ddddddddddddddddddddddd

What I wish to do is remove all duplicate lines leaving only unique lines:

ddddddddddddddddddddddd
aaaaaaaaaaaaaaaaaaaaaaa
avvvvvvvvvvvvvvvvvvvvvv
bbbbbbbbbbbbbbbbbbbbbbb


Constraints:
1) I know I can use this command:
cat  sample.txt | sort | uniq > newfile.txt 

Open in new window

but would prefer to not sort the file as I wish to leave it in its original order.
2) Would prefer to make the change inline (not writing to a newfile etc.)

I found this link http://www.linuxquestions.org/questions/programming-9/removing-duplicate-lines-with-sed-276169/

which offered the following solution (which I modified as follows)
# delete duplicate, consecutive lines from a file (emulates "uniq").
# First line in a set of duplicate lines is kept, rest are deleted.
sed -i '$!N; /^\(.*\)\n\1$/!P; D' sample.txt

# delete duplicate, nonconsecutive lines from a file. Beware not to
# overflow the buffer size of the hold space, or else use GNU sed.
sed -i 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P' sample.txt

Open in new window

The 1st sed command (consecutive lines) worked.
The 2nd sed command (non-consecutive lines) did not work (i.e. lines were in fact replicated in the file) and I do not understand sed enough to fix it.

Any help would be greatly appreciated.  If this requirement can not be fulfilled using sed or sed alone please supply alternative.
0
Comment
Question by:klyles95
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
1 Comment
 
LVL 84

Accepted Solution

by:
ozo earned 2000 total points
ID: 39974196
perl -i -ne 'print unless $s{$_}++' sample.txt
0

Featured Post

Learn how to optimize MySQL for your business need

With the increasing importance of apps & networks in both business & personal interconnections, perfor. has become one of the key metrics of successful communication. This ebook is a hands-on business-case-driven guide to understanding MySQL query parameter tuning & database perf

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction We as admins face situation where we need to redirect websites to another. This may be required as a part of an upgrade keeping the old URL but website should be served from new URL. This document would brief you on different ways ca…
In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Suggested Courses

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question