Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 992
  • Last Modified:

Easy-peasy: Script to delete duplicate lines in a file

This should be really simple, i know, but i'm quite new to scripting and i just don't have any idea how to do it.

Imagine a process that returns a list of paths, with the form:

aaa/bbb/ccc
aaa/bbb/ccc
aaa/bbb/ccc
ddd/eee/fff
ddd/eee/fff
ddd/eee/fff
bbb/ccc/ddd
bbb/ccc/ddd
.
.
.

what script will reduce that to a file that just has:

aaa/bbb/ccc
ddd/eee/fff
bbb/ccc/ddd

and nothing else?

Currrently i've tried this:

# FileList is the file that's being created.

: > FileList
<first process> |
while read "b"
        do
        c=`dirname "$b"`
        d=`sed -n '/"$c"/p' FileList`
        if [ "$d" = "$c" ]
                then
                        continue
                else
                        echo "$c" >> List
                fi
        done

I've tried a few other variations, all based around this theme, but can't seem to get it to work.  Any and all ideas are welcome, and wordy explanations are preferred.  ;-)
0
kyle_in_taiwan
Asked:
kyle_in_taiwan
1 Solution
 
ravenplCommented:
try: uniq original > uniqued.txt
Note: uniq only deletes duplicates that follows one by another. ie if Your file have
aaa/bbb/aaa
aaa/ccc/aaa
aaa/bbb/aaa

uniq will do nothing.
0
 
kyle_in_taiwanAuthor Commented:
Cool.  I've been able to wrangle a solution out of that one already.  Thanks.
0

Featured Post

Configuration Guide and Best Practices

Read the guide to learn how to orchestrate Data ONTAP, create application-consistent backups and enable fast recovery from NetApp storage snapshots. Version 9.5 also contains performance and scalability enhancements to meet the needs of the largest enterprise environments.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now