?
Solved

How to check for duplicate data records using Perl?

Posted on 2011-03-04
4
Medium Priority
?
510 Views
Last Modified: 2012-05-11
How can I check if data records are duplicated using Perl?

For example, I have the following format of duplicated data records and need to check this.

__Data__
@scsi_test
time: 2:30 p.m.
1. loop 10 - used data path 1
2. loop 10 - used data path 2
3. loop 10 - used data path 3

__Data__
@scsi_test
time: 2:30 p.m.
1. loop 10 - used data path 1
2. loop 10 - used data path 2
3. loop 10 - used data path 3

0
Comment
Question by:areyouready344
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 16

Expert Comment

by:sjklein42
ID: 35037423
(1) What output do you want - the input echoed, but with the duplicate records removed?

(2) I think this is right, but are we looking for entire "sets" of data that match based on the timestamp and the detail lines following it?

Please provide a more complete input test data so we can be sure it works the way you want.
0
 

Author Comment

by:areyouready344
ID: 35039728
Looking for  a complete duplicate of all lines within a record (including timestamp and lines). Output should say no duplicate or duplicate records and display which ones are duplicated..

Thanks,
0
 
LVL 26

Accepted Solution

by:
wilcoxon earned 2000 total points
ID: 35040444
This should work...
#!/usr/local/bin/perl

use strict;
use warnings;

$/ = '__Data__';
my (%seen, %dupe);
while (<>) {
    chomp;
    s{\n+$}{};
    $dupe{$_}++ if $seen{$_};
    $seen{$_}++;
}

if (%dupe) {
    print "duplicate records:\n";
    foreach my $set (sort keys %dupe) {
        print "$set\n";
    }
} else {
    print "no duplicates\n";
}

Open in new window

0
 

Author Comment

by:areyouready344
ID: 35193912
thanks again Wilcoxo...
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Suggested Courses

764 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question