Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

How to check for duplicate data records using Perl?

Posted on 2011-03-04
4
Medium Priority
?
512 Views
Last Modified: 2012-05-11
How can I check if data records are duplicated using Perl?

For example, I have the following format of duplicated data records and need to check this.

__Data__
@scsi_test
time: 2:30 p.m.
1. loop 10 - used data path 1
2. loop 10 - used data path 2
3. loop 10 - used data path 3

__Data__
@scsi_test
time: 2:30 p.m.
1. loop 10 - used data path 1
2. loop 10 - used data path 2
3. loop 10 - used data path 3

0
Comment
Question by:areyouready344
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 16

Expert Comment

by:sjklein42
ID: 35037423
(1) What output do you want - the input echoed, but with the duplicate records removed?

(2) I think this is right, but are we looking for entire "sets" of data that match based on the timestamp and the detail lines following it?

Please provide a more complete input test data so we can be sure it works the way you want.
0
 

Author Comment

by:areyouready344
ID: 35039728
Looking for  a complete duplicate of all lines within a record (including timestamp and lines). Output should say no duplicate or duplicate records and display which ones are duplicated..

Thanks,
0
 
LVL 27

Accepted Solution

by:
wilcoxon earned 2000 total points
ID: 35040444
This should work...
#!/usr/local/bin/perl

use strict;
use warnings;

$/ = '__Data__';
my (%seen, %dupe);
while (<>) {
    chomp;
    s{\n+$}{};
    $dupe{$_}++ if $seen{$_};
    $seen{$_}++;
}

if (%dupe) {
    print "duplicate records:\n";
    foreach my $set (sort keys %dupe) {
        print "$set\n";
    }
} else {
    print "no duplicates\n";
}

Open in new window

0
 

Author Comment

by:areyouready344
ID: 35193912
thanks again Wilcoxo...
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

609 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question