Solved

How to check for duplicate data records using Perl?

Posted on 2011-03-04
4
461 Views
Last Modified: 2012-05-11
How can I check if data records are duplicated using Perl?

For example, I have the following format of duplicated data records and need to check this.

__Data__
@scsi_test
time: 2:30 p.m.
1. loop 10 - used data path 1
2. loop 10 - used data path 2
3. loop 10 - used data path 3

__Data__
@scsi_test
time: 2:30 p.m.
1. loop 10 - used data path 1
2. loop 10 - used data path 2
3. loop 10 - used data path 3

0
Comment
Question by:areyouready344
  • 2
4 Comments
 
LVL 16

Expert Comment

by:sjklein42
Comment Utility
(1) What output do you want - the input echoed, but with the duplicate records removed?

(2) I think this is right, but are we looking for entire "sets" of data that match based on the timestamp and the detail lines following it?

Please provide a more complete input test data so we can be sure it works the way you want.
0
 

Author Comment

by:areyouready344
Comment Utility
Looking for  a complete duplicate of all lines within a record (including timestamp and lines). Output should say no duplicate or duplicate records and display which ones are duplicated..

Thanks,
0
 
LVL 26

Accepted Solution

by:
wilcoxon earned 500 total points
Comment Utility
This should work...
#!/usr/local/bin/perl

use strict;
use warnings;

$/ = '__Data__';
my (%seen, %dupe);
while (<>) {
    chomp;
    s{\n+$}{};
    $dupe{$_}++ if $seen{$_};
    $seen{$_}++;
}

if (%dupe) {
    print "duplicate records:\n";
    foreach my $set (sort keys %dupe) {
        print "$set\n";
    }
} else {
    print "no duplicates\n";
}

Open in new window

0
 

Author Comment

by:areyouready344
Comment Utility
thanks again Wilcoxo...
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
String Substitution 4 65
Perl Sort Question 4 117
Perl string replace for refred url 9 57
Perl Write to Specific line in a file 15 50
Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now