troubleshooting Question

remove duplicates from the csv file

Avatar of shragi
shragiFlag for India asked on
ProgrammingPerlScripting Languages
13 Comments2 Solutions234 ViewsLast Modified:
Hi - I wrote the below perl script to get emp Id and emp date of join.
I write the output of the script into a csv file in the below format

100000001,20150601
100000002,20150101
100000231,20150101
100002431,20150101
100023545,20150301
100000021,20150101
100000031,20150101
100000051,20150418
102044353,20150401
100000054,20150601
104560071,20150301
100045671,20150301
100045671,20150301
100045671,20150601
215645567,20150301


if you observe the emp id 100045671 was repeated 3 times, but i want it only once and only the first occurence
so from the below three
100045671,20150301
100045671,20150301
100045671,20150601
i just wanted the first one
100045671,20150301

in order to achieve that how can i modify the script.
I just don't want any duplicate employee id's in the file, if there are duplicates i want just the first occurence and remove the remaining from the csv.

use strict;
use warnings;
use Pod::Usage;
use Getopt::Long;
use Time::Piece;

my $helpme = 0;
my $man = 0;

my $outputFileName = 'C:\\temp\\test_v1.csv';
my $inputFileName =  'C:\\temp\\test.txt';

my $errorcode = 0;
my $DEBUG=0;

if(exists $ENV{DEBUG}) {
	$DEBUG = ($ENV{DEBUG} eq "") ? 0 : $ENV{DEBUG};
}

GetOptions('help' => \$helpme, 'man' => \$man, 'infile=s' => \$inputFileName, 'outfile=s' => \$outputFileName) or pod2usage(2);

pod2usage(1) if $helpme;
pod2usage(-verbose => 2) if $man;

die 'No input file name specified!' unless $inputFileName;
die 'No output file name specified!' unless $outputFileName;

open(INFILE, '<', $inputFileName) or die "Could not open input file: $!";
open(OUTFILE, '>', $outputFileName) or die "Could not open/create output file: $!";

while(<INFILE>) {
	chomp;
	
	if ((/^50\|([^|]+)\|/) || (/^51\|([^|]+)\|/)) {
		my $empID = $1;
		print OUTFILE $empID, ",";		
	}
	 if ((/^90\|([^|]+)\|/) || (/^91\|([^|]+)\|/)) {
		my $eDate = $1;
		my $dt = Time::Piece->strptime($eDate, '%m/%d/%Y');
		print OUTFILE $dt->strftime('%Y%m%d'), "\n";
	}
}

close INFILE;
close OUTFILE;

Thanks,
ASKER CERTIFIED SOLUTION
wilcoxon

Our community of experts have been thoroughly vetted for their expertise and industry experience.

Join our community to see this answer!
Unlock 2 Answers and 13 Comments.
Start Free Trial
Learn from the best

Network and collaborate with thousands of CTOs, CISOs, and IT Pros rooting for you and your success.

Andrew Hancock - VMware vExpert
See if this solution works for you by signing up for a 7 day free trial.
Unlock 2 Answers and 13 Comments.
Try for 7 days

”The time we save is the biggest benefit of E-E to our team. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange.

-Mike Kapnisakis, Warner Bros