Advertisement

02.19.2008 at 12:29AM PST, ID: 23173702
[x]
Attachment Details
[x]
The Solution Rating System

With so many solutions, how can you tell which solutions are most likely to help you and which ones are not? To provide you with a tool to use, we rate our solutions based on various elements that most accurately determine if a solution is a quality solution. To explain what factors affect the solution rating, here are the elements we take into consideration when formulating our solution rating.

  • The Grade of the Solution
  • The Zone Rank of the Expert Providing the Solution
  • The Number of Author and Expert Comments
  • The Number of Experts Contributing
  • The Feedback of the Community

Your Input Matters
Because of the way the system is set up, the most important variable in this equation is you. As a member of Experts Exchange, you are able to cast your vote on the quality of the solutions in regard to how complete, accurate, helpful and easy to understand each solution is. When you provide your feedback, each rating is adjusted accordingly. So, if you see a solution that has a poor rating that you think is a good solution, let us know by rating it. As you do, the rating will be adjusted and will become more accurate for other members of our site.

If you have any suggestions that you would like to make for our rating system, please ask a question in the Suggestions Zone of Community Support.

Thank you!

Remove duplicate string in txt file using perl

Tags: Perl
Hi all,

I need to remove the same string from the txt file using perl, how do I do it????

Pls some 1 help???
1:
2:
3:
4:
5:
6:
7:
8:
Testing.txt
==========================================================
 
AAA\BBB\CCC	1st
111\222\333	2nd
444\555\666	3rd
rrr\ggg\vvv	3rd  //Remove this line away as it is the same
777\888\999	4th
Start your free trial to view this solution
Question Stats
Zone: Programming
Question Asked By: KenTan85
Solution Provided By: ozo
Participating Experts: 2
Solution Grade: A
Views: 93
Translate:
Loading Advertisement...
02.19.2008 at 12:42AM PST, ID: 20926634

Rank: Sage

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.19.2008 at 12:47AM PST, ID: 20926662

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.19.2008 at 12:49AM PST, ID: 20926670

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.19.2008 at 01:07AM PST, ID: 20926739

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.19.2008 at 01:11AM PST, ID: 20926759

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.19.2008 at 01:20AM PST, ID: 20926799

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.19.2008 at 01:29AM PST, ID: 20926824

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.19.2008 at 01:36AM PST, ID: 20926854

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.19.2008 at 01:46AM PST, ID: 20926897

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
Loading Advertisement...
Microsoft
  • Internet Protocols
  • Applications
  • Development
  • OS
  • Hardware
  • Windows Security
Apple
  • Operating Systems
  • Hardware
  • Programming
  • Networking
  • Software
Internet
  • Search Engines
  • File Sharing
  • WebTrends / Stats
  • Spy / Ad Blockers
  • Web Browsers
  • New Net Users
  • Web Development
  • Chat / IM
  • Anti Spam
  • Web Servers
  • Anti-Virus
  • Email Clients
Gamers
  • Tips
  • Online / MMORPG
  • Puzzle
  • Emulators
  • Action / Adventure
  • Role Playing
  • Consoles
  • Game Programming
  • Strategy
  • Sports
  • Misc
  • Computer Games
Digital Living
  • Hardware
  • New Net Users
  • New Users
  • Software
  • Digital Music
  • Gaming World
  • Home Security
  • Apple
  • Networking Hardware
Virus & Spyware
  • Vulnerabilities
  • IDS
  • Encryption
  • Anti-Virus
  • Operating Systems Security
  • Software Firewalls
  • WebApplications
  • Cell Phones
  • Operating Systems
  • Internet
  • Hardware Firewalls
Hardware
  • Handhelds / PDAs
  • Displays / Monitors
  • Components
  • Networking Hardware
  • Peripherals
  • Laptops/Notebooks
  • Storage
  • Servers
  • Desktops
  • New Users
  • Misc
  • Apple
Software
  • System Utilities
  • Industry Specific
  • Network Management
  • Photos / Graphics
  • Page Layout
  • VMWare
  • Misc
  • Web Development
  • OS
  • CYGWIN
  • Voice Recognition
  • Message Queue
  • Quality Assurance
  • Security
  • Firewalls
  • MultiMedia Applications
  • Development
  • Database
  • Office / Productivity
  • Business Management
  • OS/2 Apps
  • Server Software
  • Internet / Email
ITPro
  • OS
  • Storage
  • Encryption
  • Operating Systems Security
  • Apple Hardware
  • Laptops & Notebooks
  • Servers
  • Networking Hardware
  • Peripherals
  • Devices
  • Displays / Monitors
  • WebTrends / Stats
  • Search Engines
  • Firewalls
  • WebApplications
  • IDS
  • Vulnerabilities
  • Email Clients
  • File Sharing
  • Spy / Ad Blockers
  • Web Browsers
  • Web Servers
  • Networking
  • Anti-Virus
  • Chat / IM
  • Anti Spam
Developer
  • Web Servers
  • Web Browsers
  • Game Programming
  • Dev Tools
  • Industry Specific
  • Office / Productivity
  • Database
  • CYGWIN
  • Web Development
  • Search Engines
  • File Sharing
  • WebTrends / Stats
  • Programming
  • Content Management
  • Application Servers
  • Protocols
Storage
  • Removable Backup Media
  • Storage Technology
  • Servers
  • Grid
  • Remote Access
  • Backup / Restore
  • Misc
  • Hard Drives
OS
  • Miscellaneous
  • Security
  • Development
  • Linux
  • VMWare
  • MainFrame OS
  • Unix
  • Apple
  • OS / 2
  • AS / 400
  • BeOS
  • Microsoft
  • VMS / OpenVMS
Database
  • Oracle
  • Miscellaneous
  • MySQL
  • Software
  • Sybase
  • Contact Management
  • PostgreSQL
  • Data Manipulation
  • Clarion
  • InterSystems Cache
  • Siebel
  • MUMPS
  • OLAP
  • SQLBase
  • SAS
  • GIS & GPS
  • 4GL
  • Berkeley DB
  • DB2
  • Informix
  • Interbase / Firebird
  • FoxPro
  • Reporting
  • LDAP
  • Filemaker Pro
  • MS SQL Server
  • dBase
  • MS Access
Security
  • Misc
  • Web Browsers
  • Software Firewalls
  • Operating Systems Security
  • File Sharing
  • Spy / Ad Blockers
  • Vulnerabilities
  • WebApplications
  • IDS
  • Anti-Virus
  • Encryption
  • Anti Spam
  • Email Clients
  • VPN
  • Chat / IM
Programming
  • Editors IDEs
  • Installation
  • Handhelds / PDAs
  • Multimedia Programming
  • System / Kernel
  • Algorithms
  • Game
  • Signal Processing
  • Project Management
  • Open Source
  • Database
  • Misc
  • Languages
  • Processor Platforms
  • Theory
Web Development
  • Scripting
  • Blogs
  • Web Servers
  • Software
  • Search Engines
  • Web Graphics
  • Images
  • Internet Marketing
  • Images and Photos
  • Components
  • Document Imaging
  • Web Languages/Standards
  • Illustration
  • WebApplications
  • Fonts
  • WebTrends / Stats
  • Authoring
  • Digital Camera Software
  • Miscellaneous
Networking
  • Protocols
  • Apple Networking
  • Network Management
  • Message Queue
  • Application Servers
  • Content Management
  • File Servers
  • Email Servers
  • Misc
  • Java Editors & IDEs
  • Wireless
  • Networking Hardware
  • Backup / Restore
  • System Utilities
  • ISPs & Hosting
  • Web Servers
  • Storage Technology
  • Removable Backup Media
  • Servers
  • Broadband
  • Grid
  • OS / 2
  • Novell Netware
  • Unix Networking
  • Windows Networking
  • Security
  • Telecommunications
  • Operating Systems
  • Linux Networking
Other
  • Community Advisor
  • Lounge
  • Community Support
  • New Net Users
  • Philosophy / Religion
  • Math / Science
  • Miscellaneous
  • URLs
  • Expert Lounge
  • Politics
  • Puzzles / Riddles
Community Support
  • Suggestions
  • New to EE
  • New Topics
  • Community Advisor
  • CleanUp
  • Announcements
  • General
  • Feedback
  • Input
  • EE Bugs
 
02.19.2008 at 12:42AM PST, ID: 20926634

Rank: Sage

Is there a specific reason to use Perl?

You can simply do

uniq -f1 Testing.txt
 
02.19.2008 at 12:47AM PST, ID: 20926662

Rank: Genius

Are the duplicate lines always adjacent?
 
02.19.2008 at 12:49AM PST, ID: 20926670
Tintin, I need to use perl is becoz I need to coporate this into other app which is cater for perl. Ur solution do not work becoz I not working on a unix platform.

ozo, yes it is always adjacent mi can say.
 
02.19.2008 at 01:07AM PST, ID: 20926739

Rank: Genius

my %seen=();
$seen{(split)[1]}++ or print while <>;
#should work even if the duplicates are not adjacent
 
02.19.2008 at 01:11AM PST, ID: 20926759

Rank: Genius

#if the duplicates are always adjacent, then this can use less memory
while( <> ){
   print if $prev ne ($next=(split)[1]);
   $prev = $next;
}
 
02.19.2008 at 01:20AM PST, ID: 20926799
Hi ozo, mi is new to perl, mind if u could provide a more detail explaination on the coding???
 
02.19.2008 at 01:29AM PST, ID: 20926824

Rank: Genius

In general, see
perldoc -q duplicate

split by default splits $_ on white space
perldoc -f split


(split)[1] takes the element at index 1 of the array


$seen{(split)[1]} is a hash that takes that value as a key


$seen{(split)[1]}++ returns the old value at that key, and then increments that value
if the value is non-zero (true) that means we have previously incremented it, which means that we have seen it before
Accepted Solution
 
02.19.2008 at 01:36AM PST, ID: 20926854
When I try 2 do it this way, it always only print out the 1st line, how come???
1:
2:
3:
4:
open(IN, "Testing.txt")  || die "\nCan't open file $! $input_file \n";
my %seen=();
$seen{(split)[1]}++ or print while <IN>;
close IN;
Open in New Window
 
02.19.2008 at 01:46AM PST, ID: 20926897
ok ozo your code is good but I realised that some time if my 1st argument got space then it will compare the wrong value like the 1 below.  It is becoz the 1st argument in the text file is suppose to be a path and we have to accept the fact that sometime space will be in between them if the path hav a space. How do we solve it??
1:
2:
3:
4:
5:
6:
7:
8:
Testing.txt
==========================================================
 
AAA\BB B\CCC	1st
111\22 2\333	2nd
444\555\666	3rd
rrr\ggg\vvv	3rd  //Remove this line away as it is the same
777\888\999	4th
Open in New Window
 
 
ozo
02.19.2008 at 02:56AM PST, ID: 20927148
What determines which is the part you want to compare for duplication if not whitespace?
I thought about taking the last whitespace separated field instead of the second, but then
"same" would be the last field of
rrr\ggg\vvv     3rd  //Remove this line away as it is the same
Or do we require multiple spaces rather than just one to separate fields?
 
 
 
20080236-EE-VQP-29 / EE_QW_2_20070628