Solved

Regexp help with capturing (round 3)

Posted on 2004-09-01
7
181 Views
Last Modified: 2010-03-05
Once again, I am stuck on this!  I have asked a similar question before but now I'm trying to do something a little different.  I just can't seem to hack this to where I'm confident with it.  What I'm trying to do is this:

$an1 = 'Album Name';
$an2 = 'The Album Name';
$an3 = 'Album Name (Disc 1)';
$an4 = 'The Album Name (Disc 1)';
$an5 = 'Album Name: Get Some';
$an6 = 'The Album Name: Get Some';
$an7 = '(2004) The Album Name (Disc 1)';
$an8 = '(2004) The Album Name (Disc 1).ext'
$an9 = '(Album Name) Get Some';

for ( all $an's )
{
      if ( $type eq 'FileName' ) {
           s/^(\(\d+\)\s+)?(?:(The|A)\s+)([^(:]+)((?:(\s+\(.*)?)|(?:(\:\s+.*)?))(\..*)?$/$1$3, $2$4$5/i;
        } else {
           # The reverse
           s/^(.*),\s+(The|A)(.*)$/$2 $1$3/i;
        }
}

desired results:

an1 = Album Name
an2 = Album Name, The
an3 = Album Name (Disc 1)
an4 = Album Name, The (Disc 1)
an5 = Album Name: Get Some
an6 = Album Name, The: Get Some
an7 = (2004) Album Name, The (Disc 1)
an8 = (2004) Album Name, The (Disc 1).ext
an9 = (Album Name) Get Some

Basically, I want the word 'The' or 'A' to be appended to the end of the line but before any ':' or '('.  I also want any optional '(2004)\s+' entries at the front of the line to remain (and possibly the file extension).  My regexp above does work .. but I get 'use of uninitialized value' warnings when using 'use warnings'.  I'm also not sure I'm gonna catch all the possibilities with this.  Is there a more efficient / safer way to do this?  I also need to reverse the change .. which does seem to work with the second regexp above.  Thanks once again!
0
Comment
Question by:Verbatim
  • 5
7 Comments
 

Author Comment

by:Verbatim
ID: 11950947
Well, this also works .. but seems too complicated:

s/^(\(\d+\)\s+)?(?:(The|A)\s+)([^(:]+)((?:(?:\s+\(.*)?)(?:(?:\:\s+.*)?))$/${\($1?$1:'')}$3, $2${\($4?$4:'')}/i;

Also, I can't figure out how to capture the options '.ext' ..

thanks!
0
 
LVL 7

Expert Comment

by:rugdog
ID: 11958082
s/(.*)(The|A)\s+([\w| ]+)\b(\W)/\1\3, \2 \4/i
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 11960559
s/(^|(?<=\))\s*)(The|A)\s+([^(:]*\b)/$1$3, $2/;


0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 

Author Comment

by:Verbatim
ID: 11960845
ozo - that definitely works!  Is there a way to optionally grab the '.ext' off the end into $4 or something?  Also, what would be the reverse of this?  Thanks!
0
 

Author Comment

by:Verbatim
ID: 11963541
So .. I think the reverse would be this:

s/(^|(?<=\)\s))([^(:]*\b),\s(The|A)/$3 $2$1/;

It seems to work, does that look ok?
0
 

Author Comment

by:Verbatim
ID: 11963618
better yet (for the reverse)?

s/(^|(?<=\)\s))([^(:]*\b),\s(The|A)/$1$3 $2/;
0
 

Author Comment

by:Verbatim
ID: 11980901
Thanks ozo .. can you have a look at this question also?

http://www.experts-exchange.com/Programming/Programming_Languages/Perl/Q_21119221.html
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Suggested Solutions

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now