?
Solved

Regexp help with capturing (round 3)

Posted on 2004-09-01
7
Medium Priority
?
190 Views
Last Modified: 2010-03-05
Once again, I am stuck on this!  I have asked a similar question before but now I'm trying to do something a little different.  I just can't seem to hack this to where I'm confident with it.  What I'm trying to do is this:

$an1 = 'Album Name';
$an2 = 'The Album Name';
$an3 = 'Album Name (Disc 1)';
$an4 = 'The Album Name (Disc 1)';
$an5 = 'Album Name: Get Some';
$an6 = 'The Album Name: Get Some';
$an7 = '(2004) The Album Name (Disc 1)';
$an8 = '(2004) The Album Name (Disc 1).ext'
$an9 = '(Album Name) Get Some';

for ( all $an's )
{
      if ( $type eq 'FileName' ) {
           s/^(\(\d+\)\s+)?(?:(The|A)\s+)([^(:]+)((?:(\s+\(.*)?)|(?:(\:\s+.*)?))(\..*)?$/$1$3, $2$4$5/i;
        } else {
           # The reverse
           s/^(.*),\s+(The|A)(.*)$/$2 $1$3/i;
        }
}

desired results:

an1 = Album Name
an2 = Album Name, The
an3 = Album Name (Disc 1)
an4 = Album Name, The (Disc 1)
an5 = Album Name: Get Some
an6 = Album Name, The: Get Some
an7 = (2004) Album Name, The (Disc 1)
an8 = (2004) Album Name, The (Disc 1).ext
an9 = (Album Name) Get Some

Basically, I want the word 'The' or 'A' to be appended to the end of the line but before any ':' or '('.  I also want any optional '(2004)\s+' entries at the front of the line to remain (and possibly the file extension).  My regexp above does work .. but I get 'use of uninitialized value' warnings when using 'use warnings'.  I'm also not sure I'm gonna catch all the possibilities with this.  Is there a more efficient / safer way to do this?  I also need to reverse the change .. which does seem to work with the second regexp above.  Thanks once again!
0
Comment
Question by:Verbatim
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
7 Comments
 

Author Comment

by:Verbatim
ID: 11950947
Well, this also works .. but seems too complicated:

s/^(\(\d+\)\s+)?(?:(The|A)\s+)([^(:]+)((?:(?:\s+\(.*)?)(?:(?:\:\s+.*)?))$/${\($1?$1:'')}$3, $2${\($4?$4:'')}/i;

Also, I can't figure out how to capture the options '.ext' ..

thanks!
0
 
LVL 7

Expert Comment

by:rugdog
ID: 11958082
s/(.*)(The|A)\s+([\w| ]+)\b(\W)/\1\3, \2 \4/i
0
 
LVL 84

Accepted Solution

by:
ozo earned 2000 total points
ID: 11960559
s/(^|(?<=\))\s*)(The|A)\s+([^(:]*\b)/$1$3, $2/;


0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 

Author Comment

by:Verbatim
ID: 11960845
ozo - that definitely works!  Is there a way to optionally grab the '.ext' off the end into $4 or something?  Also, what would be the reverse of this?  Thanks!
0
 

Author Comment

by:Verbatim
ID: 11963541
So .. I think the reverse would be this:

s/(^|(?<=\)\s))([^(:]*\b),\s(The|A)/$3 $2$1/;

It seems to work, does that look ok?
0
 

Author Comment

by:Verbatim
ID: 11963618
better yet (for the reverse)?

s/(^|(?<=\)\s))([^(:]*\b),\s(The|A)/$1$3 $2/;
0
 

Author Comment

by:Verbatim
ID: 11980901
Thanks ozo .. can you have a look at this question also?

http://www.experts-exchange.com/Programming/Programming_Languages/Perl/Q_21119221.html
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Suggested Courses

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question