Link to home
Start Free TrialLog in
Avatar of Tom Knowlton
Tom KnowltonFlag for United States of America

asked on

URGENT: Please help with Regular Expression modification

afilename=Regex.Match(decodedUrl, @"(?<=\w+\:\s+)\w\:\S+(?=\s|\.eml)", RegexOptions.ExplicitCapture|RegexOptions.IgnoreCase).Value;

used to work just fine when the Subject line of the e-mail file was:

"Failure:  C:\12341234123432.doc"


But now the Subject line of the e-mail has changed to this:

"Failure:  \\platinum\FaxCOMGeneratedFaxes\12341234123432.doc"


What I am after is JUST this part:


12341234123432.doc



The rest of the subject line can be discarded.



Thank you,


Tom



Avatar of tomasX2
tomasX2

Maybe not what you need but here is a solution without  regex...
                  string url = "\\platinum\FaxCOMGeneratedFaxes\12341234123432.doc";
                  string whatINeed = s.Substring(s.LastIndexOf(@"\"));
ASKER CERTIFIED SOLUTION
Avatar of tomasX2
tomasX2

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Tom Knowlton

ASKER

Expert tomasX2:

This looks promising.

The problem is I have not had a chance to study Regular Expressions or WebDAV  (YET)   so I am lost whenever modifications are required!!!!!

Thanks...I'll let you know how this works out.

Tom
Expert tomasX2:

Your proposed solution was right on the money.....I just need you to help me tweak it some more.

I found out that the URL line looks like THIS:

http://www.buyersfund.com/exchange/faxcom/FaxMakerFaxes/Success: \\platinum\FaxCOMGeneratedFaxes\040909160209703.doc (Fax sent to 917135326577).EML


Which when I run LAST INDEX OF on the above line I get:


\040909155232735.doc (Fax sent to 918165240579).EML


NOTE:  I was wrong about what I said I needed earlier (sorry abou that).

What I need is to parse the line in such a way so I can extact the following from it:

\\platinum\FaxCOMGeneratedFaxes\040909160209703.doc
Here is the regex...

Regex regex = new Regex(
    @"(?<doc>\\\\.*\.doc)\s+(?<eml>.+\.eml)",
    RegexOptions.IgnoreCase
    | RegexOptions.Multiline
    | RegexOptions.IgnorePatternWhitespace
    | RegexOptions.Compiled
    );

then you can run:

Matches matches = regex.Matches( decodeUrl );
afilename = matches[0].Value;

Give it a try...
Couldn't you use something real simple like:

  afilename = Regex.Match(decodedUrl, @"(?<=[^:]*\s)[^\s]*", RegexOptions.ExplicitCapture|RegexOptions.IgnoreCase).Value;
Actually if you just need the doc you could do...

afilename = Regex.Match(dcodeUtl, @"(\\\\.*\.doc)",
RegexOptions.ExplicitCapture|RegexOptions.IgnoreCase).Value;
Woops, should read

afilename = Regex.Match(dcodeUrl, @"(\\\\.*\.doc)",
RegexOptions.ExplicitCapture|RegexOptions.IgnoreCase).Value;
I would suggest that use the same regular expression which you were using
but before that do what is present in the first comment from tomasX2

     string url = @"http://www.buyersfund.com/exchange/faxcom/FaxMakerFaxes/Success: \\platinum\FaxCOMGeneratedFaxes\040909160209703.doc (Fax sent to 917135326577).EML";
      string filePathAndSomeExtraInfo = url .Substring(url.IndexOf(@"\"));
      string whatINeed = filePathAndSomeExtraInfo.Substring(0,filePathAndSomeExtraInfo.IndexOf("(")).Trim();
      System.Diagnostics.Debug.WriteLine(whatINeed);
Expert tomasX2's alternative to RegEx worked just fine.  Alteast I can understand what is happening now.

Regular Expression syntax is hard to understand for the uninitiated (me).
I agree... Regular Expressions can be frustrating....

nonetheless here is neat tool that can be used to create and test the expressions.

http://www.codeproject.com/dotnet/Expresso.asp
I went to this link:

http://www.codeproject.com/dotnet/Expresso.asp

Very very cool.  I will have to download and try it out!!!

I can see that Regular Expressions are very powerful and cool.  Sort of a parsing "short hand" of sorts??????

I just find statements like this so intimidating:

@"(?<doc>\\\\.*\.doc)\s+(?<eml>.+\.eml)",


It just looks like a bunch of nonsense to me!!!!!!

I know you have to pick it apart and look at each character sequence one by one...and then it makes sense.....but taken as a whole I don't see how you could possibly understand what the above statement is supposed to do.  :)
@"(?<doc>\\\\.*\.doc)\s+(?<eml>.+\.eml)", looks like jiberish or some kind of math syntax;-)
... very unintuitive

 but you can do things with regualar expressions that are just not possible or extremly difficult with regular string manipulations.
but they are definently not always needed.
@"(?<doc>\\\\.*\.doc)\s+(?<eml>.+\.eml)", looks like jiberish or some kind of math syntax;-)
... very unintuitive

 but you can do things with regualar expressions that are just not possible or extremly difficult with regular string manipulations.
but they are definently not always needed.

============================

Agreed on all counts!

Thanks again!

Tom
True, regular expression take awhile to get used to... but they are extreamly powerfull!  And with a little "math syntax" ;-) can do a great deal of stuff with a string.  The expresso tool is good, it is what I use to test my regex... It also will give the the C# or VB.Net syntax once you have the regex built like you want...

Let me break down  @"(?<doc>\\\\.*\.doc)\s+(?<eml>.+\.eml)" ...

the @ before the quote is a C# syntax, which sais to process the string literally.

the ( ) paren in a regex means to remember what was found
the ? in the ( ) means that the name of the match will follow in < >
the first \ means to escape the next special regex character, which is another \ in this case so the \\\\ renders to \\ when the regex is processed.
      also the \. renders to a .
the . means match any character
the * means match 0 or more times
the + means match 1 or more times
what this leaves is the \s which is a special way of saying any white space in regex.


So (?<doc>\\\\.*\.doc) will match \\platinum\FaxCOMGeneratedFaxes\040909160209703.doc putting it into a group called doc
     (?<eml>.+\.eml) will match (Fax sent to 917135326577).EML putting it into a group called eml

So while a regex can look like a math equation it can be every bit as powerful for manipulating a string.
The great appeal of RegEx is the brevity, to be sure.

I just have to practice, right?  :)

Getting a good book on Regular Expressions might help a lot also... O'REILLY puts out a good one that I would recommend.
NipNFiar:

Can you give the exact title / auther of the book and I will check it out!

I do have the "camel" book on Perl, btw.  Would you still recommend the other book?

I will be working in C# quite a bit this next year....is there a C# specific book on RegEx?
Here is a link to the book:  http://www.oreilly.com/catalog/regex2/index.html

Also, yes I would still recomend the book... The reason is that this book deals with a number of subtleties in regular expressions where as most other books deal with syntax.

I cut my teeth on regular expressions with the "camel" book while I was developing heavely with PERL.  It is a decent resource...

The two books that I have used the most for .Net development are:

"Applied Microsoft .NET Framework Programming" by Jeffery Richter and "Inside C#" by Tom Archer and Andrew Whitechapel both published by Microsoft Press.
The two books that I have used the most for .Net development are:

"Applied Microsoft .NET Framework Programming" by Jeffery Richter and "Inside C#" by Tom Archer and Andrew Whitechapel both published by Microsoft Press.

===================

Please tell me a little bit about these books, especially "Inside C#".  What is the intended purpose of the "Inside C#" book?
The main thrust of Inside C# is as an Architectural Reference.  Some of the advanced programming topics covered include UnManaged Code, Remoting, Threading, Security, Error Handling and more.

The Applied Microsoft .NET Framework Programming covers .NET topics that are a part of the core framework and is language agnostic.  Some of the topics that I found usefull here include Reflection, Delegates and how the .NET Garbage Collection works.

While I come from more of a Java background and was able to apply 90+ percent of that knowledge to C#, these books have filled in the gaps nicely.
Cool.

I come from the following background:

C++
Delphi
Visual Basic / MS Access / VBA / Office Automation

I started out at the company I now work for programming in MS Acccess.

We are moving from MS Access to a WEB-based application.  I am doing less and less in MS Access and more and more in C# as time progresses.  The intention is for me to write the middle tier of our WEB-based application, between the Web GUI and SQL Server database.

I have it as my personal goal to learn C# inside and out.....to master the language.  I am just trying to figure out the best way to do that.  Obviously....just by writing applications in the language I can learn it over time.....but my boss has approved ONE hour a day to spend on *just* learning the C# language.....so I am trying to figure out the BEST way to spend that hour.