Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

java regex question

Posted on 2005-04-26
39
Medium Priority
?
318 Views
Last Modified: 2008-02-01
This is more of a regex question, but I couldn't locate a regex forum.

During runtime, the following statement:

startTimePattern = Pattern.compile("(?=\\s+START DATE:)(\\d{1,2}[\\]\\d{1,2}[\\]\\d{4} [(]\\d{1,2}:\\d{1,2} AM|PM[)])");


Throws the following exception:

Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed char
acter class near index 71
(?=\s+START DATE:)(\d{1,2}[\]\d{1,2}[\]\d{4} [(]\d{1,2}\d{1,2} AM|PM[)])
                                                                                                             ^


The expression looks correct to me, but I must be missing something.  Does anyone know what is going on here?
0
Comment
Question by:rnicholus
  • 20
  • 11
  • 8
39 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 13867077
Try

startTimePattern = Pattern.compile("(?=\\s+START DATE:)(\\d{1,2}[\\\\]\\d{1,2}[\\\\]\\d{4} [(]\\d{1,2}:\\d{1,2} AM|PM[)])");
0
 

Author Comment

by:rnicholus
ID: 13867253
Ok, your method worked, but I realized that I was asking the wrong question anyways.  I really need to check for "/" in place of the "\".  So, I constructed the following statement:

startTimePattern = Pattern.compile("(?=\\s*START DATE:)(\\d{1,2}[/]\\d{1,2}[/]\\d{4} [(]\\d{1,2}:\\d{1,2} AM|PM[)])");

However, when attempting to match using the above pattern against, say, "START DATE: 3/17/2005 (12:00 AM)" I don't get a match.  I can't figure out why.  Let me step through the logic of my regex statement as I see it...

If the phrase "START DATE:" preceeded by 0 or more space characters is found, attempt to find a section of the string that starts out with 1 or 2 digits followed by a "/", followed by 1 or 2 digits, followed by a "/", followed by 4 digits, followed by a "(", followed by 1 or 2 digits, followed by a ":", followed by 1 or 2 digits, followed by a space, followed by either "AM" or "PM", and, finally, followed by a ")".
0
 

Author Comment

by:rnicholus
ID: 13867275
I should add that a space must follow the 4 digits in the middle of my statement.
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
LVL 37

Expert Comment

by:zzynx
ID: 13867426
>> I should add that a space must follow the 4 digits in the middle of my statement.
Try:

... \d{4}\s[(]...
0
 

Author Comment

by:rnicholus
ID: 13867514
Arrgh!  Still no dice.  My statement now looks like this:

startTimePattern = Pattern.compile("(?=\\s*START DATE:)(\\d{1,2}[/]\\d{1,2}[/]\\d{4}\\s[(]\\d{1,2}:\\d{1,2}\\sAM|PM[)])");

Shouldn't this work?  Why isn't this matching?
0
 
LVL 37

Expert Comment

by:zzynx
ID: 13867517
This works for me:

        if ( "START DATE: 3/17/2005 (12:00 AM)".matches("(\\s)*+START DATE:(\\s)*+\\d{1,2}[/]\\d{1,2}[/]\\d{4}\\s[(]\\d{1,2}:\\d{1,2}\\s(AM|PM)[)]") )
            System.out.println("matches");
        else
            System.out.println("doesn't match");

It prints out "matches".
0
 
LVL 37

Expert Comment

by:zzynx
ID: 13867526
>> Why isn't this matching?
It's not because you say that
\\d{1,2} follows "START DATE:" that it accepts "START DATE: 3" (that space before the 3)
0
 
LVL 37

Expert Comment

by:zzynx
ID: 13867536
It's not because you have one digit instead of 2, that it allows a space.
0
 

Author Comment

by:rnicholus
ID: 13867541
A few things...

1.) Why do you have a "+" sign after group 1 and 2?

2.) I need to capture the actual date and time.  It needs to be in a group so I can parse it further.  If I turn the date and time pattern into, in this case, group 3, will this work as well?
0
 

Author Comment

by:rnicholus
ID: 13867551
zzynx:

I'm not sure I understand your explanation.  Could you clarify for me?
0
 
LVL 37

Expert Comment

by:zzynx
ID: 13867558
Explanation of

"(\\s)*+START DATE:(\\s)*+\\d{1,2}[/]\\d{1,2}[/]\\d{4}\\s[(]\\d{1,2}:\\d{1,2}\\s(AM|PM)[)]"

- 0 or more spaces
- the literal "START DATE:"
- 0 or more spaces
- 1 or 2 digits
- /
- 1 or 2 digits
- /
- 4 digits
- one space
- (
- 1 or 2 digits
- :
- 1 or 2 digits
- one space
- "AM" or "PM"
- )
0
 
LVL 37

Expert Comment

by:zzynx
ID: 13867572
>>Could you clarify for me?
If your day would be the 13th of the month, you would probably have "START DATE:13"
If it's 3 you would have "START DATE: 3"
Well to match, you need to specify

- START DATE:
- 0 or more spaces
- 1 or 2 digits
0
 

Author Comment

by:rnicholus
ID: 13867581
Ok, your statement works, but why doesn't mine?  I want to use a lookahead to check for "START DATE:" before continuing to look for a match.  Then if the date/time pattern matches, I want to capture it as a group for further parsing...
0
 

Author Comment

by:rnicholus
ID: 13867586
ohhhhh.... i see now.  Hold on, let me try this...
0
 
LVL 37

Expert Comment

by:zzynx
ID: 13867592
>> 1.) Why do you have a "+" sign after group 1 and 2?
You are right those +'s are not needed. Remove them.

>> 2.) I need to capture the actual date and time
Sorry, I wasn't aware of that
0
 

Author Comment

by:rnicholus
ID: 13867599
Wait a minute, no.  This still doesn't make sense.  I'm not attempting to match anything between the ":" in start date and the first digit.  I'm using "START DATE:" as a positive lookahead.  Do you see what I'm saying?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 13867600
As far as i can see, your pattern should be:


Pattern startTimePattern = Pattern.compile("START DATE: \\d{1,2}/\\d{1,2}/\\d{4} \\(\\d{1,2}:\\d{1,2} (AM|PM)\\)");
0
 

Author Comment

by:rnicholus
ID: 13867630
I still need to capture the date and time as a group so I can parse it further.  I only want 1 group, the matched date and time, and I want to use a positive lookahead to check for the existance of "START DATE:" to save time.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 13867641
Or rather:

Pattern startTimePattern = Pattern.compile("START DATE: \\d{1,2}/\\d{1,2}/\\d{4} \\(\\d{1,2}:\\d{1,2} (?:AM|PM)\\)");

thus leaving open the question of what should be captured, if anything
0
 

Author Comment

by:rnicholus
ID: 13867651
...your example groups AM|PM as group 1, and everything as group 0 (of course) with no positive lookahead...
0
 

Author Comment

by:rnicholus
ID: 13867665
Ok, still, if this matches, I only have the option of returning group 1 (AM|PM) or group 0 (the entire match - which will force me to parse the match the old fashioned way, something I wanted to avoid in the first place).
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 13867668
In that case:

                  Pattern startTimePattern = Pattern.compile("START DATE: (\\d{1,2}/\\d{1,2}/\\d{4} \\(\\d{1,2}:\\d{1,2} (?:AM|PM)\\))");
0
 

Author Comment

by:rnicholus
ID: 13867670
oops, ok, group 1 is noncapturing, I see.  However, I still have only group 0 if it matches, which still requires some further parsing.
0
 

Author Comment

by:rnicholus
ID: 13867687
hehe.  sorry, AM|PM must be part of the same group as the date/time.  I need the am|pm to determine the time of day when I save the time.  So, the full group should only include the date, time, and am or pm.  In your example, am|pm is noncapturing.  I still don't understand why my original statement doesn't match...
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 13867721
My last code captures the following as its one group:

3/17/2005 (12:00 AM)
0
 

Author Comment

by:rnicholus
ID: 13867776
You're right, it does.  I'm fairly new to regex.  I really want to know why my orignal (the one with the "/"s) did not work.  It seems like it should have.
0
 
LVL 37

Expert Comment

by:zzynx
ID: 13867778
>> I'm fairly new to regex
Then you'll find http://www.regular-expressions.org a great site.
0
 

Author Comment

by:rnicholus
ID: 13867827
Actually, I have read quite a large book on regex in java, and I feel I have a good understanding of it.  Though I have only been working with regex for a little while now, and it's possible I missed something in my original statement.  However,  as I have said before, my original statement seems perfectly valid, just as I have explained it in my first post (after my question).  Can anyone tell me why it didn't?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 13867829
//  I really want to know why my orignal (the one with the "/"s) did not work

Well we've been into that partly have we not? There only needs to be one error...
Decide which one you mean and then perhaps we can discuss it

Of course this approach also works:

String s = "START DATE: 3/17/2005 (12:00 AM)";
DateFormat df = new SimpleDateFormat("'START DATE: 'M/dd/yyyy '('h:mm a')'");
Date d = df.parse(s);
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 13867902
I mean, your very first one (allowing for the wrong direction of the slash) is already not a match at this point:

>>"(?=\\s+

since the String does not beging with whitespace
0
 

Author Comment

by:rnicholus
ID: 13868779
Let us call this my original pattern:

startTimePattern = Pattern.compile("(?=\\s*START DATE:)(\\d{1,2}[/]\\d{1,2}[/]\\d{4}[(]\\d{1,2}:\\d{1,2}\\s*AM|PM[)])");

The String may not begin with a whitespace, but that is covered by the "*".  
0
 
LVL 86

Accepted Solution

by:
CEHJ earned 500 total points
ID: 13868946
There are really 3 errors

a. you don't want to be using a lookaround at the start, since you need to match the second part of the string. See
http://www.regular-expressions.info/lookaround.html
b. there is a space missing after the four consecutive digits
c. AM and PM are not grouped correctly for the OR (see mine)
0
 

Author Comment

by:rnicholus
ID: 13869415
a.) I figured a lookaround was neccesary since I don't even want to bother matching the string if it doesn't start with "START TIME:".  Isn't this correct?

b.) You're right about this.  My mistake.

c.) You have grouped AM/PM as (?:AM|PM).  This is a noncapturing group within the capturing group.  What is the effect of this and why is it important?
0
 

Author Comment

by:rnicholus
ID: 13869427
I should also mention that I don't care about returning the "START TIME:" portion of the string.  I just want to make sure it's there.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 13869464
a) The regex engine will give up as soon as 'START TIME' is not matched anyway
c) >>This is a noncapturing group within the capturing group.

Yes. It has to be bracketed AFAIK for it to work at all, whether capturing or not

>>I don't care about returning the "START TIME:"

Yes, so then you don't capture it
0
 

Author Comment

by:rnicholus
ID: 13869631
one more question and then I'm set...

>> a) The regex engine will give up as soon as 'START TIME' is not matched anyway

Ok, so this is true, which means that a lookaround is not needed.  However, is there a reason why I couldn't or shouldn't use a lookaround in this case?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 13869792
The point of a lookaround is really not to consume characters, but you need it to, as you're interested in the end of the string
0
 

Author Comment

by:rnicholus
ID: 13869820
Thanks!
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 13869835
:-)
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Java contains several comparison operators (e.g., <, <=, >, >=, ==, !=) that allow you to compare primitive values. However, these operators cannot be used to compare the contents of objects. Interface Comparable is used to allow objects of a cl…
Java had always been an easily readable and understandable language.  Some relatively recent changes in the language seem to be changing this pretty fast, and anyone that had not seen any Java code for the last 5 years will possibly have issues unde…
Viewers will learn about the different types of variables in Java and how to declare them. Decide the type of variable desired: Put the keyword corresponding to the type of variable in front of the variable name: Use the equal sign to assign a v…
This video teaches viewers about errors in exception handling.
Suggested Courses
Course of the Month14 days, 11 hours left to enroll

578 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question