Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Regular Expression on string with braces in Java

Posted on 2009-05-12
3
Medium Priority
?
1,154 Views
Last Modified: 2013-12-17
Hi guys,

I hope you could help me with these.  I need a regular expression that could split up the string below

{adfsadfd}{sdads}{asdf{asdf}{asdf}}{asdfsdf{dads}}

into

{adfsadfd}
{sdads}
{asdf{asdf}{asdf}}
{asdfsdf{dads}}



Currently I have "\\{([.*]*)[^}]*}"   but it doesn't work. It will also include the groups inside, which i don't want


Thanks in advance
0
Comment
Question by:xanbi
3 Comments
 
LVL 85

Accepted Solution

by:
ozo earned 1500 total points
ID: 24371754
([{](?:[^{]|[{][^{}]*})*})
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24375825
I would suggest that for recursive grammars, regular expressions are not usually adequate. You can usually write a "good enough" solution, but it will not be bulletproof for the whole grammar. There is a limit to how you can employ lookahead and lookbehind in a non-recursive approach, which regular expressions are.

Recursive descent parsers can be written for special cases, and I have provided a sample that will not descend into the nested {} as you asked. In a real parser we would usually return tree nodes and parse until there is no nesting left, but since you want the nesting to be preserved past 1 level, here is a sample that works, maybe you can use it.

// Simple recursive descent parser for matching nested tokens
public class Main {
	  public static void main(String [] args) {
	       String s = "{adfsadfd}{sdads}{asdf{asdf}{asdf}}{asdfsdf{dads}}";
	       String token;
	       int beginIndex = 0;
	       while((token = match(s.substring(beginIndex))) != null) {
	    	   System.out.println(token);
	    	   beginIndex += token.length();
	       }
	   }
 
	   public static String match(String s) {
	      int i = 0; char ch;
	      if(s == null || s.length() == 0)
	    	  return null;
	      String token = "";
	      if((ch =s.charAt(i++)) != '{') // not a valid start pattern
	    	  return null;
	      token += ch;
	      while(i < s.length()) {
	    	  ch = s.charAt(i);
	    	  if(ch == '}') { // end of top level {} pattern
	    		  token += ch;
	    		  return token;
	    	  }
	    	  else if(ch != '{') { // non bracket, add to the token
	    		  token += ch;
	    		  i++;
	    	  }
	    	  else { // nested {, so recursively match and concat sub-strings
	    		  String subtoken = match(s.substring(i));
	    		  if(subtoken != null) {
	    			  token += subtoken;
	    			  i += subtoken.length();
	    		  }
	    	  }
	      }
	      
	      return token;
	   }
}

Open in new window

0
 

Author Closing Comment

by:xanbi
ID: 31580846
Thank you very much.  I accepted ozo solution because I should use regular expression prior to making scan through the string.

Regards
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When you discover the power of the R programming language, you are going to wonder how you ever lived without it! Learn why the language merits a place in your programming arsenal.
Today as you open your Outlook, you witness an error message: “Outlook is using an old copy of your Outlook Data File…”. Probably, Outlook is accessing an old OST file.
The viewer will learn how to use and create new code templates in NetBeans IDE 8.0 for Windows.
Please read the paragraph below before following the instructions in the video — there are important caveats in the paragraph that I did not mention in the video. If your PaperPort 12 or PaperPort 14 is failing to start, or crashing, or hanging, …

876 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question