Solved

Regular Expression on string with braces in Java

Posted on 2009-05-12
3
1,122 Views
Last Modified: 2013-12-17
Hi guys,

I hope you could help me with these.  I need a regular expression that could split up the string below

{adfsadfd}{sdads}{asdf{asdf}{asdf}}{asdfsdf{dads}}

into

{adfsadfd}
{sdads}
{asdf{asdf}{asdf}}
{asdfsdf{dads}}



Currently I have "\\{([.*]*)[^}]*}"   but it doesn't work. It will also include the groups inside, which i don't want


Thanks in advance
0
Comment
Question by:xanbi
3 Comments
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 24371754
([{](?:[^{]|[{][^{}]*})*})
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24375825
I would suggest that for recursive grammars, regular expressions are not usually adequate. You can usually write a "good enough" solution, but it will not be bulletproof for the whole grammar. There is a limit to how you can employ lookahead and lookbehind in a non-recursive approach, which regular expressions are.

Recursive descent parsers can be written for special cases, and I have provided a sample that will not descend into the nested {} as you asked. In a real parser we would usually return tree nodes and parse until there is no nesting left, but since you want the nesting to be preserved past 1 level, here is a sample that works, maybe you can use it.

// Simple recursive descent parser for matching nested tokens
public class Main {
	  public static void main(String [] args) {
	       String s = "{adfsadfd}{sdads}{asdf{asdf}{asdf}}{asdfsdf{dads}}";
	       String token;
	       int beginIndex = 0;
	       while((token = match(s.substring(beginIndex))) != null) {
	    	   System.out.println(token);
	    	   beginIndex += token.length();
	       }
	   }
 
	   public static String match(String s) {
	      int i = 0; char ch;
	      if(s == null || s.length() == 0)
	    	  return null;
	      String token = "";
	      if((ch =s.charAt(i++)) != '{') // not a valid start pattern
	    	  return null;
	      token += ch;
	      while(i < s.length()) {
	    	  ch = s.charAt(i);
	    	  if(ch == '}') { // end of top level {} pattern
	    		  token += ch;
	    		  return token;
	    	  }
	    	  else if(ch != '{') { // non bracket, add to the token
	    		  token += ch;
	    		  i++;
	    	  }
	    	  else { // nested {, so recursively match and concat sub-strings
	    		  String subtoken = match(s.substring(i));
	    		  if(subtoken != null) {
	    			  token += subtoken;
	    			  i += subtoken.length();
	    		  }
	    	  }
	      }
	      
	      return token;
	   }
}

Open in new window

0
 

Author Closing Comment

by:xanbi
ID: 31580846
Thank you very much.  I accepted ozo solution because I should use regular expression prior to making scan through the string.

Regards
0

Featured Post

Networking for the Cloud Era

Join Microsoft and Riverbed for a discussion and demonstration of enhancements to SteelConnect:
-One-click orchestration and cloud connectivity in Azure environments
-Tight integration of SD-WAN and WAN optimization capabilities
-Scalability and resiliency equal to a data center

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I previously wrote an article addressing the use of UBCD4WIN and SARDU. All are great, but I have always been an advocate of SARDU. Recently it was suggested that I go back and take a look at Easy2Boot in comparison.
All of the resources available today make learning a new digital media easier than ever-- if you know where to begin. This is a clear, simple guide to a few of the basic digital art mediums and how to begin learning them on your own.
This video shows how use content aware, what it’s used for, and when to use it over other tools.
This tutorial covers a step-by-step guide to install VisualVM launcher in eclipse.

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question