• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 283
  • Last Modified:

Regeular expression to split a string

I have a string in this format:
KEYWORD [int][int][name&surname&int&cell_number][String]

I want to strip out the keyword (leaving me with: '[int][int][name&surname&int&cell_number][String]' ) and then split the rest up into:
int
int
name&surname&int&cell_number
String

I thought something like:
String[] msg = message.replaceFirst("CT_FEEDBACK ", "").split( "\\[.*?\\]" );  
would work, but it doesn't...

Any ideas?
0
riaancornelius
Asked:
riaancornelius
  • 10
  • 7
  • 6
  • +1
1 Solution
 
CEHJCommented:
split( "\\[|\\]" );  
0
 
riaancorneliusAuthor Commented:
OK, why exactly? What is |
0
 
objectsCommented:
try:

String[] tokens = substring(s.indexOf("[")).split("\\]{0,1}\\[");
0
Introducing Cloud Class® training courses

Tech changes fast. You can learn faster. That’s why we’re bringing professional training courses to Experts Exchange. With a subscription, you can access all the Cloud Class® courses to expand your education, prep for certifications, and get top-notch instructions.

 
objectsCommented:
woops, should be:

String[] tokens = substring(s.indexOf("[")).split("\\]{0,1}\\[{0,1}");

0
 
CEHJCommented:
>> OK, why exactly? What is |

Or
0
 
riaancorneliusAuthor Commented:
>> String[] tokens = substring(s.indexOf("[")).split("\\]{0,1}\\[{0,1}");
Explain that please?
0
 
objectsCommented:
starting from first [
split on 0-1 ] followed by 0-1 [
0
 
riaancorneliusAuthor Commented:
I see CEHJ. Only problem is that it puts 4 blank elements as well.
0
 
girionisCommented:
       String s = "KEYWORD [int][int][name&surname&int&cell_number][String]";
        String[] msg = s.substring(s.indexOf("[")).split( "\\[|\\]" );
0
 
riaancorneliusAuthor Commented:
objects, that just splits it at every single character and puts a space everywhere there was a [ or ][
0
 
CEHJCommented:
>>I see CEHJ. Only problem is that it puts 4 blank elements as well.

Yes. That's not easy to avoid. Just ignore elements in the array that are blank
0
 
riaancorneliusAuthor Commented:
girionis: Isn't that the same as CEHJ's first post?
0
 
girionisCommented:
> girionis: Isn't that the same as CEHJ's first post?

Not exactly, I am getting rid of the "KEYWORD " as well.
0
 
riaancorneliusAuthor Commented:
>> Yes. That's not easy to avoid. Just ignore elements in the array that are blank
Was just wondering whether there was an easy solution (in the regex) to do that.

Thanks CEHJ
0
 
objectsCommented:
it will too :)  try

        String[] msg = s.substring(s.indexOf("[")+1, s.length()-1).split( "\\]\\[" );
0
 
objectsCommented:
> Was just wondering whether there was an easy solution (in the regex) to do that.

try my last post
0
 
riaancorneliusAuthor Commented:
>> Not exactly, I am getting rid of the "KEYWORD " as well.
My original solution does get rid of it as well.

As a matter of interest, Anybody know which is more efficient? substring() or replaceFirst().
0
 
objectsCommented:
substring I'd reckon
0
 
girionisCommented:
> >> Not exactly, I am getting rid of the "KEYWORD " as well.
> My original solution does get rid of it as well.

Yes you are right, but my comment will replace *any* keyword, not just the "CT_FEEDBACK"
0
 
CEHJCommented:
>> Was just wondering whether there was an easy solution (in the regex) to do that.

Can't think of one without substringing at the moment
0
 
riaancorneliusAuthor Commented:
good plan objects.

 String[] msg = s.substring(s.indexOf("[")+1, s.length()-1).split( "\\]\\[" );

This works perfectly.
0
 
objectsCommented:
excellent :)
0
 
riaancorneliusAuthor Commented:
>> Yes you are right, but my comment will replace *any* keyword, not just the "CT_FEEDBACK"
This is true. I didn't pick that up, because in this class, keyword will always be "CT_FEEDBACK".

Interesting thought here though. If it's such a simple solution, it would actually be more efficient to use:
StringTokenizer st = new StringTokenizer( message.substring(message.indexOf("[")+1, message.length()-1), "][" );

Since stringTokenizer is a lot faster than String.split().
0
 
girionisCommented:
That would work too. As you see there are several solutions to a problem :)
0
 
riaancorneliusAuthor Commented:
I was just making it too complex...
0
 
CEHJCommented:
>>Was just wondering whether there was an easy solution (in the regex) to do that.

You can do it like this actually:

String[] tokens = s.replaceFirst("KEYWORD\\s*\\[", "").split("\\[|\\]\\[|\\]");
0
 
CEHJCommented:
This is more efficient:

String[] tokens = s.replaceFirst("KEYWORD\\s*\\[", "").split("\\]\\[|\\]");
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

  • 10
  • 7
  • 6
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now