• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 311
  • Last Modified:

String matching

Hi,
   This is a snippet of code from an application that receives a small amount of text from a UDP datagram. I can print the contents of the String and it does contain what I think it should. However I can't seem to match it against another string. For example in this case when it receives 'die'  I want it to quit. Although it prints 'die' to the screen, it doesn't quit..... Here's the code snippet :



byte[] buffer = new byte[1024];

        packet = new DatagramPacket(buffer,buffer.length);

        ds.receive(packet);


        String message = new String(packet.getData());
        System.out.println(message);
        if(message.equalsIgnoreCase("die"))

      {

      System.out.println("Is this being called?");
      System.exit(0);

      }



Any suggestions or am I doing something stupidly obviously wrong? :)


0
petepalmer
Asked:
petepalmer
  • 19
  • 17
  • 13
  • +2
2 Solutions
 
CEHJCommented:
Try

String message = new String(packet.getData()).trim();
0
 
petepalmerAuthor Commented:
I was just messing around with it and the lenghth of the data passed to the string is 1024 bytes - regardless of the length of the data.... I imagine that would screw things up - and I imagine "trim()" will solve that :)
0
 
objectsCommented:
String message = new String(packet.getData(), packet.getOffset(), packet.getLength());
0
Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

 
petepalmerAuthor Commented:
Yup it did. Points are yours :)
0
 
objectsCommented:
> I imagine that would screw things up - and I imagine "trim()" will solve that :)

not recomended, use the code I posted above instead
0
 
objectsCommented:
> Yup it did. Points are yours :)

It's not guaranteed to work.
0
 
petepalmerAuthor Commented:
Sorry objects - didn't see your reply till I'd already assigned the points......
0
 
girionisCommented:
You can also use the indexOf("die") although this is a bit risky since the user might actually send a sentence with the word die in it.
0
 
petepalmerAuthor Commented:
Which one isn't guaranteed to work? :S
0
 
objectsCommented:
> Which one isn't guaranteed to work?

the one you accepted
0
 
CEHJCommented:
>>It's not guaranteed to work.

Please explain why
0
 
petepalmerAuthor Commented:
I'm now using  :


String message = new String(packet.getData(),0,packet.getLength());


Seems to work okay.....
0
 
objectsCommented:
using trim() would introduce a nasty bug to track down in your app.
i'll get the question reopened so you can accept the correct comment.


0
 
CEHJCommented:
>>using trim() would introduce a nasty bug to track down in your app.

That's still not an adequate explanation
0
 
girionisCommented:
> using trim() would introduce a nasty bug to track down in your app.

Can you elaborate on that?
0
 
objectsCommented:
I'm little confused myself actually why CEHJ think that the message has been padded with whitespace. Under that circumstance you'd use trim() but thats not happening here.
The fact that it works sometimes is only a result of the unused portion of the buffer containing whitespace and not something you should or even need to code for.

0
 
CEHJCommented:
>>I'm little confused myself actually why CEHJ think that the message has been padded with whitespace

Quite simply because petepalmer was seeing the String he expected was not what it seemed. It seemed a good bet

>>but thats not happening here.

That seems to be exactly what's happening
0
 
petepalmerAuthor Commented:
Perhaps when you create a string from a byte array, the empty bytes are made into whitespace?

Regardless the trim() did actually work so that must have been happening :)
0
 
objectsCommented:
> the empty bytes are made into whitespace?

not necesarily, thats the danger. It may work one some occasions and not on others.
Often making it hard to track down the problem.
Far safer to expicitly use the valid part of the buffer when creating your string.
0
 
girionisCommented:
objects, is there any chance the empty bytes will be converted to something alse (if they are converted at all) apart from whitespaces? I am curious as to why that could happen.
0
 
petepalmerAuthor Commented:
Regardless, if you don't use the "spare" bytes - you'll never have to deal with that issue :)
0
 
CEHJCommented:
>>Far safer to expicitly use the valid part of the buffer when creating your string.

There's no way of knowing what is 'valid'

Empty bytes could be padded either with ' ' or 0. The latter is preferable. Either way, String.trim() will work
0
 
CEHJCommented:
>>Empty bytes could be padded either with ' ' or 0

I should say 'are likely to be padded'. Certainly no non-whitespace bytes will be used for padding
0
 
objectsCommented:
> is there any chance the empty bytes will be converted to something alse

problem is the bytes aren't necessarily empty, they are simply unused.

> Regardless, if you don't use the "spare" bytes - you'll never have to deal with that issue :)

if you use them then they aren't spare :)

> Empty bytes could be padded either with ' ' or 0.

no padding is required, as shown with my suggested solution.
0
 
petepalmerAuthor Commented:
likely but not certainly.....

If they're not my app would get one hell of a shock ;)


0
 
objectsCommented:
> likely but not certainly.....

If you just use the defined part of the data buffer then there a lot more certainty on the result :)
0
 
CEHJCommented:
>>no padding is required, as shown with my suggested solution.

?
0
 
CEHJCommented:
>>If you just use the defined part of the data buffer

There's no certainty, from the receiving end alone, what 'defined' means. You're making an assumption - i.e. that petepalmer has control over how the message was set in the first place.
0
 
petepalmerAuthor Commented:
So what if I just used the "set" bytes.... and then trim that string?

Apart from potentially being slightly slower, would it give any other problems?
0
 
petepalmerAuthor Commented:
That might be a good idea. Both answers work - but there is great discussion whether either will work 100% of the time.  Despite the fact it seems to be working for me - I think it would be useful for other users with the same problem if we allowed this debate to continue....
0
 
objectsCommented:
The recieved packet contains 1024 bytes but only a handful of these bytes contains data (the rest of the buffer is undefined).
What CEHJ suggests to do is to create a string from the entire 1024 bytes, and then trim any whitespace from the resulting string. Even if you could be absolutely sure that the unused parts of the buffer only contained whitespace it is still an inefficient approach and entirely unecessary.
0
 
CEHJCommented:
>>The recieved packet contains 1024 bytes ... entirely unecessary.

You're *still* making the assumption that petepalmer has control over how the message is set in the first place. That's the only way in which your suggestion would make any sense. Unless the length is explicitly set to only use part of the buffer

>>String message = new String(packet.getData(),0,packet.getLength());

would make no difference whatsoever. A 1024 byte String that needs trimming would be returned anyway.

In any case, it's probably a good idea to call trim() anyway.
0
 
objectsCommented:
> You're *still* making the assumption that petepalmer has control over how the message is set in the first place.

No at all, I'm making no assumptions.
(You're suggestion however does).

> >>String message = new String(packet.getData(),0,packet.getLength());
> would make no difference whatsoever.

Again incorrect, if that was the case then it wouldn't work :-D

> A 1024 byte String that needs trimming would be returned anyway.

not sure why you'd think that


0
 
CEHJCommented:
>>No at all, I'm making no assumptions.

You are i'm afraid as your approach won't work in certain circumstances, depending on how the message was set. It could have been set as follows:

     final int BUF_SIZE = 1 << 10; // 1K buffer
     byte[] message = new byte[BUF_SIZE];
     byte[] messageStringBytes = "die".getBytes();
     System.arraycopy(messageStringBytes, 0, message, 0, messageStringBytes.length);
     DatagramPacket dp = new DatagramPacket(message, BUF_SIZE);
     String dataString = new String(dp.getData());
     System.out.println(dataString.equals("die")); // false
     dataString = new String(dp.getData(), dp.getOffset(), dp.getLength()); // STILL false
     System.out.println(dataString.equals("die"));
     System.out.println(dataString.trim().equals("die")); // true

>>Again incorrect, if that was the case then it wouldn't work :-D

Not incorrect (see above)

>>(You're suggestion however does).

No, it's not an assumption, but a deduction based on the behaviour exhibited by petepalmer's program

0
 
objectsCommented:
You seem to be missing the point somewhat :)
You example doesn't really show anything except how *not* to create a datagram :)

And you still haven't addressed why you feel it necessary to create a String including the unused portion of the buffer, or provide proof that any unused buffer is *always* guaranteed to contain whitespace. Though even if it does its still an ineffiecent approach.
0
 
CEHJCommented:
>>You example doesn't really show anything except how *not* to create a datagram

It's certainly not the best way, but since it uses one of the available ctors it's *one* way of creating it

>>And you still haven't addressed why you feel it necessary to create a String including the unused portion of the buffer

It's you who's missing the point - there's no way of knowing what portion is 'unused' if the datagram has been created like that as the code shows

>>provide proof that any unused buffer is *always* guaranteed to contain whitespace.

If you were using the datagram like this it would have to or you could only obtain garbage. getOffset and getLength are not guaranteed to work as i showed above although they should do if the DatagramPacket has been created in an optimal way.
0
 
objectsCommented:
petepalmer,

I think I've demonstarted that it is not necessary or desirable to create an 1024 byte string from every packet. Let me know if you need further clarification.
0
 
petepalmerAuthor Commented:
Well from what I can tell both of you have very valid points. If  the array is not fully used, there will be empty bytes - which you can then get around by using  bytearray.length.   However if the array was padded i.e packets sent were always made up to 1024 bytes by say using white space, the array length would always be 1024 and you'd have a string with a lot of white space.

Therefore I believe the best solution is to first copy only the "used" bytes from the array and then trim it to make sure that no padding was used.


Does anyone disagree with this assessment?
0
 
CEHJCommented:
The best solution is to set the offset and length in the ctor then the number of bytes of the buffer used and where they're used is clear. The trouble is, objects was assuming that this had been done, which can't be assumed, since there is a ctor where this is *not* done
0
 
objectsCommented:
> then trim it to make sure that no padding was used.

Whether trim is used or not is dependant on the protocol being used (which you should know anyway), which is not the problem in your case. It should definitely *not* be used to trim unused bytes as CEHJ is suggesting.
0
 
CEHJCommented:
>>It should definitely *not* be used to trim unused bytes as CEHJ is suggesting.

As i'm almost getting tired of saying, if setting the packet has been implemented in a certain way, such as the one i mentioned (effectively as a C string) then you can't possibly *know* what's unused unless you scan (possibly slightly faster) or (easier) trim the String
0
 
objectsCommented:
> such as the one i mentioned (effectively as a C string) then you can't possibly *know* what's
> unused unless you scan (possibly slightly faster) or (easier) trim the String

Thats not the case here, and would be specified in the protocol as I mentioned above.
0
 
CEHJCommented:
>>Thats not the case here

The point is that you didn't know that earlier.

>>and would be specified in the protocol

There may not be any 'protocol', especially with datagrams, other than 'i'm sending you a packet with a message in it'
0
 
objectsCommented:
> The point is that you didn't know that earlier.

Is that the closest you can come to saying I'm right :-D

> There may not be any 'protocol'

There is *always* a protocol.
0
 
CEHJCommented:
>>Is that the closest you can come to saying I'm right

Well you certainly *guessed* right here, but your suggestion won't always work as my code shows ;-)

>>There is *always* a protocol.

If you think 'i'm sending you a packet of data to parse' is a protocol, then you're right
0
 
objectsCommented:
> but your suggestion won't always work as my code shows ;-)

irrelevant as already discussed.
0
 
petepalmerAuthor Commented:
Hi,
   The problem is the two highest rated experts both have different views about which answer is best. If the experts can't decide which is the right way, I really don't feel that I can make the judgement simply because I may get it wrong. The thread itself will be of use to someone who has the same problem as me - but I honestly don't see how I can pick which answer is correct.
0
 
petepalmerAuthor Commented:
As suggest I've split the points because in this case both did work - however there is disagreement if they will work in all circumstances - and I'm certainly not qualified to say who is right. Both worked for me.... but you never know lol
0
 
VenabiliCommented:
It is your question - you are supposed to accept what worked for you :)Thanks for closing:)
0
 
objectsCommented:
> however there is disagreement if they will work in all circumstances

And also that is unecessary to create a 1024 byte string from an array where only some of the bytes are valid.
If you have an 1024 byte array and only say 5 bytes are actually used (and defined) then why would you want to create a 1024 byte string and then trim it. Answer ... you wouldn't :)
0
 
CEHJCommented:
>>And also that is unecessary to create a 1024 byte string from an array where only some of the bytes are valid.

Still missing the point, i'm afraid. You wouldn't necessarily *know* how many bytes are used
0
 
CEHJCommented:
Thanks petepalmer ;-)
0
 
objectsCommented:
> Still missing the point, i'm afraid. You wouldn't necessarily *know* how many bytes are used

Seems you have actually missed the point, have a look at the answer I posted. It shows you how to determine exactly how many bytes are used.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

  • 19
  • 17
  • 13
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now