Solved

String matching

Posted on 2004-08-17
56
299 Views
Last Modified: 2010-03-31
Hi,
   This is a snippet of code from an application that receives a small amount of text from a UDP datagram. I can print the contents of the String and it does contain what I think it should. However I can't seem to match it against another string. For example in this case when it receives 'die'  I want it to quit. Although it prints 'die' to the screen, it doesn't quit..... Here's the code snippet :



byte[] buffer = new byte[1024];

        packet = new DatagramPacket(buffer,buffer.length);

        ds.receive(packet);


        String message = new String(packet.getData());
        System.out.println(message);
        if(message.equalsIgnoreCase("die"))

      {

      System.out.println("Is this being called?");
      System.exit(0);

      }



Any suggestions or am I doing something stupidly obviously wrong? :)


0
Comment
Question by:petepalmer
  • 19
  • 17
  • 13
  • +2
56 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 11818618
Try

String message = new String(packet.getData()).trim();
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11818627
I was just messing around with it and the lenghth of the data passed to the string is 1024 bytes - regardless of the length of the data.... I imagine that would screw things up - and I imagine "trim()" will solve that :)
0
 
LVL 92

Expert Comment

by:objects
ID: 11818632
String message = new String(packet.getData(), packet.getOffset(), packet.getLength());
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11818636
Yup it did. Points are yours :)
0
 
LVL 92

Expert Comment

by:objects
ID: 11818637
> I imagine that would screw things up - and I imagine "trim()" will solve that :)

not recomended, use the code I posted above instead
0
 
LVL 92

Expert Comment

by:objects
ID: 11818642
> Yup it did. Points are yours :)

It's not guaranteed to work.
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11818643
Sorry objects - didn't see your reply till I'd already assigned the points......
0
 
LVL 35

Expert Comment

by:girionis
ID: 11818648
You can also use the indexOf("die") although this is a bit risky since the user might actually send a sentence with the word die in it.
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11818649
Which one isn't guaranteed to work? :S
0
 
LVL 92

Expert Comment

by:objects
ID: 11818653
> Which one isn't guaranteed to work?

the one you accepted
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11818657
>>It's not guaranteed to work.

Please explain why
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11818660
I'm now using  :


String message = new String(packet.getData(),0,packet.getLength());


Seems to work okay.....
0
 
LVL 92

Expert Comment

by:objects
ID: 11818684
using trim() would introduce a nasty bug to track down in your app.
i'll get the question reopened so you can accept the correct comment.


0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11818701
>>using trim() would introduce a nasty bug to track down in your app.

That's still not an adequate explanation
0
 
LVL 35

Expert Comment

by:girionis
ID: 11818707
> using trim() would introduce a nasty bug to track down in your app.

Can you elaborate on that?
0
 
LVL 92

Expert Comment

by:objects
ID: 11818727
I'm little confused myself actually why CEHJ think that the message has been padded with whitespace. Under that circumstance you'd use trim() but thats not happening here.
The fact that it works sometimes is only a result of the unused portion of the buffer containing whitespace and not something you should or even need to code for.

0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11818777
>>I'm little confused myself actually why CEHJ think that the message has been padded with whitespace

Quite simply because petepalmer was seeing the String he expected was not what it seemed. It seemed a good bet

>>but thats not happening here.

That seems to be exactly what's happening
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11819005
Perhaps when you create a string from a byte array, the empty bytes are made into whitespace?

Regardless the trim() did actually work so that must have been happening :)
0
 
LVL 92

Expert Comment

by:objects
ID: 11819075
> the empty bytes are made into whitespace?

not necesarily, thats the danger. It may work one some occasions and not on others.
Often making it hard to track down the problem.
Far safer to expicitly use the valid part of the buffer when creating your string.
0
 
LVL 35

Expert Comment

by:girionis
ID: 11819120
objects, is there any chance the empty bytes will be converted to something alse (if they are converted at all) apart from whitespaces? I am curious as to why that could happen.
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11819136
Regardless, if you don't use the "spare" bytes - you'll never have to deal with that issue :)
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11819139
>>Far safer to expicitly use the valid part of the buffer when creating your string.

There's no way of knowing what is 'valid'

Empty bytes could be padded either with ' ' or 0. The latter is preferable. Either way, String.trim() will work
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11819172
>>Empty bytes could be padded either with ' ' or 0

I should say 'are likely to be padded'. Certainly no non-whitespace bytes will be used for padding
0
 
LVL 92

Expert Comment

by:objects
ID: 11819178
> is there any chance the empty bytes will be converted to something alse

problem is the bytes aren't necessarily empty, they are simply unused.

> Regardless, if you don't use the "spare" bytes - you'll never have to deal with that issue :)

if you use them then they aren't spare :)

> Empty bytes could be padded either with ' ' or 0.

no padding is required, as shown with my suggested solution.
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11819179
likely but not certainly.....

If they're not my app would get one hell of a shock ;)


0
 
LVL 92

Expert Comment

by:objects
ID: 11819185
> likely but not certainly.....

If you just use the defined part of the data buffer then there a lot more certainty on the result :)
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 
LVL 86

Expert Comment

by:CEHJ
ID: 11819198
>>no padding is required, as shown with my suggested solution.

?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11819249
>>If you just use the defined part of the data buffer

There's no certainty, from the receiving end alone, what 'defined' means. You're making an assumption - i.e. that petepalmer has control over how the message was set in the first place.
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11819279
So what if I just used the "set" bytes.... and then trim that string?

Apart from potentially being slightly slower, would it give any other problems?
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11823166
That might be a good idea. Both answers work - but there is great discussion whether either will work 100% of the time.  Despite the fact it seems to be working for me - I think it would be useful for other users with the same problem if we allowed this debate to continue....
0
 
LVL 92

Expert Comment

by:objects
ID: 11827055
The recieved packet contains 1024 bytes but only a handful of these bytes contains data (the rest of the buffer is undefined).
What CEHJ suggests to do is to create a string from the entire 1024 bytes, and then trim any whitespace from the resulting string. Even if you could be absolutely sure that the unused parts of the buffer only contained whitespace it is still an inefficient approach and entirely unecessary.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11829021
>>The recieved packet contains 1024 bytes ... entirely unecessary.

You're *still* making the assumption that petepalmer has control over how the message is set in the first place. That's the only way in which your suggestion would make any sense. Unless the length is explicitly set to only use part of the buffer

>>String message = new String(packet.getData(),0,packet.getLength());

would make no difference whatsoever. A 1024 byte String that needs trimming would be returned anyway.

In any case, it's probably a good idea to call trim() anyway.
0
 
LVL 92

Expert Comment

by:objects
ID: 11837162
> You're *still* making the assumption that petepalmer has control over how the message is set in the first place.

No at all, I'm making no assumptions.
(You're suggestion however does).

> >>String message = new String(packet.getData(),0,packet.getLength());
> would make no difference whatsoever.

Again incorrect, if that was the case then it wouldn't work :-D

> A 1024 byte String that needs trimming would be returned anyway.

not sure why you'd think that


0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11840258
>>No at all, I'm making no assumptions.

You are i'm afraid as your approach won't work in certain circumstances, depending on how the message was set. It could have been set as follows:

     final int BUF_SIZE = 1 << 10; // 1K buffer
     byte[] message = new byte[BUF_SIZE];
     byte[] messageStringBytes = "die".getBytes();
     System.arraycopy(messageStringBytes, 0, message, 0, messageStringBytes.length);
     DatagramPacket dp = new DatagramPacket(message, BUF_SIZE);
     String dataString = new String(dp.getData());
     System.out.println(dataString.equals("die")); // false
     dataString = new String(dp.getData(), dp.getOffset(), dp.getLength()); // STILL false
     System.out.println(dataString.equals("die"));
     System.out.println(dataString.trim().equals("die")); // true

>>Again incorrect, if that was the case then it wouldn't work :-D

Not incorrect (see above)

>>(You're suggestion however does).

No, it's not an assumption, but a deduction based on the behaviour exhibited by petepalmer's program

0
 
LVL 92

Expert Comment

by:objects
ID: 11847097
You seem to be missing the point somewhat :)
You example doesn't really show anything except how *not* to create a datagram :)

And you still haven't addressed why you feel it necessary to create a String including the unused portion of the buffer, or provide proof that any unused buffer is *always* guaranteed to contain whitespace. Though even if it does its still an ineffiecent approach.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11850433
>>You example doesn't really show anything except how *not* to create a datagram

It's certainly not the best way, but since it uses one of the available ctors it's *one* way of creating it

>>And you still haven't addressed why you feel it necessary to create a String including the unused portion of the buffer

It's you who's missing the point - there's no way of knowing what portion is 'unused' if the datagram has been created like that as the code shows

>>provide proof that any unused buffer is *always* guaranteed to contain whitespace.

If you were using the datagram like this it would have to or you could only obtain garbage. getOffset and getLength are not guaranteed to work as i showed above although they should do if the DatagramPacket has been created in an optimal way.
0
 
LVL 92

Expert Comment

by:objects
ID: 11857112
petepalmer,

I think I've demonstarted that it is not necessary or desirable to create an 1024 byte string from every packet. Let me know if you need further clarification.
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11858242
Well from what I can tell both of you have very valid points. If  the array is not fully used, there will be empty bytes - which you can then get around by using  bytearray.length.   However if the array was padded i.e packets sent were always made up to 1024 bytes by say using white space, the array length would always be 1024 and you'd have a string with a lot of white space.

Therefore I believe the best solution is to first copy only the "used" bytes from the array and then trim it to make sure that no padding was used.


Does anyone disagree with this assessment?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11858256
The best solution is to set the offset and length in the ctor then the number of bytes of the buffer used and where they're used is clear. The trouble is, objects was assuming that this had been done, which can't be assumed, since there is a ctor where this is *not* done
0
 
LVL 92

Expert Comment

by:objects
ID: 11861876
> then trim it to make sure that no padding was used.

Whether trim is used or not is dependant on the protocol being used (which you should know anyway), which is not the problem in your case. It should definitely *not* be used to trim unused bytes as CEHJ is suggesting.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11863559
>>It should definitely *not* be used to trim unused bytes as CEHJ is suggesting.

As i'm almost getting tired of saying, if setting the packet has been implemented in a certain way, such as the one i mentioned (effectively as a C string) then you can't possibly *know* what's unused unless you scan (possibly slightly faster) or (easier) trim the String
0
 
LVL 92

Expert Comment

by:objects
ID: 11865835
> such as the one i mentioned (effectively as a C string) then you can't possibly *know* what's
> unused unless you scan (possibly slightly faster) or (easier) trim the String

Thats not the case here, and would be specified in the protocol as I mentioned above.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11867759
>>Thats not the case here

The point is that you didn't know that earlier.

>>and would be specified in the protocol

There may not be any 'protocol', especially with datagrams, other than 'i'm sending you a packet with a message in it'
0
 
LVL 92

Expert Comment

by:objects
ID: 11867780
> The point is that you didn't know that earlier.

Is that the closest you can come to saying I'm right :-D

> There may not be any 'protocol'

There is *always* a protocol.
0
 
LVL 86

Accepted Solution

by:
CEHJ earned 125 total points
ID: 11867811
>>Is that the closest you can come to saying I'm right

Well you certainly *guessed* right here, but your suggestion won't always work as my code shows ;-)

>>There is *always* a protocol.

If you think 'i'm sending you a packet of data to parse' is a protocol, then you're right
0
 
LVL 92

Assisted Solution

by:objects
objects earned 125 total points
ID: 11867821
> but your suggestion won't always work as my code shows ;-)

irrelevant as already discussed.
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11951318
Hi,
   The problem is the two highest rated experts both have different views about which answer is best. If the experts can't decide which is the right way, I really don't feel that I can make the judgement simply because I may get it wrong. The thread itself will be of use to someone who has the same problem as me - but I honestly don't see how I can pick which answer is correct.
0
 
LVL 1

Author Comment

by:petepalmer
ID: 11951602
As suggest I've split the points because in this case both did work - however there is disagreement if they will work in all circumstances - and I'm certainly not qualified to say who is right. Both worked for me.... but you never know lol
0
 
LVL 20

Expert Comment

by:Venabili
ID: 11951635
It is your question - you are supposed to accept what worked for you :)Thanks for closing:)
0
 
LVL 92

Expert Comment

by:objects
ID: 11958577
> however there is disagreement if they will work in all circumstances

And also that is unecessary to create a 1024 byte string from an array where only some of the bytes are valid.
If you have an 1024 byte array and only say 5 bytes are actually used (and defined) then why would you want to create a 1024 byte string and then trim it. Answer ... you wouldn't :)
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11968622
>>And also that is unecessary to create a 1024 byte string from an array where only some of the bytes are valid.

Still missing the point, i'm afraid. You wouldn't necessarily *know* how many bytes are used
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11968631
Thanks petepalmer ;-)
0
 
LVL 92

Expert Comment

by:objects
ID: 11968928
> Still missing the point, i'm afraid. You wouldn't necessarily *know* how many bytes are used

Seems you have actually missed the point, have a look at the answer I posted. It shows you how to determine exactly how many bytes are used.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

Are you developing a Java application and want to create Excel Spreadsheets? You have come to the right place, this article will describe how you can create Excel Spreadsheets from a Java Application. For the purposes of this article, I will be u…
Introduction This article is the second of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers the basic installation and configuration of the test automation tools used by…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…
Viewers will learn about if statements in Java and their use The if statement: The condition required to create an if statement: Variations of if statements: An example using if statements:

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now