Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 232
  • Last Modified:

Searching for strings in a byte array

I want to do some fairly crude processing on data read by an InputStreamReader. I have performance issues with character processing and the text is ASCII, so I am working with bytes.

In this quick'n'dirty processing, I'd like to be able to search for strings in a character array read by the InputStreamReader. Google enticed me with http://www.goodserver.com/imap4-ssdk/apidocs/com/goodserver/util/ByteArrayHelper.html#indexOf%28byte%5b%5c,%20byte%5b%5c%29 but that would require buying into a library with a load of whistles and bells that aren't needed.

Can anyone suggest a  more suitable jar I could use for searching for strings in a byte array?
0
rstaveley
Asked:
rstaveley
1 Solution
 
hoomanvCommented:
you can write it yourself, by learning how it is implemented in String class using char, you can change it to use byte
java.lang.String source code: http://www.docjar.com/html/api/java/lang/String.java.html
0
 
hoomanvCommented:
> I'd like to be able to search for strings in a character array
does it really decreases performance if u use    new String(charArray).indexOf()   ?
0
 
Tommy BraasCommented:
My suggestion would be to use the BufferedReader to increase stream performance. Secondly, as far as the algorithm goes, my recommendation depends on the type search you're doing. Which strings are you searching for? Whole words? Partial words? Across line endings or not?
0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

 
rstaveleyAuthor Commented:
> BufferedReader

This code needs to process large text files quickly. I've taken note of the comparisons at: http://www.javaworld.com/jw-11-2000/jw-1117-performance.html and because I don't need characters so I'd rather work with bytes and buffer locally in a byte array read into by InputStreamReader.

Here's the sum total of the processing I need to do:

(1) Junk any UUEncoded blocks
   (a) Look for \nbegin
   (b) Ignore characters until \nend
(2) Junk any lines which are longer than (say) 512 bytes
   (a) Look for \n
   (b) If it comes after > 512 bytes
(3) Copy all unjunked text into a temporary file.

The processing in (2) is very simple and I guess I could look for individual bytes for "begin" and "end" to handle (1), but I'd prefer not to write the code, if I can avoid it :-)
0
 
objectsCommented:
> I want to do some fairly crude processing on data read by an InputStreamReader. I

if you are only interested in the bytes then don't use InputStreamReader and instead read from the InputStream directly.

You search would basically loop thru the bytes looking for the first byte in the string(s) to match.
When you find it check next matches 2ns char etc.
When you find a mismatch go back to looking for 1st char.
0
 
rstaveleyAuthor Commented:
Thanks, objects. I suspected as much though I had a mental block about using the reader.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now