Solved

Reading/Writing binary files

Posted on 2002-03-06
5
2,365 Views
Last Modified: 2012-06-21
Hey all,
I've been fighting with this problem for a few days and to be honest I'm quite confused as to what's
going on.  Here's the scenario...I'm taking a binary file, reading it in to a char[] buffer, looking
for a particular sequence of characters, changing that sequence to a new sequence (of the same length),
and writing the buffer out to a new file.  I then am opening up the two files in a hex editor to see
if my changes worked properly.  

Here's where it gets interesting.  Upon looking through the file, I find that the sequence I want to
change did get changed properly.  However, there are some other changes that were made that shouldn't
have been.

The following list of hex values are being incorrectly changed to "3F" (every instance of them...not
just one or two):
81
8D
8F
90
9D

Now, when printing out what characters those are, I see that they are some of the "unprintable characters"
(ie '?').  However, a number of other unprintable characters don't get changed to 3F.  For example,
the following are left alone:
82
80
83
84
85
93
94
95

Also, I know for a fact that this isn't an off by one error, as I'm reading in 8192 bytes at a time,
and writing them out.  Since the test file I'm working on only takes 2 reads and writes, and there are
significantly more changes than that, I'm pretty sure that I'm not just losing the last character in
my buffer or anything like that.

Does anybody have any idea what's going on here?  I'm using BufferedReader and BufferedWriter for the
read/write operations (also tried FileReader/Writer).  I suspect there is some kind of bad conversion
happening between read and write, but not sure what.

Oh yeah, one other thing.  I've tried just reading and immediately writing without changing anything
in the buffer and got the same results, so I'm fairly sure it's not in my conversion method.

After fighting with this whole problem for a while, I've tried using RandomAccessFile instead and reading in bytes.  Haven't fully gotten that to work, but I'm hopeful.  However, I'm still really curious as to why the aforementioned problem is happening.  Thanks for any help.
0
Comment
Question by:mzimmer74
5 Comments
 
LVL 4

Expert Comment

by:jerch
ID: 6844657
Can you post your code?
0
 
LVL 4

Accepted Solution

by:
m_onkey_boy earned 75 total points
ID: 6844713
You have to read your file with FileInputStream and manipulate the bytes instead of characters - it's the character handling that is causing your errors.

This is happening because Readers convert bytes into characters, and Writers convert characters into bytes.  When you don't specify a charset, it assumes ISO-8859-1.  Example:

bytes in file: 35, 36, 37, 81
goes through Reader (Unicode characters are 2 bytes): (0-35), (0-36), (0-37), (0-81).

When you write back out, it tries to convert your characters into bytes using ISO-8859-1 again, and since 81 is not a valid ISO-8859-1 character, you get 3F (?).

0
 
LVL 2

Author Comment

by:mzimmer74
ID: 6844748
Thanks.  I wondered if this might be the case, and am glad to see that I wasn't just losing my mind.
0
 
LVL 1

Expert Comment

by:stefarg
ID: 9963482
I'm having a very similar problem but I'm really stuck at the moment because I can't see a way around it.
The problem I'm having is that I'm receiving data in DatagramPackets that I write to a file using a FileOutputStream, but I think that the DatagramPacket object I use to receive the packet is causing the problem.  The payload of the packet seems to have the conversion already applied.  Is there any way I can stop this from happening.
Thanks,
Stef
0
 

Expert Comment

by:cyberscan
ID: 13166983
I think there may be another problem with crap OS's Like Windows.  I noticed that whenever a \r is written to a file Windows also writes a \n .  This is very annoying.  I know that in C, one can open a file and add an additional flag (O_BINARY) to keep this from happening.  I've been truing to figure out how to do this in Java.

I wrote a webserver in Java, and it works great for transferring binary formats in every operating system except those designed by Microsoft.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Unable to open debugger port in Intellij idea 6 140
Modeling a class in java 5 46
what is a "java.lang.System Property"   ? 20 57
eclipse java  build path 6 43
For customizing the look of your lightweight component and making it look opaque like it was made of plastic.  This tip assumes your component to be of rectangular shape and completely opaque.   (CODE)
Java functions are among the best things for programmers to work with as Java sites can be very easy to read and prepare. Java especially simplifies many processes in the coding industry as it helps integrate many forms of technology and different d…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now