Solved

Reading/Writing binary files

Posted on 2002-03-06
5
2,370 Views
Last Modified: 2012-06-21
Hey all,
I've been fighting with this problem for a few days and to be honest I'm quite confused as to what's
going on.  Here's the scenario...I'm taking a binary file, reading it in to a char[] buffer, looking
for a particular sequence of characters, changing that sequence to a new sequence (of the same length),
and writing the buffer out to a new file.  I then am opening up the two files in a hex editor to see
if my changes worked properly.  

Here's where it gets interesting.  Upon looking through the file, I find that the sequence I want to
change did get changed properly.  However, there are some other changes that were made that shouldn't
have been.

The following list of hex values are being incorrectly changed to "3F" (every instance of them...not
just one or two):
81
8D
8F
90
9D

Now, when printing out what characters those are, I see that they are some of the "unprintable characters"
(ie '?').  However, a number of other unprintable characters don't get changed to 3F.  For example,
the following are left alone:
82
80
83
84
85
93
94
95

Also, I know for a fact that this isn't an off by one error, as I'm reading in 8192 bytes at a time,
and writing them out.  Since the test file I'm working on only takes 2 reads and writes, and there are
significantly more changes than that, I'm pretty sure that I'm not just losing the last character in
my buffer or anything like that.

Does anybody have any idea what's going on here?  I'm using BufferedReader and BufferedWriter for the
read/write operations (also tried FileReader/Writer).  I suspect there is some kind of bad conversion
happening between read and write, but not sure what.

Oh yeah, one other thing.  I've tried just reading and immediately writing without changing anything
in the buffer and got the same results, so I'm fairly sure it's not in my conversion method.

After fighting with this whole problem for a while, I've tried using RandomAccessFile instead and reading in bytes.  Haven't fully gotten that to work, but I'm hopeful.  However, I'm still really curious as to why the aforementioned problem is happening.  Thanks for any help.
0
Comment
Question by:mzimmer74
5 Comments
 
LVL 4

Expert Comment

by:jerch
ID: 6844657
Can you post your code?
0
 
LVL 4

Accepted Solution

by:
m_onkey_boy earned 75 total points
ID: 6844713
You have to read your file with FileInputStream and manipulate the bytes instead of characters - it's the character handling that is causing your errors.

This is happening because Readers convert bytes into characters, and Writers convert characters into bytes.  When you don't specify a charset, it assumes ISO-8859-1.  Example:

bytes in file: 35, 36, 37, 81
goes through Reader (Unicode characters are 2 bytes): (0-35), (0-36), (0-37), (0-81).

When you write back out, it tries to convert your characters into bytes using ISO-8859-1 again, and since 81 is not a valid ISO-8859-1 character, you get 3F (?).

0
 
LVL 2

Author Comment

by:mzimmer74
ID: 6844748
Thanks.  I wondered if this might be the case, and am glad to see that I wasn't just losing my mind.
0
 
LVL 1

Expert Comment

by:stefarg
ID: 9963482
I'm having a very similar problem but I'm really stuck at the moment because I can't see a way around it.
The problem I'm having is that I'm receiving data in DatagramPackets that I write to a file using a FileOutputStream, but I think that the DatagramPacket object I use to receive the packet is causing the problem.  The payload of the packet seems to have the conversion already applied.  Is there any way I can stop this from happening.
Thanks,
Stef
0
 

Expert Comment

by:cyberscan
ID: 13166983
I think there may be another problem with crap OS's Like Windows.  I noticed that whenever a \r is written to a file Windows also writes a \n .  This is very annoying.  I know that in C, one can open a file and add an additional flag (O_BINARY) to keep this from happening.  I've been truing to figure out how to do this in Java.

I wrote a webserver in Java, and it works great for transferring binary formats in every operating system except those designed by Microsoft.
0

Featured Post

Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
HashTable highest marks enumeration alternative 9 43
How do I remove an object from a 3 40
sql import cannot be resolved jsp 3 49
hashmap order 17 37
An old method to applying the Singleton pattern in your Java code is to check if a static instance, defined in the same class that needs to be instantiated once and only once, is null and then create a new instance; otherwise, the pre-existing insta…
INTRODUCTION Working with files is a moderately common task in Java.  For most projects hard coding the file names, using parameters in configuration files, or using command-line arguments is sufficient.   However, when your application has vi…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…
Viewers will learn about the regular for loop in Java and how to use it. Definition: Break the for loop down into 3 parts: Syntax when using for loops: Example using a for loop:

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question