Solved

Reading/Writing binary files

Posted on 2002-03-06
5
2,362 Views
Last Modified: 2012-06-21
Hey all,
I've been fighting with this problem for a few days and to be honest I'm quite confused as to what's
going on.  Here's the scenario...I'm taking a binary file, reading it in to a char[] buffer, looking
for a particular sequence of characters, changing that sequence to a new sequence (of the same length),
and writing the buffer out to a new file.  I then am opening up the two files in a hex editor to see
if my changes worked properly.  

Here's where it gets interesting.  Upon looking through the file, I find that the sequence I want to
change did get changed properly.  However, there are some other changes that were made that shouldn't
have been.

The following list of hex values are being incorrectly changed to "3F" (every instance of them...not
just one or two):
81
8D
8F
90
9D

Now, when printing out what characters those are, I see that they are some of the "unprintable characters"
(ie '?').  However, a number of other unprintable characters don't get changed to 3F.  For example,
the following are left alone:
82
80
83
84
85
93
94
95

Also, I know for a fact that this isn't an off by one error, as I'm reading in 8192 bytes at a time,
and writing them out.  Since the test file I'm working on only takes 2 reads and writes, and there are
significantly more changes than that, I'm pretty sure that I'm not just losing the last character in
my buffer or anything like that.

Does anybody have any idea what's going on here?  I'm using BufferedReader and BufferedWriter for the
read/write operations (also tried FileReader/Writer).  I suspect there is some kind of bad conversion
happening between read and write, but not sure what.

Oh yeah, one other thing.  I've tried just reading and immediately writing without changing anything
in the buffer and got the same results, so I'm fairly sure it's not in my conversion method.

After fighting with this whole problem for a while, I've tried using RandomAccessFile instead and reading in bytes.  Haven't fully gotten that to work, but I'm hopeful.  However, I'm still really curious as to why the aforementioned problem is happening.  Thanks for any help.
0
Comment
Question by:mzimmer74
5 Comments
 
LVL 4

Expert Comment

by:jerch
ID: 6844657
Can you post your code?
0
 
LVL 4

Accepted Solution

by:
m_onkey_boy earned 75 total points
ID: 6844713
You have to read your file with FileInputStream and manipulate the bytes instead of characters - it's the character handling that is causing your errors.

This is happening because Readers convert bytes into characters, and Writers convert characters into bytes.  When you don't specify a charset, it assumes ISO-8859-1.  Example:

bytes in file: 35, 36, 37, 81
goes through Reader (Unicode characters are 2 bytes): (0-35), (0-36), (0-37), (0-81).

When you write back out, it tries to convert your characters into bytes using ISO-8859-1 again, and since 81 is not a valid ISO-8859-1 character, you get 3F (?).

0
 
LVL 2

Author Comment

by:mzimmer74
ID: 6844748
Thanks.  I wondered if this might be the case, and am glad to see that I wasn't just losing my mind.
0
 
LVL 1

Expert Comment

by:stefarg
ID: 9963482
I'm having a very similar problem but I'm really stuck at the moment because I can't see a way around it.
The problem I'm having is that I'm receiving data in DatagramPackets that I write to a file using a FileOutputStream, but I think that the DatagramPacket object I use to receive the packet is causing the problem.  The payload of the packet seems to have the conversion already applied.  Is there any way I can stop this from happening.
Thanks,
Stef
0
 

Expert Comment

by:cyberscan
ID: 13166983
I think there may be another problem with crap OS's Like Windows.  I noticed that whenever a \r is written to a file Windows also writes a \n .  This is very annoying.  I know that in C, one can open a file and add an additional flag (O_BINARY) to keep this from happening.  I've been truing to figure out how to do this in Java.

I wrote a webserver in Java, and it works great for transferring binary formats in every operating system except those designed by Microsoft.
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
countAbc challenge 9 49
strDist challenge 35 84
wordmultiple challenge 12 90
Running Jira on Raspberry PI 2? 3 137
Java had always been an easily readable and understandable language.  Some relatively recent changes in the language seem to be changing this pretty fast, and anyone that had not seen any Java code for the last 5 years will possibly have issues unde…
Basic understanding on "OO- Object Orientation" is needed for designing a logical solution to solve a problem. Basic OOAD is a prerequisite for a coder to ensure that they follow the basic design of OO. This would help developers to understand the b…
This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.
This tutorial will introduce the viewer to VisualVM for the Java platform application. This video explains an example program and covers the Overview, Monitor, and Heap Dump tabs.

759 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now