Link to home
Start Free TrialLog in
Avatar of alpha-lemming
alpha-lemming

asked on

Mac OSX Lion: Problems with Umlauts and ISO8859

Hello Experts,

I have a tar archive created on an old Unixware machine using ISO8859-1 encoding. When I try to extract it under Macosx Lion, I experience a bit of weirdness with German umlauts. For instance:

I open a terminal.app window with the encoding set to "ISO8859-1" and set my locale like this:
export LANG=de_DE.ISO8859-1
export LC_ALL=de_DE.ISO8859-1

Open in new window

then take a peek at the archive:

dhcp202:Downloads frank$ tar tvf backup.tar home/frank
x home/frank/
x home/frank/Präferenzen/

Open in new window


note that the "ä" is displayed correctly

I then unpack the archive with

tar xvf backup.tar home/frank

Open in new window


the "ä" is also displayed correctly in the output from tar, but when I list the directory contents, I see:

dhcp202:Downloads frank$ ls home/frank
Pr%E4ferenzen 

Open in new window

I've unpacked the archive on Linux and Unixware machines - no problems. Could it be something with HFS+?

Thanks!
ASKER CERTIFIED SOLUTION
Avatar of gheist
gheist
Flag of Belgium image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of alpha-lemming
alpha-lemming

ASKER

Tried several other utilities, 7zip, build gnu tar from source, didn't help.
LANG=C pax -x file.tar
hmm, when pax encounters one of the files i question, it spits out

pax: Invalid header, starting valid header search

and skips extracting the file.
Does setting encoding to win1252 help?

You might need gnu tar and/or pax to extract non-posix TAR files (with non-ascii characters)