Solved: Decimal Back To Packed Decimal

Hi scogger1974. The question does not seem to make sense to me. It makes sense to convert integer into packed decimal. It makes sense to interpret byte as a character and to store the character with such unicode value using the UTF-8 encoding. But it does not make sense to do both.

The reason is that values from 80 to 99 (that could be represented as packed decimal on one byte) will use the 8th bit of the byte. Such unicode character must be stored on 2 bytes using UTF-8. So, it can happen that integers that have 4 digits (for example) will occupy 2 or 3 or 4 characters in your UTF-8 file. This is not practical.

I guess that you want that text be a part of XML file (it uses UTF-8 as default encoding if not stated otherwise explicitly). For that case, you should stick with the string representation of the number using str(n).

Anyway, the integer to packed decimal as 8bit string could be obtained by the following code:

w = 4 # required width
i = 3 # let's store it as 0003 (i.e. 4 positions for w == 4)
fmt = '%%0%dd' % w # formatting string like '%04d'
s = fmt % i # so, render it as a string '0003'

buf = [] # used as a buffer of resulting bytes represented as characters

for i in xrange(0, w, 2):

d1 = s[i] # upper digit
d2 = s[i+1] # lower digit of the byte

i = int(d1) << 4 # upper half of the byte
i += int(d2) # lower half of the resulting byte

c = chr(i) # interpreted as an 8-bit character
buf.append(c) # next byte for the result

result = ''.join(buf) # join the list of bytes represented as character into one string

print repr(result) # print representation of the string

Well, you could use unichr(i) instead of chr(i) and result = u''.join(buf) then store the unicode string using UTF-8. But again, it probably does not make sense.

Chatable

I do not fully understand your question but it seems that the struct module should be useful to you.
For example:
>>> import struct
>>> spam = "\x24\x03" # 804 (just an example)
>>> eggs = struct.unpack("=H",spam)[0]
>>> eggs
804
>>> foo = struct.pack("=H",eggs)
>>> print foo
"\x24\x03"

pepr

Correction: The UTF-8 format for encoding of the unicode characters may be a bit more complidated than stated above. See http://en.wikipedia.org/wiki/UTF-8

scogger1974

ASKER

The reason is to store it back to the file the same with potential changes. thats all. So it ... and tell me if i'm wrong. .... would make sense to convert it back into the same format I originally read it from.

But with the file access maybe it does make more sense to write it as a byte back to the file .. I'm not too sure

I just want it to be still readable to my unix Software....

Thanks.. I will look at the above code and try it out..

scogger1974

ASKER

Chatable.

I'm not sure that works. if you look at the link to the old question. it seems that .pack is not what i'm looking for

scogger1974

ASKER

pepr. It is not XML it is a C-ISAM file from informix. and the rendering in a hex editor is I guess what I'm trying to
get at. If you render the DB file as a Hex Page you will see the Numbers as they appear in the database. though without the precision. but fortunately i have the precision information in the Data Dictionary so I only need to know the numbers and I can apply the Precision to it.

Thanks.

pepr

OK. But the hex editor reads binary information and it only displays it on the screen in the hex form. Then you probably want only to write the result or the previous code into the file in a binary mode.

f = open('filename', 'wb')
f.write(result)
f.close()

Try it and have a look at the hex editor.

pepr

Of course, the code could be written in the more dense form. Try the following function

def intToPackedDecimal(i, width=4):
assert width % 2 == 0
fmt = '%%0%dd' % width # formatting string like '%04d'
s = fmt % i # so, render the number as a string '0003'
buf = [] # used as a buffer of resulting bytes represented as characters
for i in xrange(0, width, 2):
buf.append(chr((int(s[i]) << 4) + int(s[i+1])))
return ''.join(buf)

f = open('filename', 'wb')
f.write(intToPackedDecimal(3210))
f.write(intToPackedDecimal(1))
f.write(intToPackedDecimal(9999))
f.write(intToPackedDecimal(8888, 20)) # here use 20 digits instead of 4
f.close()

Chatable

I don't see anyone mentioning .pack in the other question.
If you're trying to convert in big endian (i.e. 3 becomes "\x00\x03" other than "\x03\x00") try struct.pack(">H",foo) (instead of "=H").
Also if you just want to save a python object to a file so you can load it later take a look at the marshall and pickle modules.

scogger1974

ASKER

2Y0466 2Y0466 ~CORDOBA SWIVEL CHAIR ASCP0 WODPP

Here is a single record from the flat text Most of the Items in this record are numbers. Most of the actual characters cannot be rendered on the screen. I get the Precision and the decimal Place from the Data Dictionary Stored elsewhere. Currently I use this to convert it to a decimal.

class converPack: # This converts IBM Packed Decimal as it works with a C-ISAM Database
def __init__(self,packed,precision,decimalplace):
unpacked=""
sign=""
#print "PACKED-->" + packed
precision=precision-2
if packed != "":
for c in packed: # for each byte
i = ord(c) # convert it to number
d1 = (i & 0xf0) >> 4 # value of upper 4 bits
d2 = i & 0x0f # value of lower 4 bits
unpacked = unpacked + str(d1)+str(d2) # add the upper and lower to the string
if unpacked[precision+1] == "5": sign="-" # 3 Means Positive and 5 Means Negative
unpacked=unpacked[0:precision-decimalplace] + "." + unpacked[precision-decimalplace:precision] # place the decimal
unpacked=sign+unpacked[0:precision+1].lstrip('0')# strip the beginning Zero's
self.strvalue = unpacked # send the unpacked result back

What I would like. Is a function that can put a changed number back into the file.

thats the best way I can explain it.

pepr

What is the data structure that you read the record into from the database. Is it a string (I mean the non-unicode Python string)? Is there a part of the string (i.e. index values) where the packed value is to be stored?

Please, show also the case how the +12.34 and -12.34 is strored in the packed form. Is it '\x01\x23\x43' and '\x01\x23\x45' with information about 2 decimal places somewhere else?

pepr

Firstly, I do not know the details of encoding of your packed decimal to bytes. My guess is marked by ??? in the following code.

I would recommend to slightly rewrite your code. I guess that the PackedDecimal class would be handy for storing the packed value plus the decimal places. Then converting from the packed decimal is similar to your original code. Another method is used to set the new value from int and decimal. As the string cannot be modified, it is often easier to convert it to the list of characters and back.Try the following self-standing script to get the idea.

class PackedDecimal:

def __init__(self, packedValue, decimals):
'''Constructor stores the packed value and the mumber of decimal places.'''
self.packedValue = packedValue
self.decimals = decimals

def __repr__(self):
'''Returns quoted representation of the tuple with packed value plus decimals.'''
return "\"(%s, %d)\"" % (repr(self.packedValue), self.decimals)

def __str__(self):
'''Returns human-readable string representation.'''
return self.asReadableString()

def asReadableString(self):
'''Converts the packed value into human readable string.'''
unpacked = []
for c in self.packedValue: # for each byte
i = ord(c) # convert it to number
unpacked.append(str((i & 0xf0) >> 4)) # digit from upper 4 bits
unpacked.append(str(i & 0x0f)) # digit from lower 4 bits

# The last value encodes the sign ???
sign = ''
if unpacked[-1] == '5': # minus sign
sign = '-'

# Chop the last 2 characters that were used to encode the sign. ???
del unpacked[-2:]

# Insert the decimal point.
unpacked.insert(-self.decimals, '.')

return sign + (''.join(unpacked)).lstrip('0')

def setFromInt(self, intValue, precision, decimals):
assert precision % 2 == 0 # including one byte for the sign ???

# Remember the sign and get the absolute value.
negative = False
if intValue < 0:
negative = True
intValue = -intValue

fmt = '%%0%dd' % (precision-2) # formatting string like '%04d'
print fmt
s = fmt % intValue # so, render the number as a string '0003'
buf = [] # used as a buffer of resulting bytes represented as characters
for i in xrange(0, precision-2, 2):
buf.append(chr((int(s[i]) << 4) + int(s[i+1])))

# Append the sign. ???
if negative:
buf.append('\x05')
else:
buf.append('\x03')

# Store the result into the data member.
self.packedValue = ''.join(buf)
self.decimals = decimals

if __name__ == '__main__':
# test

packedVal = '\x12\x34\x03' # 12.34 ???

pd = PackedDecimal(packedVal, 2)
print pd.asReadableString()

pd.setFromInt(-1234, 10, 2) # precision including the 2 positions (1 byte) for the sign
print pd.asReadableString()
print pd # notice here -- the same as previous "for free"!
print repr(pd)

# to simulate your record...
record = '\x00\x00\x00\x12\x34\x03' + 'some text '
print repr(record)
d = PackedDecimal(record[0:6], 2)

# Do some changes...
d.setFromInt(-987654, 12, 4) # -98.7654

# The record in the string must be converted to list of characters
lst = list(record)
print lst

# The part of the list must be replaced by new bytes that represent new value...
lst[0:6] = list(d.packedValue)
print lst

# ... and the list must be converted to resulting string (the modified record)
record = ''.join(lst)
print repr(record)

scogger1974

ASKER

>What is the data structure that you read the record into from the database. Is it a string (I mean the non-unicode >Python string)? Is there a part of the string (i.e. index values) where the packed value is to be stored?

>Please, show also the case how the +12.34 and -12.34 is strored in the packed form. Is it '\x01\x23\x43' and '\x01\x2\x45' with information about 2 decimal places somewhere else?

That is absoulutly corrrect. exactly how it stores it. I figured this out in the hex editor

Trying above script. a little advanced for me. since I'm only starting in python. but i'll give it a look

scogger1974

ASKER

Okay.. I tried the code above several different ways...

A couple of things.

the repr of the part of the record containing the Packed decimal doesnt seem to always have
numbers sometimes its letters as well.

\x01\x23\x45

could look like

'\x01\x2e\x45

and so forth. It seems that it may have to do with the type of UTF encoding. or at least I
believe it does.

The code that I use to convert it now uses the ORD command to parse each character to a number
then takes the upper and lower bits of that number. Maybe that has something to do with it
but Not sure. It seems that your code almost does what I need it to do.

Is there a command that is the opposite of ORD that I possibly can do the reverse. feed it the upper
bits and the lower bits to come up with the character i need?

Thanks for all your help so far.

pepr

Oh, I see. The repr() built-in function returns a string that, when passed to the Python interpreter, would be converted into the same internal representation as the argument. This means that for example '\x42' means exactly the same as 'B', because B's ordinal number is 42 in hexa or 66 in decimal. The repr() of a long string return partly escape sequences and partly normal characters. If the character cannot be printed, escaped hexa form is displayed instead of that single character. It is only a human (programmer) readable form of the internal representation that is unambiguous. In the code it was used only to display the string that could not be displayed by python. Try the following code and look at the repr.txt and also to its end.

------------------------------
f = open('repr.txt', 'w')
for i in xrange(256): # from 0 to 255
c = chr(i) # i.e. one-letter string
f.write("%3i: %02x '%c' %s\n" % (i, i, c, repr(c))) # dec, hexa, as char, as repr of the char

f.write('-' * 70 + '\n')

lst = [chr(i) for i in xrange(256)]
s = ''.join(lst) # one string with all 8bit characters
f.write(repr(s))

f.close()
------------------------------
Also \x2 is equal to \x02 in hex the same way as 2 is equal to 02 in decimal. 'e' is equal to \x65. Try in interactive Python

>>> repr('\x65')
"'e'"
>>> repr('e')
"'e'"

The ord() function converts the one-letter string into its ordinal number. The inverse function is chr(), but it works only for arguments in the range 0 to 255 (i.e. not for unicode characters).

pepr

Correction. The \x2e is the character '.' (dot). If this is the case, then '\x01\x2e\x45' probably is not the form of the packed decimal that you expect. But it could be representation of 1.45 if the norm permits the explicit usage of the dot in the number. I am not the expert for the packed decimal representation.

scogger1974

ASKER

Is there anything You can do with the example below. the idea is to add the correct amount of zero's to the beginning of the string and the sign to the end of the string to come up with the answer then take the numbers and turn them into characters.. although I latter of that is what I don't know yet.... just took a stab in the dark below

class converPack2:
def __init__(self,unpacked,precision):
sign=""
precision=precision-2
up = [unpacked.replace('.', '') for s in unpacked]
buf = []
lst = []
thb=''
if unpacked != ".":
if unpacked[0]=="-":
up.append("5")
else: up.append("3")
for i in range(precision-len(up)):
up.insert(0,"0")
print ''.join(up)
thb=eval(repr(''.join(up)))
self.strvalue = thb # send the unpacked result back

thanks

pepr

Does it mean that the packed value can contain the decimal point? Should not then the number of decimals be derived from the number itself? Some notes to your code:

up = [unpacked.replace('.', '') for s in unpacked]

This could be also written as

up = []
for s in unpacked:
up.append(unpacked.replace('.', ''))

which is wrong. Why? You are getting one by one character from unpacked into the s. Then you get the unpacked and replace the dot by nothing and append that value into the list up. You do not use the character s -- the for simply loops that many times, how many characters are in the unpacked. The body of the loop is unrelated to the character s. The replacement is done in each loop which is unnecessary. Finaly you get the list of equal strings. The strings are simply equal to unpacked without the dot.

You probably wanted to write something like that:

up = []
for s in unpacked:
if s != '.':
up.append(s)

which can be written as

up = [ s for s in unpacked if s != '.' ]

After that, you do not want to work with unpacked by with up (i.e. unpacked without dot broken to characters).

buf = []
lst = []
thb=''

not used later.

You can use up[0] the same way as unpacked[0], but after appending the encoded sign, you have to remove the '-' from the up:

if up[0] == '-':
up.append('5')
else:
up.append('3')

del up[0]

Use xrange() instead of range(). The second for loop seems O.K. otherwise.

The eval(repr(anything)) for anything equal to string is the string itself. The ''.join(up) is the string, so you do not get anything more. So, the command should be left off and the final line should be self.strvalue = ''.join(up)

scogger1974

ASKER

I guess the real goal is this... parsing the String is secondary.

What I really need is a way to do this.

I already can convert ' ' to a decimal.

Lets say the above converts to 123.50. Which I can already do with the routine that I have.
I may get 0000123503 with my converter ... then in my data dictionary it tells me that the precision is 9
So now I know that I am dealing with 9 digits + a Sign .. Thank my data dictionary tells me that I have a
decimal place of two . so now I know to put the decimal between the 3 and the 5. Okay.. thats great
but Now I want to take 123.50 and lets say add 20 to it. Okay now I want to take 143.50 and say let me
apply the reverse to it as I did to convert it... so I remove the decimal and put in the sign which gives me
143503 okay but I need to add zero's to the front of this string so it matches the other so now I get
0000143503 ... great.. almost there.. now.. here is where i run into my stumbling block. how can i get
0000143503 to give me the equivalent of the string ' ' don't know... so I try this. I try to see if i can
take 0000123503 and get the same string I got before I converted.. So I try every way possilbe to make
lets say '@sB' Converts into a number then I convert it to a ########## string then try to convert it back to
the string it was. Cant do it.. I've tried several ways even used the code you gave me but I never can get
that far.

thats my thought process and thats where I'm stuck..

Thanks for your help

scogger1974

ASKER

Got It !! .. thanks for all your help.. the below is exactly what I need to convert it back to the string format.
it works like a charm...
I feed it a decimal such as "123.40" or "-123.40" and it converts it back.

class converPack: # This converts a Decimal to IBM Packed Decimal as it works with a C-ISAM Database
def __init__(self,unpacked,precision):
unpacked= unpacked.replace(".","")
if unpacked[0] == '-':
sign="5"
unpacked= unpacked[1:len(unpacked)]
else:
sign="3"
precision=precision-1
packed = ''
for g in range(1,precision-len(unpacked)):
unpacked = "0" + unpacked
unpacked=unpacked+sign
for i in range(2, len(unpacked)-1):
byte = (int(unpacked[i]) << 4) + int(unpacked[i+1])
packed = packed + chr(byte)
self.strvalue = packed

pepr

You can replace the loop

for g in range(1,precision-len(unpacked)):
unpacked = "0" + unpacked

by prepending the substring of zeros of length (precision-len(unpacked)) this way:

unpacked = ("0" * (precision-len(unpacked))) + unpacked

For example '-' * 70 is a string consisting of 70 dashes.

And use xrange instead of range. It works logically the same way and it is more time and memory efficient.

scogger1974

ASKER

Okay I will try, also I made a mistake with range it should have read
range(0, len(unpacked)-1,2)

I put the 2 in the wrong place.

didnt realize it until I got the results from the modified data file

Thanks for all your help.