asked on

Reversing bytes

I need to write a function that, given a byte, will return the byte with bits in reverse order. For example, if it is given
0101 1111
the return value would be
1111 1010
Is there a better way to do this than checking bits in a loop and shifting the source byte and the target byte ?
I know this sounds like homework, but it isn't. The reason I need to do this is that I want to read data from a device through LPT. I have a function that reads a word, but, inside that word, one byte is reversed. I need to un-reverse it QUICKLY, and the shifting loop takes too much time.
I realize that I can do this faster in Assembly using the carry register, but I was looking for a way to do this in plain C, with & | ~ ^ and so on.

Sinclair

ASKER

Edited text of question

Mirkwood

Hmmm, I don't this that you can do it in less than 24 operations:
8 and operations to determine the bits and 8 shift operations to put them in the correct place and 8 to add the parts

Mirkwood

See http://www.specs.de/users/danni/appl/soft/mirror/mirror.lst
for the fastest assembly code to do it.

rbr

The fastest way is to use a unsigned char array[256] were the reverse info is stored

unisgned char reverse[256]=
{0, 128, 64, 192, 32, 160, ....}
I think there will be no faster way.

ozo

You could do it in C with 4 shifts
but an array lookup is probably the fastest on most architectures

rbr

Mirkwood

BTW: Sinclair, if you look at the assembly code, you will see that there is a faster algorithm. The assembly code takes 17 clock cycli.

rbr

But the fastest algorithm in assembly code is the first one which uses a table as I mentioned. A good C compiler will generate a very fast code with the table.

Mirkwood

Yes, but if you use a table you can better use a full table of all 256 values and it saves a lot of code.
reva = convtable[a]

Mirkwood

Sorry rbr, now I understand your solution. That is what you proposed. I'm out of here. Your answer is the best

alexo

https://www.experts-exchange.com/Q.10074236

Sinclair

ASKER

Ok, I am really intrigued now. I agree that a table lookup SEEMS like the fastest idea, but it has to get its values from RAM, while other solutions that people gave (using shifts and masks) can work entirely in the registers, which MAY be faster...

Sinclair

ASKER

Actually, I believe ozo's solution from https://www.experts-exchange.com/Q.10074236 takes the cake:

((input*0x202020202)&0x10884422010)%0x3ff

My "aside" question to ozo is:
How in the world would you come up with that ? Can you explain step-by-step what you did ? I think I understand this conceptually, but I want to be sure... Thanks :-)

rbr

This table needs 256 bytes and it should be no problem to hold them permanent in the RAM to avoid paging. And the access to this array is only one assembler instruction. Look at http://www.specs.de/users/danni/appl/soft/mirror/mirror.lst where the table is the fastest algorithm.

rbr

input*0x202020202)&0x10884422010)%0x3ff needs one integer multiplikation and division. This wouldn't be faster than a table.

Sinclair

ASKER

Sorry to be such a pain in the ass, but:

I assume you are referring to
;C: ACC:
MOV C, ACC.1 ;1 76543210
RLC A ;7 65432101
MOV ACC.2, C ;7 65432701
MOV C, ACC.3 ;2 65432701
RLC A ;6 54327012
MOV ACC.4, C ;6 54367012
MOV C, ACC.5 ;3 54367012
RLC A ;5 43670123
MOV ACC.6, C ;5 45670123
SWAP A ;5 01234567
RET

(quoted from the page).

Will this work on a PC (x86) ? In particular, can you access individual bits like that ("ACC.1"), and is there a SWAP command ? It's been a while since I used Assembly on anything that big...

ASKER CERTIFIED SOLUTION

rbr

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

rbr

Read the all the comments. ozo and mirkwood think that a table would be the fastest solution on the most architectures.

alexo

rbr, what's wrong with the XLAT x86 opcode?

sergelebel

Sinclair.. here is an Assembler function to be called from C
which will also get the job done.. it's not as complicated to understand and should be easy
to incorperate into your code, Assemble it and add
the Obj when you compile your program...

;-- REVERSE: Reverse the bits in AL -----------------------------------
;
;Usage: X = _ReverseByte (Y);
;
;Declarations: unsigned int X,Y;
;
;Note: the value in Y must be between 0 and 255 since we are
; only talking about 1 byte..
;

PUBLIC _ReverseByte
_ReverseByte proc near
push bp ;save BP register
mov bp,sp ;set BP to SP (stack pointer)

mov ax,[bp+4] ;AX = value of passed parameter

cmp ax,0 ;is Prameter is Zero
je reverse_099 ;don't waiste your time

push bx ;save BX
push cx ; CX

xor ah,ah ;clear AH for bit setting
mov cx,8 ;8 bits to reverse, we start at LSB
mov bl,00000001b ;walk threw AL with BL lsb..msb
mov bh,10000000b ;walk threw AH with BH msb..lsb

reverse_001: test bl,al ;is AL (bit) set or not
jz reverse_002 ;no, skip the AH bit setting
or ah,bh ;yes, set the bit in AH with BH
reverse_002: shr bh,1 ;traverse bits from lsb to msb
shl bl,1 ; bits from msb to lsb
loop reverse_001 ;repeat for 8 bits

xchg al,ah ;reversed AL is AH, switch into AL
xor ah,ah ;clear high working register

pop cx ;restore destroyed registers
pop bx ;

reverse_099: pop bp ;restore BP
ret ;leave Result in AX
_ReverseByte endp

any questions...Assemblr@hotmail.com

sergelebel

Sorry...made a mistake...

the Usage should say X = ReverseByte (Y);

Sinclair

ASKER

Ok, it appears that rbr's answer (a lookup table) is still the fastest, so I will give him the points. I wish I could give them to all you people, because your answers are also very interesting, but the system won't allow that.
Anyway, I am STILL wondering about ozo's solution, so, ozo, if you're reading this, please email me at
bugmaster@earthlink.net
thanks.

alexo

>> but the system won't allow that.
posting a polite request (0-point question) to Linda in the customer support area usally takes care of that. Make sure to include names and the Q. number.

alexo

Does this help?
http://support.microsoft.com/support/kb/articles/q129/3/15.asp

(The first access of MSDN-online will require a one-time registration process)

ozo

I'm not sure what there is to explain about the multiplication solution,
other than observing what it does for each bit, and noting that the result for the sum of bits is the sum of the results for each bit.
basically the multiply duplicates the byte at different offsts,
the and selects bits in key positions, and the moduo collects them.

(BTW, I have seen at least one architecture on which the multiply could potentially be faster than the lookup,
but I don't think I've ever seen a web browser written for that architecture,
so if you're reading this, chances should be good that the lookup will be faster on your machine)

alexo

What is this architecture?

Sinclair

ASKER

Ozo, can you maybe explain the thought process you used to arrive at the multiplication solution ? How does the modulo "collect" the bits ?