Can't understand this - bitwise ops
Posted on 2004-04-29
I'm just not making any sense out of this code and that really bothers me :) See if you can understand why it works. If you can please explain it to me!
This is basically the core loop of microsoft's memchr crt function.
the char we're looking for (rememeber it's ASCII so only 7 bits are used) is stored in every byte of the dword N
IE if your looking for 'C' (43 in hex), then N = 0x43434343
read 4 bytes as a dword into A ((7e432e5a, 7e3c2e5a) see below)
I worked this out with hex numbers on a calculator for two values of A, in brackets next to the ops. First value is 4 bytes, one of which is 0x43, the second value is 4 unmatching bytes.
xor A with N. A becomes (3D006D19, 3D7F6D19). The matching byte of the dword is completely cleared now.
Set a second dword, B equal to the constant 7EFEFEFF (why this value?)
Add A into B.
xor B with -1 (0xFFFFFFFF) B becomes (C2FF92E6, C28092E6)
xor B with A, B becomes (C2FF92E6, 7EFEFEFE)
and (&) with 0x81010100. If the result is zero, the character is not in the dword. If it is zero, the character may* be in the dword.
So that's it. Why those values for the constants? I need to modify this code to look for an entire byte (not just the first 7 bits) in 4 bytes of a dword. I suspect it's not just a 'plug and play' thing. It might not even be possible to use this code to find an enitre byte. It might be making use of the fact that the 8th bit will be cleared.
*there is some further processing, which may reject A as containing the char, included in assembler below. ecx is A. bl is 'C', the character we're looking for.
xor cl,bl ; is it byte 0
je short byte_0
xor ch,bl ; is it byte 1
je short byte_1
shr ecx,10h ; is it byte 2
je short byte_2
xor ch,bl ; is it byte 3
je short byte_3
jmp short main_loop ; taken if bits 24-30 are clear and bit
; 31 is set