Solved

bit manipulation in C

Posted on 2006-06-15
17
2,001 Views
Last Modified: 2012-06-27
Hi,
I have a data structure to represent 1024 bits
something like:
unsigned long myarray[16] ;

How do I efficiently iterate through it to find which bits are set.
Any of the bit manipulation macros such as "test_bit" are sequential and hence makes it an o(n) operation.

Is there something I can do like setting a mask every time a bit gets set and then use that mask to find
out later  all the set bits.


0
Comment
Question by:sanjay_thakur
  • 5
  • 4
  • 3
  • +2
17 Comments
 
LVL 15

Expert Comment

by:bpmurray
ID: 16915874
Iterating using >> or << isn't terribly efficient, so I'd add some optimizations - check for 0 and ~0 to get rid of the all zero/one values. You could also use this method in test, to see which values are the most common, and include these in the explicit tests.
0
 
LVL 16

Expert Comment

by:PaulCaswell
ID: 16916266
Hi sanjay_thakur,

Try:

#define byte unsigned char
#define bit(n,a) (((byte *)a) [ n / 8 ] & ( 1 << ( 7 - ( n % 8 ) ) )

Paul
0
 
LVL 41

Expert Comment

by:HonorGod
ID: 16916442
#define LEN(x) ( sizeof( x ) / sizeof( x[ 0 ] ) )


int i,j;
for ( i = 0; i < LEN( myarray ); ++i ) {
  int val = myarray[ i ];
  if (  val != 0 ) {
     for ( j = 0; j < 16; ++j ) {
       if ( bit( j, val ) ) {
         printf( "Byte %d  Bit %d  is on\n", i, j );
       }
     }
  }
}
0
 
LVL 4

Author Comment

by:sanjay_thakur
ID: 16920706
For an int there is something like

int countOnes(int num) {
while(num) {
number = number & (number -1);
numOnes++;
}
return numOnes;
}

The run time of this is o(p) where p is the number of 1's in the int.

I was wondering if I can do something similar for the problem in hand. i.e take each array entry which is an unsigned long and check only the bits that are set.

In the above mentioned solution by HonorGod, checking the array element doesn't help if I have a
situation where a bit is set in each array element.

Also I am not looking for implementation of testing a particular bit. I am looking for how to skip the unset bits;
for example if bit 0 and bit 1024 is set how do I skip checking all the bits in between?
0
 
LVL 15

Expert Comment

by:bpmurray
ID: 16921073
Well, you could use a set of arrays to indicate what you want - the usefullness of this solution depends on the meanings of hte individual bits. Say, for example, you wanted to execute a method for each bit that was set, you could do something like (this is probably full of errors - I just typed it it without any checks_:

typedef struct {
     int (*pFn0)();
     int (*pFn1)();
     int (*pFn2)();
     int (*pFn3)();
     int (*pFn4)();
     int (*pFn5)();
     int (*pFn6)();
     int (*pFn7)();
} t_FN;

t_FN funcs[0][8] = { { 0,0,0,0,0,0,0}, {0,0,0,0,0,0,0,MyFN} ...} ,
                           { { 0,0,0,0,0,0,0}, {0,0,0,0,0,0,0,MyFN} ...},
etc.

ProcessResult(unsigned long *foo) {
   int i;
   t_FN x;
   for (i=0; i<8; i++) {
      x = funcs[i][foo[i]];
      for (j=0; i<8; j++) {
          if (x[j])
             x[j]();
     }
   }
}
0
 
LVL 18

Expert Comment

by:JoseParrot
ID: 16923538
Hi,

Sounds an academic challenge.
I can show a constant time O(1) solution, much much better than O(n) or even O(p) if you tell us this is not your homework.

Jose
0
 
LVL 16

Accepted Solution

by:
PaulCaswell earned 25 total points
ID: 16923554
There's an old bit-twiddlers trick I've always wanted to use.

Look at the function:

y = x xor (x-1)

For example, take the bitmap

0110010110101000001000

apply the function and you get

0110010110101000001000
xor
0000000000000000000111

Now that doesnt actually seem to help much but hold on! Add one to it and subtract it from the original number and you've removed the bottom bit of the pattern! Repeat that until the number becomes zero and you know how many bits there were, and you may actually have looped less than n times!

Pretty good if more than usual are zero (or one for that matter). If bits are equally likely to be in either state then you only loop n times so it should work.

Now this is just a guess, I havent tested this but it would be something like:

unsigned long bits = 0x12345678L;
int count = 0;

for ( count = 0; bits != 0; count++ )
{
 bits -= (bits ^ (bits-1)) + 1;
}

This should be O(n) at maximum I think!

If the bits appear in clumps, like in a one-bit deep image, it may be worth inverting the value each time around the loop.

Paul
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 
LVL 16

Expert Comment

by:PaulCaswell
ID: 16923591
Sorry, that was the same algorithm you posted. :-)

Paul
0
 
LVL 4

Author Comment

by:sanjay_thakur
ID: 16924948
In response to JoseParrot 's comments:
This is not an assigment.
Optimization is not just an acamedic thing. It gets used too, in every day S/W engineering.
Bit manipulation and optimization can have multiple approaches and I am trying to pool in different solutiions by the capable people on the forum.
0
 
LVL 18

Expert Comment

by:JoseParrot
ID: 16925668
Hi,

I agree. My question was due the number of students that post homework here. In these cases, I do preffer to show the way to find the solution by themselves rather than offer an idea or the solution. Not intended to be offensive or rude, of course.

Lets go to the problem.
The naive algorithm to show which bits are set has O(n) time complexity, not satisfactory as stated.
As we want a better time, we must pay something to do that. The price is space. So the algorithm proposed has constant time and O(256n) space. (Actually a naive algorithm too, but on space).

To describe the concept, let suppose we need to identify the bits of a single byte.
Tha data structure will be:

char A              // input data
string B[256]    // output data

//initialization
B[0] = ""
B[1] = "1"
B[2] = "2"
B[3] = "1,2"
B[4] = "3"
B[5] = "1,3"
B[6] = "2,3"
B[7] = "1,2,3"
       :
       :
B[255] = "1,2,3,4,5,6,7"

showBits(byte A)
{
   // A has the byte
   string answer = B[A]
   print "Bits set are: "
   print answer
}

By repeating to 128 bytes, to have 1024 bits, the algorithm is:
showBits(A[0],...,A[127])
{
   // Data structure
   char A[128]             // the input data, a chain of 128 bytes
   string B[128][256]   // the output data for strings
   string answer
   char a

   // initialization
   B[0][0] = ""
   B[0][1] = "1"
       :
       :
   B[127][255] = "1015, 1016, 1017, 1018, ..., 1023"

   print "Bits set are: "
   for i = 0, 127          // fixed number of bytes (note 1)
   {
      a = A[i]
      answer = B[i][a]
      print answer
   }
}

Note 1
Strictly speaking, the time complexity is proportional to the number of bits n, so it is O(n). Actually the naive algoritm has a loop from 1 to 1024, that is O(n). The proposed algorithm is O(1) for 1 byte, O(2) for 2 bytes, then it is O(128) for 1024 bytes, that is O(n/8), which is, as per the O notation, still O(n). If our step is size of unsigned long, we have a loop of only 16, then O(n/64), also O(n) by notation. Actually these algorithms are not O(n), they are theta(n) instead.

The memory size is huge compared to the naive time O(n) algorithm, which is just the entry size.
For the proposed solution, for 1024 bits arranged in a chain of 128 bytes, on need:
128 bytes (for A[128] ) + 128 x 256 strings of up 40 bytes (for B[128][256])  that is around 1.25 MB.
The initialization is time consuming, but need to run once a time.

Resume
Both naive and proposed algorithms are O(n) time complexity.
The difference is that naive loops 1024 times and proposed loops 128 times (for 1 byte modulo size) or 16 times (for 8 bytes modulo size). The disadvantage of 8 bits modulo size is the exponential growth of memory needs, that will be 16 x 2^64!! Anyway, it is a matter of balance of speed versus memory.

If the question is just to know how many bits are set to 1, things are easer.
For example, for just one byte:

initialize()
{
   int B[256]         // output data , now a int type

   //initialization
   B[0] = 0
   B[1] = 1
   B[2] = 1
   B[3] = 2
   B[4] = 1
   B[5] = 2
   B[6] = 2
   B[7] = 3
   B[8] = 1
       :
       :
   B[255] = 8
}

showHowManyBits(byte A)
{
   // A has the byte
   int answer = B[A]
   print "Bits set to 1 are: "
   print answer
}

As in the previous example, this also can be extrapolated to 1024 bits.

Hope it helps.
Jose
0
 
LVL 4

Author Comment

by:sanjay_thakur
ID: 16935513
In response to joseParrot
The solution is quiet neat but expensive as you suggested.
The problem is I need this code in a device driver module and I am not comfortable with memory requirement.
So I will need to fall back to some bit twiddling tricks.
0
 
LVL 16

Expert Comment

by:PaulCaswell
ID: 16936815
You can take a part-way approach. Build only a single 8-bit table with 256 entries and lookup each byte in the source.

Paul
0
 
LVL 18

Assisted Solution

by:JoseParrot
JoseParrot earned 25 total points
ID: 16939451
Paul's idea is an improvement in space domain, at the cost of updating the table 128 times, resulting in 128 x 256 = 2^15 operations. This is more than the 2^10 to improve on.
No pain no gain. The smaller the time, the bigger the space.
Anyway, can be the direction to an intermediate solution.

What sanjay_thakur looks for is a tricky solution.
That is, few time, few space and much ingeniuty.
This is no pure logic solution, it is an inspired solution...
When and if I have the insight... I'll be back.

Good luck.

Jose
0
 
LVL 18

Expert Comment

by:JoseParrot
ID: 16939966
Can't resist...
The code:

// FindSetBits.cpp : find set bits and identify them trickly

#include "stdafx.h"
#include <stdio.h>
#include <math.h>

// definition
#define QTBYTE 128
#define QTBITS 8
typedef unsigned char BYTE;

// prototype
void FindBitsSet(BYTE n, int ref);
void FindBitsSetTrick(BYTE n, int ref);

// global
int LogTable[256];

//////////////////////////////////////////
void main(int argc, char* argv[])
{
      BYTE A[QTBYTE];
      int i,bit;

      bit = 1;
      // prepare the log table, once
      // actually initialized as constant
      for (i=0;i<8;i++)      
      {
            LogTable[bit]=i;
            bit = bit<<1;
      }

      for (i=0;i<QTBYTE;i++) A[i]=0;
      A[0]  = 137;            //10001001
      A[1]  = 255;            //11111111
      A[22] = 1;
      A[31] = 12;            //00001100
      A[QTBYTE-1]= 128;            //10000000
      for (i=0; i<QTBYTE; i++)
      {
            BYTE p = A[i];
            if (p != 0)
            {
                  FindBitsSetTrick(p, QTBITS*i);
                  FindBitsSet(p, QTBITS*i);
            }
      }
}



void FindBitsSetTrick(BYTE n, int ref)
{
      int bit, i;
      BYTE tmp;
      
      printf("\n%u\t",ref);
      while (n > 0)
      {
            tmp = n & (n-1);
            // xor operation shows the discovered bit
            i = n ^ tmp;
            bit = LogTable[i];
            printf("[%d] ",ref+bit);
            n = tmp;
      }
}


// just to compare results
void FindBitsSet(BYTE n, int ref)
{
      int count;
      
      count=0;
      printf("\n%u\t",ref);
      while (n != 0)
      {
            if (n & 1)
                  printf("[%d] ",ref+count);
            n=n>>1;count++;
      }
}


Jose
0
 
LVL 18

Expert Comment

by:JoseParrot
ID: 16954661
Hi,

Still interested?
The above code isn't theory. It is real code, ready to run in any C compiler. I have tested with Visual C++ 6.0 and the results between Dummy (by testing bit by bit) and tricky (dependent on how many bits are one), range from
omega(n/128) for 128 x BYTE or 16 x 64i, up to Theta(n).

That is: best case trick algorithm is as low as 16 iterations. Compared to 1024 Theta(n) of dummy algorithm, it is the best you can wait for.

By the way, the response is absolutely what stated in the question.

You wrote:
>int countOnes(int num) {
>while(num) {
>number = number & (number -1);
>numOnes++;
>}
>return numOnes;
>}

The void FindBitsSetTrick(BYTE n, int ref) I post is absolutely the same complexity. O(p).

You wrote:
>I am looking for how to skip the unset bits;
>for example if bit 0 and bit 1024 is set how do I skip checking
>all the bits in between?
>
>The run time of this is o(p) where p is the number of 1's in the int.

The loop in my posting:
          if (p != 0)
          {
               FindBitsSetTrick(p, QTBITS*i);
               FindBitsSet(p, QTBITS*i);
          }
Skips groups of 8 zero bits (64 if implemented with int64) at once. Only the first and last bytes will be tested. And just the set bits in these will be tested.

You wrote:
>So I will need to fall back to some bit twiddling tricks.
No need for it. The solution I post is exactly a trick on bits. BTW I discoverd the trick on your own example.

If I am wrong, please tell me. I would be glad in continue the hunting for the solution and really glad if it would be useful to you.

Thank you.
Jose
0

Featured Post

Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
wordsWithoutList  challenge 24 73
matchUp  challenge 9 71
sumDigits  challenge 7 60
mapBully challenge 6 88
This article is meant to give a basic understanding of how to use R Sweave as a way to merge LaTeX and R code seamlessly into one presentable document.
Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now