Community Pick: Many members of our community have endorsed this article.

How to Be a Better Programmer: Bits and Bytes

Does the idea of dealing with bits scare or confuse you? Does it seem like a waste of time in an age where we all have terabytes of storage? If so, you're missing out on one of the core tools in every professional programmer's toolbox. Learn how to do amazing, modern things with bits!
In the past year, I've had two different developers come to me for help with problems that were actually quite simple. The underlying issue in both cases was that the developers had either never learned or had forgotten everything about working with the most basic unit of computing - the bit.

A lot of programming languages have gotten so high-level and easy to use that it's become easy to forget about things like bits. Why worry about those when your boss is asking you to fix a bug with e-mail address validation?

So you fix the bug and then your boss asks you to enforce IP filters so that only people with IP addresses within can hit certain web pages, which prompts you to adopt a confused look. You've seen that kind of IP address notation before and you know it has something to do with masking, so you start looking for an explanation. When you find that people are talking about bitwise operators you get that dread in your stomach because dealing with bits is like a foreign language to you.

Here are three pieces of good news:
#1. Bits are a lot easier to understand than a foreign language.
#2. You probably already know a lot of the concepts but don't realize you know them.
#3. Understanding bits will make you a better programmer.

Let's dive in!

Billions... Wasted!

Imagine you're watching your favorite sport on TV and you look at the scoreboard to see what your team's score is. It looks like your team has 122 points. Then you realize that the number "122" is split onto 3 separate, large LCD screens: "1" "2" and "2". Wow, you think - what a waste of money. Each one of those LCD screens probably costs several hundred dollars, and they could easily just show the full number "122" on a single LCD screen.

Yet, this is precisely the kind of wasteful thing that many programmers do every day, and I'll give a more relevant example. Let's say that an average programmer named Matt is storing the ages of people in a database. Matt creates his database and creates a column that holds up to 3 characters (since most people will not live beyond 999 years old). All of his data seems to be stored perfectly well in the database and he can get it back out easily later on. He thinks everything is fine.

What he doesn't realize is that he's using up to 3 bytes of data for information that can be stored in 1 byte of data. That's up to 2 extra bytes of wasted data. It doesn't seem like much, but let's say that 5 years later, he has 10 million contacts in his database. Now that waste has multiplied into 20,000,000 bytes of wasted data! Now let's say that he has a backup system which stores up to 30 days of full backups. He now is storing 20 million wasted bytes * 30 days = 600,000,000 bytes of wasted data altogether - all because he didn't use the proper data type at the beginning.

Hopefully, most programmers will use an integer for an age column, but I've seen many people create 15-character text fields on their databases to store IP addresses like "" - and that's just IPv4 addresses. We're entering the age of IPv6 addresses like "fe80:abcd:ef01:2345:6789:0abc:def0:1234", and I can already see people creating 39-character text fields to store that data. Programmers are doing this because they don't understand bits well enough to know how an IPv4 address can actually fit into 4 bytes instead of 15.

Imagine Matt creating a database that stores access logs of a web server, and it stores the IP addresses as 15-character strings. Five years later, with 30 days of backups, we're talking about many gigabytes of wasted space and bandwidth.

Bits, Basements, and Bulbs

So let's talk about how Matt can improve his situation. The first step is understanding what a single byte can do - and you might already know some or most of this.

A single byte can hold any number between 0 and 255. This is because a single byte is made up of 8 bits, and each bit is either a 0 (off) or a 1 (on).

An easier way to think about this is to imagine going into a completely dark basement. There are eight light switches on the wall, so you flip the one on the right, and a small lamp in the corner with one tiny lightbulb turns on. It makes the room a little brighter, but you want more brightness, so you flip the next switch (the second one from the right), and this time, a different lamp turns on. This new lamp has two lightbulbs in it, so now you have three lightbulbs turned on in the room.

You flip on the third light switch from the right, and an even bigger lamp turns on - this one has 4 lightbulbs in it, so now you have a total of 7 lightbulbs turned on.

You start realizing that the number of bulbs is doubling on each lamp as you move from the right to the left. So you flip the left-most light switch and suddenly the room gets really bright because a huge chandelier with (yes, you guessed it) 128 lightbulbs suddenly turns on. So now you have a total of 135 lightbulbs turned on.

You turn on all of the eight light switches, and you end up with every lamp and chandelier in the room turned on, with a total of 255 light bulbs (and a really bright basement).

This is precisely how a computer thinks about numbers. A byte is like a room with 8 light switches that goes from a total value of 0 (no switches turned on) all the way up to 255 (all switched turned on). So a value of zero looks like 00000000 while a value of 255 looks like 11111111. Or an easier way to look at it might be:

A byte value of zero has all bits set to off:
 0   0   0   0   0   0   0   0
___ ___ ___ ___ ___ ___ ___ ___
128  64  32  16  8   4   2   1

A byte value of 73 is made up of the bits for 64, 8, and 1 being turned on:
 0   1   0   0   1   0   0   1
___ ___ ___ ___ ___ ___ ___ ___
128  64  32  16  8   4   2   1

A byte value of 255 has all bits turned on:
 1   1   1   1   1   1   1   1
___ ___ ___ ___ ___ ___ ___ ___
128  64  32  16  8   4   2   1

Simple enough, right? So what happens if we need to store the number 256?

We're Gonna Need a Bigger Boat

Great question - I'm glad you asked it, and the simple answer is:  use another byte! The harder question is how to use the two bytes together. After all, each byte can only hold a value that goes up to 255.

The answer is simply a matter of instructing the computer to treat both bytes as one, big number. If you think about it, we already do this every day with numbers. For example, the number 24 is technically two digits - 2 and 4. But when we read them together, the 2 isn't just a 2 - it represents 20. So when a computer needs to deal with numbers that are larger than a single byte, it simply needs to be told that "there are 2 bytes here, and they should be considered as one big number."

Or Maybe Just Bigger Bits....

In programming languages, a single byte has 8 bits, so it's an 8-bit number. A two byte number is a 16-bit number and is commonly referred to as a "short." When a computer is told to treat 2 bytes as a 16-bit number, it keeps the same pattern of doubling numbers, so the values of one byte suddenly become much larger. It looks like this:
_____ _____ _____ _____ _____ _____ _____ _____   ___ ___ ___ ___ ___ ___ ___ ___
32768 16384  8192  4096  2048  1024  512   256    128  64  32  16  8   4   2   1
                  Byte #1                                      Byte #2

So by being told to use two bytes, a computer can handle numbers between 0 and 65,535.

So there's really no difference in how the data is stored - every piece of data out there is just a series of 8-bit bytes. However, with a little bit of programming instructions, a computer can be told to sometimes use more than one byte in order to hold larger numbers.

This same concept extends into 32-bit numbers and 64-bit numbers. A 32-bit number, commonly known as a "long", is made up of four bytes and can hold numbers up to 4,294,967,295. A 64-bit number uses eight bytes and holds values up to 18,446,744,073,709,551,615.

Bazillions: An IP Address Love Story

Now we're entering an age of 128-bit computing, and the first most common appplication of 128-bit numbers is the IPv6 standard. While IPv4, which is actually a 32-bit number (again, a "long"), allowed up to 4.2 billion IP addresses in the address, IPv6 addresses are 128-bit numbers. I'll allow you to think about how many IP addresses that are allowed with IPv6.

Some people might look at an IP address and say, "That's not a 32-bit number." What they are missing is that the whole "###.###.###.###" syntax is simply a visual representation of a long number. Each ### is simply the value of each byte. So "" is actually:

Byte #1 = 72
Byte #2 = 123
Byte #3 = 45
Byte #4 = 67

If we actually take those 4 bytes and tell the computer to treat them as one big number, we get 1,216,032,067. Likewise, you can "convert" 1,216,032,067 back into that IP address at any time. So in reality, Average Matt could store all his IPv4 IP addresses as "long" numbers, taking up only 4 bytes of space each time, instead of up to whopping 15 bytes of text per each one.

An additional benefit to storing them as long numbers is that you can use math with numbers. Need to find out if is between and You can either break up the strings into pieces and compare each set of numbers, or you can simply convert all three to long numbers and do simple less-than and greater-than comparisons.

Just a Little Bit of Negativity

One point of interest - a computer often needs to work with negative numbers. So pretty much any number can be told to be treated as signed or unsigned. So far, we've been talking about unsigned numbers (or numbers that don't have any possibility of having a negative "sign" in front of them), so an 8-bit number goes from 0 to 255, a 16-bit number goes from 0 to 65,535, and so on.

When the computer is told to treat a number as a signed number, the maximum value is basically split in half, and you get half of it as negative, and half as positive. So a signed 8-bit number can go from -128 to 127. A signed 16-bit number can go from -32,768 to 32,767. One bit is used to indicate whether the number is negative or positive.

Some languages, like PHP, will default to using signed numbers all the time. So when you try to tell it to deal with a really large 32-bit number, like 3,000,000,000, it can only go up to 2,147,483,647 before it has nowhere else to go. So it will then "wrap around". So if you tell PHP to convert "" to a long, it'll correctly tell you the result is 2147483647, which is the maximum value for a signed 32-bit number. However, if you just add 1 and tell it to convert "" to a long, it will come back with -2147483648.

So it can be helpful to understand what your programming language is doing for you whenever you ask it to deal with different types of numbers.

Bit Maps and Other Creative Uses for Bits

Moving on, it can often be useful to work with bits. There are numerous applications and creative ways you can use all 8 bits of a byte, and they don't always have to be math-related problems. For example, let's say that you wanted to log a complex, daily schedule of times that you were working, in 5-minute blocks, like this:

My Billable Time
12:05 AM - 12:15 AM
12:30 AM - 12:35 AM
1:00 AM - 1:35 AM

Imagine you had a hundred entries in your log. Now, you could store the start and stop time for every entry. Or you could simply use bits to represent 5-minute blocks throughout the day:

Bit #1 = 12:00 AM - 12:05 AM
Bit #2 = 12:05 AM - 12:10 AM
Bit #3 = 12:10 AM - 12:15 AM

So in the above example's first three entries, you could have a "bit map" that looks like this:

The first bit is 0 because you didn't work from 12:00 AM to 12:05 AM. The next two bits are 1, indicating that you worked from 12:05 AM to 12:10 AM and from 12:10 AM to 12:15 AM, and so on.

So in 24 bits (3 bytes), we've described someone's activity in 5-minute increments for the first 2 hours of the day! You could do the same thing for an entire 24-hour period in only 36 bytes. For the purposes of comparison, that's the same amount of bytes it takes to store the following part of this paragraph: "So in 24 bits (3 bytes), we've descr". So in the same few bits that would be used to store that fragment of a sentence, you could store the entirety of someone’s billable time with 5-minute accuracy!

Of course, the results wouldn't really make any sense as numbers. You'd have all sorts of "random" numbers, but the total values of each byte would have no purpose here. You'd be using each byte to simply record eight 5-minute blocks of time.

This, This, and This, But Not That

Another common way of using bits is to allow someone to mix and match permissions. File permissions on a UNIX system can be pretty simple. You can read a file, write to a file, and/or execute a file. So what if you want someone to read and execute a file, but not write a file? Bits to the rescue!

The common permission set looks like this:

Bit value of 1 = Execute
Bit value of 2 = Write
Bit value of 4 = Read

So if you want full permissions, you end up with 7 (1 + 2 + 4). If you want just read and execute permissions, you'd use a permission of 5 (1 + 4). If you want read and write, but not execute, then you'd have a permission of 6 (2 + 4). You can store any combination of up to 8 different permissions in a single byte!

Bitwise Operators

Because of all the unique applications here, most programming languages provide a set of functions and tools for dealing specifically with bits. These are usually called "bitwise operators" and they tend to have similar syntax between different languages. Usually you'll have tools for moving bits left or right (e.g. shifting a bit to the left look like going from 00000001 to 00000010, which is actually the same thing as multiplying a number by two). You'll also often have tools for flipping the bits (all 0s become 1s and all 1s become 0s), or comparing two different sets of bits (e.g. "show me all the 1 bits from value A that are also set in value B").

Being Bitwise with IP Address Masks

The operators are simple and there's only a handful of them, but they DO take a little bit of extra thinking to use efficiently. For example, bitwise operators turn a complex concept like an IP address netmask (like our earlier CIDR example of "") into a very simple matter of comparing a bunch of 1s and 0s. It's simply a matter of finding out how someone else decided how they wanted to use the bits. In the CIDR example, the "/24" simply means "of the 32-bits that make up this IP address, keep the first 24 bits as they are, and then the remaining 8-bits define the range, from 0 to 255." So "" literally means "keep the 71.23.45" part. Or in other words, " to".

Was this Helpful? Click a Button!

Hopefully this opens up some new ways of thinking about data and how you can make more effective use of every single byte of data. Thanks for reading and be sure to vote this article as helpful if you liked it.

Copyright  ©  2015 - Jonathan Hilgeman. All Rights Reserved. 

Comments (7)

Distinguished Expert 2022


@IainMacb - You're correct. The storage is just the initial facet and can lead to improvements in other, various ways. When small projects expand into bigger ones, suddenly you have valid reasons to go from one all-in-one server to dedicated boxes per task (e.g. dedicated DB server, dedicated web server, etc...), and at that point, all of those bytes transferred back and forth between web and DB are not going over cables and connections within the same hardware, but over a much slower network connection (often with tons more overhead). So every byte counts.
Distinguished Expert 2022


@philkryder - The impact is going to be different depending on the application. For example, let's say that you do have this situation where you're dealing with 20 million 15-byte strings (roughly 286 megabytes).

If you're dealing with a database, then typically you're in a client/server setup, so the client needs the data locally in order to do any viewing or processing (this can be expanded also to a scenario where you've got DB Server -> Web Server -> End User, and data has to be transferred twice - and that's just basic setups).

Typically, the data's going to be transmitted over a TCP/IP network connection, so you're also adding about 8 megs of TCP overhead for 300,000,000 bytes (versus 2 megabytes for 80,000,000 bytes). You also have overhead of the structure containing/defining the data, plus any extra identifying data, but we'll set that aside for now.

So let's say that the database server has all the resources it needs to send all the data across to the client. On a LAN, it might not take too long either way, but you'll still notice a significant difference between transferring 78-ish megs vs 294-ish megs.

If you're dealing with a situation where you transfer that over a broadband connection of some kind, the difference will be even larger.

So the majority of the time, your "extra time" is going to be found in the data transfer, since that's often the slowest point. For the sake of having some example numbers, let's say that we're transferring over a 60Mbps connection, so we're looking at anywhere from 30 seconds - 45 seconds to download a roughly-300 meg payload, and 10-15 seconds to download a roughly 80-meg payload.

We'll say that we can save about 20-30 seconds in transfer time.

Now let's say the client app finally has the data in memory. It probably needs to store it somewhere, so there's some kind of structure involved, which also means an extra % of overhead storage in memory. Let's say you have a C# app using a DataTable structure. A string column in a DataTable takes up a LOT more memory per row than a long column. I don't have hard numbers in front of me, but I did have a project once where I hadn't defined my column type and I was accidentally storing small integers as a string. When I fixed the problem with a byte type definition on the field, memory usage dropped by several hundred megabytes.

So now we've got a client app with the data in memory. What's the next step?

If we're displaying the data on-screen, then chances are that you're probably only converting a handful of records at a time - maybe 200 at most.  The time it takes to convert 200 longs into string is around 0.001 seconds each time (just a rough test on my end).

But let's say you want to convert those 20 million numbers into 20 million entries all at once to be written to a file. Let's say that looping through 20 million records in memory takes a full 3 seconds, with absolutely no processing at all, so it wouldn't matter if the numbers were already converted or not - 3 seconds is the starting baseline. Your question is then - how much ADDITIONAL time would it take to convert 20 million numbers? A rough test on my machine shows about 40 seconds, and of course, that's going to depend on processor power. Many machines might be even slower.

So now we have some example numbers.
You have about 20-30 seconds of savings on the transfer side. If you were to turn around and convert all of those to strings immediately and write them to a file, then you're probably looking at a speed LOSS of roughly 10 seconds. Of course, this also doesn't take anything else into consideration (the value of bandwidth pipes / network saturation / memory / likelihood of this scenario / scalability across multiple clients / etc).

For displaying the data on-screen, you have about 20-30 seconds of savings on the transfer side, and virtually no time spent on conversion since you're not doing them all at once. The client's not going to notice a split-second of extra time every time they move to a new page of results.

Either way, there is a LOT of extra value in more efficient storage. There will always be some scenarios where it doesn't make sense to compact everything, but that's where being a good programmer comes into the picture and understanding the business case and how it translates into data flows and the value of each resource involved, and how that multiplies with concurrent load.
Mike TomlinsonHigh School Computer Science, Computer Applications, Digital Design, and Mathematics Teacher
Top Expert 2009

Excellent article!  The only thing I would have included would be a small section on Endianness (Big Endian vs. Little Endian).
armin sadatiFanyarai

My brain exploded from all this information
I hope I can process all this information

thank you
nociSoftware Engineer
Distinguished Expert 2022

Besides endian-nes there is one-two complement and floating point that might also need some explanation.

View More

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.