HAE Dot Logogram Hector A. Escobedo 2022 HAE Dot Logogram

Bit lengths of various things

By: Hector Escobedo

Published

A bit is a binary digit, either 0 or 1, used in the base two number system. Most people are familiar with decimal digits, which are 0 through 9: the building blocks of the base ten number system. In decimal, 10 is ten, and 100 is one hundred. However, in binary, 10 represents two, because appending 0 multiplies the number by the base of that particular number system. Binary 100 is four. Binary 1000 is eight. Every additional 0 increases the value to the next power of two. The number of binary digits in a given number is known as its bit length, or size.

Generally, a computer must use defined bit lengths to store and represent numbers or data. Specialized software can be used for calculating numbers of arbitrary size, but for routine tasks, this is a waste of energy and resources. Bit lengths themselves are commonly powers of two, as this makes memory addressing more efficient and easier to remember.

So far, I have only discussed positive numbers. Since around the nineteenth century AD, mathematicians have been convinced that negative numbers are pretty useful as well. There are a few different ways to represent negative numbers in binary. Of these, I recommend using twos' complement or offset binary whenever possible, as ones' complement and sign-magnitude have both a positive and a negative zero. It’s best to avoid such ambiguity.

In either standard representation, the maximum signed value of a number with bit length n is the maximum unsigned value of a bit length n - 1 number. This means that for a given bit length, the maximum signed value always equals half of the maximum unsigned value, rounded down. When the maximum signed value is x, the minimum signed value is -x for ones' complement and -x - 1 for twos' complement. For example, when using 8 bits, the maximum signed value is 127 and the minimum signed value is -127 or -128.

0 bits

What does it mean to use no bits whatsoever to represent a value? In programming, this is known as a void type. It is a type of data which does not contain any values which can be distinguished from one another, used when there is nothing meaningful to return or analyze, yet the programmer must still provide some data type. There is no maximum or minimum value of the empty set.

1 bit

Key info:

A single bit is commonly known as a boolean value or bool, named after the great logician George Boole. It is the fundamental unit of computer science and information theory. A bit length of one can’t really be used for signed numbers, as there is no extra space to hold both the sign and the magnitude. However, it could be interpreted as the sign of a separate value, indicating whether that number is positive or negative.

An image with 1 bit pixels is called a monochrome bitmap or binary image. QR codes, technically a type of matrix barcode, are binary images that are widely used for reliable optical scanning of all sorts of data.

Some basic error detection codes use only a single bit for the checksum, in which case it is known as the parity bit.

By itself, 0 can represent:

While 1 can represent:

2 bits

Key info:

3 bits

Key info:

Used as a basis for simple color coding schemes by some printers, because it has enough space to enumerate the basic display colors of black, white, red, yellow, blue, pink, green, and cyan.

Could also encode a Chinese trigram (bagua). Each trigram is a symbol composed of three lines, and each line is either broken (Yin) or unbroken (Yang). In East Asian cultures, the bagua are steeped in philosophical and mystical significance, and are commonly used in feng shui and other traditional arts.

4 bits

Key info:

Commonly known as a nibble! This bit length is used for binary-coded decimal because it is the shortest bit length with a maximum unsigned value greater than ten. Also used as the word length or instruction size in some extremely limited microprocessors like those in coffee makers and children’s toys.

A convenient way to represent 4 bits is with a hexadecimal digit, commonly used in CSS colors, low-level programming, or viewing binary data. The hexadecimal system is base sixteen. Letters A through F are used to represent values ten through fifteen. It’s known as hex, for short, even though the Greek root word “hex” by itself just means six. Literal hexadecimal values in source code are often prefixed with 0x because “six” is the only decimal number which contains the letter X, in English, Greek, and Latin! Because any byte value can be expressed as a combination of two hex digits, this system is even more convenient than it first appears. Any byte values equal to or higher than one hundred would take three decimal digits to express, so this representation saves space when viewing a large amount of binary data, while also allowing exact alignment between character and byte boundaries. For example, 0x64 is one hundred and 0xff is two hundred and fifty-five.

5 bits

Key info:

This is the fewest number of bits that can be used to store a single letter of the basic Latin alphabet, which has twenty-six letters. Just pick lowercase or uppercase, but not both! You’ll even have room left over for null (string terminator character), space, period, comma, question mark, and exclamation point. No digits though, unless you use mode shift control characters. Émile Baudot invented a similar scheme for use as an early telegraph code in 1876. This proved satisfactory for the majority of the industrialized world for almost nine decades!

6 bits

Key info:

Enough to encode one of the 64 hexagrams of the I Ching, an ancient Chinese classic text used for divination. Each hexagram is composed of an upper and lower trigram (see 3 bits), and has a specific meaning.

7 bits

Key info:

In 1963, the first edition of the ASCII standard was published, defining a complete Latin character set using only 7 bits for each character. ASCII includes upper- and lowercase letters, decimal digits, and all the common punctuation and special characters found on a US keyboard. It also includes about 30 control characters, only a few of which are actually used anymore because the rest were designed for teletype functions. Hindsight is twenty-twenty.

8 bits

Key info:

Commonly known as a byte! This bit length is the smallest standard unit of memory for nearly all computer systems in the world today. The vast majority of Internet standards and protocols, modern programming languages, and other software use bytes for data organization and alignment. Any field smaller than 8 bits is treated as a special case, and performing operations on these fields is referred to as bit-slicing. Larger amounts of data are usually expressed as multiples of bytes. For example, a kilobyte is one thousand bytes.

The earliest practical, commercially available microprocessors, such as the 1972 Intel 8008, used 8 bit registers.

Years ago, programs had to define a code page, or character set, for each text file, indicating how the bytes within it were to be decoded and displayed. Users in different countries could receive garbled messages if the proper metadata or settings were missing, because their systems had different code page defaults. In the early 1990s, several computing standards organizations began working on a Universal Coded Character Set (known today as Unicode) which could encode all characters in every human script and thereby solve this problem, and furthermore would enable different scripts and languages to easily be used within the same document.

Ken Thompson and Rob Pike invented the UTF-8 encoding in 1992, which had the crucial property of being backward compatible with ASCII. In UTF-8, for all bytes beginning with a 0 bit, the remaining bits correspond exactly to an ASCII character. Bytes beginning with a 1 bit are part of a multi-byte sequence that may be up to four bytes (32 bits) long, more than enough space for the rest of Unicode!

In almost all C compilers, the char type is a single byte.

An 8 bit color depth works surprisingly well for small images such as icons, thumbnails, or emoji. It is also used for GIFs.

10 bits

Key info:

12 bits

Key info:

16 bits

Key info:

Equivalent to two bytes, 16 bits is a compromise between a single byte and a larger type like 32 or 64 bits. This is the basic audio bit depth for CDs and amateur recordings: 16 bit signal amplitude sampling is good enough to enjoy popular music.

The Intel 8086 was a 16 bit microprocessor released in 1978. This marked the start of the personal computing revolution and it was widely used in PCs and word processor devices in the 1980s. The x86 family of instruction set architectures derives its name from this very popular and capable little chip, which contained less than 30 thousand transistors.

24 bits

Key info:

Most new displays and common image formats use a 24 bit color depth, which is theoretically enough to display all colors that the human eye can distinguish. However, our eyes are more sensitive to blues and greens and no existing video hardware can display all these subtle shades. Maybe future laser displays will do the trick!

A 24 bit audio depth covers every perceptible amplitude in the human ear’s hearing range, from a pin drop to a jet engine roar. It’s used for professional quality recording, mixing, and mastering.

32 bits

Key info:

From the 1990s to the 2000s, almost all PCs and consumer CPUs used 32 bit designs. This limited the amount of addressable memory to at most 4 gibibytes. With proper library support, x86-32 programs can be run on x86-64 processors in backward compatibility mode. A float in the C language is normally 32 bits in size.

IPv4, which is sadly still the most widely supported Internet protocol, uses 32 bit addresses. The designers never anticipated that one day, every person on Earth might own a computer. With the rapid expansion of the Internet, the publicly routable IPv4 address space has already been exhausted, and end-user devices are forced to use cumbersome network address translation techniques to connect to each other directly.

64 bits

Key info:

Present day CPUs and operating systems nearly all use 64 bit instruction sets, and this will likely be sufficient for many decades to come. A 64 bit floating point value is called a double in the C language.

Sixty-four is a square number, and therefore a 64 bit value can represented as an eight by eight matrix of bits.

128 bits

Key info:

IPv6 addresses are 128 bits. The IETF was pretty generous this time: with that many possible addresses, every individual insect on Earth could have a whole IPv4 Internet space to itself, and there would be room left over!

In cryptography, this is considered the smallest secure key size. If your secret key or password has less than 100 bits of entropy, then it’s not secure against any serious attacker! The insecure MD5 hash function outputs a 128 bit hash. AES and other common symmetric encryption functions use a 128 bit block size.

160 bits

SHA-1 hashes are 160 bits in size. This is no longer considered secure and every cryptographer and expert will recommend replacing it.

224 bits

Not a very cool number. Some older hash functions output hashes of this size.

256 bits

Key info:

This is another fun square number. A sixteen by sixteen matrix has two hundred and fifty-six elements.

256 bit cryptography is now the norm, and is expected to maintain a good security margin for a while. The SHA-256 hash function is used for Bitcoin’s proof-of-work. The BLAKE2s hash function is a faster, state-of-the-art alternative which has seen significant and increasing adoption in open source software.

512 bits

Key info:

When you really need some extra security margin, and want to sleep soundly at night, a 512 bit hash function comes in handy. SHA-512 and BLAKE2b are the leading options for this purpose. The latter is also specifically designed to be really fast on 64 bit processors.