Competitive/Collaborative Programming Class

ICPC Computer Programming Contest Prep

Problem Solving in Computer Science

Fall 2021 -- CSC 2700 Section 01 (1206 Patrick Taylor, 6:00 PM - 7:50 PM)

Binary Number Storage

Storing numbers in computer memory is a lot more complicated than it might seem. How a computers stores (encodes) Integer and Real is interesting -- then along comes endian.

Endianness

Big and little endian is a neat topic. Endianness specifies the order in which bytes are stored. For example, the 32bit word 0x0A0B0C0D could be stored to different ways in memory:

BiEndianness is the ability to choose Big or Little via a switch/setting (Newer ARM, PowerPC, Alpha, ...)

Integer Binary Format

Wikipedia: Signed number representations and Integer

Most 32-bit binary integers are formatted with one sign bit and 31 bits for data. The sign bit is usually the left most bit. The value of the bits usually are something like this:

bit  value
---  -----
 0   2^0 - 1s
 1   2^1 - 2s
 2   2^2 - 4s   
 3   2^3 - 8s
 4   2^4 - 16s
 5   2^5 - 32s
 6   2^6 - 64s
 7   2^7 - 128s
 8   2^8 - 256s
 9   2^9 - 512s
10  2^10 - 1024s 

This is how positive numbers are typically encoded. For example:

116 is 01110100 (64 + 32 + 16 + 4)

An 8-bit byte can store 256 values (0-255) if only positive values are needed. If negative numbers are needed a byte can store -128 thru 127.

Negative Integers

Negative Integers are stored in Two's Complement. To negate a number, you convert it to 2's complement via the following steps:

  1. Apply 1's complement
  2. Add 1

For instance the value 116 is 01110100. The 2's complement is 10001100 (-116). You would expect that 116 + -116 would be 0. So lets add them together:

 
  01110100
+ 10001100
  --------
 100000000

Cool, so how do you get the string 10001100 from 01110100?

 01110100
 10001000 - 1's complement (flip bits)
 10001100 - add 1

2'S complement is very easy for a computer to do. Flipping bits (NOTing them is easy). Add 1 is also very easy. It turns out that 2's complement is really cool. It means that addition of numbers can be done without concern for the sign of the numbers (unlike for us in base 10). Subtraction turns out to be very easy -- computers do not subtract, they take the 2's complement of the second number and add.

Floating Point

A floating point number in scientific notation looks like this: 3.1415 x 10^7. 3.1415 is known as the mantissa and 7 is the exponent. Both the mantissa and the exponent could be negative.

In binary floating point, the storage is divided into 3 parts: sign bit, exponent and mantissa. The sign bit is like there sign bit of an Integer and is the left-most bit. The exponent is (in a typical 32-bit word) is an 8-bit signed number (integer values -128 to 127). The mantissa is often referred to as the fraction. In a 32-bit word, it usually is 23 bits long (1 for sign and 8 for exponent). The first bit is the 1/2s, then the 1/4s, and so forth. Where integer bits start at 2^0 and go up, real mantissa bits start at 2^-1 and progress to 2^-23.

So a floating point number in binary will be 2^(exp) * ((bit 1 * 1/2) + (bit 2 * 1/4) + (bit 3 * 1/8) + ... + (bit 23 * 1/8388608)). At this point, you should notice an interesting problem. Only numbers that equal a summation of 1/2^n can be represented exactly. 2, 1/4, and 64 can be represented exactly as a real binary number. But since 3 (or 1/3 for that matter) cannot be expressed as a sum of 1/2^n terms, those numbers cannot be exactly represented.

This gets really interesting as math expressions proceed. 1/3 + 1/3 + 1/3 = 1. But in floating point binary 1/3 + 1/3 + 1/3 != 1 (slightly less). Imagine the small error growing over time as more and more expressions are evaluated...

While this binary design is clearly flawed, it can be improved. If we add more bits, we can approximate 1/3 much better. Doubles (double precision floating point) is usually 64 bits. Usually, a 64-bit floating point has a sign bit, a 10-bit exponent and a 53-bit mantissa/fraction.

Reference: What Every Computer Scientist Should Know About Floating-Point Arithmetic (local copy)

Binary Coded Decimal

Binary Coded Decimal is yet another way to encode a number into binary. It has the advantage that the range (how big/small) if the number can vary based on need. It has the disadvantages of using more memory and being slower for arithmetic. In simple BCD, each decimal digit of the number is stored in a separate byte. Of course you also have to either keep a count of used bytes or pre-define a standard number of digits.

Obviously the numbers 0-9 can be represented with 4 bits (1/2 of a byte is known as a nibble). Packed BCD is where each nibble of the byte stores a separate digit. This doubles the number of digits available for a given size but slows arithmetic even more.

A variant of BCD called Zoned Decimal allows the digits to be stored one per byte as there ascii value. In ASCII, a '0' is 0x30 with a '9' being a 0x39. The high nibble (0x3 or 0011) can be added to the value 0 through 9 to attain the ascii character '0' .. '9'. This means that Zoned Decimal Math is fairly fast (and the byte with 0x0F, do the arithmetic and the or with 0x30 to return to ASCII Zoned Decimal).