Binary

Goals

Concepts

Language

Library

Lesson

In one of the first lessons of this course, you discovered that computers use these base 2 or binary numbers as a convenient way to represent values using electrical signals. Computers refer to individual binary digits as bits, and typically group them in sequences of eight bits called bytes. High-level language such as Java groups these bytes into various primitive types such as int.

The value 123 (0b01111011) stored as a 32-bit Java int.
123 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1
Byte 3 Byte 2 Byte 1 Byte 0

For the most part Java hides the details of binary storage, allowing you to manipulate values in terms of their primitive types. However understanding more about binary representation and binary operations will allow you to accomplish more advanced tasks in Java, as well as avoid certain pitfalls presented by certain idiosyncrasies of the Java language.

Conversion to Decimal

In elementary school you probably learned that the right-hand digit of a number was the ones-column, the next column was the tens-column, followed by the hundreds-column, etc. progressing from right to left. You might have been taught that the number 456 is equivalent to (4 × 100) + (5 × 10) + (6 × 1).

Calculating the value 456 from its decimal digits.
4 × 100
+ 5 × 10
+ 6 × 1
Sum = 456

Keeping in mind that the number 456 is represented in base 10 (decimal), you might notice that the values 100, 10, and 1 are all powers of 10: 102 = 100, 101 = 10, and 100 = 1. Any number to the power of 0 is equal to 1.

Calculating the decimal value of 456 (base 10).
Power of 10 102 101 100 Result
Multiplier 100 10 1
Input 4 5 6
Output 400 + 50 + 6 = 456

You can therefore use the following rule to convert any number in any base to decimal: the multiplier in each column is equal to the number base to the power of the zero-based column from right to left. Let's examine how this would work to convert the base 2 (binary) number 0b01111011 to decimal:

Calculating the decimal value of b01111011 (base 2).
Power of 2 27 26 25 24 23 22 21 20 Result
Multiplier 128 64 32 16 8 4 2 1
Input 0 1 1 1 1 0 1 1
Output 0 + 64 + 32 + 16 + 8 + 0 + 2 + 1 = 123

Bit Operators

Bit Shift Operators

There are three operators for shifting all bits to the left or the right a certain number of positions: << (left shift), >> (right shift), and >>> (unsigned right shift). The difference between the two right shift operators is that unsigned right shift operator will always fill the empty space on the left with binary 0, while the (signed) right shift operator will replicate the first bit that was originally on the left. See Negative Values below for more information on how this relates to signed and unsigned values.

The various bit shift operators applied to the 32-bit Java int value 0xABCD (1010101111001101).
0xABCD 1 0 1 0 1 0 1 1 1 1 0 0 1 1 0 1
0xABCD << 1 0 1 0 1 0 1 1 1 1 0 0 1 1 0 1 0
0xABCD << 2 1 0 1 0 1 1 1 1 0 0 1 1 0 1 0 0
0xABCD >> 1 1 1 0 1 0 1 0 1 1 1 1 0 0 1 1 0
0xABCD >> 2 1 1 1 0 1 0 1 0 1 1 1 1 0 0 1 1
0xABCD >>> 1 0 1 0 1 0 1 0 1 1 1 1 0 0 1 1 0
0xABCD >>> 2 0 0 1 0 1 0 1 0 1 1 1 1 0 0 1 1

Bitwise Boolean Operators

By now working with the logical operators || (OR), && (AND), and ! (NOT) has become second nature to you. Java provides a complete set of bitwise operators that parallel the logical operators, except that instead of working on the values false and true, they work on individual bit values of 0 and 1. That is to say the bitwise operators work like the logical if the bit value 0 represented false and the bit value 1 represented true.

The bitwise operators are && (AND), | (OR), and ^ (XOR). The complement operator ~ performs an analogous function to a logical NOT. You recall that it is possible to create a truth table containing all possible outcomes involving each logical operator. The same can be done the binary operators; you'll note that the results are parallel.

Logical Truth Table
Input Values AND OR XOR NOT
boolean foo boolean bar foo && bar foo || bar foo ^ bar !foo
false false false false false true
false true false true true true
true false false true true false
true true true true false false
Binary Truth Table.
Input Values AND OR XOR NOT (complement)
int foo int bar foo & bar foo | bar foo ^ bar ~foo
0 0 0 0 0 1
0 1 0 1 1 1
1 0 0 1 1 0
1 1 1 1 0 0

The following figure shows the result of performing each of the bitwise operators on two values.

The various bitwise operators applied to the 16-bit Java short values 123 (0b01111011) and 456 (0b111001000).
123 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1
456 0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0
123 & 456 72 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0
123 | 456 507 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 1
123 ^ 456 435 0 0 0 0 0 0 0 1 1 0 1 1 0 0 1 1
~123 -124 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0

Masking

The bitwise are often used with special bit patterns to mask a value, turning certain bits on or off; or to toggle certain bits.

Goal Turn Bits On Turn Bits Off Toggle Bits
Operator OR AND XOR
Description 1 turns bit on, 0 keeps existing value 0 turns bit off, 1 keeps existing value 1 toggles bit, 0 keeps existing value
Value 0b01111011 0b01111011 0b01111011
Mask | 0b11110000 & 0b11110000 ^ 0b11110000
Result 0b11111011 0b01110000 0b10001011

Negative Values

There's a peculiarity you may have noticed in the table above. The value resulting from ~123 (~0b01111011) is -124 (0b10000100). Why is the result a negative number? In fact if we fill up all eight bits in a byte to give 0b11111111, using the base conversion calculations from the beginning of this lesson we would expect this to represent the decimal value 255. How therefore would it even be possible to represent a negative number in binary?

The approach computers usually use is to reserve one bit to be the sign bit, indicating if the value is positive or negative. The sign bit is the highest, most-significant bit on the left. But if you take away one bit from a byte, it only leaves seven for the actual value. The highest value stored in the remaining seven bits is 0b1111111, which is 127 decimal. Thus if one bit is considered a sign bit, the highest value a byte can hold is 127, and the lowest value it can hold is -127. (Why not -128?)

Two's Complement

Excerpt of value sequence using two's complement negative numbers.
-3 1 1 1 1 1 1 0 1
-2 1 1 1 1 1 1 1 0
-1 1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 1
2 0 0 0 0 0 0 1 0
3 0 0 0 0 0 0 1 1
combination discs
Imagining a decimal number as digits on rotating discs.

The way the negative bit is encoded results in 0b11111111 for -1 and 0b11111110 for -2, etc. This encoding is called two's complement. You can visualize two's complement by thinking of all the bits as a series of rotating dials, as you did in the early lesson on values. If we start with all the bits turned off we represent the value 0 (0b00000000). Rotating the dials forward one position gives us 1 (0b00000001). Therefore rotating the dials backwards one position to the setting right before -1 must give us -1 (0b11111111).

Put another way, if you continue rotating the dials until you reach 255 (0b11111111), rotating one more position brings the dials back to 0 (0b00000000). The value before 0 is -1, so 0b11111111 must have been another way to represent -1.

So do eight bits that are turned on, 11111111, represent 255 or -1? Whether a value with its most-significant bit set to 1 is positive or negative depends on whether you are considering the bytes as signed or unsigned.

Converting Two's Complement

Formula for converting two's complement.
twosComplement = ~n + 1

You can convert any positive binary number to its two's complement negative version (or vice-versa) by taking the complement of the bits and then adding 1:

Converting from 123 (0b0000000001111011) to -123 (0b1111111110000101) in two's complement stored as a 16-bit Java short.
n 123 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1
~123 -124 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0
~123 + 1 -123 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 1

Sign Extension

It should come as no mystery that as you move to larger types used to store a positive integer, the leftmost bits are filled with 0. Thus 123 would be stored as a Java byte using 0b01111011, as a Java short using 0b0000000001111011, and as a Java int using 0b00000000000000000000000001111011. This is no different than with everyday decimal numbers. (Don't forget that Java considers representations starting with 0 as indicating octal, however.)

A close examination of the representation of the 16-bit representation of -123 will show that its leftmost bits have been instead filled with 1. (This follows from the above formula for converting a positive to a negative value using two's complement: the leading zeros will be changed to ones.) Thus -123 would be stored as a Java byte using 0b10000101, as a Java short using 0b1111111110000101, and as a Java int using 0b11111111111111111111111110000101. The unexpected consequence is that when widening a signed value the sign bit (eight 0 or 1) will be extended to fill the remaining space! This makes complete sense, as one would expect the byte value -123 (0b10000101) to remain -123 (0b11111111111111111111111110000101) when widened to an int.

Widening Unsigned Values

This sign extension can cause enormous problems when widening a signed type that is being interpreted as holding an unsigned value. As you saw above, although a Java considers a byte to be signed, in most cases one can ignore the sign and consider the bits as representing an unsigned value. Thus the eight bits 0b10000101 (which Java interprets as -123) can also be interpreted as holding unsigned value 133. The problem arises when attempting to retrieve the unsigned value by casting to a wider type, because Java will extend the sign into the new space of the wider type, as discussed above.

Removing extended sign when widening a byte to an short.
final byte b = …;
final short unsigned = b & 0b0000000011111111;

Therefore when widening a signed type that is being used to hold an unsigned value, you must remove the extended sign bits. This is easily done performing an AND operation with a mask in which the upper bits are all 0.

Converting from an 8-bit byte 133 (0b0000000001111011) (also interpreted as -123) to an unsigned value by widening to a 16-bit short and removing the extended sign bits.
byte b = (byte)133 -123 1 0 0 0 0 1 0 1
short signed = b -123 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 1
short unsigned = signed & 0b0000000011111111 133 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1

Review

Summary

Operator Precedence
Operators Precedence
additive + -
assignment = += -= *= /= %= &= ^= |= <<= >>= >>>=
bitwise AND &
bitwise exclusive OR ^
bitwise inclusive OR |
equality == !=
logical AND &&
logical OR ||
multiplicative * / %
postfix expr++ expr--
relational < > <= >= instanceof
shift << >> >>>
ternary ? :
unary ++expr --expr +expr -expr ~ !

Gotchas

In the Real World

Self Evaluation

Task

Create a method getBinaryByteValues(byte[] bytes) to extract byte values to integers.

See Also

References

Acknowledgments