Binary

Goals

Be able to convert base 2 and base 16 to base 10.
Know how to work with bit operators.
Understand two's complement representation of negative values.

Concepts

binary
bit
bitwise operator
byte
complement operator
least-significant
mask
most-significant
overflow
sign bit
sign extension
truth table
two's complement

Language

&
^
|
<<
>>
>>>
~

Library

Lesson

In one of the first lessons of this course, you discovered that computers use these base 2 or binary numbers as a convenient way to represent values using electrical signals. Computers refer to individual binary digits as bits, and typically group them in sequences of eight bits called bytes. High-level language such as Java groups these bytes into various primitive types such as int.

The value 123 (0b01111011) stored as a 32-bit Java int.

`123`	Byte 3								Byte 2								Byte 1								Byte 0
`123`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`1`	`1`	`1`	`1`	`0`	`1`	`1`

For the most part Java hides the details of binary storage, allowing you to manipulate values in terms of their primitive types. However understanding more about binary representation and binary operations will allow you to accomplish more advanced tasks in Java, as well as avoid certain pitfalls presented by certain idiosyncrasies of the Java language.

Conversion to Decimal

In elementary school you probably learned that the right-hand digit of a number was the ones-column, the next column was the tens-column, followed by the hundreds-column, etc. progressing from right to left. You might have been taught that the number 456 is equivalent to (4 × 100) + (5 × 10) + (6 × 1).

Calculating the value 456 from its decimal digits.

`4 × 100 + 5 × 10 + 6 × 1`
Sum	`= 456`

Keeping in mind that the number 456 is represented in base 10 (decimal), you might notice that the values 100, 10, and 1 are all powers of 10: 10² = 100, 10¹ = 10, and 10⁰ = 1. Any number to the power of 0 is equal to 1.

Calculating the decimal value of 456 (base 10).

Power of 10	10²	10¹	10⁰	Result
Multiplier	`100`	`10`	`1`
Input	`4`	`5`	`6`
Output	`400`	`+ 50`	`+ 6`	`= 456`

You can therefore use the following rule to convert any number in any base to decimal: the multiplier in each column is equal to the number base to the power of the zero-based column from right to left. Let's examine how this would work to convert the base 2 (binary) number 0b01111011 to decimal:

Calculating the decimal value of b01111011 (base 2).

Power of 2	2⁷	2⁶	2⁵	2⁴	2³	2²	2¹	2⁰	Result
Multiplier	`128`	`64`	`32`	`16`	`8`	`4`	`2`	`1`
Input	`0`	`1`	`1`	`1`	`1`	`0`	`1`	`1`
Output	`0`	`+ 64`	`+ 32`	`+ 16`	`+ 8`	`+ 0`	`+ 2`	`+ 1`	`= 123`

The principle works the same with base 16 (hexadecimal), as well as any other number base:

Calculating the decimal value of 0x4D2 (base 16).

Power of 16	16²	16¹	16⁰	Result
Multiplier	`256`	`16`	`1`
Input	`4`	`D`	`2`
Output	`1024`	`+ 208`	`+ 2`	`= 1234`

Bit Operators

Bit Shift Operators

There are three operators for shifting all bits to the left or the right a certain number of positions: << (left shift), >> (right shift), and >>> (unsigned right shift). The difference between the two right shift operators is that unsigned right shift operator will always fill the empty space on the left with binary 0, while the (signed) right shift operator will replicate the first bit that was originally on the left. See Negative Values below for more information on how this relates to signed and unsigned values.

The various bit shift operators applied to the 32-bit Java int value 0xABCD (1010101111001101).

`0xABCD`	`1`	`0`	`1`	`0`	`1`	`0`	`1`	`1`	`1`	`1`	`0`	`0`	`1`	`1`	`0`	`1`
`0xABCD << 1`	`0`	`1`	`0`	`1`	`0`	`1`	`1`	`1`	`1`	`0`	`0`	`1`	`1`	`0`	`1`	`0`
`0xABCD << 2`	`1`	`0`	`1`	`0`	`1`	`1`	`1`	`1`	`0`	`0`	`1`	`1`	`0`	`1`	`0`	`0`
`0xABCD >> 1`	`1`	`1`	`0`	`1`	`0`	`1`	`0`	`1`	`1`	`1`	`1`	`0`	`0`	`1`	`1`	`0`
`0xABCD >> 2`	`1`	`1`	`1`	`0`	`1`	`0`	`1`	`0`	`1`	`1`	`1`	`1`	`0`	`0`	`1`	`1`
`0xABCD >>> 1`	`0`	`1`	`0`	`1`	`0`	`1`	`0`	`1`	`1`	`1`	`1`	`0`	`0`	`1`	`1`	`0`
`0xABCD >>> 2`	`0`	`0`	`1`	`0`	`1`	`0`	`1`	`0`	`1`	`1`	`1`	`1`	`0`	`0`	`1`	`1`

In elementary school you might have learned the trick that you could multiply any number by 10 simply by shifting the number one digit to the left: 123 × 10 = 1230. Similarly dividing by 10 can be achieved by shifting the value one digit to the right: 123 ÷ 10 = 12 (ignoring the remainder). From the above tables, you should see why this is true. It also implies two rules:

You can multiply any number by its base by shifting one digit to the left.
You can dividing any number by its base by shifting one digit to the right.

Thus for example b01111011 << 1 = b11110110 (123 × 2 = 246).

Accomplishing multiplication and division by shifting bits may be especially efficient because CPUs have built-in machine language for shifting bits that may be faster than actually doing multiplication.

As with anything clever, don't get carried away with it. Most of the time a few extra cycles won't make a real difference in speed, and your colleagues will probably understand better what you're trying to do if you just use normal multiplication. Only if you're building a 3D graphics engine, for example, might you have a real excuse for playing tricks like this.

Bitwise Boolean Operators

By now working with the logical operators || (OR), && (AND), and ! (NOT) has become second nature to you. Java provides a complete set of bitwise operators that parallel the logical operators, except that instead of working on the values false and true, they work on individual bit values of 0 and 1. That is to say the bitwise operators work like the logical if the bit value 0 represented false and the bit value 1 represented true.

The bitwise operators are && (AND), | (OR), and ^ (XOR). The complement operator ~ performs an analogous function to a logical NOT. You recall that it is possible to create a truth table containing all possible outcomes involving each logical operator. The same can be done the binary operators; you'll note that the results are parallel.

Logical Truth Table
Input Values		AND	OR	XOR	NOT
`boolean foo`	`boolean bar`	`foo && bar`	`foo \|\| bar`	`foo ^ bar`	`!foo`
`false`	`false`	`false`	`false`	`false`	`true`
`false`	`true`	`false`	`true`	`true`	`true`
`true`	`false`	`false`	`true`	`true`	`false`
`true`	`true`	`true`	`true`	`false`	`false`

Binary Truth Table.
Input Values		AND	OR	XOR	NOT (complement)
`int foo`	`int bar`	`foo & bar`	`foo \| bar`	`foo ^ bar`	`~foo`
`0`	`0`	`0`	`0`	`0`	`1`
`0`	`1`	`0`	`1`	`1`	`1`
`1`	`0`	`0`	`1`	`1`	`0`
`1`	`1`	`1`	`1`	`0`	`0`

The following figure shows the result of performing each of the bitwise operators on two values.

The various bitwise operators applied to the 16-bit Java short values 123 (0b01111011) and 456 (0b111001000).

	`123`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`1`	`1`	`1`	`1`	`0`	`1`	`1`
	`456`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`1`	`1`	`1`	`0`	`0`	`1`	`0`	`0`	`0`
`123 & 456`	`72`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`1`	`0`	`0`	`1`	`0`	`0`	`0`
`123 \| 456`	`507`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`1`	`1`	`1`	`1`	`1`	`1`	`0`	`1`	`1`
`123 ^ 456`	`435`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`1`	`1`	`0`	`1`	`1`	`0`	`0`	`1`	1
`~123`	`-124`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`0`	`0`	`0`	`0`	`1`	`0`	`0`

You can think of the XOR operator as a switch; when the second value is 1, it will flip the first value from 0 to 1 or from 1 to 0; otherwise, it will leave the first value as-is.

Masking

The bitwise are often used with special bit patterns to mask a value, turning certain bits on or off; or to toggle certain bits.

Goal	Turn Bits On	Turn Bits Off	Toggle Bits
Operator	OR	AND	XOR
Description	`1` turns bit on, `0` keeps existing value	`0` turns bit off, `1` keeps existing value	`1` toggles bit, `0` keeps existing value
Value	`0b01111011`	`0b01111011`	`0b01111011`
Mask	`\| 0b11110000`	`& 0b11110000`	`^ 0b11110000`
Result	`0b11111011`	`0b01110000`	`0b10001011`

The complement operator effectively switches all bits to their opposite values. This is equivalent to using XOR operator with the same value and a mask containing all binary 1 bits. In other words, ~0b01111011 is equivalent to 0b01111011 ^ 0b11111111; or in hexadecimal, ~0x7B is equivalent to 0x7B ^ 0xFF.

You can check to see if a value is odd using masking by checking to see if its least-significant bit on the right is set.

final int result = (2 + 70) / (3 * 6);
final boolean isResultEven = !(result & 0b1);  //note negation and parentheses

Negative Values

There's a peculiarity you may have noticed in the table above. The value resulting from ~123 (~0b01111011) is -124 (0b10000100). Why is the result a negative number? In fact if we fill up all eight bits in a byte to give 0b11111111, using the base conversion calculations from the beginning of this lesson we would expect this to represent the decimal value 255. How therefore would it even be possible to represent a negative number in binary?

The approach computers usually use is to reserve one bit to be the sign bit, indicating if the value is positive or negative. The sign bit is the highest, most-significant bit on the left. But if you take away one bit from a byte, it only leaves seven for the actual value. The highest value stored in the remaining seven bits is 0b1111111, which is 127 decimal. Thus if one bit is considered a sign bit, the highest value a byte can hold is 127, and the lowest value it can hold is -127. (Why not -128?)

Two's Complement

Excerpt of value sequence using two's complement negative numbers.

…	…	…	…	…	…	…	…	…
`-3`	`1`	`1`	`1`	`1`	`1`	`1`	`0`	`1`
`-2`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`0`
`-1`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`1`
`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`
`1`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`1`
`2`	`0`	`0`	`0`	`0`	`0`	`0`	`1`	`0`
`3`	`0`	`0`	`0`	`0`	`0`	`0`	`1`	`1`
…	…	…	…	…	…	…	…	…

combination discs — Imagining a decimal number as digits on rotating discs.

The way the negative bit is encoded results in 0b11111111 for -1 and 0b11111110 for -2, etc. This encoding is called two's complement. You can visualize two's complement by thinking of all the bits as a series of rotating dials, as you did in the early lesson on values. If we start with all the bits turned off we represent the value 0 (0b00000000). Rotating the dials forward one position gives us 1 (0b00000001). Therefore rotating the dials backwards one position to the setting right before -1 must give us -1 (0b11111111).

Put another way, if you continue rotating the dials until you reach 255 (0b11111111), rotating one more position brings the dials back to 0 (0b00000000). The value before 0 is -1, so 0b11111111 must have been another way to represent -1.

So do eight bits that are turned on, 11111111, represent 255 or -1? Whether a value with its most-significant bit set to 1 is positive or negative depends on whether you are considering the bytes as signed or unsigned.

All Java integer values, including byte, are considered signed!

Because the value -1 in any of the signed types is represented by all bits set to binary 1 (e.g. 0b11111111), it is not uncommon to see -1 represented using hexadecimal (e.g. 0xFF). The use of hexadecimal is especially common as a short form for indicating masks, such as 0xFF00 to represent 0b1111111100000000.

The interesting thing about two's complement is that much of the time when doing arithmetic it doesn't matter whether the value is considered signed or unsigned, because the math works out either way as far as the bits are concerned. For example, let's say you add binary 0b11111111 and 0b00000001. That will give you 0b100000000 (notice that there are now nine bits). If you consider the value unsigned, then the calculation in decimal 255 + 1 = 256, which is correct (although the value will overflow the eight bits allowed in a byte). If you consider the values unsigned, the calculation in decimal is -1 + 1 = 0 (ignoring the overflow bit). What is important are the bit calculations. You just need to remember at the end whether you are considering them signed or unsigned.

Converting Two's Complement

Formula for converting two's complement.

twosComplement = ~n + 1

You can convert any positive binary number to its two's complement negative version (or vice-versa) by taking the complement of the bits and then adding 1:

Converting from 123 (0b0000000001111011) to -123 (0b1111111110000101) in two's complement stored as a 16-bit Java short.

`n`	`123`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`1`	`1`	`1`	`1`	`0`	`1`	`1`
`~123`	`-124`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`0`	`0`	`0`	`0`	`1`	`0`	`0`
`~123 + 1`	`-123`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`0`	`0`	`0`	`0`	`1`	`0`	`1`

Sign Extension

It should come as no mystery that as you move to larger types used to store a positive integer, the leftmost bits are filled with 0. Thus 123 would be stored as a Java byte using 0b01111011, as a Java short using 0b0000000001111011, and as a Java int using 0b00000000000000000000000001111011. This is no different than with everyday decimal numbers. (Don't forget that Java considers representations starting with 0 as indicating octal, however.)

A close examination of the representation of the 16-bit representation of -123 will show that its leftmost bits have been instead filled with 1. (This follows from the above formula for converting a positive to a negative value using two's complement: the leading zeros will be changed to ones.) Thus -123 would be stored as a Java byte using 0b10000101, as a Java short using 0b1111111110000101, and as a Java int using 0b11111111111111111111111110000101. The unexpected consequence is that when widening a signed value the sign bit (eight 0 or 1) will be extended to fill the remaining space! This makes complete sense, as one would expect the byte value -123 (0b10000101) to remain -123 (0b11111111111111111111111110000101) when widened to an int.

Widening Unsigned Values

This sign extension can cause enormous problems when widening a signed type that is being interpreted as holding an unsigned value. As you saw above, although a Java considers a byte to be signed, in most cases one can ignore the sign and consider the bits as representing an unsigned value. Thus the eight bits 0b10000101 (which Java interprets as -123) can also be interpreted as holding unsigned value 133. The problem arises when attempting to retrieve the unsigned value by casting to a wider type, because Java will extend the sign into the new space of the wider type, as discussed above.

Removing extended sign when widening a byte to an short.

final byte b = …;
final short unsigned = b & 0b0000000011111111;

Therefore when widening a signed type that is being used to hold an unsigned value, you must remove the extended sign bits. This is easily done performing an AND operation with a mask in which the upper bits are all 0.

Converting from an 8-bit byte 133 (0b0000000001111011) (also interpreted as -123) to an unsigned value by widening to a 16-bit short and removing the extended sign bits.

`byte b = (byte)133`	`-123`									`1`	`0`	`0`	`0`	`0`	`1`	`0`	`1`
`short signed = b`	`-123`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`1`	`0`	`0`	`0`	`0`	`1`	`0`	`1`
`short unsigned = signed & 0b0000000011111111`	`133`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`0`	`1`	`0`	`0`	`0`	`0`	`1`	`0`	`1`

Review

Summary

Operator Precedence
Operators	Precedence
additive	`+ -`
assignment	`= += -= *= /= %= &= ^= \|= <<= >>= >>>=`
bitwise AND	`&`
bitwise exclusive OR	^
bitwise inclusive OR	`\|`
equality	`== !=`
logical AND	`&&`
logical OR	`\|\|`
multiplicative	`* / %`
postfix	`expr++ expr--`
relational	`< > <= >= instanceof`
shift	`<< >> >>>`
ternary	`? :`
unary	`++expr --expr +expr -expr ~ !`

Gotchas

The symbols && and || are logical operators and are short-circuiting. The symbols & and | can be used as bitwise operators or logical operators, but they are not short-circuiting.
Remember that the Java integer types are signed!
When widening a signed type you are using to store unsigned values, without taking precautions the sign will be extended and not produce the unsigned value you intended.

In the Real World

You no longer have to do manual unsigned widening conversions, as this functionality is now available in the primitive wrapper classes in methods such as Byte.toUnsignedInt(byte).

Self Evaluation

What is the result of 912873465⁰?
What is the difference between the operator >> and the operator >>>?
What is the difference between the operator || and the operator |?
How could you use the & operator to determine if an int is positive or negative?
How could you replicate the results of the complement operator using the XOR operator?
In two's complement representation, is the sign represented by the least-significant bit or the most-significant bit?
What Java integer type is unsigned?
What Java type would you use to represent the values 0 – 255, that is the full value range of eight bits?

Task

Create a method getBinaryByteValues(byte[] bytes) to extract byte values to integers.

The method will consider each byte to be an unsigned value.
The method will return an int[] containing the same unsigned values as were stored in the byte[].
Test the method extensively. Include the [2, 3, 4, 234, 0, 0, 0, 234] as inputs in one of the tests.

References

Acknowledgments

Combination disc image modified from illustration of a simple combination lock by Wapcaplet and licensed under Creative Commons Attribution-Share Alike 3.0 Unported.