# Binary

## Goals

- Be able to convert base 2 and base 16 to base 10.
- Know how to work with bit operators.
- Understand two's complement representation of negative values.

## Concepts

- binary
- bit
- bitwise operator
- byte
- complement operator
- least-significant
- mask
- most-significant
- overflow
- sign bit
- sign extension
- truth table
- two's complement

### Language

`&`

`^`

`|`

`<<`

`>>`

`>>>`

`~`

### Library

## Lesson

In one of the first lessons of this course, you discovered that computers use these base 2 or binary numbers as a convenient way to represent values using electrical signals. Computers refer to individual binary digits as bits, and typically group them in sequences of eight bits called bytes. High-level language such as Java groups these bytes into various primitive types

such as `int`

.

`123`

(`0b01111011`

) stored as a 32-bit Java `int`

.`123` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `1` | `1` | `1` | `1` | `0` | `1` | `1` |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Byte 3 | Byte 2 | Byte 1 | Byte 0 |

For the most part Java hides the details of binary storage, allowing you to manipulate values in terms of their primitive types. However understanding more about binary representation and binary operations will allow you to accomplish more advanced tasks in Java, as well as avoid certain pitfalls presented by certain idiosyncrasies of the Java language.

### Conversion to Decimal

In elementary school you probably learned that the right-hand digit of a number was the ones-column

, the next column was the tens-column

, followed by the hundreds-column

, etc. progressing from right to left. You might have been taught that the number `456`

is equivalent to `(4 × 100) + (5 × 10) + (6 × 1)`

.

`456`

from its decimal digits.`4 × 100` | |

Sum | `= 456` |
---|

Keeping in mind that the number 456 is represented in base 10 (decimal), you might notice that the values `100`

, `10`

, and `1`

are all powers of 10: 10^{2} = 100, 10^{1} = 10, and 10^{0} = 1. Any number to the power of 0 is equal to 1.

`456`

(base 10).Power of 10 | 10^{2} | 10^{1} | 10^{0} | Result |
---|---|---|---|---|

Multiplier | `100` | `10` | `1` | |

Input | `4` | `5` | `6` | |

Output | `400` | `+ 50` | `+ 6` | `= 456` |

You can therefore use the following rule to convert any number in any base to decimal: **the multiplier in each column is equal to the number base to the power of the zero-based column from right to left**. Let's examine how this would work to convert the base 2 (binary) number `0b01111011`

to decimal:

`b01111011`

(base 2).Power of 2 | 2^{7} | 2^{6} | 2^{5} | 2^{4} | 2^{3} | 2^{2} | 2^{1} | 2^{0} | Result |
---|---|---|---|---|---|---|---|---|---|

Multiplier | `128` | `64` | `32` | `16` | `8` | `4` | `2` | `1` | |

Input | `0` | `1` | `1` | `1` | `1` | `0` | `1` | `1` | |

Output | `0` | `+ 64` | `+ 32` | `+ 16` | `+ 8` | `+ 0` | `+ 2` | `+ 1` | `= 123` |

### Bit Operators

#### Bit Shift Operators

There are three operators for shifting all bits to the left or the right a certain number of positions: `<<`

(left shift), `>>`

(right shift), and `>>>`

(unsigned right shift). The difference between the two right shift operators is that unsigned right shift operator will always fill the empty space on the left with binary `0`

, while the (signed) right shift operator will replicate the first bit that was originally on the left. See Negative Values below for more information on how this relates to signed and unsigned values.

`int`

value `0xABCD`

(`1010101111001101`

).`0xABCD` | `1` | `0` | `1` | `0` | `1` | `0` | `1` | `1` | `1` | `1` | `0` | `0` | `1` | `1` | `0` | `1` |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

`0xABCD << 1` | `0` | `1` | `0` | `1` | `0` | `1` | `1` | `1` | `1` | `0` | `0` | `1` | `1` | `0` | `1` | `0` |

`0xABCD << 2` | `1` | `0` | `1` | `0` | `1` | `1` | `1` | `1` | `0` | `0` | `1` | `1` | `0` | `1` | `0` | `0` |

`0xABCD >> 1` | `1` | `1` | `0` | `1` | `0` | `1` | `0` | `1` | `1` | `1` | `1` | `0` | `0` | `1` | `1` | `0` |

`0xABCD >> 2` | `1` | `1` | `1` | `0` | `1` | `0` | `1` | `0` | `1` | `1` | `1` | `1` | `0` | `0` | `1` | `1` |

`0xABCD >>> 1` | `0` | `1` | `0` | `1` | `0` | `1` | `0` | `1` | `1` | `1` | `1` | `0` | `0` | `1` | `1` | `0` |

`0xABCD >>> 2` | `0` | `0` | `1` | `0` | `1` | `0` | `1` | `0` | `1` | `1` | `1` | `1` | `0` | `0` | `1` | `1` |

#### Bitwise Boolean Operators

By now working with the logical operators `||`

(OR), `&&`

(AND), and `!`

(NOT) has become second nature to you. Java provides a complete set of bitwise operators that parallel the logical operators, except that instead of working on the values `false`

and `true`

, they work on individual bit values of `0`

and `1`

. That is to say the bitwise operators work like the logical if the bit value `0`

represented `false`

and the bit value `1`

represented `true`

.

The bitwise operators are `&&`

(AND), `|`

(OR), and ^ (XOR). The complement operator `~`

performs an analogous function to a logical NOT. You recall that it is possible to create a truth table containing all possible outcomes involving each logical operator. The same can be done the binary operators; you'll note that the results are parallel.

Input Values | AND | OR | XOR | NOT | |
---|---|---|---|---|---|

`boolean foo` | `boolean bar` | `foo && bar` | `foo || bar` | `foo ^ bar` | `!foo` |

`false` | `false` | `false` | `false` | `false` | `true` |

`false` | `true` | `false` | `true` | `true` | `true` |

`true` | `false` | `false` | `true` | `true` | `false` |

`true` | `true` | `true` | `true` | `false` | `false` |

Input Values | AND | OR | XOR | NOT (complement) | |
---|---|---|---|---|---|

`int foo` | `int bar` | `foo & bar` | `foo | bar` | `foo ^ bar` | `~foo` |

`0` | `0` | `0` | `0` | `0` | `1` |

`0` | `1` | `0` | `1` | `1` | `1` |

`1` | `0` | `0` | `1` | `1` | `0` |

`1` | `1` | `1` | `1` | `0` | `0` |

The following figure shows the result of performing each of the bitwise operators on two values.

`short`

values `123`

(`0b01111011`

) and `456`

(`0b111001000`

).`123` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `1` | `1` | `1` | `1` | `0` | `1` | `1` | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

`456` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `1` | `1` | `1` | `0` | `0` | `1` | `0` | `0` | `0` | |

`123 & 456` | `72` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `1` | `0` | `0` | `1` | `0` | `0` | `0` |

`123 | 456` | `507` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `1` | `1` | `1` | `1` | `1` | `1` | `0` | `1` | `1` |

`123 ^ 456` | `435` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `1` | `1` | `0` | `1` | `1` | `0` | `0` | `1` | 1 |

`~123` | `-124` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `0` | `0` | `0` | `0` | `1` | `0` | `0` |

#### Masking

The bitwise are often used with special bit patterns to mask a value, turning certain bits on or off; or to toggle

certain bits.

Goal | Turn Bits On | Turn Bits Off | Toggle Bits |
---|---|---|---|

Operator | OR | AND | XOR |

Description | `1` turns bit on, `0` keeps existing value | `0` turns bit off, `1` keeps existing value | `1` toggles bit, `0` keeps existing value |

Value | `0b01111011` | `0b01111011` | `0b01111011` |

Mask | `| 0b11110000` | `& 0b11110000` | `^ 0b11110000` |

Result | `0b11111011` | `0b01110000` | `0b10001011` |

### Negative Values

There's a peculiarity you may have noticed in the table above. The value resulting from `~123`

(`~0b01111011`

) is `-124`

(`0b10000100`

). Why is the result a *negative* number? In fact if we fill up all eight bits in a byte to give `0b11111111`

, using the base conversion calculations from the beginning of this lesson we would expect this to represent the decimal value `255`

. How therefore would it even be possible to represent a negative number in binary?

The approach computers usually use is to reserve one bit to be the sign bit, indicating if the value is positive or negative. **The sign bit is the highest, most-significant bit on the left.** But if you take away one bit from a byte, it only leaves seven for the actual value. The highest value stored in the remaining seven bits is `0b1111111`

, which is `127`

decimal. Thus if one bit is considered a sign bit, the highest value a byte can hold is `127`

, and the lowest value it can hold is `-127`

. (Why not `-128`

?)

#### Two's Complement

… | … | … | … | … | … | … | … | … |
---|---|---|---|---|---|---|---|---|

`-3` | `1` | `1` | `1` | `1` | `1` | `1` | `0` | `1` |

`-2` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `0` |

`-1` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `1` |

`0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` |

`1` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `1` |

`2` | `0` | `0` | `0` | `0` | `0` | `0` | `1` | `0` |

`3` | `0` | `0` | `0` | `0` | `0` | `0` | `1` | `1` |

… | … | … | … | … | … | … | … | … |

The way the negative bit is encoded results in `0b11111111`

for `-1`

and `0b11111110`

for `-2`

, etc. This encoding is called two's complement. You can visualize two's complement by thinking of all the bits as a series of rotating dials, as you did in the early lesson on values. If we start with all the bits turned off

we represent the value `0`

(`0b00000000`

). Rotating the dials forward one position gives us `1`

(`0b00000001`

). Therefore rotating the dials backwards one position to the setting right before `-1`

must give us `-1`

(`0b11111111`

).

Put another way, if you continue rotating the dials until you reach `255`

(`0b11111111`

), rotating one more position brings the dials back to `0`

(`0b00000000`

). The value before `0`

is `-1`

, so `0b11111111`

must have been another way to represent `-1`

.

So do eight bits that are turned on, `11111111`

, represent `255`

or `-1`

? **Whether a value with its most-significant bit set to 1 is positive or negative depends on whether you are considering the bytes as signed or unsigned.**

#### Converting Two's Complement

You can convert any positive binary number to its two's complement negative version (or vice-versa) by taking the complement of the bits and then adding `1`

:

`123`

(`0b0000000001111011`

) to `-123`

(`0b1111111110000101`

) in two's complement stored as a 16-bit Java `short`

.`n` | `123` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `1` | `1` | `1` | `1` | `0` | `1` | `1` |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

`~123` | `-124` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `0` | `0` | `0` | `0` | `1` | `0` | `0` |

`~123 + 1` | `-123` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `0` | `0` | `0` | `0` | `1` | `0` | `1` |

#### Sign Extension

It should come as no mystery that as you move to larger types used to store a positive integer, the leftmost bits are filled with `0`

. Thus `123`

would be stored as a Java `byte`

using `0b01111011`

, as a Java short using `0b0000000001111011`

, and as a Java `int`

using `0b00000000000000000000000001111011`

. This is no different than with everyday decimal numbers. (Don't forget that Java considers representations starting with 0 as indicating octal, however.)

A close examination of the representation of the 16-bit representation of `-123`

will show that its leftmost bits have been instead filled with `1`

. (This follows from the above formula for converting a positive to a negative value using two's complement: the leading zeros will be changed to ones.) Thus `-123`

would be stored as a Java `byte`

using `0b10000101`

, as a Java short using `0b1111111110000101`

, and as a Java `int`

using `0b11111111111111111111111110000101`

. **The unexpected consequence is that when widening a signed value the sign bit (eight 0 or 1) will be ** This makes complete sense, as one would expect the

extendedto fill the remaining space!

`byte`

value `-123`

(`0b10000101`

) to remain `-123`

(`0b11111111111111111111111110000101`

) when widened to an `int`

.#### Widening Unsigned Values

This sign extension can cause enormous problems when widening a signed type that is being interpreted as holding an unsigned value. As you saw above, although a Java considers a `byte`

to be signed, in most cases one can ignore the sign and consider the bits as representing an unsigned value. Thus the eight bits `0b10000101`

(which Java interprets as `-123`

) can also be interpreted as holding unsigned value `133`

. The problem arises when attempting to retrieve the unsigned value by casting to a wider type, because Java will extend the sign into the new space of the wider type, as discussed above.

Therefore when widening a signed type that is being used to hold an unsigned value, you must remove the extended sign bits. This is easily done performing an AND operation with a mask in which the upper bits are all `0`

.

`byte`

`133`

(`0b0000000001111011`

) (also interpreted as `-123`

) to an unsigned value by widening to a 16-bit `short`

and removing the extended sign bits.`byte b = (byte)133` | `-123` | `1` | `0` | `0` | `0` | `0` | `1` | `0` | `1` | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

`short signed = b ` | `-123` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `1` | `0` | `0` | `0` | `0` | `1` | `0` | `1` |

`short unsigned = signed & 0b0000000011111111` | `133` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `0` | `1` | `0` | `0` | `0` | `0` | `1` | `0` | `1` |

## Review

### Summary

Operators | Precedence |
---|---|

additive | `+ -` |

assignment | `= += -= *= /= %= &= ^= |= <<= >>= >>>=` |

bitwise AND | `&` |

bitwise exclusive OR | ^ |

bitwise inclusive OR | `|` |

equality | `== !=` |

logical AND | `&&` |

logical OR | `||` |

multiplicative | `* / %` |

postfix | `expr++ expr--` |

relational | `< > <= >= instanceof` |

shift | `<< >> >>>` |

ternary | `? :` |

unary | `++expr --expr +expr -expr ~ !` |

### Gotchas

- The symbols
`&&`

and`||`

are*logical*operators and are short-circuiting. The symbols`&`

and`|`

can be used as*bitwise*operators or`logical`

operators, but they are not short-circuiting. - Remember that the Java integer types are signed!
- When widening a signed type you are using to store unsigned values, without taking precautions the sign will be extended and not produce the unsigned value you intended.

### In the Real World

- You no longer have to do manual unsigned widening conversions, as this functionality is now available in the primitive wrapper classes in methods such as
`Byte.toUnsignedInt(byte)`

.

### Self Evaluation

- What is the result of
`912873465`

?^{0} - What is the difference between the operator
`>>`

and the operator`>>>`

? - What is the difference between the operator
`||`

and the operator`|`

? - How could you use the
`&`

operator to determine if an`int`

is positive or negative? - How could you replicate the results of the complement operator using the XOR operator?
- In two's complement representation, is the sign represented by the least-significant bit or the most-significant bit?
- What Java integer type is unsigned?
- What Java type would you use to represent the values
`0`

–`255`

, that is the full value range of eight bits?

## Task

Create a method `getBinaryByteValues(byte[] bytes)`

to extract byte values to integers.

- The method will consider each byte to be an
*unsigned value*. - The method will return an
`int[]`

containing the same unsigned values as were stored in the`byte[]`

. - Test the method extensively. Include the
`[2, 3, 4, 234, 0, 0, 0, 234]`

as inputs in one of the tests.

## See Also

- Bitwise and Bit Shift Operators (Oracle - The Java™ Tutorials)
- Two's Complement (Thomas Finley)
- Java SE 8's New Compact Profiles and Integer APIs: Unsigned Integer API (Jeff Friesen, Informit)

## References

- Elementary arithmetic (Wikipedia)
- Summary of Operators (Oracle - The Java™ Tutorials)
- Operator Precedence (Oracle - The Java™ Tutorials)
- The Java® Language Specification, Java SE 8 Edition: Chapter 15.22. Bitwise and Logical Operators (Oracle)
- Mask (computing) (Wikipedia)
- Two's complement (Wikipedia)

## Acknowledgments

- Combination disc image modified from illustration of a simple combination lock by Wapcaplet and licensed under Creative Commons Attribution-Share Alike 3.0 Unported.