Iterators

Goals

Concepts

Language

Library

Lesson

Abstract Data Types

The name of the LinkedList interface we have been implementing so far has been a little too restrictive. The interface has a getLength() method, and you can access any element by index using get(int index). But nothing about this interface requires that you use a linked list specifically. You'll note that Java arrays also provide a length, and also allow you to look up elements based upon an index. Arrays don't provide a get(int index) method, but you could wrap a class around an array and provide your own get(…) method to get access to the underlying array elements.

Arrays and linked lists are examples of data structures, and their names indicate how information is literally arranged in memory. We can step back and talk about information on a higher level: how the information is logically arranged and accessed. At this level, we could say that arrays and linked lists provide different implementations of how one might store a list of items. In this view list is an abstract data type (ADT) and describes how data is logically stored and accessed. Array and linked list are two data structures that can be used to implement a list abstract data type.

With this more refined view, let's rename the LinkedList interface to simply ListADT for list abstract data type. We'll also remove the getFirstNode() method, because nodes are specific to graph-based data structures such as linked lists, but aren't necessary for other data structures such as arrays.

Interface for list abstract data type.
/**
 * A list abstract data type.
 * @param <E> The type of element in the list.
 */
 public interface ListADT<E> {

  /** @return The length of the list. */
  int getLength();


  /**
   * Retrieves the object at the given index.
   * @param index The index at which to retrieve the object.
   * @return The object at the given index, which may be <code>null</code>.
   * @throws IndexOutOfBoundsException if the given index is less than zero;
   *     or greater than or equal to the number of elements in the list.
   */
  @Nullable E get(int index);

  /**
   * Add an item into the list after the current last item.
   * @param element The item to add.
   */
  default void add(@Nullable E element) {
    insert(getLength(), element);
  }

  /**
   * Inserts an item into the list at a given index.
   * @param index The index at which the object should be inserted,
   *     with a minimum of zero and a maximum of the length of the list.
   * @param element The item to add.
   * @throws IndexOutOfBoundsException if the given index is less than zero;
   *     or greater than the number of elements in the list.
   */
  void insert(int index, @Nullable E element);

}

Array-Based List

To illustrate how the list ADT is distinct from the data structure used in implementation, we can construct a list ADT that is based on an array rather than a series of linked nodes.

Array-based implementation of a list abstract data type.
/**
 * A list ADT based on an array.
 * @param <E> The type of element in the list.
 */
public class ArrayListADT<E> implements ListADT<E> {

  private E[] array;

  public ArrayListADT() {
    //TODO create array
  }

  @Override
  public int getLength() {
    return array.length;
  }

  @Override
  public E get(final int index) {
    return array[index];
  }

  @Override
  public void insert(int index, E element) {
    if(index < 0 || index > array.length) {
      throw new IndexOutOfBoundsException();
    }
    //TODO create a new array with one more element
    //TODO copy over all items, leaving a space at index
    array[index] = element;
  }
}

Iterator<E>

When working with your LinkedListImpl you might have stepped through the elements with a for(…) loop, using an index variable to look up each element using get(int index). You might have realized that this approach is somewhat inefficient, because finding a particular node in a linked list required walking through and counting all the nodes, one at a time, until you get to the requested node. The earlier LinkedList interface exposed a getFirstNode() method that would have allowed a more efficient traversal of the nodes, but we have removed that from ListADT<E>, and for good reason: it wouldn't work for other list ADT implementation such as ArrayListADT<E>.

Java provides an iterator interface java.util.Iterator<E> which allows us to step through or iterate across a series of items. It keeps track of the underlying state for us, shielding the developer from the underlying implementation of the ADT. If the ADT is based on an array, for instance, the iterator will keep track of the current index. If the ADT is based on a linked list, the ADT will keep track of the current node. The Iterator<E> interface is simple; it has only two methods that must be implemented (although it has others with default implementations): Iterator.hasNext() and Iterator.next().

java.util.Iterator<E>
package java.util;

public interface Iterator<E> {

  /**
   * @return <code>true</code> if the iteration has more elements.
   */
  boolean hasNext();

  /**
   * @return The next element in the iteration
   * @throws NoSuchElementException if the iteration has no more elements.
   */
  E next();

  …

}

If you had an iterator to a ListADT, for example, you could use the following pattern without worrying about the underlying implementation.

Outline of iterating an Animal list.
public static void printAnimals(final ListADT<Animal> animals) {
  final Iterator<Animal> animalIterator = //TODO get iterator to ListADT of animals
  while(animalIterator.hasNext()) {
    final Animal animal = animalIterator.next();
    System.out.println(animal);
  }
}

Now we need to get an iterator to our ListADT<E>. This is usually done by asking the ListADT<E> itself for an iterator, as the list must provide an iterator that knows about its internal implementation details. We can thus add a method newIterator() to the ListADT<E> interface.

Adding a method for creating a new iterator to the ListADT<E> interface.
public interface ListADT<E> {

  /** @return The length of the list. */
  int getLength();

  /**
   * Retrieves the object at the given index.
   * @param index The index at which to retrieve the object.
   * @return The object at the given index, which may be <code>null</code>.
   * @throws IndexOutOfBoundsException if the given index is less than zero;
   *     or greater than or equal to the number of elements in the list.
   */
  @Nullable E get(int index);

  …

  /** @return A new iterator for stepping through the items in the list. */
  Iterator<E> newIterator();

}

Because the iterator must know about the specific implementation of the ListADT<E>, this indicates that there is a very close relationship between the iterator and list implementation. It is also unlikely that an iterator for a specific list type could be used generally by other classes. These considerations indicate that it might be a good idea to make our iterator implementation an inner class of the list implementation class. Because the iterator will need to access actual data from the list ADT, we will make the iterator a non-static inner class. The iterator will use an internal index variable to keep track of the state of iteration—in this case the current position in the list.

Iterator implementation as an inner class of ArrayListADT<E>.
public class ArrayListADT<E> implements ListADT<E> {

  private E[] array;

  …

  @Override
  public Iterator<E> newIterator() {
    return new IteratorImpl();
  }

  private class IteratorImpl implements Iterator<E> {

    private int index = 0;

    @Override
    public boolean hasNext() {
      return index < array.length;
    }

    @Override
    public E next() {
      //looking up the item in the array will throw the correct exceptions as needed
      final E next = array[index];
      index++; //next time we'll get the following item
      return next;
    }
  }
}

Now you can use the iterator to iterate over a list.

Using an iterator to iterate an Animal list.
public static void printAnimals(final ListADT<Animal> animals) {
  final Iterator<Animal> animalIterator = animals.newIterator();
  while(animalIterator.hasNext()) {
    final Animal animal = animalIterator.next();
    System.out.println(animal);
  }
}

Removing Items

The Iterable<E> interface also provides an Iterator.remove() method. By default this method throws an UnsupportedOperationException, but if the underlying ADT is mutable and supports removal its iterator will implement Iterable.remove() so that it removes the last retrieved item from the iterator. Here's how it could be used to remove all low miles-per-gallon vehicles from a list.

Removing items during iteration.
final int MIN_MPG = 10;
final ListADT<Vehicle> vehicles = getVehicles();
final Iterator<Vehicle> vehicleIterator = vehicles.newIterator();
while(vehicleIterator.hasNext()) {
  final Vehicle vehicle = vehicleIterator.getNext();
  if(vehicle.getMPG() < MIN_MPG) {
    vehicleIterator.remove();  //remove the vehicle from the list
  }
}

ListIterator<E>

An Iterator<E> is extremely simple, allowing one to walk through any sequence of items in one direction. Java provides an iterator sub-interface called java.util.ListIterator<E> which provides additional functionality for iterating sequences that behave like lists. A ListIterator<E> has methods for going forwards and backwards in a sequence, and can even provide the previous and next indexes in a list.

java.util.ListIterator<E>
package java.util;

public interface ListIterator<E> extends Iterator<E> {

  …

  boolean hasPrevious();

  E previous();

  int nextIndex();

  int previousIndex();

  void set(E e);

  void add(E e);
}

Iterable<T>

Java further enhanced and standardized the way to retrieve iterators from an ADT using an iterable. The java.lang.Iterable<T> interface indicates that this object knows how to return a new iterator to its items and provides one central method for retrieving this iterator: Iterable.iterator(). Java chose to use the simple method name "iterator()" rather than "newIterator()" as we used above, but the method functions the same.

java.lang.Iterable<E>
package java.lang;

public interface Iterable<T> {

  /**
   * @return An iterator over elements of type <code>T</code>.
   */
  Iterator<T> iterator();

} 

We can consequently remove the newIterator() method from our ListADT<E> interface and simply extend the Iterable<T> interface.

Extending Iterable<E> to add access to iterables in the ListADT<E> interface.
public interface ListADT<E> extends Iterable<E> {

  /** @return The length of the list. */
  int getLength();

  /** Retrieves the object at the given index.
   * @param index The index at which to retrieve the object.
   * @return The object at the given index, which may be <code>null</code>.
   * @throws IndexOutOfBoundsException if the given index is less than zero;
   *     or greater than or equal to the number of elements in the list.
   */
  @Nullable E get(int index);

  …

}
public class ArrayListADT<E> implements ListADT<E> {

  …

  @Override
  public Iterator<E> iterator() {
    return new IteratorImpl();
  }

  private class IteratorImpl implements Iterator<E> {

    …

  }
} 

Enhanced for(…) Loop

The Iterable<T> interface does more than simply provide a standard way for retrieving iterators, however. Any Iterable<T> implementation can be used in an enhanced for(…) loop! Because our ListADT<E> interface (and our ArrayListADT<E> implementation) now ultimately implement Iterable<T>, our printAnimals() method becomes a lot simpler.

Using an enhanced for(…) loop to iterate an Animal list.
public static void printAnimals(@Nonnull final ListADT<Animal> animals) {
  for(final Animal animal : animals) {
    System.out.println(animal);
  }
}

Yes, it really is that simple. Behind the scenes Java will invoke animals.iterator() and use similar logic to what we used above, calling hasNext() and next() to step through the elements.

Review

Think About It

How an Iterator<E> works inside may seem a little mysterious at first, but you should remember that an Iterator<E> merely does the same job you would do if you were to iterate over your data structure at a low level. If you were manually iterating over an array (not using an enhanced for(…) loop), you would create an index variable an increment it each time you go to the next item. An Iterator<E> simply internalizes or encapsulates this variable, allowing you to update the variable using the Iterator<E> methods.

If you were iterating over a linked list manually, you would create a variable pointing to a particular node. When it came time to go to the next node, you would check the reference to the next node and then update the node reference to point to the next node, and so on until you run out of nodes. The Iterator<E> of a linked list then will do the same thing. It will not be based upon an index, but on a reference to a next node that is updated.

We say that an Iterator<E> "encapsulates state", or the current conditions of iteration. When we work with an Iterator<E>, it updates some internal information (whether that is an index or a node reference), so that when we come back to it the internal information is in the same state as we left it. An Iterator<E> will keep track of this state, updating it as necessary, until we no longer reference the Iterator<E> (such as when it goes out of scope) and it is discarded by the garbage collector.

Self Evaluation

Task

Create an Iterator<E> implementation for your LinkedListImpl<E> class.

Add two methods getKeys() and getValues() to your hash table. You should be able to use these methods in an enhanced for(…) loop to iterate over the keys and values, respectively, of your hash table. The order of the returned values does not matter.

Acknowledgments