Command-Line Interface

Goal

Understand a program's command-line interface.
Recognize common uses of command-line arguments.
Use the args4j library to parse command-line options.
Learn how to make executable JARs with dependencies.

Concepts

alias
args4j
business logic
business objects
command
command-line argument
command-line interface (CLI)
concern
entry point
exit code
flag
graphical user interface (GUI)
JAR with dependencies
layer
N-tier
manifest
metadata
normalize
option
presentation layer
redirect
standard streams
subcommand
switch
tier
user interface (UI)

Library

Dependencies

Lesson

Until this point in this course you have been learning about the main features of the Java language, exploring useful libraries for added functionality, and using tools to facilitate the development process. This knowledge is not sufficient to make a modern application, however. To move beyond small example programs and code snippets, you need to learn larger patterns of how to group your objects and design ways for them to effectively communicate with each other in a way that is maintainable.

Separation of Concerns

A computer application may have a single purpose, but it likely does many things to meet its purpose. It may save and load things to and from a database. It may interact with the user. It may make interesting calculations. It may communicate with the Internet. In addition it may also log errors to a file. It may check for updates. It may have the ability to show menus in different languages based upon the preferred language of the user.

All of these activities are referred to as concerns of an application. If an application's concerns are all mixed together in one big application class, they becomes difficult to manage. The bugs get harder to find. And if we want to change some behavior, a conglomerate class makes it harder to switch an implementation out with a better one or one with enhanced functionality.

For these reasons a well-designed application will use separation of concerns (SoC). Many times the primary concerns are conceptualized as layers that communicate with each other, each still addressing a single concern. Here is a typical set of layers or tiers of an application that interacts with the user and saves information in some database.

A three-tier application. Modified from diagram by Bartledan (Wikimedia Commons).

Presentation Layer

The top-most layer, the presentation layer, is the layer responsible for gathering input from the user. After processing the data in the other layers, the program will use the presentation layer to present the output back to the user; this is what gives the layer its name.

Presentation Representation

The program may be somewhat lenient as to the format in which it accepts input from the user. The application may accept a username " jdoe " with leading or trailing spaces. This is not the correct username representation recognized by the other layers of the application, however, and would cause problems when used as map keys, for example. In order to convert the user data to the form expected by the other layers, the application needs to normalize the data, placing it in the normal, expected form—in this example, by trimming the leading and trailing spaces from the username to produce "jdoe".

Likewise the application may format the output in a way especially accessible to the user. As you will learn in a future lesson, an application may present the value 1.23 as "1,23" in countries that use the comma character for separating the fractional portion of a number. The other layers can work with data in normal form without being distracted with these sorts of user-specific representation, as those are the sole responsibility of the presentation later.

User Interface

Typical Windows command-line program invocation.

git commit --all -m "fixed bug"

The presentation layer's approach to interacting with the user is referred to as the user interface (UI) of the program. Many modern applications use a graphical user interface (GUI), complete with windows, dialogs boxes, and buttons. Even programs with GUIs can still be invoked from an operating system's command-line interface (CLI) in which the user types input values via the operating system console, and receives output values in response in the same location. Typical command-line interfaces include the Windows command prompt and the Unix Bash shell.

Command-Line Interface

Arguments

In a command-line interface, the user types input values via the operating system console, and receives output values in response in the same location. The initial string indicates the command to run; this usually invokes some installed application. The other strings are referred to as the command-line arguments, and they are passed to the program executable when the program is initially invoked. This is why the array parameter to a Java program's public static main(String[]) method is usually named args.

Standard Streams

A text terminal; the running program; and the streams standard input stdin, standard output stdout, standard error stderr. — Standard streams of a command-line interface. (Wikimedia Commons)

A program started from the command line is connected to three communication channels called the standard streams, illustrated in the accompanying figure.

stdin: Input from the user, usually typed on a keyboard, is connected via the standard input stream stdin①. Java represents stdin using System.in.
stdout: Console output of the program goes to the standard output stream stdout②, which is usually the display. Java represents stdout using System.out.
stderr: Error-related output such as error messages are routed separately to the standard error stream stderr③. Normally stderr is routed to the same destination as stdout. Java represents stderr using System.err.

The operating system, command line, and/or the program itself usually provide facilities to redirect one or more of the standard streams to another location such as to a file.

Exit Code

When a command-line program finishes, it usually sends back an exit code indicating its status. By contention an exit code of 0 indicates that the program completed successfully. There are few conventions for non-success exit codes; the significance of any other code will usually be unique to the program itself.

Option Conventions

The syntax and style of program options on the command line have varied widely over the years, especially between Windows and Unix-based platforms. Windows traditionally has used forward slash / delimiters to indicate options, while Unix and similar systems have used the hyphen - character. Some programs use entire words, while others use single letters.

Classification of command-line arguments.

`git commit --all -m "fixed bug"`
	command	flag	parameter	argument
git	commit	--all	-m	"fixed bug"

There is however enough commonality in recent programs to perceive somewhat of a pattern. The conventions presented here, while certainly not rules, would be a good overall approach to follow when accepting command-line arguments in your own program. The figure shows some typical types of command-line arguments and their purpose in relation to the program being invoked.

Switches

Certain arguments are considered special words recognized or commands recognized by the program. We call those switches or simply options, and they are usually prefixed by one or more occurrences of a special character. While Windows has traditionally used the slash / character as a switch delimiter, many modern programs are adopting the Unix-style switches which use the hyphen - character instead.

The switches in the above example are:

--all
-m

See the Summary section for a list of commonly-encountered switches.

Aliases

A program many times will support an alternate, shorter name of a switch called an alias. Either form of the switch will work, but one is usually shorter.

Conventionally the full-word form is prefixed with two hyphen -- characters, while the shorter form usually consists of only one letter and is prefixed with a single hyphen - character.

Here are options for the command above, along with their shorter aliases:

--all: -a
--message: -m

Flags

Even though various terms are used, some switches do not take parameters; we will call those flags. In the example above the --all (-a) switch is a flag that causes all modified and deleted files to be staged automatically before committing.

If the switches are not recognized or otherwise cannot be processed, it is typical for a program to print out a guide to the available switches—the equivalent of the --help switch.

Arguments

While all of the strings passed to a program from the command-line are considered command-line arguments, from the program's perspective, program arguments are are usually those options that follow a parameter. There are a couple of special cases where an argument may stand alone:

Sometimes the first option is considered a command for the program, as in git log.
The last option may be considered an argument to the program command rather than the preceding switch, such as in git rm -r directory.

If the argument has spaces, in most command-line environments you will need to place quote " characters around the string, as in "fixed bug" in the example above. Otherwise the quotation marks are optional. If there are characters specially interpreted by the operating system, however, they may need to be quoted or escaped.

Depending on the program, a parameter may support multiple arguments. Sometimes the parameter switch is allowed to be repeated, but oftentimes a single parameter switch is given followed by multiple argument values. Sometimes to ease processing multiple values are combined using some delimiter such as a comma. Here are various ways in which two values might be passed as arguments to a fictitious --when switch:

--when now later
--when now,later
--when now|later

Documentation

Example CLI documentation for do-something program.

Do Something.

Usage:
  do-something --when | -w (now|later) <file>

Options:
  -h --help  Show this help information.
  --version  Show the version identifier.
  -w --when  When something should be done.

It is customary for a program to print out usage documentation to the console if the the --help flag is provided. There are likewise no standards in this area, but there are several conventions that have been formalized with the docopt command-line interface description language, which will be used in this lesson and throughout this course.

Here are a few of the docopt conventions for presenting parameters arguments:

<arg>: Name of some argument.
[<arg>]: Optional argument.
<arg>...: Repeating argument.
(foo|bar): Mutually exclusive argument values.

See the docopt Command-line interface description language for more information.

Java Command-Line Arguments

In Java, command-line arguments are provided to your main(…) method for processing.

public static void main(final String[] args) {
  …
}

Within your program, if you support command-line arguments you would need somehow to process the values the JVM provided to you in the args parameter. This usually involves one or more loops, some string comparisons, and variables to store the presence of recognized options and flags.

args4j

Iterating through string arguments and sorting them out is a tedious process. Several third-party libraries exist for assisting in this process; some provide a configurable processor, while others rely on Java annotations. No single library has become the de-facto standard for processing command-line arguments, but one named args4j is relatively popular, simple to use, and available from Maven Central. (Alternative libraries are listed in the Resources section.)

The library requires that you create some sort of class to encapsulate the parsed options and flags. You will place a unique variable in the option class to represent each command-line option.

public class TravelAgent {

  private static class CommandLineOptions {

    //the first argument, at index 0, is used as a command
    @Argument(index = 0, required = true, metaVar = "(purchase|view|refund)",
        usage = "Purchase or refund your ticket.")
    private String command;

    @Option(name = "--itinerary", aliases = "-i", metaVar = "<itinerary>",
        usage = "Itinerary number for viewing or requesting a refund.")
    private Integer itinerary;

    @Option(name = "--destinations", aliases = "-d", metaVar = "<airport>...",
        handler = StringArrayOptionHandler.class,
        usage = "Airport code(s) for flight destinations if purchasing a ticket.")
    private final List<String> airports = new ArrayList<>();

    @Option(name = "--help", aliases = "-h", help = true,
        usage = "Presents information on command-line options.")
    private boolean help;

  }

  …
}

Annotate the option class variables with the appropriate args4j annotations to indicate which switches and their aliases are expected in the command-line arguments. The most commonly used annotation is org.kohsuke.args4j.Option, indicating an option that can appear in any order. The org.kohsuke.args4j.Argument annotation defines a positional option that always appears at some index in the sequence of arguments, such as a program command. You can learn about their details by reading the args4j API documentation.

The metaVar annotation parameter is used to record information to present in the help message regarding arguments of an option.

The org.kohsuke.args4j.spi.StringArrayOptionHandler class, specified as a handler parameter of the @Option annotation, will collect multiple argument strings following a parameter option. Although the class has the word array in its name, it will work with Java collection types such as List<E>.

You should research the args4j documentation to discover other options to make your options work effectively. For example, the @Options annotation allows additional parameters such as depends and forbids to indicate how the options interrelate.

Once you've defined which arguments you expect using args4j annotations, as soon as you get an opportunity within your program you will want to ask args4j to parse the command-line arguments. Instantiate your options class and pass it to args4j along with the string arguments provided by the JVM. args4j will parse the options, using the annotations on your options class as a guide, and fill in the options class instance with the result automatically so that you won't have to do the hard work of processing the options yourself.

You can use the following as a basic pattern of use:

public static void main(final String[] args) {
  …
  final CommandLineOptions commandLineOptions = new CommandLineOptions();
  final CmdLineParser commandLineParser = new CmdLineParser(commandLineOptions);
  try {
    commandLineParser.parseArgument(args);
  } catch (final CmdLineException cmdLineException) {
    System.err.println(cmdLineException.getMessage());
    commandLineParser.printUsage(System.out);
    System.exit(1);
  }
  …
}

The CommandLineOptions class you instantiate and pass to args4j is the class you created earlier—the defintion of the command-line options you are expecting.

The CmdLineParser.printUsage(OutputStream) method can automatically print out command-line documentation based upon the options class you passed it, based upon the usage information you provided in the args4j annotations. This provides a convenient and easy way to print out usage information in response to e.g. a --help switch.

System.err should be used for presenting error messages to the user via stderr. Only normal program output should be presented using System.out.

The System.exit(int status) method is used to exit the program providing a custom exit code. If System.exit(…) is not used, a Java program will end automatically and return an exit code of 0 when it reaches the end of the main(…) method.

The System.exit(…) method is dangerous, and hijacks the normal flow of our program! Use it sparingly, and localize it to the top-level command-line presentation layer of your program. Never hide a System.exit(…) call inside some method; instead return the appropriate error value or throw an exception, letting the top-level program method deal with exiting the program.

Executable Application JAR Files

Long ago you learned that you could execute an application through maven using the Exec Maven Plugin. To run the Hello World program you used mvn exec:java -Dexec.mainClass="com.example.HelloWorld". You need only to tell Maven the class containing the main(…) method; because Maven knows all your programs dependencies (and their dependencies) listed in the POM, Maven can automatically determine the classpath to use—the same as if you would have listed those dependencies using the java -classpath command-line option.

Running the Hello World program from an executable JAR.

java -jar HelloWorld.jar

When you distribute an application with a command-line interface, users will want to execute it without using Maven. They will not want to worry about which class is the “main” class. They will not want to track down the dependencies and list them as a classpath. Java allows you to create a single JAR file that both indicates the main class, and contains all the dependencies inside the JAR file. You can run this sort of JAR file (sometimes called an “executable JAR”) using the java command with the -jar option.

JAR Metadata

Default META-INF/MANIFEST.MF file.

Manifest-Version: 1.0
Created-By: 1.8.0_162 (Oracle Corporation)

For a JAR to be “executable”, it must indicate the main class as part of its metadata, its internal information describing the contents of the JAR. You know that a JAR file is essentially a Zip file with JAR extension. A JAR also provides metadata in a manifest stored in the Zip archive in the META-INF/MANIFEST.MF file. The mainfest file contains pairs of header names and values. For example, the Manifest-Version provides the version of the manifest file format itself; this is always set to 1.0.

You can create a JAR file from scratch using the jar command which comes with the JDK. Normally you wouldn't need to use this command; as you'll see there are other tools that make creating JAR files easier. In fact you've already been using Maven to create a JAR file using mvn package. See Creating a JAR File for more information on using the jar command, as well as the link to the jar manual in References.

Application Entry Point

Java considers the class containing the main(…) method the application entry point, because this is the class and method the JDK first calls to transfer control to the program. You can indicate which class is the entry point using the Main-Class JAR metadata header. For the Hello World program you would add an entry to the manifest containing Main-Class: com.example.HelloWorld.

You can set the entry point manually using the jar command -e option. See Setting an Application's Entry Point.

JAR with Dependencies

Even if you specify the entry point in your application JAR's manifest, you will still need to specify all your application's dependencies using the -classpath option, because those dependencies are found in other JAR files—perhaps in on Maven Central or cached locally on your computer. So that your user does not have to worry about dependency JARs, you can bundle all the class files, both from your application JAR and from its dependency JARs, into a single JAR with dependencies. This type of JAR is often called an “uber JAR” or a “fat JAR”.

Apache Maven Assembly Plugin

A simple way to create a JAR with dependencies is to use the Apache Maven Assembly Plugin. The single goal will produce a single application bundle. By default this goal is not associated with any Maven phase, so you must specify an appropriate phase, such as package, in its configuration in order for the plugin to be invoked automatically during the build.

Running the Hello World program from a JAR with dependencies.

java -jar target/helloworld-1.2.3-jar-with-dependencies.jar

How this application bundle is produced depends on an “assembly descriptor”, indicated using the <descriptorRefs><descriptorRef> section of the plugin configuration. The Apache Maven Assembly Plugin has a predefined assembly descriptor named jar-with-dependencies specifically for creating this type of JAR. In addition the <archive><manifest><mainClass> section of the plugin configuration allows you to indicate the main class of your program. The figure shows how to configure the Hello World POM so that the command mvn package will produce a helloworld-1.2.3-jar-with-dependencies.jar (assuming the Hello World version is set to 1.2.3).

Using the Apache Maven Assembly Plugin in the POM to create an executable JAR with dependencies by invoking mvn package.

<project …>
  …
  <build>
    <plugins>
      …
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-assembly-plugin</artifactId>
        <version>3.1.0</version>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <archive>
            <manifest>
              <mainClass>com.example.HelloWorld</mainClass>
            </manifest>
          </archive>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
        </configuration>
      </plugin>

…

By default the name of the JAR with dependencies append have the assembly ID (-jar-with-dependencies) appended to the default name of the main program artifact. To produce a JAR with dependencies without this suffix, set the appendAssemblyId parameter to false. This will overwrite your main JAR, however, so be sure you don't need both a JAR without dependencies and a JAR with dependencies before using this option.

Although the Maven Assembly Plugin works fine for simple applications, it runs into problems with certain types of dependencies that use java.util.ServiceLoader for auto-discovery of services. These dependencies each contain separate META-INF/services files. Rather than combine them, the Apache Maven Assembly Plugin will discard all but one of these files, which will likely stop your application from finding all its service dependencies, preventing it from functioning correctly. The Apache Maven Assembly Plugin provides a metaInf-services “descriptor handle” that can correctly process service definitions, but defining this descriptor handle is complicated. See one answer to How can I merge resource files in a Maven assembly? for more in-depth instructions if you want to try. It is probably simpler and more flexible to use the Apache Maven Shade Plugin, below.

Apache Maven Shade Plugin

An alternative and more flexible solution for creating a JAR with dependencies is the Apache Maven Shade Plugin. Its shade goal produces a “shaded JAR”, which is essentially a JAR with dependencies except that some dependencies have been “shaded” by renaming or processing them. The shade goal is automatically bound to the Maven package phase, so merely the goal needs to be indicated in the POM.

Running the Hello World program from a shaded JAR.

java -jar target/helloworld-1.2.3-shaded.jar

The Apache Maven Shade Plugin allows extensive configuration using “transformers”, specified in the <transformers> section of the plugin configuration. Particularly relevant is the is org.apache.maven.plugins.shade.resource.ManifestResourceTransformer, which allows an entry point to be specified via its <manifestEntries><Main-Class> child elements. If only the <Main-Class> is specified, you do not need to include the <manifestEntries> element. The figure shows how to configure the Hello World POM so that the command mvn package will produce a helloworld-1.2.3-shaded.jar (assuming the Hello World version is set to 1.2.3).

Using the Apache Maven Assembly Plugin in the POM to create a shaded JAR by invoking mvn package.

<project …>
  …
  <build>
    <plugins>
      …
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-shade-plugin</artifactId>
        <version>3.1.0</version>
        <executions>
          <execution>
            <goals>
              <goal>shade</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <shadedArtifactAttached>true</shadedArtifactAttached>
          <transformers>
            <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
              <mainClass>com.example.HelloWorld</mainClass>
            </transformer>
            <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
          </transformers>
        </configuration>
      </plugin>

…

The shadedArtifactAttached parameter ensures that the -shaded suffix is added to the name of the produced JAR with dependencies. To produce a JAR with dependencies without this suffix, set the set the shadedArtifactAttached parameter to false, its default value, or remove the parameter altogether. This will replace your main JAR; the JAR without dependencies will be given an original- prefix.

You can specify a different suffix for your JAR with dependencies using the shadedClassifierName parameter. For example, to produce a JAR with dependencies with the filename helloworld-1.2.3-exe.jar, specify <shadedClassifierName>exe</shadedClassifierName> in the plugin configuration.

Specifying the org.apache.maven.plugins.shade.resource.ServicesResourceTransformer transformer is crucial if you use dependencies that use java.util.ServiceLoader for auto-discovery of services. These dependencies each contain separate META-INF/services files, and the ServicesResourceTransformer knows how to combine these files rather than discarding all but one of them, as would happen by default (which would likely prevent your application from working correctly).

Review

Summary

Common Switches
Switch	Alias	Description
`--all`	`-a`	Includes everything in the input or output.
`--file`	`-f`	The file that is the target of the operation. Many times, rather than using a switch, the target filename is specified as the last argument.
`--help`	`-h`	Prints out a help summary of available switches. Some programs will also display help information of the provided arguments are invalid.
`--quiet`	`-q`	Produces reduced output or no explanatory output.
`--verbose`	`-v`	Provide extra explanatory information in the output.
`--version`		Indicates the release version of the program.

Gotchas

A System.exit(…) hidden inside a method can cause problems by hijacking the normal shutdown of your program.
If you are editing a JAR META-INF/MANIFEST.MF file by hand, don't forget to add an ending newline or the last line may not be processed correctly.
If you use the Apache Maven Assembly Plugin to produce a JAR with dependencies, and you have dependencies that rely on “service loaders” to discover certain functionality providers, the produced JAR with dependencies will contain an incomplete META-INF/services and your program will probably not function correctly. Choose for the Apache Maven Shade Plugin instead.

In the Real World

Use System.exit(…) sparingly, and only at the top level of your program.
Error messages should be sent to System.err.
More and more modern applications use dependencies that use “service loaders”, from logging providers to database drivers. Most real-world application should therefore err on the side of caution and use the Apache Maven Shade Plugin rather than the Apache Maven Assembly Plugin to produce a JAR with dependencies.

Self Evaluation

What are the different delimiters commonly used for full-word options and their single-letter aliases?

Task

Your Booker class over several lessons has collected various ways to present information. Now you will turn Booker into a proper application, first concentrating on the command-line interface presentation layer.

Application

All your programs until now have simply used the public static void main(final String[] args) entry point method—and in fact all command-line Java programs use main(…) to bootstrap the application. But beyond its special entry point method, your Booker class is no different than any other class and can be instantiated, hold instance variables, etc. Creating an actual instance of the application class facilitates modularization as you add functionality. Supporting application instantiation could theoretically allow multiple instance of the application to run simultaneously in certain contexts.

Create one or more constructors for your application.
1. One of the custructors must allow command-line arguments to be passed to the application for storage.
2. Rather than passing the raw args strings to the application constructor, parse these user-specific representations within the main(…) method and pass them to the constructor using some command line option encapsulation class (see below).
Have your application class implement the java.lang.Runnable interface, which provides a convenient designation for things that can run.
Move your application's business logic from main(…) into its implementation Runnable.run(). This business logic will use the options you stored inside the constructor.
In main(…) after parsing the command line options, instantiate your application and invoke its run() method.

Alter your Booker application POM so that it produces an executable JAR with dependencies with the filename booker-version-exe.jar when you invoke mvn package. If your Booker version were 1.2.3, you should be able to run Booker from the command line without using Maven, instead using the command java -jar target/booker-1.2.3-exe.jar.

Command-Line Options

The Booker program will now concentrate on a single task: producing a list of publications. It will use the following general command-line format:

booker list
booker -h | --help

Option	Alias	Description
`list`		Lists all available publications.
`--help`	`-h`	Prints out a help summary of available switches.

Remove all the maps and trees that are not not necessary for the single task of printing a list of publications.
Remove all the program output used for testing and debugging.
Create a method in Booker that prints out all known publications in a user-friendly format.
Add command-line argument support.

References

Resources

Acknowledgments

Standard stream diagram by ScotXW (Own work) [Public domain], via Wikimedia Commons.
Three-tier application diagram modified from diagram by Bartledan (talk), based on a file by User:Foofy [Public domain], via Wikimedia Commons.

Command-Line Interface

Goal

Concepts

Library

Dependencies

Lesson

Separation of Concerns

Presentation Layer

Presentation Representation

User Interface

Command-Line Interface

Arguments

Standard Streams

Exit Code

Option Conventions

Switches

Aliases

Flags

Arguments

Documentation

Java Command-Line Arguments

args4j

Executable Application JAR Files

JAR Metadata

Application Entry Point

JAR with Dependencies

Apache Maven Assembly Plugin

Apache Maven Shade Plugin

Review

Summary

Gotchas

In the Real World

Self Evaluation

Task

Application

Command-Line Options

See Also

References

Resources

Acknowledgments