Chapter 16 -- Command-line Interface with Perl

Chapter 16 Command-line Interface with Perl

The Command-line Options to Perl
Reading Input from STDIN
The Getopts Package
Summary

This chapter introduces you to handling the options available from the command-line interface to the Perl interpreter, handling user input, and writing interactive Perl scripts.

By using the command-line options in Perl, you can determine how to best use the Perl interpreter to take care of details such as handling loops on the input, creating and destroying temporary files, and handling multiple files.

Of course, you would want to be able to process the incoming options to your own programs as well. Writing scripts to handle user responses takes an inordinate amount of time and effort given the infinite number of responses you can receive. When passing installation scripts for a software package, it would be nice if the scripts were intelligent enough to filter out most of the incorrect responses. In this chapter, you work with Perl modules that eliminate some of the grunt work.

The Command-line Options to Perl

Perl's command-line options provide many features, such as checking syntax, printing warnings, using the C preprocessor, and modifying the way output is printed in a Perl document. There are two ways to provide options to a Perl program: either by passing them in the command line along with the command you enter to start the Perl program or in the comment header of your Perl program script.

Sending Options via the Command Line

You can always enter options for a Perl program on the command line. The syntax for specifying options on the command line is

perl options programName

where programName is the name of the Perl program to run, and options is the list of options to provide to the program being run. For example, the command

perl -d -w test1

runs the Perl program named test1 and passes it the options -d and -w. You'll learn about the actions of these options in the following sections. Some options require an additional value. For example, the -I option requires a pathname for include files.

perl -I /usr/local/include/special something

The /usr/local/include/special path is also searched for a file if it is not found via the @Inc path. It is not necessary to put a space between the option and its argument.

perl -I/usr/local/include/special something

In all cases, any value associated with an option must always immediately follow the option.

Options that do not require an associated value can be grouped without the use of an additional dash (-) character or space. For example, the following two commands do the same thing:

perl -d -w test1 perl -dw test1

The last option in a group can have additional values. For example, the following two commands do the same thing:

perl -w -I/usr/local/include/special something perl -wI/usr/local/include/special something

Specifying an Option within the Program

The command line at the start of a program that includes a header comment (a comment beginning with the #! characters) can be used to pass options to Perl. For example, the following line:

#!perl -w

will pass the -w option to Perl. Historically, only one argument could be passed to Perl this way, but now you can pass several options. A word of caution is necessary here: Options specified on the command line will override options specified in the header comment. For example, if your header comment is

#!perl -d

and you start your program with the following command, the program will run with the -w option specified but not the -d option:

perl -w test1

Table 16.1 lists some of the command-line options to Perl.

Table 16.1. Command-line options to Perl.

Option	Meaning
`-c`	Do syntax checking only.
`-d`	Start the debugger.
`-e`	Execute a program from the command line.
`-i`	Insert line back into the input file.
`-I`	Specify the paths to search for included files.
`-p`	Echo each line of input.
`-P`	Use the C preprocessor.
`-s`	Parse very basic command-line switches to the program.
`-T`	Used for writing secure programs. Using this option forces data obtained from outside the program to not be used in any command that affects your file system. This feature lets you write secure programs for system administration tasks.
`-u`	Generate a core dump.
`-U`	Run in unprotected mode (full access to file system).
`-v`	Print the version number.
`-w`	Print warning labels.
`-W`	Print warnings.

The following sections cover each option in more detail. The options are presented in the order in which they are most likely to be found in Perl scripts rather than in alphabetical order.

The `-c` and `-w` Syntax Checking and Warning Options

The -c option asks the Perl interpreter to check the syntax of your Perl program without actually running it. All other options except for -v and -w are ignored by the Perl interpreter when it sees the -c option. The -w option prints warnings instead of errors. An error will certainly crash your program. A warning is issued when attempting to parse an ambiguous operation. Both the -c and the -w options can be used together with the flag, like this: -cw.

If the program you provide is syntactically correct, the Perl interpreter will print the message

filename syntax OK

where filename is the name of your program. If any errors are detected, you'll see the following message where filename is the name of your program:

filename had compilation errors

The -w option prints a warning every time the Perl interpreter sees something that might cause a problem. Here are some of the potential problems:

Having more than one subroutine with the same name. Both functions will be called, and the program won't crash. Use the -w option to warn about this problem.
Using the value of a variable that has not been defined.
Using the == operator to compare strings instead of eq operators.

Note

A number is converted to a string when compared with a string using the eq operator. However, a string when used with the == operator is always converted to the numeric value of 0.

The `-e` Option: Executing a Single-line Program

You can execute Perl statements at the command line with the -e option. Here is an example of a command that prints a string:

$ perl -e 'print ("Kamran Wuz Here\n");' Kamran Wuz Here

Don't forget to type the semicolon (;) at the end of each statement. You can specify more than one statement by using either semicolons to separate them or using multiple -e options. For example, the following two statements both print the string Howdy folks:

$ perl -e 'print ("Howdy\n");' -e 'print (" folks\n");' Howdy folks $ perl -e 'print ("Howdy\n"); print (" folks\n");' Howdy folks

In the case of multiple -e options, the Perl interpreter executes them from left to right. Here's an example:

$ perl -e 'print ("Donald");' -e 'print (" Duck");' Donald Duck

The `-s` Option to Supply Custom Command-line Options

Generally, you'll specify the command line in a Perl script with execute permissions in the first line of a script file, as follows:

#!/usr/bin/perl

The first line is the complete pathname to the Perl interpreter.

You can run the same script using the following command at the command line, as follows:

perl scriptFile

where scriptFile is the name of the script file. Any command-line options specified before the script file's name will be passed to the Perl interpreter and not to your script file.

To pass options to the script that you run, you have to use the -s option.

perl -s scriptFile -w

This command starts the Perl program scriptFile and passes it the -w option. If you do not specify the -s option, your -w will be sent as part of the @ARGV array to the program being run. For programs that are run from the command line with the Perl command, it's best to include -s as part of your header comment:

#!perl -s

This way you are guaranteed that the program always will check for options provided that no other Perl options were specified on the command line when you invoked the program.

A scalar variable with the same name as the name of any specified option is created and automatically set to 1 before the Perl interpreter executes a program. For example, if a Perl program named scriptFile is called with the -x option, as in

perl -s scriptFile -x

the scalar variable $x is automatically set to 1. This lets you test the $x variable in a conditional expression to see whether the option has been set. The named variable will not appear in the @ARGV array. Options do not have to be a single character. For example, the following command sets the value of the scalar variable $surge to 1:

perl -s scriptFile -surge

Options can be set to a value other than 1 by simply assigning a value at the command line. For example:

perl -s scriptFile -surge="power"

This command sets the value of $surge to power in the program specified in scriptFile.

The -s option lets you supply both options and command-line arguments based on these rules:

All arguments that start with a dash (-) and immediately follow the program name are assumed to be options.
Any argument not starting with a dash (-) is assumed to be an ordinary argument. All subsequent arguments, even if they start with a - , are then assumed to be ordinary arguments and not options.
A double dash (--) will end Perl's parsing of command-line switches.

This means, for example, that the command

perl -s scriptFile -arg1 -arg2 -arg3

treats -arg1 as an option to scriptFile, and -arg2 and -arg3 are ordinary arguments that are placed in @ARGV.

The `-I` Option to Include Other Files

The -I option is used with the -P option (which is described in the next section). The -I option lets you specify the pathnames to search for include files to be processed by the C preprocessor. For example:

perl -P -I /usr/local/other scriptFile

This command tells the Perl interpreter to search the directory /usr/local/other for include files if the file is not found in its default paths. The default path is the current directory and, if the file is not found, the /usr/local/lib/perl directory. The -I option can be repeated on the same command line to specify more than one include-file path.

Using the -I option also adds the path or paths to the @Inc variable. The paths are then made available to the Perl interpreter when it uses the require function to find its modules.

The `-P` Option for Using the C Preprocessor

The -P option is helpful only if you have a C compiler on your system. Although all UNIX systems come with a C compiler, DOS and Windows NT systems don't; you have to purchase your own. The cpp preprocessor is the default C preprocessor in UNIX. The C preprocessor is a program that takes code written in C which does basic string substitution based on the values of variables. To enable the use of cpp with the -P option, use the following statement to start your Perl program:

perl -P scriptFile

The Perl program scriptFile is first run through the C preprocessor, and then the resulting output is executed by the Perl interpreter. You can also specify the use of the C preprocessor in the header comment like this:

#!perl -P

All C preprocessor statements have the following syntax:

#command value

The hash (#) is interpreted by Perl as a comment, and so any statements intended for the C preprocessor are ignored by Perl even if the -P option is not used. The command is the preprocessor operation to perform, and value, which is optional, is associated with this operation.

The #define operator is the most common preprocessor statement. It tells the preprocessor to replace every occurrence of a particular character string with a specified value. The syntax for #define is

#define item value

This statement replaces all occurrences of the character string item with value. This substitution operation is known as macro substitution. The item being substituted can contain a combination of letters, digits, and underscores. The value specified in a #define statement can be any character string or number. For example, the following statement will replace all occurrences of DOCTOR with quack, and Donald with "Duck" including the quotation marks:

#define DOCTOR QUACK #define Donald "Duck"

Any expressions are treated as strings. For example, the following statement:

#define AREA (3.141 * 2)

replaces AREA with the string (3.141 * 2), including the parentheses and not the value of 6.282.

When using #define with expressions, don't forget to enclose the value in parentheses. For example, consider the following Perl statement:

$result = ITEM * 10;

If the statement you use in your preprocessor command is this:

#define ITEM 1 + 2

the resulting Perl statement is this:

$result = 1 + 2 * 10;

This statement assigns 21 to $result instead of 30, which would be the result if you used this expression instead:

#define ITEM (1 + 2)

to get this statement as a result:

$result = (1 + 2) * 10;

You can even specify multiple parameters with a #define statement thus enabling you to use a preprocessor command like a simple function that also accepts arguments. For example, this preprocessor statement:

#define SQUARE(val) ((val) * (val))

will return the square of a passed value. This statement:

$result = SQUARE(4)

will generate the following statement:

$result = ((4) * (4));

Multiple parameters are specified using a syntax similar to a Perl program. For example, consider the following statement:

#define POW(base, power) ((base) ** (power)) $result = POW(2,3);

It produces this result:

$result = ((2) ** (3));

Macros can be reused. For example,

#define PI 3.141 #define AREA(rad) (2* (rad) * PI) $result = 43 + AREA($radius);

Here, the macro PI is defined first, the macro AREA uses PI to return an area for a given radius in $.

Using the `-n` and `-p` Options for Handling Multiple Files

When processing input from multiple files, it's often convenient to put the processing function in a while(<>) loop so that each line in each file is sequentially processed. For example, you'll see code of the following form:

while ($line = <>) { &processMe($line) }

Use the -n option to not specify the while loop. This option forces Perl to take your program and execute it once for each line of input in each of the files specified on the command line. Here's an example:

#!perl -n $line = $_; chop ($line); printf ("%d %-52s *\n", $ctr++, $line);

The -n option encloses this program in an invisible while loop. Each line of input is stored in the system variable $_ by the Perl interpreter, which then calls this program. The same program could be rewritten as follows:

#!perl while (<>) { $line = $_; chop ($line); printf ("%d %-52s *\n", $ctr++, $line); }

The -n and -e options can be used together to perform a function on each line of input of all input files. For example, the following statements both search for the word param in all files whose names end with .pl:

perl -n -e "print $_ if (/param/);" *.pl grep "param" *.pl

The print $_ if (/param/); argument supplied with the -e option is a one-line Perl program that prints the current line if the word param is found in it. The -n option executes the one-line program once for each input line that is set into the system variable $_.

The -p option is like the -n option except that it prints each line as it reads each line. The -p option is designed for use with the -i option, which is described in the following section. If both the -p and the -n options are specified, the -n option is ignored.

The `-i` Option to Edit Files

Both the -n and -p options read lines from the files whose names are listed on the command line. When the -i option is used with the -p option, the Perl interpreter takes the input lines being read and writes them back out to the files from which they came. For example, consider the following command:

perl -p -i -e "s/Costa/Rica/g;" *.txt

It replaces every instance of Costa with Rica in all the files whose names end with .txt.

Caution

Do not use the -i option with the -n option. The following command:

perl -n -i -e "s/Stock/Option/g;" *.txt

also changes all occurrences of Stock to Option. However, it does not write out the input lines after it changes them! Because the -i option forces the input files to be written to and nothing is printed, you'll erase the contents of all the files with .txt extensions!

The -i option does not have to work in conjunction with the -p option if the program that uses the option contains the <> operator inside a loop. For example, consider the following command:

perl -i *.txt

It will copy the content of each input file to a temporary file and then open it for reading. The input file is closed and then reopened for writing. This process is repeated for all input files.

Listing 16.1 presents a simple example of a program using both the -i option and the <> operator. This program replaces all occurrences of Wall with Brick.

Listing 16.1. A program that edits files using the -i option.

1 #!perl -I 2 while ($line = <>) { 3 $line =~ s/Wall/Brick/g; 4 print ($line); 5 }

No output is sent to the screen because the output is redirected to each input file.

The -i option can be used to back up input files, too. By specifying a new file extension to the -i option, you can ask that the new extension be appended to the filename being written to. For example, the following command:

perl -i .bak dog mouse

will result in two extra files, dog.bak and mouse.bak, being written to disk. The .bak file extension specified with -i will force the Perl interpreter to copy each file to file.bak before overwriting it.

Using the `-a` Option

The -a option is used for extracting words from files. The -a option is designed to be used with the -n or -p option to split incoming lines into a list of items in the @F array. Each item in the @F array is a word derived by applying the split(' ',$_) function to each input line. For example, if your input file contains the following line:

My name is Kamran

the result of the -a option that reads this line sets the contents of the array @F to be the following list:

("My", "name", "is", "Kamran")

Note that extraneous spaces and tabs from the input line have not been added to the @F array.

Listing 16.2 shows a sample program of how to use the -a option to extract all numeric values that are the first word of an input line.

Listing 16.2. Sample use of the -a option.

1 #!perl -a -n 2 while ($F[0] =~ /[^\d.]/) { 3 shift (@F); 4 next if (!defined($F[0])); 5 } 6 print ("$F[0] \n");

Note that this program prints every line and prints only the first word that does not contain a digit or a . character.

Using the `-F` Option

The -F option is designed to be used along with the -a option. It is used to specify the pattern to use when splitting input lines into words. For example, if the input fields on each line that is input to a program are separated by a colon, you would use the following statement:

perl -a -n -F: textfile

In this case, the words in the input file are assumed to be separated by a colon. You can use opening and closing slashes as pattern delimiters. This means that both the following programs do the same thing:

prog -a -n -F: *.txt prog -a -n -F/:/ *.txt

Using the `-0` Option

The default end-of-input for one line of text in Perl is the newline. That is, the Perl interpreter reads a line from an input file or from the keyboard until it sees a newline character. You can specify an end-of-line input character other than the newline character by using the -0 OOO option. The 0 here is the digit zero for the option, and the letter O is the octal number to replace the newline character. For example, the following command:

perl -0 07 program *.bin

will let the named program use the bell character (7 octal) as the end-of-line character when it reads the input files that have a .bin extension.

For example, the following header comment line will set the end-of-line character to a space, (octal 40):

#!perl -0 040

To read one paragraph at a time, specify 00 as the input to the -0 option. This will let the Perl interpreter read input until it sees two newlines together, and thus you will be able to read in one paragraph at a time. If you do not specify a value with the -0 option, the Perl interpreter assumes the null character (ASCII 0).

Using the `-l` Option

The -l option lets you use a new output end-of-line character for printing statements. Like the -0 option, the -l option takes an octal number instead of an ASCII character for use in place of the newline. This is a one, not the letter "el." When the -l option is specified, the Perl interpreter always replaces the end-of-line character in print statements with the newer version. Also, in the case of -n or -p options, the end-of-line character is removed after reading the input.

The Perl interpreter uses the character specified by the -0 option, if it is defined, in case you do not specify the -I option. If -0 also has not been specified, the end-of-line character is set to the newline character.

When using both the -l and the -0 option, specify the -1 option first, then -0 option. Recall that options are processed from left to right. If the -l option appears first, the output end-of-line character is set to the newline character. If the -0 option appears first, the output end-of-line character (set by -l) becomes the same as the input end-of-line character (set by -0).

Note

It's probably easier to control the input and output end-of-line characters also by using the system variables $/ and $\, respectively.

Using the `-x` Option to Get a Perl Program from Another File

The -x option enables you to process a Perl program that appears in the middle of a file. When the -x option is specified, the Perl interpreter ignores every line in the program until it sees a header comment. The Perl interpreter then processes the program as usual until the bottom of the program file is reached or the __END__ statement is reached. Everything after the __END__ statement is ignored by the Perl interpreter.

Using the `-S` Option

You need to use -S only if you run your Perl program using the Perlcommand. If you run a program directly using a script, the -S option is meaningless because the shell will hunt for your program in the directories specified in your PATH environment variable. The -S option simply tells the Perl interpreter that your program might be contained in any of the directories specified by your PATH environment variable.

The `-v` Option: Printing the Perl Version Number

You might be curious as to which version of Perl you are running. The -v option prints a string with the version information for the Perl interpreter you are running. The Perl interpreter will not run any scripts, nor will it honor any other options when this -v option is specified. Here is sample output from the -v command:

$ perl -v This is perl, version 5.002 Copyright 1987-1996, Larry Wall Perl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the Perl 5.0 source kit.

Now that you've learned the command-line options for the Perl interpreter, you're ready to learn how to process input in your Perl applications.

Using Conditional Code with the C Preprocessor

The C preprocessor also provided five statements, #ifdef, #ifndef,#if, #else, and #endif, for conditional statements to include or exclude parts of your Perl program. The syntax for the #ifdef and #endif statements is

#ifdef cond ...code if cond is defined... #else ...code if cond is NOT defined... #endif

The cond is a character string that can be used in a #define statement. If the character string has been defined to a value, the first set of code (above the #else clause) is inserted in your program; otherwise, the second part of code (after the #else and before the #endif clause) is inserted in your program. Because the #else clause is optional, you can also have statements of the form

#ifdef cond ...code if cond is defined... #endif

The #ifndef lets you define code that is to be executed when a particular string is not defined. Thus, #ifndef takes the opposite action of the #ifdef statement. For example:

#ifdef SOMBER print ("Hello, Cruel world!\n"); #else print ("Hello, Beautiful world!\n"); #endif

This code prints a sad message (Hello, Cruel world!) if SOMBER was defined earlier, or a happy message (Hello, Beautiful world!) if SOMBER was not defined earlier.

Code enclosed by #ifdef and #endif does not have to be a complete Perl statement. For example, the following code will set the value of $result to different settings based on the whether or not METRIC was defined:

$area = $radius * PI * 2 #ifdef METRIC * 2.54 #endif ;

Here, $result is assigned a value in centimeters if METRIC is defined or in inches if it's not.

Tip

Don't overuse the C preprocessor because it might make your program hard to read, especially by people who are not familiar with the C programming language.

The #if statement in the C preprocessor is similar to the #ifdef statement. The #if statement uses the value of a variable, whereas the #ifdef statement simply checks to see whether a variable is defined. The syntax for the #if statement is as follows:

#if expr ...code... #endif

The expr is the expression that is evaluated by the C preprocessor, and code is the code to be executed if expr is nonzero. For example, the following statements will set the value of $result to "hello" if the sum of S1 and S2 is nonzero:

#if S1 + S2 $result = "hello"; #endif

If you want to set the value of $result if either S1 or S2 is set to a nonzero value, you can use the following statement:

#if S1 || S2 $result = "hello"; #endif

By specifying 0 to the #if statement, you can easily prevent lines of code from being interpreted without having to put a hash (#) in front of each line:

#if 0 $result = "hello"; print ("I will not be printed if the -P option is used.\n"); #endif

You can also use #else with the #if operator:

#if S1 || S2 $result = "hello"; #else $result = "goodbye"; #endif

In this case, the value of $result will be "hello" if either S1 or S2 has a nonzero value; otherwise, the value will be "goodbye".

Note

The C preprocessor does not support the exponent operator, so you cannot evaluate
(x ** y) with the #if statement.

You can embed #ifdef/#else/#endif constructs inside one another. Just make sure that you match all the #ifdef and #endif statements so that there is one #endif for each #ifdef and #ifndef statement. For example, here is a snippet of code that illustrates how the nesting is done with two #ifdef blocks:

#ifdef S1 #ifdef S2 print ("Both S1 and S2 are defined \n"); #else print ("S1 yes but not S2\n"); #endif #else #ifdef S2 print ("S2 yes but not S1\n"); #else print ("neither S1 nor S2\n"); #endif #endif

Normally, you would include other Perl programs and modules with the require and use statements. You can also use the #include directive of the C preprocessor to include the contents of another file. The syntax for the #include command is

#include filename

where filename is the name of the file to be included.

For example, the following command includes the contents of math.h as part of the program:

#include <math.h>

The contents of math.h will also be run through the C preprocessor before it's included. The C preprocessor searches for the included file in the current directory and, if not found, in the
/usr/local/lib/perl directory. You can use the -I option to search in other directories for source and include files.

Reading Input from `STDIN`

In a Perl script, you can easily read the standard input for responses with the <STDIN> file handle. The following three lines of code show you how to get a number from a user and return the square root of the number:

print "\nEnter a number:"; $answer = <STDIN>; print "Sq. root of $answer = ", sqrt($answer), "\n";

This little gem of code works great as long you are careful enough to enter only positive numbers. Enter a negative number, and the script bombs. Therefore, before taking the square root, you have to check to see if the number is greater than or equal to zero; otherwise, you have to bail out with an error message.

Another annoying fact is that reading $answer=<STDIN> also brings along the \n end-of-line character. Therefore, to remove this appendage from $answer, you have to call the function chop($answer).

The <STDIN> operation is used to read from the STDIN file handle for reading from standard input. To read each line one at a time from the standard input <STDIN>, you use a program like this one:

while ($_ = <STDIN>) { chop($_); print $_; }

Because $_ is the default storage area for the last line read in a Perl script, any references to $_ can be removed when implicitly implied. For example, the previous excerpt of code could be written as this:

while (<STDIN>) { chop; print; }

To read complete files by specifying the filename from the command line, you can use the <> operator. For example, the following code reads and prints the contents of the files specified on the command line:

while (<>) { print $_; }

In this way, you are reading all the files specified on the command line and then processing the contents of the files one line at time by simply printing the contents of each. Think of this as an equivalent to saying cat file1 file2 file3 … and so on. The <> is equivalent to <ARGV> where ARGV is either STDIN if no files were specified or the contents of all the files in the order they were specified at the command line.

The `Term::Query` Module

The previous example for getting the square root of a number is a very simple example of what you normally run into when getting user responses to questions. Your query expects a response of Y for Yes and N for No, but the user's response might be a firm M for Maybe. If you have twenty questions, the last thing you want to do is to have to verify the responses. This is when it's nice to have modules that do the work for you.

Term::Query is a Perl 5 module written by Alan K. Stebbens. The module is used to provide a set of questions, a default response, a set of expected responses per question, and a help string to assist the end user. Not all of these items have to be specified; only the query is required.

If you do not specify a set of expected return values to a query, the module will accept anything as input. On the other hand, if you do specify a set of parameters, the module will validate the responses for you.

The default response to a query can also be set. The default response is displayed between square brackets. If no default is specified, there will be no such response displayed for the user.

Finally, you can specify a help string for the input question. This string is displayed if the user types ? at the prompt. You can disable the display of the help string if you want ? to be an acceptable response to a query. The help messages can also be based on expected input types. There are built-in help messages for some types of input that are displayed even if you do not explicitly specify a help message. The built-in help strings are quite verbose and may be enough for most general cases.

If at any time during the entry and validation process a bizarre response is given, the module can stop and ask the same question again. This capability to ask the same query again until a correct response is received (or the user types the Ctrl+C key combination) is great for ensuring that the right user responses get into your Perl script.

The module itself contains more details about its internal operations. The documenation is located in the module in the Perl 5 "pod" format. You can convert a pod file into a man page with the following command:

pod2man Query.pm | nroff -man - | less

The pod2man code was developed in version 5.001m and requires at least Perl5.001m. This is because the pod2man code uses references in the Carp.pm module to diagnose itself and in the PrintArray.pm module. (Both modules are written by Alan Stebbens.)

Installing the module is easy. First check to see whether you have the module already in your distribution. Go to your /usr/lib/perl5, /usr/local/lib/perl5/site_perl, or /usr/local/lib/perl5 directory (or wherever you have installed Perl) and look for the file Query.pm. The file will most likely be in the directory /usr/lib/perl5/Term.

If you cannot find the file, you can get it from the ftp sites at hubs.ucsb.edu/pub and ikra.com:/pub/perl/modules. Here's a list of the modules you need:

Term-Query-1.15.tar.gz for the Term module
PrintArray-1.1.tar.gz, a required module for Term

Unzip and untar these files in a place away from the PERLLIBDIR directory.

You have to set the environment variable PERLLIBDIR to either /usr/lib/perl5 or /usr/local/lib/perl5.

Copy the Query.pm file into the $PERLLIBDIR/Term directory. You have to be superuser to do this. Create the directory if you do not already have it. Copy the PrintArray.pm file into the location PERLLIBDIR. You can use the Makefiles that come with the modules, but the copying method has proved to work without having to edit any pathnames in the Makefiles. It's worth taking a look at the test target in the Makefile to see how the regression tests are done in the test directory.

There is one primary subroutine, called query, which is called to process one interaction with the user. The subroutine query() is passed a prompt and some flags, optionally followed by additional arguments, depending on the particular flags. Each flag is a single character and indicates the following values:

The input type: integer, real, string, yes/no, keyword, or non-keyword
What default input to use in the absence of user input
An optional help string to be displayed for errors or input of a question mark (?)
Any required input validation, such as regular expression or pattern matching, maximum length, and so on
Any use of chop() or white space removal

I'll cover these options with some samples. The following sections describe how you can use the module.

Using the `Term::Query` Module

Here's the syntax for the call to the query function:

$result = query($prompt, $flags, [optional fields]);

The $prompt string is displayed, and the response entered is interpreted on the value in $flags. The optional fields may be NULL but must be at least as large as required by the flags.

What are these flags and how did they get interpreted by query()? The flags indicate the type or attribute of the value. Each flag may have parameters associated with it. The order in which the flags are listed must be the same order in which the parameters are listed. Therefore, if you list flags rdh, then you'll have two more strings in the argument list in the order of a default string and a help message string.

There are several flags you can use with the Query package. Some of these you have already seen, some are described in Table 16.2. There is more documentation on other esoteric flags included in the module.

Table 16.2. Flags for the interpretation of input variables.

Flag	Interpretation
`d`	The default response to use if you get no input from the question.
`H`	Ignores the question mark as a request for help. Treats it as a response to a question.
`h`	The help string.
`i`	Accepts Integer input only.
`I`	Specifies a reference to a function to use instead of read `<STDIN>`.
`k`	Specifies a table reference of allowed responses to the question.
`K`	Specifies a table reference of disallowed responses to the question.
`l`	Limits the length of the input.
`m`	Uses the argument as a regular expression for processing responses.
`n`	Accepts Real or Integer.
`N`	Requires a Yes/No response only. The default is N.
`r`	An answer is required at the prompt.
`Y`	Requires a Yes/No response only. The default is Y.

Now, let's see how the flag ridh is interpreted by the module. The first two flags, r and i, translate to "required, integer value." No extra parameters are needed. The d flag specifies that the next argument ($_[1]) is used as the default value. The h flag specifies that the next argument ($_[2]) is used as the help string for the prompt.

The best way to start is to use an example. A sample script using the query() subroutine is shown in Listing 16.3.

Listing 16.3. Using the query subroutine.

1 #!perl 2 # 3 # A sample usage of the query subroutine. 4 # 5 use Term::Query qw( query query_table query_table_set_defaults ); 6 # 7 # Tell him what happened. 8 # 9 sub processReply { 10 my $reply = query @_; # <<<< The call to the query function >>> 11 # 12 # Bail out? 13 # 14 exit if $reply =~ /^\s*(later|bye)\s*$/; 15 printf "You said = [%s]\n",$reply; 16 return $reply; 17 } 18 printf "\n ------------------------------------------ "; 19 printf "\n Application to join da rest of da boys -- "; 20 printf "\n ------------------------------------------ "; 21 # 22 # This will require a response. 23 # 24 $nameh = &processReply("\nWhazza u name-h?",'rh', 25 'Whazza matah, u too stoopid to fouget yo name-h?'); 26 printf "\n Okay $nameh, lemme talk to u about it... \n"; 27 # 28 # This subroutine will NOT require a response before proceeding. 29 # 30 $liveh = &processReply("\nWheh you live-h"); 31 # 32 # This will only accept a response of Y or N, the default being Y 33 # 34 $wannbe = &processReply("\nU wanna be amobstah?",'Y'); 35 # 36 # This will only accept a response of Y or N, the default being N 37 # 38 $house = &processReply("\nU bin to da Big Haus?",'N'); 39 # 40 # This one requires an integer, with a default reply and 41 # has help text for the question mark. 42 # 43 $iq = &processReply ("\nEnter your IQ:", 44 'ridh', 45 5, # the default IQ 46 'Whazza matah? Give you shoe size-h'); 47 # 48 # Use a list of keyowrds 49 # 50 $gunnh = &processReply("\nWhat weapon you like?", 51 'rkd', ['GUN','38','lugah','mace','BO'],'GUN'); 52 printf "\n Okay $nameh, lemme think about it... \n";

The imports in line 5 are a bit extraneous for this simple example. The line could easily have been rewritten because none of the other three functions in the Query.pm module are being used. Here's the line that would work:

5 use Term::Query qw( query );

Here's the function that processes each reply:

9 sub processReply { 10 my $reply = query @_; 11 # 12 # Bail out? 13 # 14 exit if $reply =~ /^\s*(later|bye)\s*$/; 15 printf "You said = [%s]\n",$reply; 16 return $reply;

Lines 9 through 16 are a subroutine to process the reply from the first parameter passed in. This subroutine simply calls thequery function, checks to see whether the user wants to exit and, if the user is not exiting, prints the reply. The reply is returned back to the caller.

Lines 24 and 25 process replies to the name question. In case the applicant does not know how to answer this one correctly, a help string is provided. The processed reply is echoed in the following print statement:

24 $nameh = &processReply("\nWhazza u name-h?",'rh', 25 'Whazza matah, u fouget yo name-h?'); 26 printf "\n Okay $nameh, lemme talk to u about it... \n";

Line 30 requests input and even accepts a carriage return. No help string is given, nor is there any default response. If only a carriage return is entered, the reply back is set to undef.

30 $liveh = &processReply("\nWheh you live-h");

In Line 34, the $wannabe variable is something like, y, Y, n, or N. The responses in this module are not case-sensitive.

34 $wannbe = &processReply("\nU wanna be amobstah?",'Y');

Line 38 does the reverse of Line 34 in that the default response is No instead of Yes. This is also not case-sensitive. Thus, nO is the same as No is the same as NO.

For case-sensitive comparisions between the response and a known string, set the variable $Query::Case_sensitive to 1. By default, this value is set to 0 for case-insensitive comparisons.

Lines 43 through 46 require an input integer with the ridh flag.

43 $iq = &processReply ("\nEnter your IQ:", 44 'ridh', 45 5, # the default IQ 46 'Whazza matah? Give you shoe size-h');

In line 50, a list of keywords to use is specified. Only the responses listed in the table are allowed:

50 $gunnh = &processReply("\nWhat weapon you like?", 51 'rkd', ['GUN','38','lugah','mace','BO'],'GUN'); 52 printf "\n Okay $nameh, lemme think about it... \n";

The order of flags specified in the input to query is also important because this also sets the order in which input validation is done. All input is validated in the order of the flags. When the first test fails, an error message is displayed and the testing stops:

query "Really format the disk? (yn)", 'NV', \$ans;

To add a long help message, you can use the following example:

$ans = &query("Are you sure?? (yn)",'Nh',<<'MESSAGE'); This is the time to back out. If you answer "y", I will format the partition in $partitionName, any existing data on the partition will be lost. If you answer 'no' now, you can back out of the routine to specify another partition. MESSAGE

Note the use of the variable $partitionName in the string to print a value.

You can even use regular expressions to specify an input variable collection. Consider Listing 16.4, which uses regular expressions to match incoming words with the m flag and does not allow certain words.

Listing 16.4. Using regular expressions.

1 #!perl 2 # 3 # A sample usage of the query subroutine. 4 # 5 use Term::Query qw( query query_table query_table_set_defaults ); 6 7 $names = "itsy bitsy bambi"; 8 9 @fields = split(' ',$names); # existing fields 10 $newNode = &query('New node name:','rKmh',\@fields,'^\w+$',<<MSG); 11 Enter a node name to add to the existing list: 12 $names 13 MSG 14 15 $names .= " " . $newNode; 16 17 print "The names are now:\n"; 18 print $names . "\n";

Extending the `Query.pm` Module

There are two other routines in the Query.pm module that allow easier processing of the question/answer sequences:

`query_table()`	This function can be passed an array of arguments that are interactively passed to `query()`. This is an easy way to get all your answers up front if you do not have to do any processing between responses.
`query_table_set_defaults()`	This can be used on a query table array to cause any mentioned variables to be initialized with any mentioned default values. This is handy for having a single table define variables, default values, and validation criteria for setting new values.

Using `query_table()`

The query_table() function is useful when you want to collect all the user input at one time without having to do any processing in between inputs. Basically, you pass in list of prompts, flags, and optional arguments to the query_table function. The query_table() function calls query() on the list, collects the responses, and returns them in an array.

Here's the way to use query_table:

@array = query_table( $prompt1, $flags, [ $arguments, ... ], $prompt2, $flags, [ $arguments, ... ], ... $promptN, $flags, [ $arguments, ... ] );

There are three items per query: a prompt string, a flags string, and an array of arguments. Note that the syntax specifies the use of the square brackets to show that the arguments array is a variable length array. The array can be empty if no arguments are needed for a set of flags for the entry.

A query-table can be created with a set of variables, their default values, input validation parameters, and help strings. The query_table_set_defaults() subroutine sets the default values in the table. The subroutine query_table() processes each entry in this table to get the responses from the user.

Listing 16.5 contains a sample script using query-table.

Listing 16.5. Using query-table.

1 #!perl 2 # 3 # A sample usage of the query subroutine. 4 # 5 use Term::Query qw( query query_table query_table_set_defaults ); 6 # 7 # Snagged straight out of the test module with this package. 8 # 9 sub qa { 10 $ans = query @_; 11 exit if $ans =~ /^\s*(exit|quit|abort)\s*$/; 12 printf "Your Response = \"%s\"\n",(length($ans) ? $ans : 13 defined($ans) ? 'NULL' : 'undef'); 14 } 15 @interrogator = ( "What is your name?", 'Vrh', 16 [ 'name', 'Who are you?' ] , 17 "What is your age?", 'Vrih', 18 [ 'age', 32 , 'Please be honest are you?' ] , 19 "Do you have carrots?", 'VY', 20 [ 'carrots', 'Y' ] ); 21 #------------------------------------------------------------------ 22 # The variables $name, $age and $carrots will be set to default 23 # values (if any). 24 #------------------------------------------ 25 query_table_set_defaults \@interrogator; 26 # 27 foreach $var ( qw( name age carrots ) ) { 28 $val = $$var; 29 print " \$$var = \"$val\"\n"; 30 } 31 # 32 # The variables $name, $age and $carrots will 33 # be set to default (if any) or the response values 34 # from the processing of the query table. 35 $ret = query_table \@interrogator; 36 print "queryTable returned $ret\n"; 37 # Echo them out. 38 foreach $var ( qw( name age carrots ) ) { 39 $val = $$var; 40 print " \$$var = \"$val\"\n"; 41 }

With typical usage, given $prompt and $flags, query() prints $prompt and then waits for input from the user. The handling of the response depends on the flag characters given in the $flags string.

In Listing 16.5 the table has three prompt strings and three variables to which the received responses are assigned. See Lines 15 through 20. You can just as easily have 50 entries in the table. This modular procedure makes it easy to set up a series of questions when no processing is required between responses. That is, you can collect your information all at once and then parse the collected information.

Note the use of the V flag on all of the flags in the table in Listing 16.5, lines 17, 19, and 20. The V flag forces the reference to the name of the variable to the level above its current execution level. Therefore, $name, $age, and $carrot are defined in the calling module once the query_table_set_defaults or query_table call is made. Not setting the V flag forces the variable to be local to the Query module itself, and any responses in the named variables are lost.

The Term::Query module is an excellent tool for prompting and collecting user responses to commands. If you would like more detailed information, please read the documentation in the Query.pm module. The author Alan K. Stebbens can be reached via e-mail at aks@hub.ucsb.edu at the College of Engineering, University of California, Santa Barbara.

The `Getopts` Package

The Getopts package is designed to help you parse the input options into your shell scripts. This package comes standard with the Perl 5 distributions, so you do not have to get it from anywhere.

Options to Perl scripts you write can be sent in one at a time or can be clustered. For example, options x,v, and t can be sent in as -x -v -t, -xvt, -xt -v, and so on. Your script should be able to recognize these options. The type of work involved in this type of option recognition is common enough that more than one module is available for you to work with: for example, the Std.pm module, which is a simple module that recognizes only certain options, and the Long.pm module, which also recognizes the states and default values of options.

Using `Std.pm`

You use the Std.pm module by including the following line in your shell script:

use Getopt::Std.pm;

The options that you want the module to list are passed in a string of the form xyz. The call to the getopts function then attempts to look for -x, -y, or -z, or a combination of these options in the command-line string. For each option found, it sets the variable $opt_x, $opt_y, or $opt_z with either the value of 1 (for found) or undef (for not found). For example, the following two lines of code set up the command-line options for x, y, or z:

use Getopt::Std.pm; getopt("xyz");

The returned values for x,y, and z do not have to be 1 or 0. Assigned values can be collected by appending a colon to each option with which you expect to pass a parameter. For example, the following line takes arguments for -f and -c:

getopt("vf:c:");

A sample usage of this module is shown in Listing 16.6.

Listing 16.6. Using the Std.pm module.

1 #!perl 2 3 use Getopt::Std; 4 5 $result = getopt('wx:yz'); 6 print "\n Options:\n"; 7 printf " w :--> $opt_w \n"; 8 printf " x :--> $opt_x \n"; 9 printf " y :--> $opt_y \n"; 10 printf " z :--> $opt_z \n";

Note

The use Getopt::Std; getopt(); call completely replaces the original Perl 4, "require 'getopts.pl'; &Getopts();" statements. The old library is still included for compatibility reasons.

The `Long.pm` module

The Long.pm module is a bit bigger than the Std.pm module, both in size and in functionality. The primary functional interface to this module is via the GetOptions() function, which is basically a souped-up version of the getopts() function found in the C library. Each description of the options your script is looking for should designate a valid Perl identifier, optionally followed by a specification designating the type of option.

Here's the syntax to use when calling GetOptions():

use Getopt::Long; $result = GetOptions (name=opt1, name=opt2, .. name=optN);

You should specify the option name because this name is used by Perl to set the variable $opt_name to the value specified by the option. Here are the values for the opt1, opt2,…optN specifiers:

`<none>`	This option does not take an argument.
`!`	This option does not take an argument and may be negated.
`=s`	This option takes a mandatory string argument.
`:s`	This option takes an optional (`:`) string argument.
`=i`	This option takes a mandatory (`=`) integer argument.
`:i`	This option takes an optional (`:`) integer argument.
`=f`	This option takes a mandatory (`=`) real number argument.
`:f`	This option takes an optional (`:`) real number argument.

Please read the Long.pm file in the subroutine GetOptions() header documentation for details on these options.

For options that do not take an argument, their value will be set to 1 or nothing. Options that do take an optional argument will cause the corresponding variable to be defined in the name space of the module they are being called from. If no value is specified at the command prompt, the value of the variable will be set to the empty string.

Boolean options are also possible. Use ! after the option name to indicate that an option can also be negated. Then, you can negate options of the form html and nohtml. Thus, -html causes the variable $opt_html to be set to 1, and -nohtml causes the variable $opt_html to be set to 0.

Dashes in option names are allowed (ice-cream) but are translated to underscores in the corresponding Perl variable ($ice_cream). A lone dash is translated to the Perl identifier of $opt_. Double dashes (--) by themselves signal the end of the options list to the package. Options that start with -- can have an assignment after them. For example, --topping=nuts.

Examples of Options Settings

Listing 16.7 provides a small example of how to use options. In this example, you can specify a string for the value of the variable $opt_flavor. The variable $opt_vanilla is either 1 or 0, depending on how the -vanilla or -novanilla options are set. The value of -cost, if specified, will be set in $opt_cost. The variable $opt_cost will not be set, because it does not have to be set to a value.

Listing 16.7. Using the Getopts::Long package.

1 #!perl 2 use Getopt::Long; 3 4 # 5 # This allows you to specify a string for the flavor, 6 # either -vanilla or -novanilla 7 # The value of -cost is a required integer value, but 8 # -topping value is optional 9 # 10 $result = GetOptions ('flavor=s','vanilla!', 'cost=i','topping:s'); 11 12 printf " flavor :--> $opt_flavor \n"; 13 printf " vanilla :--> $opt_vanilla \n"; 14 printf " cost :--> $opt_cost \n"; 15 printf " topping :--> $opt_topping \n";

Here is sample input and output for this listing.

$ test.pl -flavor weird -novanilla -cost 2 -topping nuts flavor :--> weird vanilla :--> 0 cost :--> 2 topping :--> nuts $ test.pl -flavor marmalade -vanilla -cost 1 -topping bugs flavor :--> marmalade vanilla :--> 1 cost :--> 1 topping :--> bugs

Some important variables, to keep in mind when working with the 29 Getopts:: modules, appear in the following list. Check the Long.pm file itself for other variables not listed here if you need more functionality. You can set one or more of the following variables in your script to get the desired result:

`$autoabbrev`	This allows option names to be uniquely abbreviated. So, if no other variable begins with the first three letters `che`, the variable `chewable` would be referred to as `che`. The default value of this variable is `1`. The value of this variable can be overridden by setting the environment variable `POSIXLY_CORRECT`.
`$option_start`	This is the regular expression of the start of the option identifier character strings. The default value is `(--\|-\|\+);` that is, it allows the use of `-`, `--`, and even `+`. If the environment variable `POSIXLY_CORRECT` is set, the value is set to `(--\|-)`, thereby not allowing the `+` option.
`$ignorecase`	When set, this variable ignores case when matching options. The default value of this variable is `1`.
`$debug`	This variable enables debugging output. The default is `0`. By setting this value, you can view debugging information.

Summary

By passing command-line options to the Perl interpreter you can control how input is read into a program, whether the code internally can be considered a loop, how input is parsed by changing the end-of-line character, where to look for files, and so on. Perl has several options to control execution. Options can be specified on the command line or in the header comment of the program you are running. An option is simply a dash (-) followed by one or more characters. An option can have parameters too, such as pathnames for where to look for included files, or what to use as the end-of-line character, or what to use in place of a space character, and so on. Options that do not require any parameters can be grouped together behind one dash. The command-line options override values set in the header comment.

Your Perl programs can have their own options as well. For simple options into programs, it's possible to parse incoming arguments manually. In order to get arguments manually, you'd have to start with the skeletal code shown below and then work yourself up adding recognized switches as you went:

foreach (@ARGV) { last if $_ eq "--"; $opt_x = 1 if /^-x/; $opt_y = 1 if /^-y/; $opt_z = 1 if /^-z/; }

Okay, now how about handling options that are clustered, such as -xyz? This is where these packages help you out. Obviously, using the Getopts package is far more efficient and easier to use when it comes to parsing the command-line arguments. Then, after you have read the options, the Query.pm module can help with the user interaction.

This chapter has also been a very quick introduction to using Perl modules to handle interactive user input and command-line arguments. Using the Term::Query module, you can set up one or more query/response prompt and reply strings. You can use tables to automate the interrogations. The code to handle responses from the user can be set to verify or accept the user response. The Getopt::Std and Getopt::Long packages can be used to pick up arguments specified on the command line.

Previous chapter Chapter contents Contents Next chapter

Chapter 16

Command-line Interface with Perl

CONTENTS