IEEE P1003.2 Draft 11.2 - September 1991 Copyright (c) 1991 by the Institute of Electrical and Electronics Engineers, Inc. 345 East 47th Street New York, NY 10017, USA All rights reserved as an unpublished work. This is an unapproved and unpublished IEEE Standards Draft, subject to change. The publication, distribution, or copying of this draft, as well as all derivative works based on this draft, is expressly prohibited except as set forth below. Permission is hereby granted for IEEE Standards Committee participants to reproduce this document for purposes of IEEE standardization activities only, and subject to the restrictions contained herein. Permission is hereby also granted for member bodies and technical committees of ISO and IEC to reproduce this document for purposes of developing a national position, subject to the restrictions contained herein. Permission is hereby also granted to the preceding entities to make limited copies of this document in an electronic form only for the stated activities. The following restrictions apply to reproducing or transmitting the document in any form: 1) all copies or portions thereof must identify the document's IEEE project number and draft number, and must be accompanied by this entire notice in a prominent location; 2) no portion of this document may be redistributed in any modified or abridged form without the prior approval of the IEEE Standards Department. Other entities seeking permission to reproduce this document, or any portion thereof, for standardization or other activities, must contact the IEEE Standards Department for the appropriate license. Use of information contained in this unapproved draft is at your own risk. IEEE Standards Department Copyright and Permissions 445 Hoes Lane, P.O. Box 1331 Piscataway, NJ 08855-1331, USA +1 (908) 562-3800 +1 (908) 562-1571 [FAX] Part 2: SHELL AND UTILITIES P1003.2/D11.2 if x=$(_c_o_m_m_a_n_d) then ... fi An example of redirections without a command name being performed in a subshell shows that the here-document does not disrupt the standard input of the while loop: IFS=: while read a b do echo $a <<-eof Hello eof done foo || { echo "error: foo cannot be created" >&2 1 exit 1 1 } # set saved if /vmunix.save exists test -f /vmunix.save && saved=1 Command substitution and redirections without command names both occur in subshells, but they are not the same ones. For example, in: 1 exec 3> file var=$(echo foo >&3) 3>&1 it is unspecified whether foo will be echoed to the file or to standard output. END_RATIONALE 3.9.1.1 Command Search and Execution If a simple command results in a command name and an optional list of arguments, the following actions shall be performed. (1) If the command name does not contain any slashes, the first successful step in the following sequence shall occur: (a) If the command name matches the name of a special built-in utility, that special built-in utility shall be invoked. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3.9 Shell Commands 261 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX (b) If the command name matches the name of a function known to this shell, the function shall be invoked as described in 3.9.5. [If the implementation has provided a standard utility in the form of a function, it shall not be recognized at this point. It shall be invoked in conjunction with the path search in step (1)(d).] (c) If the command name matches the name of a utility listed in Table 2-2 (see 2.3), that utility shall be invoked. (d) Otherwise, the command shall be searched for using the PATH environment variable as described in 2.6: [1] If the search is successful: [a] If the system has implemented the utility as a regular built-in or as a shell function, it shall be invoked at this point in the path search. [b] Otherwise, the shell shall execute the utility 1 in a separate utility environment (see 3.12) 1 with actions equivalent to calling the 1 POSIX.1 {8} _e_x_e_c_v_e() function with the _p_a_t_h argument set to the pathname resulting from the search, _a_r_g_0 set to the command name, and the remaining arguments set to the operands, if any. If the _e_x_e_c_v_e() function fails due to an error equivalent to the POSIX.1 {8} error [ENOEXEC], the shell shall execute a command equivalent to having a shell invoked with the command name as its first operand, along with any remaining arguments passed along. If the executable file is not a text file, the shell may bypass this command execution, write an error message, and return an exit status of 1 126. 1 Once a utility has been searched for and found (either as a result of this specific search or as part of an unspecified shell startup activity), an implementation may remember its location and need not search for the utility again unless the PATH variable has been the subject of an assignment. If the remembered location fails for a subsequent invocation, the shell shall repeat the search to find the new location for the utility, if any. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 262 3 Shell Command Language Part 2: SHELL AND UTILITIES P1003.2/D11.2 [2] If the search is unsuccessful, the command shall fail with an exit status of 127 and the shell shall write an error message. (2) If the command name does contain slashes, the shell shall execute the utility in a separate utility environment with 1 actions equivalent to calling the POSIX.1 {8} _e_x_e_c_v_e() function 1 with the _p_a_t_h and _a_r_g_0 arguments set to the command name, and the remaining arguments set to the operands, if any. If the _e_x_e_c_v_e() function fails due to an error equivalent to the POSIX.1 {8} error [ENOEXEC], the shell shall execute a command equivalent to having a shell invoked with the command name as its first operand, along with any remaining arguments passed along. If the executable file is not a text file, the shell may bypass this command execution, write an error message, and return an exit status of 126. 1 BEGIN_RATIONALE 3.9.1.1.1 Command Search and Execution Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) This description requires that the shell can execute shell scripts directly, even if the underlying system does not support the common #! interpreter convention. That is, if file foo contains shell commands and is executable, the following will execute foo: ./foo The command search shown here does not match all historical implementations. A more typical sequence has been: - Any built-in, special or regular. - Functions. - Path search for executable files. But there are problems with this sequence. Since the programmer has no idea in advance which utilities might have been built into the shell, a function cannot be used to portably override a utility of the same name. (For example, a function named cd cannot be written for many historical systems.) Furthermore, the PATH variable is partially ineffective in this case and only a pathname with a slash can be used to ensure a specific executable file is invoked. The sequence selected for POSIX.2 acknowledges that special built-ins cannot be overridden, but gives the programmer full control over which Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3.9 Shell Commands 263 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX versions of other utilities are executed. It provides a means of suppressing function lookup (via the command utility; see 4.12) for the user's own functions and ensures that any regular built-ins or functions provided by the implementation are under the control of the path search. The mechanisms for associating built-ins or functions with executable files in the path are not specified by POSIX.2, but the wording requires that if either is implemented, the application will not be able to distinguish a function or built-in from an executable (other than in terms of performance, presumably). The implementation must ensure that all effects specified by POSIX.2 resulting from the invocation of the regular built-in or function (interaction with the environment, variables, traps, etc.) are identical to those resulting from the invocation of an executable file. Example: Consider three versions of the ls utility: - The application includes a shell function named ls. - The user writes her own utility named ls and puts it in /hsa/bin. - The example implementation provides ls as a regular shell built-in that will be invoked (either by the shell or directly by _e_x_e_c) when the path search reaches the directory /posix/bin. If PATH=/posix/bin, various invocations yield different versions of ls: Invocation Version of ls _______________________________________________ __________________ ls (from within application script) (1) function command ls (from within application script) (3) built-in ls (from within makefile called by application) (3) built-in system("ls") (3) built-in PATH="/hsa/bin:$PATH" ls (2) user's version After the _e_x_e_c_v_e() failure described, the shell normally executes the file as a shell script. Some implementations, however, attempt to detect whether the file is actually a script and not an executable from some other architecture. The method used by the KornShell is allowed by the text that indicates nontext files may be bypassed. END_RATIONALE 3.9.2 Pipelines A _p_i_p_e_l_i_n_e is a sequence of one or more commands separated by the control operator |. The standard output of all but the last command shall be connected to the standard input of the next command. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 264 3 Shell Command Language Part 2: SHELL AND UTILITIES P1003.2/D11.2 The format for a pipeline is: [!] _c_o_m_m_a_n_d_1 [ | _c_o_m_m_a_n_d_2 ...] The standard output of _c_o_m_m_a_n_d_1 shall be connected to the standard input of _c_o_m_m_a_n_d_2. The standard input, standard output, or both of a command shall be considered to be assigned by the pipeline before any redirection specified by redirection operators that are part of the command (see 3.7). If the pipeline is not in the background (see 3.9.3.1), the shell shall wait for the last command specified in the pipeline to complete, and may also wait for all commands to complete. _E_x_i_t__S_t_a_t_u_s If the reserved word ! does not precede the pipeline, the exit status shall be the exit status of the last command specified in the pipeline. Otherwise, the exit status is the logical NOT of the exit status of the last command. That is, if the last command returns zero, the exit status shall be 1; if the last command returns greater than zero, the exit status is zero. BEGIN_RATIONALE 3.9.2.1 Pipelines Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) Because pipeline assignment of standard input or standard output or both takes place before redirection, it can be modified by redirection. For example: $ command1 2>&1 | command2 sends both the standard output and standard error of command1 to the standard input of command2. The reserved word ! was added to allow more flexible testing using AND and OR lists. It was suggested that it would be better to return a nonzero value if any command in the pipeline terminates with nonzero status (perhaps the bitwise OR of all return values). However, the choice of the last- specified command semantics are historical practice and would cause application breakage if changed. An example of historical (and POSIX.2) behavior: Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3.9 Shell Commands 265 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX $ sleep 5 | (exit 4) $ echo $? 4 $ (exit 4) | sleep 5 1 $ echo $? 1 0 1 END_RATIONALE 3.9.3 Lists An _A_N_D-_O_R-_l_i_s_t is a sequence of one or more pipelines separated by the operators && || A _l_i_s_t is a sequence of one or more AND-OR-lists separated by the operators ; & and optionally terminated by ; & The operators && and || shall have equal precedence and shall be evaluated from beginning to end. A ; or terminator shall cause the preceding AND-OR-list to be executed sequentially; an & shall cause asynchronous execution of the preceding AND-OR-list. The term _c_o_m_p_o_u_n_d-_l_i_s_t is derived from the grammar in 3.10; it is equivalent to a sequence of _l_i_s_t_s, separated by s, that can be preceded or followed by an arbitrary number of s. BEGIN_RATIONALE 3.9.3.0.1 Lists Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) The equal precedence of && and || is historical practice. The developers of the standard evaluated the model used more frequently in high level programming languages, such as C, to allow the shell logical operators to be used for complex expressions in an unambiguous way, but could not in the end allow existing scripts to break in the subtle way unequal precedence might cause. Some arguments were posed concerning the { } or ( ) groupings that are required historically. There are some disadvantages to these groupings: Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 266 3 Shell Command Language Part 2: SHELL AND UTILITIES P1003.2/D11.2 - The ( ) can be expensive, as they spawn other processes on some systems. This performance concern is primarily an implementation issue. - The { } braces are not operators (they are reserved words) and require a trailing space after each {, and a semicolon before each }. Most programmers (and certainly interactive users) have avoided braces as grouping constructs because of the irritating syntax required. Braces were not changed to operators because that would generate compatibility issues even greater than the precedence question; braces appear outside the context of a keyword in many shell scripts. An example reiterates the precedence of the lists as they associate from 1 beginning to end. Both of the following commands write solely bar to 1 standard output: 1 false && echo foo || echo bar 1 true || echo foo && echo bar 1 The following is an example that illustrates s in compound- lists: while # a couple of newlines # a list date && who || ls; cat file # a couple of newlines # another list wc file > output & true do # 2 lists ls cat file done END_RATIONALE 3.9.3.1 Asynchronous Lists If a command is terminated by the control operator ampersand (&), the shell shall execute the command asynchronously in a subshell. This means that the shell shall not wait for the command to finish before executing the next command. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3.9 Shell Commands 267 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX The format for running a command in background is: _c_o_m_m_a_n_d_1 & [_c_o_m_m_a_n_d_2 & ...] The standard input for an asynchronous list, before any explicit redirections are performed, shall be considered to be assigned to a file that has the same properties as /dev/null. If it is an interactive shell, this need not happen. In all cases, explicit redirection of standard input shall override this activity. When an element of an asynchronous list (the portion of the list ended by 1 an ampersand, such as _c_o_m_m_a_n_d_1, above) is started by the shell, the 1 process ID of the last command in the asynchronous list element shall 1 become known in the current shell execution environment; see 3.12. This process ID shall remain known until: - The command terminates and the application waits for the process ID, or - Another asynchronous list is invoked before $! (corresponding to 1 the previous asynchronous list) is expanded in the current 1 execution environment. 1 The implementation need not retain more than the {CHILD_MAX} most recent 1 entries in its list of known process IDs in the current shell execution 1 environment. 1 _E_x_i_t__S_t_a_t_u_s The exit status of an asynchronous list shall be zero. BEGIN_RATIONALE 3.9.3.1.1 Asynchronous Lists Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) The grammar treats a construct such as 1 foo & bar & bam & 1 as one ``asynchronous list,'' but since the status of each element is 1 tracked by the shell, the term ``element of an asynchronous list'' was 1 introduced to identify just one of the foo, bar, bam portions of the 1 overall list. 1 Unless the implementation has an internal limit, such as {CHILD_MAX}, on 1 the retained process IDs, it would require unbounded memory for the 1 following example: 1 Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 268 3 Shell Command Language Part 2: SHELL AND UTILITIES P1003.2/D11.2 while true 1 do foo & echo $! 1 done 1 The treatment of the signals SIGINT and SIGQUIT with asynchronous lists is described in 3.11. Since the connection of the input to the equivalent of /dev/null is considered to occur before redirections, the following script would produce no output: exec < /etc/passwd cat <&0 & wait END_RATIONALE 3.9.3.2 Sequential Lists Commands that are separated by a semicolon (;) shall be executed sequentially. The format for executing commands sequentially is: _c_o_m_m_a_n_d_1 [; _c_o_m_m_a_n_d_2] ... Each command shall be expanded and executed in the order specified. _E_x_i_t__S_t_a_t_u_s The exit status of a sequential list shall be the exit status of the last command in the list. 3.9.3.3 AND Lists The control operator && shall denote an AND list. The format is: _c_o_m_m_a_n_d_1 [ && _c_o_m_m_a_n_d_2] ... First _c_o_m_m_a_n_d_1 is executed. If its exit status is zero, _c_o_m_m_a_n_d_2 is executed, and so on until a command has a nonzero exit status or there are no more commands left to execute. The commands shall be expanded only if they are executed. _E_x_i_t__S_t_a_t_u_s The exit status of an AND list shall be the exit status of the last command that is executed in the list. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3.9 Shell Commands 269 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX 3.9.3.4 OR Lists The control operator || shall denote an OR List. The format is: _c_o_m_m_a_n_d_1 [ || _c_o_m_m_a_n_d_2] ... First, _c_o_m_m_a_n_d_1 is executed. If its exit status is nonzero, _c_o_m_m_a_n_d_2 is executed, and so on until a command has a zero exit status or there are no more commands left to execute. _E_x_i_t__S_t_a_t_u_s The exit status of an OR list shall be the exit status of the last command that is executed in the list. 3.9.4 Compound Commands The shell has several programming constructs that are _c_o_m_p_o_u_n_d _c_o_m_m_a_n_d_s, which provide control flow for commands. Each of these compound commands has a reserved word or control operator at the beginning, and a corresponding terminator reserved word or operator at the end. In addition, each can be followed by redirections on the same line as the terminator. Each redirection shall apply to all the commands within the compound command that do not explicitly override that redirection. 3.9.4.1 Grouping Commands The format for grouping commands is as follows: (_c_o_m_p_o_u_n_d-_l_i_s_t) Execute _c_o_m_p_o_u_n_d-_l_i_s_t in a subshell environment; see 3.12. Variable assignments and built-in commands that affect the environment shall not remain in effect after the list finishes. { _c_o_m_p_o_u_n_d-_l_i_s_t;} Execute _c_o_m_p_o_u_n_d-_l_i_s_t in the current process environment. _E_x_i_t__S_t_a_t_u_s The exit status of a grouping command shall be the exit status of _l_i_s_t. BEGIN_RATIONALE Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 270 3 Shell Command Language Part 2: SHELL AND UTILITIES P1003.2/D11.2 3.9.4.1.1 Grouping Commands Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) The semicolon shown in { _c_o_m_p_o_u_n_d-_l_i_s_t;} is an example of a control operator delimiting the } reserved word. Other delimiters are possible, as shown in 3.10; is frequently used. A proposal was made to use the construct in all cases where command grouping performed in the current process environment is performed, identifying it as a construct for the grouping commands, as well as for shell functions. This was not included because the shell already has a grouping construct for this purpose ({ }), and changing it would have been counter-productive. END_RATIONALE 3.9.4.2 for Loop The for loop shall execute a sequence of commands for each member in a list of _i_t_e_m_s. The for loop requires that the _r_e_s_e_r_v_e_d _w_o_r_d_s do and done be used to delimit the sequence of commands. The format for the for loop is as follows. for _n_a_m_e [ in _w_o_r_d ... ] do _c_o_m_p_o_u_n_d-_l_i_s_t done First, the list of words following in shall be expanded to generate a list of items. Then, the variable _n_a_m_e shall be set to each item, in turn, and the _c_o_m_p_o_u_n_d-_l_i_s_t executed each time. If no items result from the expansion, the _c_o_m_p_o_u_n_d-_l_i_s_t shall not be executed. Omitting in _w_o_r_d ... is equivalent to in "$@" _E_x_i_t__S_t_a_t_u_s The exit status of a for command shall be the exit status of the last command that executes. If there are no items, the exit status shall be zero. BEGIN_RATIONALE Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3.9 Shell Commands 271 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX 3.9.4.2.1 for Loop Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) The format is shown with generous usage of s. See the grammar in 3.10 for a precise description of where s and semicolons can be interchanged. Some historical implementations support { and } as substitutes for do and done. The working group chose to omit them, even as an obsolescent feature. (Note that these substitutes were only for the for command; the while and until commands could not use them historically, because they 1 are followed by compound-lists that may contain {...} grouping commands 1 themselves 1 The reserved word pair do ... done was selected rather than do ... od (which would have matched the spirit of if ... fi and case ... esac) because od is a commonly-used utility name and this would have been an unacceptable choice. END_RATIONALE 3.9.4.3 case Conditional Construct The conditional construct case shall execute the _c_o_m_p_o_u_n_d-_l_i_s_t corresponding to the first one of several _p_a_t_t_e_r_n_s (see 3.13) that is matched by the string resulting from the tilde expansion, parameter expansion, command substitution, and arithmetic expansion and quote removal of the given word. The reserved word in shall denote the beginning of the patterns to be matched. Multiple patterns with the same _c_o_m_p_o_u_n_d-_l_i_s_t are delimited by the | symbol. The control operator ) terminates a list of patterns corresponding to a given action. The _c_o_m_p_o_u_n_d-_l_i_s_t for each list of patterns is terminated with ;;. The case construct terminates with the reserved word esac (case reversed). The format for the case construct is as follows. case _w_o_r_d in [(]_p_a_t_t_e_r_n_1) _c_o_m_p_o_u_n_d-_l_i_s_t;; 2 [(]_p_a_t_t_e_r_n_2|_p_a_t_t_e_r_n_3)_c_o_m_p_o_u_n_d-_l_i_s_t;; 2 ... esac The ;; is optional for the last _c_o_m_p_o_u_n_d-_l_i_s_t. Each pattern in a pattern list shall be expanded and compared against the expansion of _w_o_r_d. After the first match, no more patterns shall be expanded, and the _c_o_m_p_o_u_n_d-_l_i_s_t shall be executed. The order of expansion and comparing of patterns in a multiple pattern list is unspecified. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 272 3 Shell Command Language Part 2: SHELL AND UTILITIES P1003.2/D11.2 _E_x_i_t__S_t_a_t_u_s The exit status of case is zero if no patterns are matched. Otherwise, the exit status shall be the exit status of the last command executed in the _c_o_m_p_o_u_n_d-_l_i_s_t. BEGIN_RATIONALE 3.9.4.3.1 case Conditional Construct Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) An optional open-parenthesis before _p_a_t_t_e_r_n was added to allow numerous 2 historical KornShell scripts to conform. At one time, using the leading 2 parenthesis was required if the case statement were to be embedded within 2 a $( ) command substitution; this is no longer the case with the POSIX 2 shell. Nevertheless, many existing scripts use the open-parenthesis, if 2 only because it makes matching-parenthesis searching easier in vi and 2 other editors. This is a relatively simple implementation change that is 2 fully upward compatible for all scripts. 2 Consideration was given to requiring break inside the _c_o_m_p_o_u_n_d-_l_i_s_t to prevent falling through to the next pattern action list. This was rejected as being nonexisting practice. An interesting undocumented feature of the KornShell is that using ;& instead of ;; as a terminator causes the exact opposite behavior--the flow of control continues with the next _c_o_m_p_o_u_n_d-_l_i_s_t. The pattern "*", given as the last pattern in a case construct, is equivalent to the default case in a C-language switch statement The grammar shows that reserved words can be used as patterns, even if one is the first word on a line. Obviously, the reserved word esac cannot be used in this manner. END_RATIONALE 3.9.4.4 if Conditional Construct The if command shall execute a _c_o_m_p_o_u_n_d-_l_i_s_t and use its exit status to determine whether to execute another _c_o_m_p_o_u_n_d-_l_i_s_t. The format for the if construct is as follows. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3.9 Shell Commands 273 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX if _c_o_m_p_o_u_n_d-_l_i_s_t _t_h_e_n _c_o_m_p_o_u_n_d-_l_i_s_t [elif _c_o_m_p_o_u_n_d-_l_i_s_t _t_h_e_n _c_o_m_p_o_u_n_d-_l_i_s_t] ... [else _c_o_m_p_o_u_n_d-_l_i_s_t] fi The if _c_o_m_p_o_u_n_d-_l_i_s_t is executed; if its exit status is zero, the then _c_o_m_p_o_u_n_d-_l_i_s_t is executed and the command shall complete. Otherwise, each elif _c_o_m_p_o_u_n_d-_l_i_s_t is executed, in turn, and if its exit status is zero, the then _c_o_m_p_o_u_n_d-_l_i_s_t is executed and the command shall complete. Otherwise, the else _c_o_m_p_o_u_n_d-_l_i_s_t is executed. _E_x_i_t__S_t_a_t_u_s The exit status of the if command shall be the exit status of the then or else _c_o_m_p_o_u_n_d-_l_i_s_t that was executed, or zero, if none was executed. BEGIN_RATIONALE 3.9.4.4.1 if Conditional Construct Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) The precise format for the command syntax is described in 3.10. END_RATIONALE 3.9.4.5 while Loop The while loop continuously shall execute one _c_o_m_p_o_u_n_d-_l_i_s_t as long as another _c_o_m_p_o_u_n_d-_l_i_s_t has a zero exit status. The format of the while loop is as follows while _c_o_m_p_o_u_n_d-_l_i_s_t-_1 _d_o _c_o_m_p_o_u_n_d-_l_i_s_t-_2 _d_o_n_e The _c_o_m_p_o_u_n_d-_l_i_s_t-_1 shall be executed, and if it has a nonzero exit status, the while command shall complete. Otherwise, the _c_o_m_p_o_u_n_d-_l_i_s_t-_2 shall be executed, and the process shall repeat. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 274 3 Shell Command Language Part 2: SHELL AND UTILITIES P1003.2/D11.2 _E_x_i_t__S_t_a_t_u_s The exit status of the while loop shall be the exit status of the last _c_o_m_p_o_u_n_d-_l_i_s_t-_2 executed, or zero if none was executed. BEGIN_RATIONALE 3.9.4.5.1 while Loop Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) The precise format for the command syntax is described in 3.10. END_RATIONALE 3.9.4.6 until Loop The until loop continuously shall execute one _c_o_m_p_o_u_n_d-_l_i_s_t as long as another _c_o_m_p_o_u_n_d-_l_i_s_t has a nonzero exit status. The format of the until loop is as follows until _c_o_m_p_o_u_n_d-_l_i_s_t-_1 _d_o _c_o_m_p_o_u_n_d-_l_i_s_t-_2 _d_o_n_e The _c_o_m_p_o_u_n_d-_l_i_s_t-_1 shall be executed, and if it has a zero exit status, the until command shall complete. Otherwise, the _c_o_m_p_o_u_n_d-_l_i_s_t-_2 shall be executed, and the process shall repeat. _E_x_i_t__S_t_a_t_u_s The exit status of the until loop shall be the exit status of the last _c_o_m_p_o_u_n_d-_l_i_s_t-_2 executed, or zero if none was executed. BEGIN_RATIONALE 3.9.4.6.1 until Loop Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) The precise format for the command syntax is described in 3.10. END_RATIONALE Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3.9 Shell Commands 275 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX 3.9.5 Function Definition Command A function is a user-defined name that is used as a simple command to call a compound command with new positional parameters. A function is defined with a _f_u_n_c_t_i_o_n _d_e_f_i_n_i_t_i_o_n _c_o_m_m_a_n_d. The format of a function definition command is as follows: _f_n_a_m_e() _c_o_m_p_o_u_n_d-_c_o_m_m_a_n_d [_i_o-_r_e_d_i_r_e_c_t ...] The function is named _f_n_a_m_e; it shall be a name (see 3.1.5). An 1 implementation may allow other characters in a function name as an 1 extension. The implementation shall maintain separate namespaces for 1 functions and variables. The argument _c_o_m_p_o_u_n_d-_c_o_m_m_a_n_d represents a compound command, as described in 3.9.4. When the function is declared, none of the expansions in 3.6 shall be performed on the text in _c_o_m_p_o_u_n_d-_c_o_m_m_a_n_d or _i_o-_r_e_d_i_r_e_c_t; all expansions shall be performed as normal each time the function is called. Similarly, the optional _i_o-_r_e_d_i_r_e_c_t redirections and any variable assignments within _c_o_m_p_o_u_n_d-_c_o_m_m_a_n_d shall be performed during the execution of the function itself, not the function definition. See 3.8.1 for the consequences of failures of these operations on interactive and noninteractive shells. When a function is executed, it shall have the syntax-error and variable-assignment properties described for special built-in utilities, in the enumerated list at the beginning of 3.14. The _c_o_m_p_o_u_n_d-_c_o_m_m_a_n_d shall be executed whenever the function name is specified as the name of a simple command (see 3.9.1.1). The operands to the command temporarily shall become the positional parameters during the execution of the _c_o_m_p_o_u_n_d-_c_o_m_m_a_n_d; the special parameter # shall also be changed to reflect the number of operands. The special parameter 0 shall be unchanged. When the function completes, the values of the positional parameters and the special parameter # shall be restored to the values they had before the function was executed. If the special built-in return is executed in the _c_o_m_p_o_u_n_d-_c_o_m_m_a_n_d, the function shall complete and execution shall resume with the next command after the function call. _E_x_i_t__S_t_a_t_u_s The exit status of a function definition shall be zero if the function was declared successfully; otherwise, it shall be greater than zero. The exit status of a function invocation shall be the exit status of the last command executed by the function. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 276 3 Shell Command Language Part 2: SHELL AND UTILITIES P1003.2/D11.2 BEGIN_RATIONALE 3.9.5.1 Function Definition Command Rationale (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) The description of functions in Draft 8 was based on the notion that functions should behave like miniature shell scripts; that is, except for sharing variables, most elements of an execution environment should behave as if it were a new execution environment, and changes to these should be local to the function. For example, traps and options should be reset on entry to the function, and any changes to them don't affect the traps or options of the caller. There were numerous objections to this basic idea, and the opponents asserted that functions were intended to be a convenient mechanism for grouping commonly executed commands that were to be executed in the current execution environment, similar to the execution of the dot special built-in. Opponents also pointed out that the functions described in Draft 8 did not scope everything a new shell script would anyway, such as the current working directory, or umask, but instead picked a few select properties. The basic argument was that if one wanted scoping of the execution environment, the mechanism already exists: put the commands in a new shell script and call it. All traditional shells that implemented functions, other than the KornShell, have implemented functions that operate in the current execution environment. Because of this, Draft 9 removed any local scoping of traps or options. Local variables within a function were considered and included in Draft 9 (controlled by the special built-in local), but were removed because they do not fit the simple model developed for the scoping of functions and there was some opposition to adding yet another new special built-in from outside existing practice. Implementations should reserve the identifier local (as well as typeset, as used in the KornShell) in case this local variable mechanism is adopted in a future version of POSIX.2. A separate issue from the execution environment of a function is the availability of that function to child shells. A few objectors, including the author of the original Version 7 UNIX system shell, maintained that just as a variable can be shared with child shells by exporting it, so should a function--and so this capability has been added to the standard. In previous drafts, the export command therefore had a -f flag for exporting functions. Functions that were exported were to be put into the environment as _n_a_m_e()=_v_a_l_u_e pairs, and upon invocation, the shell would scan the environment for these, and automatically define these functions. This facility received a lot of balloting opposition and was removed from Draft 11. Some of the arguments against exportable functions were: Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3.9 Shell Commands 277 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX - There was little existing practice. The Ninth Edition shell provided them, but there was controversy over how well it worked. - There are numerous security problems associated with functions appearing in a script's environment and overriding standard utilities or the application's own utilities. - There was controversy over requiring make to import functions, where it has historically used an _e_x_e_c function for many of its command line executions. - Functions can be big and the environment is of a limited size. (The counter-argument was that functions are no different than variables in terms of size: there can be big ones, and there can be small ones--and just as one does not export huge variables, one does not export huge functions. However, this insight might be lost on the average shell-function writer, who typically writes much larger functions than variables.) As far as can be determined, the functions in POSIX.2 match those in System V. The KornShell has two methods of defining functions: function _f_n_a_m_e { _c_o_m_p_o_u_n_d-_l_i_s_t } and _f_n_a_m_e() { _c_o_m_p_o_u_n_d-_l_i_s_t } The latter uses the same definition as POSIX.2, but differs in semantics, as described previously. A future edition of the KornShell is planned to align the latter syntax with POSIX and keep the former as-is. The name space for functions is limited to that of a _n_a_m_e because of 1 historical practice. Complications in defining the syntactic rules for 1 the function definition command and in dealing with known extensions such 1 as the KornShell's @() prevented the name space from being widened to a 1 _w_o_r_d, as requested by some balloters. Using functions to support 1 synonyms such as the C-shell's !! and % is thus disallowed to portable 1 applications, but acceptable as an extension. For interactive users, the 1 aliasing facilities in the UPE should be adequate for this purpose. It 1 is recognized that the name space for utilities in the file system is 1 wider than that currently supported for functions, if the portable 1 filename character set guidelines are ignored, but it did not seem useful 1 to mandate extensions in systems for so little benefit to portable 1 applications. 1 The () in the function definition command consists of two operators. Therefore, intermixing _s with the _f_n_a_m_e, (, and ) is allowed, but unnecessary. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 278 3 Shell Command Language Part 2: SHELL AND UTILITIES P1003.2/D11.2 An example of how a function definition can be used wherever a simple command is allowed: # If variable i is equal to "yes", # define function foo to be ls -l # [ X$i = Xyes ] && foo() { ls -l } END_RATIONALE 3.10 Shell Grammar The following grammar describes the Shell Command Language. Any discrepancies found between this grammar and the preceding description shall be resolved in favor of this clause. 3.10.1 Shell Grammar Lexical Conventions The input language to the shell must be first recognized at the character level. The resulting tokens shall be classified by their immediate context according to the following rules (applied in order). These rules are used to determine what a ``token'' that is subject to parsing at the token level is. The rules for token recognition in 3.3 shall apply. (1) A shall be returned as the token identifier NEWLINE. (2) If the token is an operator, the token identifier for that operator shall result. (3) If the string consists solely of digits and the delimiter character is one of < or >, the token identifier IO_NUMBER shall be returned. (4) Otherwise, the token identifier TOKEN shall result. Further distinction on TOKEN is context-dependent. It may be that the same TOKEN yields WORD, a NAME, an ASSIGNMENT, or one of the reserved words below, dependent upon the context. Some of the productions in the grammar below are annotated with a rule number from the following list. When a TOKEN is seen where one of those annotated productions could be used to reduce the symbol, the applicable rule shall be applied to convert the token identifier type of the TOKEN to a token identifier acceptable at that point in the grammar. The reduction shall then proceed based upon the token identifier type yielded by the rule applied. When more than one rule applies, the highest numbered rule shall apply Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3.10 Shell Grammar 279 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX (which in turn may refer to another rule). [Note that except in rule (7), the presence of an = in the token has no effect.] The WORD tokens shall have the word expansion rules applied to them immediately before the associated command is executed, not at the time the command is parsed. 3.10.2 Shell Grammar Rules (1) [Command Name] When the TOKEN is exactly a reserved word, the token identifier for that reserved word shall result. Otherwise, the token WORD shall be returned. Also, if the parser is in any state where 1 only a reserved word could be the next correct token, proceed as 1 above. 1 NOTE: Because at this point quote marks are retained in the token, quoted strings cannot be recognized as reserved words. This rule also implies that reserved words will not be recognized except in certain positions in the input, such as after a or semicolon; the grammar presumes that if the reserved word is intended, it will be properly delimited by the user, and does not attempt to reflect that requirement directly. Also note that line joining is done before tokenization, as described in 3.2.1, so escaped newlines are already removed at this point. NOTE: Rule (1) is not directly referenced in the grammar, but 1 is referred to by other rules, or applies globally. 1 (2) [Redirection to/from filename] The expansions specified in 3.7 shall occur. As specified there, exactly one field can result (or the result is 1 unspecified), and there are additional requirements on pathname expansion. (3) [Redirection from here-document] Quote removal [3.7.4]. shall be applied to the word to 1 determine the delimiter that will be used to find the end of the 1 here-document that begins after the next . 1 (4) [Case statement termination] When the TOKEN is exactly the reserved word Esac, the token identifier for Esac shall result. Otherwise, the token WORD shall be returned. (5) [NAME in for] When the TOKEN meets the requirements for a name [3.1.5], the Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 280 3 Shell Command Language