IEEE P1003.2 Draft 11.2 - September 1991 Copyright (c) 1991 by the Institute of Electrical and Electronics Engineers, Inc. 345 East 47th Street New York, NY 10017, USA All rights reserved as an unpublished work. This is an unapproved and unpublished IEEE Standards Draft, subject to change. The publication, distribution, or copying of this draft, as well as all derivative works based on this draft, is expressly prohibited except as set forth below. Permission is hereby granted for IEEE Standards Committee participants to reproduce this document for purposes of IEEE standardization activities only, and subject to the restrictions contained herein. Permission is hereby also granted for member bodies and technical committees of ISO and IEC to reproduce this document for purposes of developing a national position, subject to the restrictions contained herein. Permission is hereby also granted to the preceding entities to make limited copies of this document in an electronic form only for the stated activities. The following restrictions apply to reproducing or transmitting the document in any form: 1) all copies or portions thereof must identify the document's IEEE project number and draft number, and must be accompanied by this entire notice in a prominent location; 2) no portion of this document may be redistributed in any modified or abridged form without the prior approval of the IEEE Standards Department. Other entities seeking permission to reproduce this document, or any portion thereof, for standardization or other activities, must contact the IEEE Standards Department for the appropriate license. Use of information contained in this unapproved draft is at your own risk. IEEE Standards Department Copyright and Permissions 445 Hoes Lane, P.O. Box 1331 Piscataway, NJ 08855-1331, USA +1 (908) 562-3800 +1 (908) 562-1571 [FAX] Part 2: SHELL AND UTILITIES P1003.2/D11.2 2.2.2.75 group ID: A nonnegative integer, which can be contained in an object of type _g_i_d__t, that is used to identify a group of system users. Each system user is a member of at least one group. When the identity of a group is associated with a process, a group ID value is referred to as a real group ID, an effective group ID, one of the (optional) supplementary group IDs, or an (optional) saved set-group-ID. [POSIX.1 {8}] 2.2.2.76 hard link: The relationship between two directory entries that represent the same file; the result of an execution of the ln utility or the POSIX.1 {8} _l_i_n_k() function. 2.2.2.77 home directory: The current directory associated with a user at the time of login. 2.2.2.78 incomplete line: A sequence of text consisting of one or more non- characters at the end of the file. 2.2.2.79 invoke: To perform the actions described in 3.9.1.1, except that searching for shell functions and special built-ins is suppressed. See also _e_x_e_c_u_t_e (2.2.2.49). 2.2.2.80 job control: A facility that allows users to selectively stop (suspend) the execution of processes and continue (resume) their execution at a later point. The user typically employs this facility via the interactive interface jointly supplied by the terminal I/O driver and a command interpreter. POSIX.1 {8} conforming implementations may optionally support job control facilities; the presence of this option is indicated to the application at compile time or run time by the definition of the {_POSIX_JOB_CONTROL} symbol; see POSIX.1 {8} 2.9. [POSIX.1 {8}] 2.2.2.81 line: A sequence of text consisting of zero or more non- characters plus a terminating character. 2.2.2.82 link: See _d_i_r_e_c_t_o_r_y _e_n_t_r_y in 2.2.2.36. 2.2.2.83 link count: The number of directory entries that refer to a particular file. [POSIX.1 {8}] Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 2.2 Definitions 41 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX 2.2.2.84 locale: The definition of the subset of a user's environment that depends on language and cultural conventions; see 2.5. 2.2.2.85 login: The unspecified activity by which a user gains access to the system. Each login shall be associated with exactly one login name. [POSIX.1 {8}] 2.2.2.86 login name: A user name that is associated with a login. [POSIX.1 {8}] 2.2.2.87 mode: A collection of attributes that specifies a file's type and its access permissions. See _f_i_l_e _a_c_c_e_s_s _p_e_r_m_i_s_s_i_o_n_s in 2.2.2.55. [POSIX.1 {8}] 2.2.2.88 multicharacter collating element: A sequence of two or more characters that collate as an entity. For example, in some coded character sets, an accented character is represented by a (nonspacing) accent, followed by the letter. Another example is the Spanish elements ``ch'' and ``ll.'' 2.2.2.89 negative response: An input string that matches one of the responses acceptable to the LC_MESSAGES category keyword noexpr, matching an extended regular expression in the current locale. See 2.5. 2.2.2.90 : A character that in the output stream shall 1 indicate that printing should start at the beginning of the next line. The shall be the character designated by '\n' in the C language binding. It is unspecified whether this character is the exact sequence transmitted to an output device by the system to accomplish the movement to the next line. 2.2.2.91 NUL: A character with all bits set to zero. 2.2.2.92 null string: See _e_m_p_t_y _s_t_r_i_n_g in 2.2.2.45. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 42 2 Terminology and General Requirements Part 2: SHELL AND UTILITIES P1003.2/D11.2 2.2.2.93 number-sign: The character ``#''. This standard permits the substitution of the ``pound sign'' graphic defined in ISO/IEC 646 {1} for this symbol when the character set being used has substituted that graphic for the graphic #. The graphic symbol # is always used in this standard. 2.2.2.94 object file: A regular file containing the output of a compiler, formatted as input to a linkage editor for linking with other object files into an executable form. The methods of linking are unspecified and may involve the dynamic linking of objects at run-time. The internal format of an object file is unspecified, but a conforming application shall not assume an object file is a text file. 2.2.2.95 open file: A file that is currently associated with a file descriptor. [POSIX.1 {8}] 2.2.2.96 operand: An argument to a command that is generally used as an object supplying information to a utility necessary to complete its processing. Operands generally follow the options in a command line. See 2.10.1. 2.2.2.97 option: An argument to a command that is generally used to specify changes in the _u_t_i_l_i_t_y's default behavior; see 2.10.1. 2.2.2.98 option-argument: A parameter that follows certain options. In some cases an option-argument is included within the same argument string as the option; in most cases it is the next argument. See 2.10.1. 2.2.2.99 parent directory: (1) When discussing a given directory, the directory that both contains a directory entry for the given directory and is represented by the pathname dot-dot in the given directory. (2) When discussing other types of files, a directory containing a directory entry for the file under discussion. This concept does not apply to dot and dot-dot. [POSIX.1 {8}] Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 2.2 Definitions 43 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX 2.2.2.100 parent process: See _p_r_o_c_e_s_s in 2.2.2.114. [POSIX.1 {8}] 2.2.2.101 parent process ID: An attribute of a new process after it is created by a currently active process. The parent process ID of a process is the process ID of its creator, for the lifetime of the creator. After the creator's lifetime has ended, the parent process ID is the process ID of an implementation-defined system process. [POSIX.1 {8}] 2.2.2.102 pathname: A string that is used to identify a file. A pathname consists of, at most, {PATH_MAX} bytes, including the terminating null character. It has an optional beginning slash, followed by zero or more filenames separated by slashes. If the pathname refers to a directory, it may also have one or more trailing slashes. Multiple successive slashes are considered to be the same as one slash. A pathname that begins with two successive slashes may be interpreted in an implementation-defined manner, although more than two leading slashes shall be treated as a single slash. The interpretation of the pathname is described in _p_a_t_h_n_a_m_e _r_e_s_o_l_u_t_i_o_n in 2.2.2.104. [POSIX.1 {8}] 2.2.2.103 pathname component: See _f_i_l_e_n_a_m_e in 2.2.2.61. [POSIX.1 {8}] 2.2.2.104 pathname resolution: A concept of the underlying system, as follows. [POSIX.1 {8}] Pathname resolution is performed for a process to resolve a pathname to a particular file in a file hierarchy. There may be multiple pathnames that resolve to the same file. Each filename in the pathname is located in the directory specified by its predecessor (for example, in the pathname fragment ``a/b'', file ``b'' is located in directory ``a''). Pathname resolution fails if this cannot be accomplished. If the pathname begins with a slash, the predecessor of the first filename in the pathname is taken to be the root directory of the process (such pathnames are referred to as absolute pathnames). If the pathname does not begin with a slash, the predecessor of the first filename of the pathname is taken to be the current working directory of the process (such pathnames are referred to as ``relative pathnames''). The interpretation of a pathname component is dependent on the values of {NAME_MAX} and {_POSIX_NO_TRUNC} associated with the path prefix of that component. If any pathname component is longer than {NAME_MAX}, and {_POSIX_NO_TRUNC} is in effect for the path prefix of that component [see _p_a_t_h_c_o_n_f() in POSIX.1 {8} 5.7.1], the implementation shall consider this Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 44 2 Terminology and General Requirements Part 2: SHELL AND UTILITIES P1003.2/D11.2 an error condition. Otherwise, the implementation shall use the first {NAME_MAX} bytes of the pathname component. The special filename dot refers to the directory specified by its predecessor. The special filename dot-dot refers to the parent directory of its predecessor directory. As a special case, in the root directory, dot-dot may refer to the root directory itself. A pathname consisting of a single slash resolves to the root directory of the process. A null pathname is invalid. 2.2.2.105 path prefix: A pathname, with an optional ending slash, that refers to a directory. [POSIX.1 {8}] 2.2.2.106 pattern: A sequence of characters used either with regular expression notation (see 2.8) or for pathname expansion (see 3.6.6), as a means of selecting various character strings or pathnames, respectively. The syntaxes of the two patterns are similar, but not identical; this standard always indicates the type of pattern being referred to in the immediate context of the use of the term. 2.2.2.107 period: The character ``.''. The term _p_e_r_i_o_d is contrasted against _d_o_t (2.2.2.38), which is used to describe a specific directory entry. 2.2.2.108 permissions: See _f_i_l_e _a_c_c_e_s_s _p_e_r_m_i_s_s_i_o_n_s in 2.2.2.55. 2.2.2.109 pipe: An object accessed by one of the pair of file descriptors created by the POSIX.1 {8} _p_i_p_e() function. Once created, the file descriptors can be used to manipulate it, and it behaves identically to a FIFO special file when accessed in this way. It has no name in the file hierarchy. [POSIX.1 {8}] 2.2.2.110 portable character set: The set of characters described in 2.4 that is supported on all conforming systems. This term is contrasted against the smaller _p_o_r_t_a_b_l_e _f_i_l_e_n_a_m_e _c_h_a_r_a_c_t_e_r _s_e_t; see 2.2.2.111. 2.2.2.111 portable filename character set: The set of characters from which portable filenames are constructed. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 2.2 Definitions 45 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX For a filename to be portable across conforming implementations of this standard, it shall consist only of the following characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 . _ - The last three characters are the period, underscore, and hyphen characters, respectively. The hyphen shall not be used as the first character of a portable filename. Upper- and lowercase letters shall retain their unique identities between conforming implementations. In the case of a portable pathname, the slash character may also be used. [POSIX.1 {8}] 2.2.2.112 printable character: One of the characters included in the print character classification of the LC_CTYPE category in the current locale; see 2.5.2.1. 2.2.2.113 privilege: See _a_p_p_r_o_p_r_i_a_t_e _p_r_i_v_i_l_e_g_e_s in 2.2.2.6. [POSIX.1 {8}] 2.2.2.114 process: An address space and single thread of control that executes within that address space, and its required system resources. A process is created by another process issuing the POSIX.1 {8} _f_o_r_k() function. The process that issues _f_o_r_k() is known as the parent process, and the new process created by the _f_o_r_k() is known as the child process. [POSIX.1 {8}] The attributes of processes required by POSIX.2 form a subset of those in POSIX.1 {8}; see 2.9.1. 2.2.2.115 process group: A collection of processes that permits the signaling of related processes. Each process in the system is a member of a process group that is identified by a process group ID. A newly created process joins the process group of its creator. [POSIX.1 {8}] 2.2.2.116 process group ID: The unique identifier representing a process group during its lifetime. A process group ID is a positive integer that can be contained in a _p_i_d__t. It shall not be reused by the system until the process group lifetime ends. [POSIX.1 {8}] Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 46 2 Terminology and General Requirements Part 2: SHELL AND UTILITIES P1003.2/D11.2 2.2.2.117 process group leader: A process whose process ID is the same as its process group ID. [POSIX.1 {8}] 2.2.2.118 process ID: The unique identifier representing a process. A process ID is a positive integer that can be contained in a _p_i_d__t. A process ID shall not be reused by the system until the process lifetime ends. In addition, if there exists a process group whose process group ID is equal to that process ID, the process ID shall not be reused by the system until the process group lifetime ends. A process that is not a system process shall not have a process ID of 1. [POSIX.1 {8}] 2.2.2.119 program: A prepared sequence of instructions to the system to accomplish a defined task. The term _p_r_o_g_r_a_m in POSIX.2 encompasses applications written in the Shell Command Language, complex utility input languages (for example, awk, lex, sed, etc.), and high-level languages. 2.2.2.120 read-only file system: A file system that has implementation-defined characteristics restricting modifications. [POSIX.1 {8}] 2.2.2.121 real group ID: The attribute of a process that, at the time of process creation, identifies the group of the user who created the process. See _g_r_o_u_p _I_D in 2.2.2.75. This value is subject to change during the process lifetime, as described in POSIX.1 {8} 4.2.2 [_s_e_t_g_i_d()]. [POSIX.1 {8}] 2.2.2.122 real user ID: The attribute of a process that, at the time of process creation, identifies the user who created the process. See _u_s_e_r _I_D in 2.2.2.154. This value is subject to change during the process lifetime, as described in POSIX.1 {8} 4.2.2 [_s_e_t_u_i_d()]. [POSIX.1 {8}] 2.2.2.123 regular expression: A pattern (sequence of characters or 1 symbols) constructed according to the rules defined in 2.8. 1 2.2.2.124 regular file: A file that is a randomly accessible sequence of bytes, with no further structure imposed by the system. [POSIX.1 {8}] Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 2.2 Definitions 47 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX 2.2.2.125 relative pathname: See _p_a_t_h_n_a_m_e _r_e_s_o_l_u_t_i_o_n in 2.2.2.104. [POSIX.1 {8}] 2.2.2.126 root directory: A directory, associated with a process, that is used in pathname resolution for pathnames that begin with a slash. [POSIX.1 {8}] 2.2.2.127 saved set-group-ID: An attribute of a process that allows some flexibility in the assignment of the effective group ID attribute, when the saved set-user-ID option is implemented, as described in POSIX.1 {8} 3.1.2 (_e_x_e_c) and 4.2.2 [_s_e_t_g_i_d()]. [POSIX.1 {8}] 2.2.2.128 saved set-user-ID: An attribute of a process that allows some flexibility in the assignment of the effective user ID attribute, when the saved set-user-ID option is implemented, as described in POSIX.1 {8} 3.1.2 and 4.2.2 [_s_e_t_u_i_d()]. [POSIX.1 {8}] 2.2.2.129 seconds since the Epoch: A value to be interpreted as the number of seconds between a specified time and the Epoch. A Coordinated Universal Time name [specified in terms of seconds (_t_m__s_e_c), minutes (_t_m__m_i_n), hours (_t_m__h_o_u_r), days since January 1 of the year (_t_m__y_d_a_y), and calendar year minus 1900 (_t_m__y_e_a_r)] is related to a time represented as seconds since the Epoch, according to the expression below. If the year < 1970 or the value is negative, the relationship is undefined. If the year _> 1970 and the value is nonnegative, the value is related to a Coordinated Universal Time name according to the expression: _t_m__s_e_c + _t_m__m_i_n*60 + _t_m__h_o_u_r*3600 + _t_m__y_d_a_y*86400 + (_t_m__y_e_a_r-70)*31536000 + ((_t_m__y_e_a_r-69)/4)*86400 [POSIX.1 {8}] 2.2.2.130 session: A collection of process groups established for job control purposes. Each process group is a member of a session. A process is considered to be a member of the session of which its process group is a member. A newly created process joins the session of its creator. A process can alter its session membership (see POSIX.1 {8} 4.3.2 [_s_e_t_s_i_d()]. Implementations that support the POSIX.1 {8} _s_e_t_p_g_i_d() function (see POSIX.1 {8} 4.3.3) can have multiple process groups in the same session. [POSIX.1 {8}] Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 48 2 Terminology and General Requirements Part 2: SHELL AND UTILITIES P1003.2/D11.2 2.2.2.131 session leader: A process that has created a session; see POSIX.1 {8} 4.3.2 [_s_e_t_s_i_d()]. [POSIX.1 {8}] 2.2.2.132 session lifetime: The period between when a session is created and the end of the lifetime of all the process groups that remain as members of the session. [POSIX.1 {8}] 2.2.2.133 shell: A program that interprets sequences of text input as commands. It may operate on an input stream or it may interactively prompt and read commands from a terminal. 2.2.2.134 Shell, The: The Shell Command Language Interpreter (see 4.56), a specific instance of a shell. 2.2.2.135 shell script: A file containing shell commands. If the file is made executable, it can be executed by specifying its name as a simple command (see the description of _s_i_m_p_l_e _c_o_m_m_a_n_d in 3.9.1). Execution of a shell script causes a shell to execute the commands within the script. Alternately, a shell can be requested to execute the commands in a shell script by specifying the name of the shell script as the operand to the sh utility. 2.2.2.136 signal: A mechanism by which a process may be notified of, or affected by, an event occurring in the system. Examples of such events include hardware exceptions and specific actions by processes. The term _s_i_g_n_a_l is also used to refer to the event itself. [POSIX.1 {8}] 2.2.2.137 single-quote: The character ``''', also known as _a_p_o_s_t_r_o_p_h_e. 2.2.2.138 slash: The character ``/'', also known as _s_o_l_i_d_u_s. 2.2.2.139 source code: When dealing with the Shell Command Language, source code is input to the command language interpreter. The term _s_h_e_l_l _s_c_r_i_p_t is synonymous with this meaning. When dealing with the C Language Bindings Option, source code is input to a C compiler conforming to the C Standard {7}. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 2.2 Definitions 49 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX When dealing with another ISO/IEC conforming language, source code is input to a compiler conforming to that ISO/IEC standard. Source code also refers to the input statements prepared for the following standard utilities: awk, bc, ed, lex, localedef, make, sed, and yacc. Source code can also refer to a collection of sources meeting any or all of these meanings. _2._2._2._1_4_0 : The character defined in 2.4 as . The character is a member of the space character class of the current locale, but represents the single character, and not all of the possible members of the class. (See 2.2.2.158.) 2.2.2.141 standard error: An output stream usually intended to be used for diagnostic messages. 2.2.2.142 standard input: An input stream usually intended to be used for primary data input. 2.2.2.143 standard output: An output stream usually intended to be used for primary data output. 2.2.2.144 standard utilities: The utilities defined by this standard, in the Sections 4, 5, and 6, and Annex A, and Annex C, and in similar sections of utility definitions introduced in future revisions of, and supplements to, this standard. 2.2.2.145 stream: An ordered sequence of characters, as described by the C Standard {7}. 2.2.2.146 supplementary group ID: An attribute of a process used in determining file access permissions. A process has up to {NGROUPS_MAX} supplementary group IDs in addition to the effective group ID. The supplementary group IDs of a process are set to the supplementary group IDs of the parent process when the process is created. Whether a process's effective group ID is included in or omitted from its list of supplementary group IDs is unspecified. [POSIX.1 {8}] Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 50 2 Terminology and General Requirements Part 2: SHELL AND UTILITIES P1003.2/D11.2 2.2.2.147 system: An implementation of this standard. 2.2.2.148 : The horizontal tab character. 2.2.2.149 terminal [terminal device]: A character special file that obeys the specifications of the POSIX.1 {8} General Terminal Interface. [POSIX.1 {8}] 2.2.2.150 text column: A roughly rectangular block of characters capable of being laid out side-by-side next to other text columns on an output page or terminal screen. The widths of text columns are measured in column positions. 2.2.2.151 text file: A file that contains characters organized into one or more lines. The lines shall not contain NUL characters and none shall exceed {LINE_MAX} bytes in length, including the . Although POSIX.1 {8} does not distinguish between text files and binary files (see the C Standard {7}), many utilities only produce predictable or meaningful output when operating on text files. The standard utilities that have such restrictions always specify _t_e_x_t _f_i_l_e_s in their Standard Input or Input Files subclauses. 2.2.2.152 tilde: The character ``~''. 2.2.2.153 user database: See Section 9 in POSIX.1 {8}. 2.2.2.154 user ID: A nonnegative integer, which can be contained in an object of type _u_i_d__t, that is used to identify a system user. When the identity of a user is associated with a process, a user ID value is referred to as a real user ID, an effective user ID, or an (optional) saved set-user-ID. [POSIX.1 {8}] 2.2.2.155 user name: A string that is used to identify a user, as described in POSIX.1 {8} 9.1. [POSIX.1 {8}] 2.2.2.156 utility: A program that can be called by name from a shell to perform a specific task, or related set of tasks. This program shall either be an executable file, such as might be produced by a compiler/linker system from computer source code, or a file of shell source code, directly interpreted by the shell. The program may Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 2.2 Definitions 51 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX have been produced by the user, provided by the implementor of this standard, or acquired from an independent distributor. The term _u_t_i_l_i_t_y does not apply to the special built-in utilities provided as part of the shell command language; see 3.14. The system may implement certain utilities as shell functions (see 3.9.5) or built-ins (see 2.3), but only an application that is aware of the command search order described in 3.9.1.1 or of performance characteristics can discern differences between the behavior of such a function or built-in and that of a true executable file. _2._2._2._1_5_7 : The vertical tab character. 2.2.2.158 white space: A sequence of one or more characters that belong to the space character class as defined via the LC_CTYPE category in the current locale. In the POSIX Locale, white space consists of one or more s (s and s), s, s, s, and s. 2.2.2.159 working directory [current working directory]: A directory, associated with a process, that is used in pathname resolution for pathnames that do not begin with a slash. 2.2.2.160 write: To output characters to a file, such as standard output or standard error. Unless otherwise stated, standard output is the default output destination for all uses of the term _w_r_i_t_e. BEGIN_RATIONALE 2.2.2.161 General Terms Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) Many of the terms originated in POSIX.1 {8} and are duplicated in this standard to meet editorial requirements. In some cases, there is supplementary text that presents additional information concerning POSIX.2 aspects of the concept. This standard uses the term _c_h_a_r_a_c_t_e_r to mean a sequence of one or more bytes representing a single graphic symbol, as defined in POSIX.1 {8}. 1 The deviation in the exact text of the C Standard {7} definition for _b_y_t_e 1 meets the intent of the C Standard {7} Rationale and the developers of 1 POSIX.1 {8}, but clears up the ambiguity raised by the term _b_a_s_i_c 1 _e_x_e_c_u_t_i_o_n _c_h_a_r_a_c_t_e_r _s_e_t, which is not defined in POSIX.1 {8}. It is 1 expected that a future version of POSIX.1 {8} will align with the text 1 Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 52 2 Terminology and General Requirements Part 2: SHELL AND UTILITIES P1003.2/D11.2 used here. The octet-minimum requirement is merely a reflection of the 1 {CHAR_BIT} value in POSIX.1 {8} and the C Standard {7}. 1 The POSIX.1 {8} term _f_i_l_e _m_o_d_e is a superset of the POSIX.2 _f_i_l_e _m_o_d_e _b_i_t_s. POSIX.1 {8} defines the file mode as the entire _m_o_d_e__t object (which includes the file type in historically the upper four bits, the sticky bit on most implementations, and potentially other nonstandardized attributes), while POSIX.2 file mode bits include only the eleven defined bits. The terms _c_o_m_m_a_n_d and _u_t_i_l_i_t_y are related but have distinct meanings. Command is defined as ``a directive to a shell to perform a specific task.'' The directive can be in the form of a single utility name (for example, ls), or the directive can take the form of a compound command (for example, ls | grep name | pr). A utility is a program that is callable by name from a shell. Issuing only the utility's name to a shell is the equivalent of a one-word command. A utility may be invoked as a separate program that executes in a different process than the command language interpreter, or may be implemented as a part of the command language interpreter. For example, the echo command (the directive to perform a specific task) may be implemented such that the echo utility (the logic that performs the task of echoing) is in a separate program; and therefore, is executed in a process that is different than the command language interpreter. Conversely, the logic that performs the echo utility could be built into the command language interpreter; and therefore, execute in the same process as the command language interpreter. The terms _t_o_o_l and _a_p_p_l_i_c_a_t_i_o_n can be thought of as being synonymous with _u_t_i_l_i_t_y from the perspective of the operating system kernel. Tools, applications, and utilities have historically run, typically, in processes above the kernel level. Tools and utilities have been historically a part of the operating system nonkernel code, and performed system related functions such as listing directory contents, checking file systems, repairing file systems, or extracting system status information. Applications have not generally been a part of the operating system, and perform nonsystem related functions such as word processing, architectural design, mechanical design, workstation publishing, or financial analysis. Utilities have most frequently been provided by the operating system vendor, applications by third party software vendors or by the users themselves. Nevertheless, the standard does not differentiate between tools, utilities, and applications when it comes to receiving services from the system, a shell, or the standard utilities. (For example, the xargs utility invokes another utility; it would be of fairly limited usefulness if the users couldn't run their own applications in place of the standard utilities.) Utilities are not applications in the sense that they are not themselves subjects to the restrictions of this standard or any other standard--there is no Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 2.2 Definitions 53 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX requirement for grep, stty, or any of the utilities defined here to be any of the classes of Conforming POSIX.2 Applications. The term _t_e_x_t _f_i_l_e does not prevent the inclusion of control or other nonprintable characters (other than NUL). Therefore, standard utilities that list text files as inputs or outputs are either able to process the special characters gracefully or they explicitly describe their limitations within their individual subclauses. The definition of _t_e_x_t _f_i_l_e has caused a good deal of controversy. The only difference between text and binary here is that text files have lines of (less than {LINE_MAX}) bytes, with no NUL characters, each terminated by a character. The definition allows a file with a single , but not a totally empty file, to be called a text file. If a file ends with an incomplete line it is not strictly a text file by this definition. A related point is that the character referred to in this standard is not some generic line separator, but a single character; files created on systems where they use multiple characters for ends of lines are not portable to all POSIX systems without some translation process unspecified by this standard. The term _h_a_r_d _l_i_n_k is historically-derived. In systems without extensions to ln, it is a synonym for _l_i_n_k. The concept of a _s_y_m_b_o_l_i_c _l_i_n_k originated with BSD systems and the term _h_a_r_d is used to differentiate between the two types of links. There are some terms used that are undefined in POSIX.2, POSIX.1 {8}, or the C Standard {7}. The working group believes that these terms have a ``common usage,'' and that a definition in POSIX.2 would not be appropriate. Terms in this category include, but are not limited to, the following: _a_p_p_l_i_c_a_t_i_o_n, _c_h_a_r_a_c_t_e_r _s_e_t, _l_o_g_i_n _s_e_s_s_i_o_n, _u_s_e_r. Good sources for general terms of this type are the _I_S_O/_A_F_N_O_R _D_i_c_t_i_o_n_a_r_y _o_f _C_o_m_p_u_t_e_r _S_c_i_e_n_c_e {B12} and _I_E_E_E _D_i_c_t_i_o_n_a_r_y {B18}. The term _f_i_l_e _n_a_m_e was defined in previous drafts to be a synonym for _p_a_t_h_n_a_m_e. It was removed in the face of objections that it was too close to _f_i_l_e_n_a_m_e, which means something different (a pathname component). The general solution to this has been to use the term _f_i_l_e in parameter names, rather than _f_i_l_e__n_a_m_e, and to make more liberal use of the correct term, _p_a_t_h_n_a_m_e; an alternate solution has been to replace _f_i_l_e _n_a_m_e with _t_h_e _n_a_m_e _o_f _t_h_e _f_i_l_e. Many character names are included in this subclause. Because of historical usage, some of these names are a bit different than the ones used in international standards for character sets, such as ISO/IEC 646 {1}. It was felt that many more UNIX system people than character set lawyers would be reading and reviewing the standard, so the former group was the one accommodated. On the other hand, the precise definitions of , , and _w_h_i_t_e _s_p_a_c_e have replaced common usage (where they have been used virtually interchangeably), as the standard attempts to Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 54 2 Terminology and General Requirements Part 2: SHELL AND UTILITIES P1003.2/D11.2 balance readability against precision. In earlier drafts, the names for the character pairs ( ), [ ], and { } were referred to as ``opening'' and ``closing'' parentheses, brackets, and braces. These were changed to the current ``left'' and right.'' When the characters are used to express natural language, the terms ``open'' and ``close'' imply text direction more strongly than ``left'' and ``right.'' By POSIX.2 definition, the character will always be mapped to the glyph '(' regardless of the locale. But when reading right-to-left, the opening punctuation of a parenthesized text segment would be ')'. The and forms are the correct ones because the punctuation appears on the left and right, respectively, of the parenthesized text regardless of the direction one might be reading the text. The character and the ERASE special character defined in POSIX.1 {8} should not be confused. The use of the character and the ERASE special character defined in the POSIX.1 {8} _t_e_r_m_i_o_s clause on special characters (7.1.1.9) are distinct even though the ERASE special character may be set to . In most one-byte character sets, such as ASCII, the concepts of column positions is identical to character positions and to bytes. Therefore, it has been historically acceptable for some implementations to describe line folding or tab stops or table column alignment in terms of bytes or character positions. Other character sets pose complications, as they can have internal representations longer than one octet and they can have displayable characters that have different widths on the terminal screen or printer. In this standard the term _c_o_l_u_m_n _p_o_s_i_t_i_o_n_s has been defined to mean character--not byte--positions in input files (such as ``column position 7 of the FORTRAN input''). Output files describe the column position in terms of the display width of the narrowest printable character in the character set, adjusted to fit the characteristics of the output device. It is very possible that _n column positions will not be able to hold _n characters in some character sets, unless all of those characters are of the narrowest width. It is assumed that the implementation is aware of the width of the various characters, deriving this information from the value of LC_CTYPE, and thus can determine how many column positions to allot for each character in those utilities where it is important. This information is not available to the portable application writer because POSIX.2 provides no interface specification to retrieve such information. The term _c_o_l_u_m_n _p_o_s_i_t_i_o_n was used instead of the more natural _c_o_l_u_m_n as the latter is frequently used in the standard in the different contexts of columns of figures, columns of table values, etc. Wherever confusion might result, these latter types of columns are referred to as _t_e_x_t _c_o_l_u_m_n_s. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 2.2 Definitions 55 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX The definition of _b_i_n_a_r_y _f_i_l_e was removed, as the term is not used in the standard. The ISO/IEC 646 {1} character set standard permits substitution of national currency symbols for the character $ in the ``reference character set'' (which is the same as ASCII). This standard permits the substitution only of the actual characters shown in ISO/IEC 646 {1}: currency sign for the dollar sign and pound sign for the number sign. This document uses the latter names and their symbols, but it is valid for an implementation to accept, for instance, the pound sign () as a comment character in the shell, if that is what the locale's character set uses instead of the number sign (#). Other variation of national currency symbols are not allowed, per the request of the WG15 POSIX working group. The term _s_t_r_e_a_m is not related to System V's STREAMS communications facility; it is derived from historical UNIX system usage and has been made official by the C Standard {7}. The POSIX.2 standard makes no differentiation between C's _t_e_x_t _s_t_r_e_a_m and _b_i_n_a_r_y _s_t_r_e_a_m. The formula used in the POSIX.1 {8} definition of _s_e_c_o_n_d_s _s_i_n_c_e _t_h_e _E_p_o_c_h 1 is not perfect in all cases. See the related rationale in POSIX.1 {8}. 1 END_RATIONALE 1 Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 56 2 Terminology and General Requirements Part 2: SHELL AND UTILITIES P1003.2/D11.2 2.2.3 Abbreviations For the purposes of this standard, the following abbreviations apply: 2.2.3.1 C Standard: ISO/IEC 9899: ..., _I_n_f_o_r_m_a_t_i_o_n _p_r_o_c_e_s_s_i_n_g _s_y_s_t_e_m_s- -_P_r_o_g_r_a_m_m_i_n_g _l_a_n_g_u_a_g_e_s--_C {7}. 2.2.3.2 ERE: An Extended Regular Expression, as defined in 2.8.4. 2.2.3.3 LC_*: An abbreviation used to represent all of the environment variables named in 2.6 whose names begin with the characters ``LC_''. 2.2.3.4 POSIX.1: ISO/IEC 9945-1: 1990: _I_n_f_o_r_m_a_t_i_o_n _t_e_c_h_n_o_l_o_g_y-- _P_o_r_t_a_b_l_e _O_p_e_r_a_t_i_n_g _S_y_s_t_e_m _I_n_t_e_r_f_a_c_e (_P_O_S_I_X)--_P_a_r_t _1: _S_y_s_t_e_m _A_p_p_l_i_c_a_t_i_o_n _P_r_o_g_r_a_m _I_n_t_e_r_f_a_c_e (_A_P_I) [_C _L_a_n_g_u_a_g_e] {8}. 2.2.3.5 POSIX.2: This standard. 2.2.3.6 RE [BRE]: A Basic Regular Expression, as defined in 2.8.3. Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 2.2 Definitions 57 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX 2.3 Built-in Utilities Any of the standard utilities may be implemented as _r_e_g_u_l_a_r _b_u_i_l_t-_i_n utilities within the command language interpreter. This is usually done to increase the performance of frequently-used utilities or to achieve functionality that would be more difficult in a separate environment. The utilities named in Table 2-2 are frequently provided in built-in form. All of the utilities named in the table have special properties in terms of command search order within the shell, as described in 3.9.1.1. Table 2-2 - Regular Built-in Utilities __________________________________________________________________________________________________________________________________________________ cd false kill true wait command getopts read umask __________________________________________________________________________________________________________________________________________________ However, all of the standard utilities, including the regular built-ins in the table, but not the special built-ins described in 3.14, shall be implemented in a manner so that they can be accessed via the POSIX.1 {8} _e_x_e_c family of functions (if the underlying operating system provides the services of such a family to application programs) and can be invoked directly by those standard utilities that require it (env, find, nohup, xargs). Since versions shall be provided for all utilities except for those listed previously, an application running on a system that conforms to both POSIX.1 {8} and Section 7 of this standard can use the _e_x_e_c family of functions, in addition to the shell command interface in 7.1 [such as the _s_y_s_t_e_m() and _p_o_p_e_n() functions in the C binding] defined by this standard, to execute any of these utilities. BEGIN_RATIONALE 2.3.1 Built-in Utilities Rationale. (_T_h_i_s _s_u_b_c_l_a_u_s_e _i_s _n_o_t _a _p_a_r_t _o_f _P_1_0_0_3._2) In earlier drafts, the table of built-ins implied two things to a conforming application: these may be built-ins and these need not be executable. The second implication has now been removed and all utilities can be _e_x_e_c-ed. There is no requirement that these be actually built into the shell itself, but many shells will want to do so because 3.9.1.1 requires that they be found prior to the PATH search. The shell could satisfy its requirements by keeping a list of the names and directly accessing the file-system versions regardless of PATH. Providing all of the required functionality for those such as cd or read Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 58 2 Terminology and General Requirements Part 2: SHELL AND UTILITIES P1003.2/D11.2 would be more difficult. There were originally three justifications for allowing the omission of _e_x_e_c-able versions: (1) This would require wasting space in the file system, at the expense of very small systems. However, it has been pointed out that all nine in the table can be provided with nine links to a single-line shell script: $0 "$@" (2) There is no sense in requiring invocation of utilities like cd because they have no value outside the shell environment or cannot be useful in a child process. However, counter-examples always seemed to be available for even the strangest cases: find . -type d -exec cd {} ; -exec foo {} ; (which invokes foo on accessible directories) ps ... | sed ... | xargs kill find . -exec true ; -a ... (where true is used for temporary debugging) (3) It is confusing to have something such as kill that can easily be in the file system in the base standard, but requires built- in status for the UPE (for the % job control job ID notation). It was decided that it was more appropriate to describe the required functionality (rather than the implementation) to the system implementors and let them decide how to satisfy it. On the other hand, there were objections raised during balloting that any distinction like this between utilities was not useful to applications and that the cost to correct it was small. These arguments were ultimately the most effective. There were varying reasons for including utilities in the table of built-ins: cd, getopts, read, umask, wait The functionality of these utilities is performed more simply within the context of the current process. An example can be taken from the usage of the cd utility. The purpose of the utility is to change the working directory for subsequent operations. The actions of cd affect the process in which cd is executed and all subsequent child processes of that process. Based on the POSIX.1 {8} process model, changes in the process Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 2.3 Built-in Utilities 59 P1003.2/D11.2 INFORMATION TECHNOLOGY--POSIX environment of a child process have no effect on the parent process. If the cd utility were executed from a child process, the working directory change would be effective only in the child process. Child processes initiated subsequent to the child process that executed the cd utility would not have a changed working directory relative to the parent process. command This utility was placed in the table primarily to protect scripts that are concerned about their PATH being manipulated. The ``secure'' shell script example in 4.12.10 would not be possible if a PATH change retrieved an alien version of command. (An alternative would have been to implement getconf as a built-in, but it was felt that it carried too many changing configuration strings to require in the shell.) kill Since common extensions to kill (including the planned User Portability Extension) provide optional job control functionality using shell notation (%1, %2, etc.), some implementations would find it extremely difficult to provide this outside the shell. true, false These are in the table as a courtesy to programmers who wish to use the ``while true'' shell construct without protecting true from PATH searches. (It is acknowledged that ``while :'' also works, but the idiom with true is historically pervasive.) All utilities, including those in the table, are accessible via the functions in 7.1.1 or 7.1.2 [such as _s_y_s_t_e_m() or _p_o_p_e_n()]. There are situations where the return functionality of _s_y_s_t_e_m() and _p_o_p_e_n() is not desirable. Applications that require the exit status of the invoked utility will not be able to use _s_y_s_t_e_m() or _p_o_p_e_n(), since the exit status returned is that of the command language interpreter rather than that of the invoked utility. The alternative for such applications is the use of the _e_x_e_c family. (The text concerning conformance to POSIX.1 {8} was included because where _e_x_e_c is not provided in the underlying system, there is no way to require that utilities be _e_x_e_c- able). END_RATIONALE Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 60 2 Terminology and General Requirements