diff options
Diffstat (limited to 'tcl/doc/string.n')
-rw-r--r-- | tcl/doc/string.n | 267 |
1 files changed, 120 insertions, 147 deletions
diff --git a/tcl/doc/string.n b/tcl/doc/string.n index 2aaec95474b..bc6b9d99aca 100644 --- a/tcl/doc/string.n +++ b/tcl/doc/string.n @@ -21,7 +21,6 @@ string \- Manipulate strings .PP Performs one of several string operations, depending on \fIoption\fR. The legal \fIoption\fRs (which may be abbreviated) are: -.VS 8.1 .TP \fBstring bytelength \fIstring\fR Returns a decimal string giving the number of bytes used to represent @@ -29,40 +28,35 @@ Returns a decimal string giving the number of bytes used to represent represent Unicode characters, the byte length will not be the same as the character length in general. The cases where a script cares about the byte length are rare. In almost all cases, you should use the -\fBstring length\fR operation. Refer to the \fBTcl_NumUtfChars\fR -manual entry for more details on the UTF\-8 representation. +\fBstring length\fR operation (including determining the length of a +Tcl ByteArray object). Refer to the \fBTcl_NumUtfChars\fR manual +entry for more details on the UTF\-8 representation. .TP \fBstring compare\fR ?\fB\-nocase\fR? ?\fB\-length int\fR? \fIstring1 string2\fR -.VE 8.1 -Perform a character-by-character comparison of strings \fIstring1\fR and -\fIstring2\fR. Returns -\-1, 0, or 1, depending on whether \fIstring1\fR is lexicographically -less than, equal to, or greater than \fIstring2\fR. -.VS 8.1 -If \fB\-length\fR is specified, then only the first \fIlength\fR characters -are used in the comparison. If \fB\-length\fR is negative, it is -ignored. If \fB\-nocase\fR is specified, then the strings are -compared in a case-insensitive manner. +Perform a character-by-character comparison of strings \fIstring1\fR +and \fIstring2\fR. Returns \-1, 0, or 1, depending on whether +\fIstring1\fR is lexicographically less than, equal to, or greater +than \fIstring2\fR. If \fB\-length\fR is specified, then only the +first \fIlength\fR characters are used in the comparison. If +\fB\-length\fR is negative, it is ignored. If \fB\-nocase\fR is +specified, then the strings are compared in a case-insensitive manner. .TP \fBstring equal\fR ?\fB\-nocase\fR? ?\fB-length int\fR? \fIstring1 string2\fR -Perform a character-by-character comparison of strings -\fIstring1\fR and \fIstring2\fR. Returns 1 if \fIstring1\fR and -\fIstring2\fR are identical, or 0 when not. If \fB\-length\fR is -specified, then only the first \fIlength\fR characters are used in the -comparison. If \fB\-length\fR is negative, it is ignored. If -\fB\-nocase\fR is specified, then the strings are compared in a -case-insensitive manner. +Perform a character-by-character comparison of strings \fIstring1\fR +and \fIstring2\fR. Returns 1 if \fIstring1\fR and \fIstring2\fR are +identical, or 0 when not. If \fB\-length\fR is specified, then only +the first \fIlength\fR characters are used in the comparison. If +\fB\-length\fR is negative, it is ignored. If \fB\-nocase\fR is +specified, then the strings are compared in a case-insensitive manner. .TP \fBstring first \fIstring1 string2\fR ?\fIstartIndex\fR? -.VE 8.1 Search \fIstring2\fR for a sequence of characters that exactly match the characters in \fIstring1\fR. If found, return the index of the first character in the first such match within \fIstring2\fR. If not -found, return \-1. -.VS 8.1 -If \fIstartIndex\fR is specified (in any of the forms accepted by the -\fBindex\fR method), then the search is constrained to start with the -character in \fIstring2\fR specified by the index. For example, +found, return \-1. If \fIstartIndex\fR is specified (in any of the +forms accepted by the \fBindex\fR method), then the search is +constrained to start with the character in \fIstring2\fR specified by +the index. For example, .RS .CS \fBstring first a 0a23456789abcdef 5\fR @@ -73,29 +67,22 @@ will return \fB10\fR, but .CE will return \fB\-1\fR. .RE -.VE 8.1 .TP \fBstring index \fIstring charIndex\fR -Returns the \fIcharIndex\fR'th character of the \fIstring\fR -argument. A \fIcharIndex\fR of 0 corresponds to the first -character of the string. -.VS 8.1 -\fIcharIndex\fR may be specified as -follows: +Returns the \fIcharIndex\fR'th character of the \fIstring\fR argument. +A \fIcharIndex\fR of 0 corresponds to the first character of the +string. \fIcharIndex\fR may be specified as follows: .RS .IP \fIinteger\fR 10 -The char specified at this integral index +The char specified at this integral index. .IP \fBend\fR 10 The last char of the string. .IP \fBend\-\fIinteger\fR 10 -The last char of the string minus the specified integer -offset (e.g. \fBend\-1\fR would refer to the "c" in "abcd"). +The last char of the string minus the specified integer offset +(e.g. \fBend\-1\fR would refer to the "c" in "abcd"). .PP -.VE 8.1 -If \fIcharIndex\fR is less than 0 or greater than -or equal to the length of the string then an empty string is -returned. -.VS 8.1 +If \fIcharIndex\fR is less than 0 or greater than or equal to the +length of the string then an empty string is returned. .RE .TP \fBstring is \fIclass\fR ?\fB\-strict\fR? ?\fB\-failindex \fIvarname\fR? \fIstring\fR @@ -105,16 +92,16 @@ empty string returns 0, otherwise and empty string will return 1 on any class. If \fB\-failindex\fR is specified, then if the function returns 0, the index in the string where the class was no longer valid will be stored in the variable named \fIvarname\fR. The \fIvarname\fR -will not be set if the function returns 1. The following character classes -are recognized (the class name can be abbreviated): +will not be set if the function returns 1. The following character +classes are recognized (the class name can be abbreviated): .RS .IP \fBalnum\fR 10 Any Unicode alphabet or digit character. .IP \fBalpha\fR 10 Any Unicode alphabet character. .IP \fBascii\fR 10 -Any character with a value less than \\u0080 (those that -are in the 7\-bit ascii range). +Any character with a value less than \\u0080 (those that are in the +7\-bit ascii range). .IP \fBboolean\fR 10 Any of the forms allowed to \fBTcl_GetBoolean\fR. .IP \fBcontrol\fR 10 @@ -124,16 +111,17 @@ Any Unicode digit character. Note that this includes characters outside of the [0\-9] range. .IP \fBdouble\fR 10 Any of the valid forms for a double in Tcl, with optional surrounding -whitespace. In case of under/overflow in the value, 0 is returned -and the \fIvarname\fR will contain \-1. +whitespace. In case of under/overflow in the value, 0 is returned and +the \fIvarname\fR will contain \-1. .IP \fBfalse\fR 10 -Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is false. +Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is +false. .IP \fBgraph\fR 10 Any Unicode printing character, except space. .IP \fBinteger\fR 10 -Any of the valid forms for an integer in Tcl, with optional surrounding -whitespace. In case of under/overflow in the value, 0 is returned -and the \fIvarname\fR will contain \-1. +Any of the valid forms for an integer in Tcl, with optional +surrounding whitespace. In case of under/overflow in the value, 0 is +returned and the \fIvarname\fR will contain \-1. .IP \fBlower\fR 10 Any Unicode lower case alphabet character. .IP \fBprint\fR 10 @@ -143,30 +131,29 @@ Any Unicode punctuation character. .IP \fBspace\fR 10 Any Unicode space character. .IP \fBtrue\fR 10 -Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is true. +Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is +true. .IP \fBupper\fR 10 Any upper case alphabet character in the Unicode character set. .IP \fBwordchar\fR 10 -Any Unicode word character. That is any alphanumeric character, -and any Unicode connector punctuation characters (e.g. underscore). +Any Unicode word character. That is any alphanumeric character, and +any Unicode connector punctuation characters (e.g. underscore). .IP \fBxdigit\fR 10 Any hexadecimal digit character ([0\-9A\-Fa\-f]). .PP In the case of \fBboolean\fR, \fBtrue\fR and \fBfalse\fR, if the -function will return 0, then the \fIvarname\fR will always be set to 0, -due to the varied nature of a valid boolean value. +function will return 0, then the \fIvarname\fR will always be set to +0, due to the varied nature of a valid boolean value. .RE .TP -\fBstring last \fIstring1 string2\fR ?\fIstartIndex\fR? -.VE 8.1 +\fBstring last \fIstring1 string2\fR ?\fIlastIndex\fR? Search \fIstring2\fR for a sequence of characters that exactly match the characters in \fIstring1\fR. If found, return the index of the first character in the last such match within \fIstring2\fR. If there -is no match, then return \-1. -.VS 8.1 -If \fIstartIndex\fR is specified (in any of the forms accepted by the -\fBindex\fR method), then only the characters in \fIstring2\fR at or before the -specified \fIstartIndex\fR will be considered by the search. For example, +is no match, then return \-1. If \fIlastIndex\fR is specified (in any +of the forms accepted by the \fBindex\fR method), then only the +characters in \fIstring2\fR at or before the specified \fIlastIndex\fR +will be considered by the search. For example, .RS .CS \fBstring last a 0a23456789abcdef 15\fR @@ -177,25 +164,25 @@ will return \fB10\fR, but .CE will return \fB1\fR. .RE -.VE 8.1 .TP \fBstring length \fIstring\fR Returns a decimal string giving the number of characters in \fIstring\fR. Note that this is not necessarily the same as the -number of bytes used to store the string. -.VS 8.1 +number of bytes used to store the string. If the object is a +ByteArray object (such as those returned from reading a binary encoded +channel), then this will return the actual byte length of the object. .TP \fBstring map\fR ?\fB\-nocase\fR? \fIcharMap string\fR Replaces characters in \fIstring\fR based on the key-value pairs in -\fIcharMap\fR. \fIcharMap\fR is a list of \fIkey value key value\fR ... +\fIcharMap\fR. \fIcharMap\fR is a list of \fIkey value key value ...\fR as in the form returned by \fBarray get\fR. Each instance of a key in the string will be replaced with its corresponding value. If \fB\-nocase\fR is specified, then matching is done without regard to case differences. Both \fIkey\fR and \fIvalue\fR may be multiple -characters. Replacement is done in an ordered manner, so the key appearing -first in the list will be checked first, and so on. \fIstring\fR is -only iterated over once, so earlier key replacements will have no -affect for later key matches. For example, +characters. Replacement is done in an ordered manner, so the key +appearing first in the list will be checked first, and so on. +\fIstring\fR is only iterated over once, so earlier key replacements +will have no affect for later key matches. For example, .RS .CS \fBstring map {abc 1 ab 2 a 3 1 0} 1abcaababcabababc\fR @@ -204,53 +191,42 @@ will return the string \fB01321221\fR. .RE .TP \fBstring match\fR ?\fB\-nocase\fR? \fIpattern\fR \fIstring\fR -.VE 8.1 -See if \fIpattern\fR matches \fIstring\fR; return 1 if it does, 0 -if it doesn't. -.VS 8.1 -If \fB\-nocase\fR is specified, then the pattern attempts to match -against the string in a case insensitive manner. -.VE 8.1 -For the two strings to match, their contents -must be identical except that the following special sequences -may appear in \fIpattern\fR: +See if \fIpattern\fR matches \fIstring\fR; return 1 if it does, 0 if +it doesn't. If \fB\-nocase\fR is specified, then the pattern attempts +to match against the string in a case insensitive manner. For the two +strings to match, their contents must be identical except that the +following special sequences may appear in \fIpattern\fR: .RS .IP \fB*\fR 10 -Matches any sequence of characters in \fIstring\fR, -including a null string. +Matches any sequence of characters in \fIstring\fR, including a null +string. .IP \fB?\fR 10 Matches any single character in \fIstring\fR. .IP \fB[\fIchars\fB]\fR 10 Matches any character in the set given by \fIchars\fR. If a sequence -of the form -\fIx\fB\-\fIy\fR appears in \fIchars\fR, then any character -between \fIx\fR and \fIy\fR, inclusive, will match. -.VS 8.1 -When used with \fB\-nocase\fR, the end points of the range are converted -to lower case first. Whereas {[A\-z]} matches '_' when matching -case-sensitively ('_' falls between the 'Z' and 'a'), with \fB\-nocase\fR -this is considered like {[A\-Za\-z]} (and probably what was meant in the -first place). -.VE 8.1 +of the form \fIx\fB\-\fIy\fR appears in \fIchars\fR, then any +character between \fIx\fR and \fIy\fR, inclusive, will match. When +used with \fB\-nocase\fR, the end points of the range are converted to +lower case first. Whereas {[A\-z]} matches '_' when matching +case-sensitively ('_' falls between the 'Z' and 'a'), with +\fB\-nocase\fR this is considered like {[A\-Za\-z]} (and probably what +was meant in the first place). .IP \fB\e\fIx\fR 10 -Matches the single character \fIx\fR. This provides a way of -avoiding the special interpretation of the characters -\fB*?[]\e\fR in \fIpattern\fR. +Matches the single character \fIx\fR. This provides a way of avoiding +the special interpretation of the characters \fB*?[]\e\fR in +\fIpattern\fR. .RE .TP \fBstring range \fIstring first last\fR Returns a range of consecutive characters from \fIstring\fR, starting with the character whose index is \fIfirst\fR and ending with the -character whose index is \fIlast\fR. An index of 0 refers to the -.VS 8.1 -first character of the string. \fIfirst\fR and \fIlast\fR may be -specified as for the \fBindex\fR method. -.VE 8.1 -If \fIfirst\fR is less than zero then it is treated as if it were zero, and -if \fIlast\fR is greater than or equal to the length of the string then -it is treated as if it were \fBend\fR. If \fIfirst\fR is greater than -\fIlast\fR then an empty string is returned. -.VS 8.1 +character whose index is \fIlast\fR. An index of 0 refers to the first +character of the string. \fIfirst\fR and \fIlast\fR may be specified +as for the \fBindex\fR method. If \fIfirst\fR is less than zero then +it is treated as if it were zero, and if \fIlast\fR is greater than or +equal to the length of the string then it is treated as if it were +\fBend\fR. If \fIfirst\fR is greater than \fIlast\fR then an empty +string is returned. .TP \fBstring repeat \fIstring count\fR Returns \fIstring\fR repeated \fIcount\fR number of times. @@ -261,61 +237,56 @@ with the character whose index is \fIfirst\fR and ending with the character whose index is \fIlast\fR. An index of 0 refers to the first character of the string. \fIFirst\fR and \fIlast\fR may be specified as for the \fBindex\fR method. If \fInewstring\fR is -specified, then it is placed in the removed character range. -If \fIfirst\fR is less than zero then it is treated as if it were zero, and -if \fIlast\fR is greater than or equal to the length of the string then -it is treated as if it were \fBend\fR. If \fIfirst\fR is greater than -\fIlast\fR or the length of the initial string, or \fIlast\fR is less -than 0, then the initial string is returned untouched. +specified, then it is placed in the removed character range. If +\fIfirst\fR is less than zero then it is treated as if it were zero, +and if \fIlast\fR is greater than or equal to the length of the string +then it is treated as if it were \fBend\fR. If \fIfirst\fR is greater +than \fIlast\fR or the length of the initial string, or \fIlast\fR is +less than 0, then the initial string is returned untouched. .TP \fBstring tolower \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR? -Returns a value equal to \fIstring\fR except that all upper (or title) case -letters have been converted to lower case. If \fIfirst\fR is specified, it -refers to the first char index in the string to start modifying. If -\fIlast\fR is specified, it refers to the char index in the string to stop -at (inclusive). \fIfirst\fR and \fIlast\fR may be +Returns a value equal to \fIstring\fR except that all upper (or title) +case letters have been converted to lower case. If \fIfirst\fR is +specified, it refers to the first char index in the string to start +modifying. If \fIlast\fR is specified, it refers to the char index in +the string to stop at (inclusive). \fIfirst\fR and \fIlast\fR may be specified as for the \fBindex\fR method. .TP \fBstring totitle \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR? Returns a value equal to \fIstring\fR except that the first character -in \fIstring\fR is converted to its Unicode title case variant (or upper -case if there is no title case variant) and the rest of the string is -converted to lower case. If \fIfirst\fR is specified, it +in \fIstring\fR is converted to its Unicode title case variant (or +upper case if there is no title case variant) and the rest of the +string is converted to lower case. If \fIfirst\fR is specified, it refers to the first char index in the string to start modifying. If -\fIlast\fR is specified, it refers to the char index in the string to stop -at (inclusive). \fIfirst\fR and \fIlast\fR may be -specified as for the \fBindex\fR method. +\fIlast\fR is specified, it refers to the char index in the string to +stop at (inclusive). \fIfirst\fR and \fIlast\fR may be specified as +for the \fBindex\fR method. .TP \fBstring toupper \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR? -Returns a value equal to \fIstring\fR except that all lower (or title) case -letters have been converted to upper case. If \fIfirst\fR is specified, it -refers to the first char index in the string to start modifying. If -\fIlast\fR is specified, it refers to the char index in the string to stop -at (inclusive). \fIfirst\fR and \fIlast\fR may be specified as for the -\fBindex\fR method. -.VE 8.1 +Returns a value equal to \fIstring\fR except that all lower (or title) +case letters have been converted to upper case. If \fIfirst\fR is +specified, it refers to the first char index in the string to start +modifying. If \fIlast\fR is specified, it refers to the char index in +the string to stop at (inclusive). \fIfirst\fR and \fIlast\fR may be +specified as for the \fBindex\fR method. .TP \fBstring trim \fIstring\fR ?\fIchars\fR? -Returns a value equal to \fIstring\fR except that any leading -or trailing characters from the set given by \fIchars\fR are -removed. -If \fIchars\fR is not specified then white space is removed -(spaces, tabs, newlines, and carriage returns). +Returns a value equal to \fIstring\fR except that any leading or +trailing characters from the set given by \fIchars\fR are removed. If +\fIchars\fR is not specified then white space is removed (spaces, +tabs, newlines, and carriage returns). .TP \fBstring trimleft \fIstring\fR ?\fIchars\fR? -Returns a value equal to \fIstring\fR except that any -leading characters from the set given by \fIchars\fR are -removed. -If \fIchars\fR is not specified then white space is removed -(spaces, tabs, newlines, and carriage returns). +Returns a value equal to \fIstring\fR except that any leading +characters from the set given by \fIchars\fR are removed. If +\fIchars\fR is not specified then white space is removed (spaces, +tabs, newlines, and carriage returns). .TP \fBstring trimright \fIstring\fR ?\fIchars\fR? -Returns a value equal to \fIstring\fR except that any -trailing characters from the set given by \fIchars\fR are -removed. -If \fIchars\fR is not specified then white space is removed -(spaces, tabs, newlines, and carriage returns). -.VS 8.1 +Returns a value equal to \fIstring\fR except that any trailing +characters from the set given by \fIchars\fR are removed. If +\fIchars\fR is not specified then white space is removed (spaces, +tabs, newlines, and carriage returns). .TP \fBstring wordend \fIstring charIndex\fR Returns the index of the character just after the last one in the word @@ -332,7 +303,9 @@ specified as for the \fBindex\fR method. A word is considered to be any contiguous range of alphanumeric (Unicode letters or decimal digits) or underscore (Unicode connector punctuation) characters, or any single character other than these. -.VE 8.1 + +.SH "SEE ALSO" +expr(n), list(n) .SH KEYWORDS case conversion, compare, index, match, pattern, string, word, equal, ctype |