summaryrefslogtreecommitdiff
path: root/tcl/doc/string.n
diff options
context:
space:
mode:
Diffstat (limited to 'tcl/doc/string.n')
-rw-r--r--tcl/doc/string.n267
1 files changed, 120 insertions, 147 deletions
diff --git a/tcl/doc/string.n b/tcl/doc/string.n
index 2aaec95474b..bc6b9d99aca 100644
--- a/tcl/doc/string.n
+++ b/tcl/doc/string.n
@@ -21,7 +21,6 @@ string \- Manipulate strings
.PP
Performs one of several string operations, depending on \fIoption\fR.
The legal \fIoption\fRs (which may be abbreviated) are:
-.VS 8.1
.TP
\fBstring bytelength \fIstring\fR
Returns a decimal string giving the number of bytes used to represent
@@ -29,40 +28,35 @@ Returns a decimal string giving the number of bytes used to represent
represent Unicode characters, the byte length will not be the same as
the character length in general. The cases where a script cares about
the byte length are rare. In almost all cases, you should use the
-\fBstring length\fR operation. Refer to the \fBTcl_NumUtfChars\fR
-manual entry for more details on the UTF\-8 representation.
+\fBstring length\fR operation (including determining the length of a
+Tcl ByteArray object). Refer to the \fBTcl_NumUtfChars\fR manual
+entry for more details on the UTF\-8 representation.
.TP
\fBstring compare\fR ?\fB\-nocase\fR? ?\fB\-length int\fR? \fIstring1 string2\fR
-.VE 8.1
-Perform a character-by-character comparison of strings \fIstring1\fR and
-\fIstring2\fR. Returns
-\-1, 0, or 1, depending on whether \fIstring1\fR is lexicographically
-less than, equal to, or greater than \fIstring2\fR.
-.VS 8.1
-If \fB\-length\fR is specified, then only the first \fIlength\fR characters
-are used in the comparison. If \fB\-length\fR is negative, it is
-ignored. If \fB\-nocase\fR is specified, then the strings are
-compared in a case-insensitive manner.
+Perform a character-by-character comparison of strings \fIstring1\fR
+and \fIstring2\fR. Returns \-1, 0, or 1, depending on whether
+\fIstring1\fR is lexicographically less than, equal to, or greater
+than \fIstring2\fR. If \fB\-length\fR is specified, then only the
+first \fIlength\fR characters are used in the comparison. If
+\fB\-length\fR is negative, it is ignored. If \fB\-nocase\fR is
+specified, then the strings are compared in a case-insensitive manner.
.TP
\fBstring equal\fR ?\fB\-nocase\fR? ?\fB-length int\fR? \fIstring1 string2\fR
-Perform a character-by-character comparison of strings
-\fIstring1\fR and \fIstring2\fR. Returns 1 if \fIstring1\fR and
-\fIstring2\fR are identical, or 0 when not. If \fB\-length\fR is
-specified, then only the first \fIlength\fR characters are used in the
-comparison. If \fB\-length\fR is negative, it is ignored. If
-\fB\-nocase\fR is specified, then the strings are compared in a
-case-insensitive manner.
+Perform a character-by-character comparison of strings \fIstring1\fR
+and \fIstring2\fR. Returns 1 if \fIstring1\fR and \fIstring2\fR are
+identical, or 0 when not. If \fB\-length\fR is specified, then only
+the first \fIlength\fR characters are used in the comparison. If
+\fB\-length\fR is negative, it is ignored. If \fB\-nocase\fR is
+specified, then the strings are compared in a case-insensitive manner.
.TP
\fBstring first \fIstring1 string2\fR ?\fIstartIndex\fR?
-.VE 8.1
Search \fIstring2\fR for a sequence of characters that exactly match
the characters in \fIstring1\fR. If found, return the index of the
first character in the first such match within \fIstring2\fR. If not
-found, return \-1.
-.VS 8.1
-If \fIstartIndex\fR is specified (in any of the forms accepted by the
-\fBindex\fR method), then the search is constrained to start with the
-character in \fIstring2\fR specified by the index. For example,
+found, return \-1. If \fIstartIndex\fR is specified (in any of the
+forms accepted by the \fBindex\fR method), then the search is
+constrained to start with the character in \fIstring2\fR specified by
+the index. For example,
.RS
.CS
\fBstring first a 0a23456789abcdef 5\fR
@@ -73,29 +67,22 @@ will return \fB10\fR, but
.CE
will return \fB\-1\fR.
.RE
-.VE 8.1
.TP
\fBstring index \fIstring charIndex\fR
-Returns the \fIcharIndex\fR'th character of the \fIstring\fR
-argument. A \fIcharIndex\fR of 0 corresponds to the first
-character of the string.
-.VS 8.1
-\fIcharIndex\fR may be specified as
-follows:
+Returns the \fIcharIndex\fR'th character of the \fIstring\fR argument.
+A \fIcharIndex\fR of 0 corresponds to the first character of the
+string. \fIcharIndex\fR may be specified as follows:
.RS
.IP \fIinteger\fR 10
-The char specified at this integral index
+The char specified at this integral index.
.IP \fBend\fR 10
The last char of the string.
.IP \fBend\-\fIinteger\fR 10
-The last char of the string minus the specified integer
-offset (e.g. \fBend\-1\fR would refer to the "c" in "abcd").
+The last char of the string minus the specified integer offset
+(e.g. \fBend\-1\fR would refer to the "c" in "abcd").
.PP
-.VE 8.1
-If \fIcharIndex\fR is less than 0 or greater than
-or equal to the length of the string then an empty string is
-returned.
-.VS 8.1
+If \fIcharIndex\fR is less than 0 or greater than or equal to the
+length of the string then an empty string is returned.
.RE
.TP
\fBstring is \fIclass\fR ?\fB\-strict\fR? ?\fB\-failindex \fIvarname\fR? \fIstring\fR
@@ -105,16 +92,16 @@ empty string returns 0, otherwise and empty string will return 1 on
any class. If \fB\-failindex\fR is specified, then if the function
returns 0, the index in the string where the class was no longer valid
will be stored in the variable named \fIvarname\fR. The \fIvarname\fR
-will not be set if the function returns 1. The following character classes
-are recognized (the class name can be abbreviated):
+will not be set if the function returns 1. The following character
+classes are recognized (the class name can be abbreviated):
.RS
.IP \fBalnum\fR 10
Any Unicode alphabet or digit character.
.IP \fBalpha\fR 10
Any Unicode alphabet character.
.IP \fBascii\fR 10
-Any character with a value less than \\u0080 (those that
-are in the 7\-bit ascii range).
+Any character with a value less than \\u0080 (those that are in the
+7\-bit ascii range).
.IP \fBboolean\fR 10
Any of the forms allowed to \fBTcl_GetBoolean\fR.
.IP \fBcontrol\fR 10
@@ -124,16 +111,17 @@ Any Unicode digit character. Note that this includes characters
outside of the [0\-9] range.
.IP \fBdouble\fR 10
Any of the valid forms for a double in Tcl, with optional surrounding
-whitespace. In case of under/overflow in the value, 0 is returned
-and the \fIvarname\fR will contain \-1.
+whitespace. In case of under/overflow in the value, 0 is returned and
+the \fIvarname\fR will contain \-1.
.IP \fBfalse\fR 10
-Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is false.
+Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is
+false.
.IP \fBgraph\fR 10
Any Unicode printing character, except space.
.IP \fBinteger\fR 10
-Any of the valid forms for an integer in Tcl, with optional surrounding
-whitespace. In case of under/overflow in the value, 0 is returned
-and the \fIvarname\fR will contain \-1.
+Any of the valid forms for an integer in Tcl, with optional
+surrounding whitespace. In case of under/overflow in the value, 0 is
+returned and the \fIvarname\fR will contain \-1.
.IP \fBlower\fR 10
Any Unicode lower case alphabet character.
.IP \fBprint\fR 10
@@ -143,30 +131,29 @@ Any Unicode punctuation character.
.IP \fBspace\fR 10
Any Unicode space character.
.IP \fBtrue\fR 10
-Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is true.
+Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is
+true.
.IP \fBupper\fR 10
Any upper case alphabet character in the Unicode character set.
.IP \fBwordchar\fR 10
-Any Unicode word character. That is any alphanumeric character,
-and any Unicode connector punctuation characters (e.g. underscore).
+Any Unicode word character. That is any alphanumeric character, and
+any Unicode connector punctuation characters (e.g. underscore).
.IP \fBxdigit\fR 10
Any hexadecimal digit character ([0\-9A\-Fa\-f]).
.PP
In the case of \fBboolean\fR, \fBtrue\fR and \fBfalse\fR, if the
-function will return 0, then the \fIvarname\fR will always be set to 0,
-due to the varied nature of a valid boolean value.
+function will return 0, then the \fIvarname\fR will always be set to
+0, due to the varied nature of a valid boolean value.
.RE
.TP
-\fBstring last \fIstring1 string2\fR ?\fIstartIndex\fR?
-.VE 8.1
+\fBstring last \fIstring1 string2\fR ?\fIlastIndex\fR?
Search \fIstring2\fR for a sequence of characters that exactly match
the characters in \fIstring1\fR. If found, return the index of the
first character in the last such match within \fIstring2\fR. If there
-is no match, then return \-1.
-.VS 8.1
-If \fIstartIndex\fR is specified (in any of the forms accepted by the
-\fBindex\fR method), then only the characters in \fIstring2\fR at or before the
-specified \fIstartIndex\fR will be considered by the search. For example,
+is no match, then return \-1. If \fIlastIndex\fR is specified (in any
+of the forms accepted by the \fBindex\fR method), then only the
+characters in \fIstring2\fR at or before the specified \fIlastIndex\fR
+will be considered by the search. For example,
.RS
.CS
\fBstring last a 0a23456789abcdef 15\fR
@@ -177,25 +164,25 @@ will return \fB10\fR, but
.CE
will return \fB1\fR.
.RE
-.VE 8.1
.TP
\fBstring length \fIstring\fR
Returns a decimal string giving the number of characters in
\fIstring\fR. Note that this is not necessarily the same as the
-number of bytes used to store the string.
-.VS 8.1
+number of bytes used to store the string. If the object is a
+ByteArray object (such as those returned from reading a binary encoded
+channel), then this will return the actual byte length of the object.
.TP
\fBstring map\fR ?\fB\-nocase\fR? \fIcharMap string\fR
Replaces characters in \fIstring\fR based on the key-value pairs in
-\fIcharMap\fR. \fIcharMap\fR is a list of \fIkey value key value\fR ...
+\fIcharMap\fR. \fIcharMap\fR is a list of \fIkey value key value ...\fR
as in the form returned by \fBarray get\fR. Each instance of a
key in the string will be replaced with its corresponding value. If
\fB\-nocase\fR is specified, then matching is done without regard to
case differences. Both \fIkey\fR and \fIvalue\fR may be multiple
-characters. Replacement is done in an ordered manner, so the key appearing
-first in the list will be checked first, and so on. \fIstring\fR is
-only iterated over once, so earlier key replacements will have no
-affect for later key matches. For example,
+characters. Replacement is done in an ordered manner, so the key
+appearing first in the list will be checked first, and so on.
+\fIstring\fR is only iterated over once, so earlier key replacements
+will have no affect for later key matches. For example,
.RS
.CS
\fBstring map {abc 1 ab 2 a 3 1 0} 1abcaababcabababc\fR
@@ -204,53 +191,42 @@ will return the string \fB01321221\fR.
.RE
.TP
\fBstring match\fR ?\fB\-nocase\fR? \fIpattern\fR \fIstring\fR
-.VE 8.1
-See if \fIpattern\fR matches \fIstring\fR; return 1 if it does, 0
-if it doesn't.
-.VS 8.1
-If \fB\-nocase\fR is specified, then the pattern attempts to match
-against the string in a case insensitive manner.
-.VE 8.1
-For the two strings to match, their contents
-must be identical except that the following special sequences
-may appear in \fIpattern\fR:
+See if \fIpattern\fR matches \fIstring\fR; return 1 if it does, 0 if
+it doesn't. If \fB\-nocase\fR is specified, then the pattern attempts
+to match against the string in a case insensitive manner. For the two
+strings to match, their contents must be identical except that the
+following special sequences may appear in \fIpattern\fR:
.RS
.IP \fB*\fR 10
-Matches any sequence of characters in \fIstring\fR,
-including a null string.
+Matches any sequence of characters in \fIstring\fR, including a null
+string.
.IP \fB?\fR 10
Matches any single character in \fIstring\fR.
.IP \fB[\fIchars\fB]\fR 10
Matches any character in the set given by \fIchars\fR. If a sequence
-of the form
-\fIx\fB\-\fIy\fR appears in \fIchars\fR, then any character
-between \fIx\fR and \fIy\fR, inclusive, will match.
-.VS 8.1
-When used with \fB\-nocase\fR, the end points of the range are converted
-to lower case first. Whereas {[A\-z]} matches '_' when matching
-case-sensitively ('_' falls between the 'Z' and 'a'), with \fB\-nocase\fR
-this is considered like {[A\-Za\-z]} (and probably what was meant in the
-first place).
-.VE 8.1
+of the form \fIx\fB\-\fIy\fR appears in \fIchars\fR, then any
+character between \fIx\fR and \fIy\fR, inclusive, will match. When
+used with \fB\-nocase\fR, the end points of the range are converted to
+lower case first. Whereas {[A\-z]} matches '_' when matching
+case-sensitively ('_' falls between the 'Z' and 'a'), with
+\fB\-nocase\fR this is considered like {[A\-Za\-z]} (and probably what
+was meant in the first place).
.IP \fB\e\fIx\fR 10
-Matches the single character \fIx\fR. This provides a way of
-avoiding the special interpretation of the characters
-\fB*?[]\e\fR in \fIpattern\fR.
+Matches the single character \fIx\fR. This provides a way of avoiding
+the special interpretation of the characters \fB*?[]\e\fR in
+\fIpattern\fR.
.RE
.TP
\fBstring range \fIstring first last\fR
Returns a range of consecutive characters from \fIstring\fR, starting
with the character whose index is \fIfirst\fR and ending with the
-character whose index is \fIlast\fR. An index of 0 refers to the
-.VS 8.1
-first character of the string. \fIfirst\fR and \fIlast\fR may be
-specified as for the \fBindex\fR method.
-.VE 8.1
-If \fIfirst\fR is less than zero then it is treated as if it were zero, and
-if \fIlast\fR is greater than or equal to the length of the string then
-it is treated as if it were \fBend\fR. If \fIfirst\fR is greater than
-\fIlast\fR then an empty string is returned.
-.VS 8.1
+character whose index is \fIlast\fR. An index of 0 refers to the first
+character of the string. \fIfirst\fR and \fIlast\fR may be specified
+as for the \fBindex\fR method. If \fIfirst\fR is less than zero then
+it is treated as if it were zero, and if \fIlast\fR is greater than or
+equal to the length of the string then it is treated as if it were
+\fBend\fR. If \fIfirst\fR is greater than \fIlast\fR then an empty
+string is returned.
.TP
\fBstring repeat \fIstring count\fR
Returns \fIstring\fR repeated \fIcount\fR number of times.
@@ -261,61 +237,56 @@ with the character whose index is \fIfirst\fR and ending with the
character whose index is \fIlast\fR. An index of 0 refers to the
first character of the string. \fIFirst\fR and \fIlast\fR may be
specified as for the \fBindex\fR method. If \fInewstring\fR is
-specified, then it is placed in the removed character range.
-If \fIfirst\fR is less than zero then it is treated as if it were zero, and
-if \fIlast\fR is greater than or equal to the length of the string then
-it is treated as if it were \fBend\fR. If \fIfirst\fR is greater than
-\fIlast\fR or the length of the initial string, or \fIlast\fR is less
-than 0, then the initial string is returned untouched.
+specified, then it is placed in the removed character range. If
+\fIfirst\fR is less than zero then it is treated as if it were zero,
+and if \fIlast\fR is greater than or equal to the length of the string
+then it is treated as if it were \fBend\fR. If \fIfirst\fR is greater
+than \fIlast\fR or the length of the initial string, or \fIlast\fR is
+less than 0, then the initial string is returned untouched.
.TP
\fBstring tolower \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR?
-Returns a value equal to \fIstring\fR except that all upper (or title) case
-letters have been converted to lower case. If \fIfirst\fR is specified, it
-refers to the first char index in the string to start modifying. If
-\fIlast\fR is specified, it refers to the char index in the string to stop
-at (inclusive). \fIfirst\fR and \fIlast\fR may be
+Returns a value equal to \fIstring\fR except that all upper (or title)
+case letters have been converted to lower case. If \fIfirst\fR is
+specified, it refers to the first char index in the string to start
+modifying. If \fIlast\fR is specified, it refers to the char index in
+the string to stop at (inclusive). \fIfirst\fR and \fIlast\fR may be
specified as for the \fBindex\fR method.
.TP
\fBstring totitle \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR?
Returns a value equal to \fIstring\fR except that the first character
-in \fIstring\fR is converted to its Unicode title case variant (or upper
-case if there is no title case variant) and the rest of the string is
-converted to lower case. If \fIfirst\fR is specified, it
+in \fIstring\fR is converted to its Unicode title case variant (or
+upper case if there is no title case variant) and the rest of the
+string is converted to lower case. If \fIfirst\fR is specified, it
refers to the first char index in the string to start modifying. If
-\fIlast\fR is specified, it refers to the char index in the string to stop
-at (inclusive). \fIfirst\fR and \fIlast\fR may be
-specified as for the \fBindex\fR method.
+\fIlast\fR is specified, it refers to the char index in the string to
+stop at (inclusive). \fIfirst\fR and \fIlast\fR may be specified as
+for the \fBindex\fR method.
.TP
\fBstring toupper \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR?
-Returns a value equal to \fIstring\fR except that all lower (or title) case
-letters have been converted to upper case. If \fIfirst\fR is specified, it
-refers to the first char index in the string to start modifying. If
-\fIlast\fR is specified, it refers to the char index in the string to stop
-at (inclusive). \fIfirst\fR and \fIlast\fR may be specified as for the
-\fBindex\fR method.
-.VE 8.1
+Returns a value equal to \fIstring\fR except that all lower (or title)
+case letters have been converted to upper case. If \fIfirst\fR is
+specified, it refers to the first char index in the string to start
+modifying. If \fIlast\fR is specified, it refers to the char index in
+the string to stop at (inclusive). \fIfirst\fR and \fIlast\fR may be
+specified as for the \fBindex\fR method.
.TP
\fBstring trim \fIstring\fR ?\fIchars\fR?
-Returns a value equal to \fIstring\fR except that any leading
-or trailing characters from the set given by \fIchars\fR are
-removed.
-If \fIchars\fR is not specified then white space is removed
-(spaces, tabs, newlines, and carriage returns).
+Returns a value equal to \fIstring\fR except that any leading or
+trailing characters from the set given by \fIchars\fR are removed. If
+\fIchars\fR is not specified then white space is removed (spaces,
+tabs, newlines, and carriage returns).
.TP
\fBstring trimleft \fIstring\fR ?\fIchars\fR?
-Returns a value equal to \fIstring\fR except that any
-leading characters from the set given by \fIchars\fR are
-removed.
-If \fIchars\fR is not specified then white space is removed
-(spaces, tabs, newlines, and carriage returns).
+Returns a value equal to \fIstring\fR except that any leading
+characters from the set given by \fIchars\fR are removed. If
+\fIchars\fR is not specified then white space is removed (spaces,
+tabs, newlines, and carriage returns).
.TP
\fBstring trimright \fIstring\fR ?\fIchars\fR?
-Returns a value equal to \fIstring\fR except that any
-trailing characters from the set given by \fIchars\fR are
-removed.
-If \fIchars\fR is not specified then white space is removed
-(spaces, tabs, newlines, and carriage returns).
-.VS 8.1
+Returns a value equal to \fIstring\fR except that any trailing
+characters from the set given by \fIchars\fR are removed. If
+\fIchars\fR is not specified then white space is removed (spaces,
+tabs, newlines, and carriage returns).
.TP
\fBstring wordend \fIstring charIndex\fR
Returns the index of the character just after the last one in the word
@@ -332,7 +303,9 @@ specified as for the \fBindex\fR method. A word is considered to be any
contiguous range of alphanumeric (Unicode letters or decimal digits)
or underscore (Unicode connector punctuation) characters, or any
single character other than these.
-.VE 8.1
+
+.SH "SEE ALSO"
+expr(n), list(n)
.SH KEYWORDS
case conversion, compare, index, match, pattern, string, word, equal, ctype