.. http://www.lassosoft.com/Language-Guide-String-Operations .. _strings: ******* Strings ******* Text in Lasso is stored and manipulated using the :type:`string` type or the ``string_…`` methods. This chapter details the operators and methods that can manipulate string values. .. tip:: The :type:`string` type is often used in conjunction with the :type:`bytes` type to convert binary data between different character encodings, such as UTF-8 and ISO-8859-1. See the :ref:`byte-streams` chapter for more information about the :type:`bytes` type. String Objects ============== Text processing is a central function of Lasso. Many Lasso methods are dedicated to outputting and manipulating text. Lasso is used to format text-based HTML pages or XML data for output. Lasso is also used to process and manipulate text-based HTML form inputs and URLs. Because of this focus on text processing, the :type:`string` type is the primary type of data in Lasso. The result of all expressions are converted to strings before they are output into the HTML page or XML data being served. The following are operations that can be performed directly on strings: #. Operators can be used to perform string calculations:: 'The' + ' ' + 'String' // => The String #. String member methods can manipulate the current string value:: 'the string'->titlecase&; // => The String #. String member methods can return new strings based on the value of the current string:: 'The String'->sub(5, 6) // => String #. String member methods can test the attributes of strings:: 'The String'->contains('the') // => true Each of these methods is described in detail in the sections that follow. This chapter contains a description and examples of using operators and methods to manipulate strings. Unicode Characters ------------------ Lasso supports the processing of Unicode characters in all :type:`string` methods. The escape sequence ``\u…`` can be used with 4 hexadecimal digits (or ``\U…`` with 8 or ``\x…`` with 2) to specify a Unicode character in a string by its code point, e.g. ``\u002F`` represents a "/" character, ``\U00000020`` represents a space, and ``\x42`` represents a capital letter "B". These types of escape sequences can be used for any code point, e.g. ``\u4E26`` represents the Traditional Chinese character |4E26|. .. |4E26| unicode:: U+4E26 Lasso also supports common escape sequences including ``"\r"`` for a return character, ``"\n"`` for a newline character, ``"\r\n"`` for a Windows return/newline, ``"\f"`` for a form-feed character, ``"\t"`` for a tab, and ``"\v"`` for a vertical-tab. See the table :ref:`literals-string-escape` for the full list. Converting Values to Strings ============================ Expressions that produce a value will convert that value to the :type:`string` type automatically, or they can be explicitly converted using the `string` creator method as well as the ``asString`` member method every object has. .. method:: string(obj::any) .. method:: string(obj::bytes, enc::string= ?) Converts a value to type :type:`string`. Requires one value which is the data to be converted. An optional second parameter can be used when converting byte streams in order to specify which character set should be used to translate the byte stream to a string, defaulting to "UTF-8". Automatic String Conversion --------------------------- Integer and decimal values are converted to strings automatically if they are used as a parameter to a string operator. If either of the parameters to the operator is a string then the other parameter is converted to a string automatically. The following example shows how the integer ``123`` is automatically converted to a string because the other parameter of the ``+`` operator is the string ``'String'``:: 'String ' + 123 // => String 123 The following example shows how a variable that contains the integer ``123`` is automatically converted to a string for the expression:: local(number) = 123 'String ' + #number + '\n' + #number->type // => // String 123 // integer Array, map, and pair values are converted to strings automatically when they are output to a web page or included as part of an auto-collect block. The value they return is intended for the developer to be able to see the contents of the complex type and is not intended to be displayed to site visitors. :: array('One', 'Two', 'Three') // => array(One, Two, Three) map('Key1'="Value1", 'Key2'="Value2") // => map(Key1 = Value1, Key2 = Value2) pair('name'='value') // => (name = value) The parameters sent to the ``string_…`` methods are automatically converted to strings. The following example shows the result of calling `string_length` on an integer:: string_length(21) // => 2 Explicitly Convert a Value to a String Object --------------------------------------------- Integer and decimal values can be converted to string objects using the `string` creator method. The value of the new string is the same as the value of the integer or decimal value when it is output using the `~null->toString` method. The following example shows a math calculation where the integer result ``579``. The next line shows the same calculation with string parameters and the result of ``123456``. :: 123 + 456 // => 579 string(123) + string(456) // => 123456 Boolean values can also be converted to a string object using the `string` creator method. The value will always be either the string "true" or the string "false". The following example shows a conditional result converted to type :type:`string`:: string('dog' == 'cat') // => false String member methods can be used on any value by first converting that value to a string using either the `string` creator method or the ``asString`` member method every object has. The following example shows how to use the `string->size` member method on an integer by first converting it to a string object:: 21->asString->size // => 2 string(21)->size // => 2 Byte streams being converted to strings can include the character set to be used to export the data in the byte stream. By default byte streams are assumed to contain UTF-8 character data. The following example code would translate a byte stream contained in a variable named "myByteStream" using the ISO-8859-1 encoding to interpret the character data. This is analogous to using the `bytes->exportString` method which is described in more detail in the :ref:`byte-streams` chapter:: string(#myByteStream, 'ISO-8859-1') String Inspection Methods ========================= The :type:`string` type has many member methods that return information about the value of the string object, which are documented below. (Information about regular expressions and the :type:`regexp` type is found in the :ref:`regular-expressions` chapter.) .. type:: string .. member:: string->size() Returns the number of characters in the string. .. member:: string->length() .. deprecated:: 9.0 Use `string->size` instead. .. member: string->sub(pos::integer) .. member:: string->sub(position::integer, size::integer= ?) .. member: string->substring(start::integer) .. member:: string->substring(start::integer, size::integer= ?) Returns a portion of the string. The starting point is specified by the first parameter and the number of characters to return is specified by the second. If the second parameter is not specified, all characters from the specified starting position to the end of the string are returned. .. member:: string->charName(position::integer) .. member:: string->charType(position::integer) Returns the Unicode name or type for a character in the string. Requires a parameter specifying the position of the character to inspect. .. member: string->integer() .. member:: string->integer(position::integer= ?) Returns the Unicode integer value for a character in the string. Requires a parameter specifying the position of the character to inspect, defaulting to the first character. .. member:: string->digit(position::integer, base::integer) Returns the integer value of a character in the string. Requires a parameter specifying the position of the character to inspect and a parameter specifying the base or radix. If the specified character is a digit for the specified radix, it will return the integer value for that digit, otherwise it returns "-1". (Remember that when integers are converted to strings, they default to displaying in base 10.) The radix or base can be any value from "2" to "36". .. member:: string->charDigitValue(position::integer)::integer Returns the integer value of a character in the string. Requires a parameter specifying the position of the character to inspect. If the specified character is not a digit, it will return "-1". .. member:: string->getNumericValue(position::integer)::decimal Returns the decimal value of a character in the string. Requires a parameter specifying the position of the character to inspect. If the specified character is not a digit, it will return the decimal "-123456789.0". .. member: string->isAlnum() .. member:: string->isAlnum(position::integer= ?) Returns "true" if the character at the specified position is alphanumeric, defaulting to the first character. Otherwise it will return "false". .. member: string->isAlpha() .. member:: string->isAlpha(position::integer= ?) Returns "true" if the character at the specified position is alphabetic, defaulting to the first character. Otherwise it will return "false". .. member: string->isUAlphabetic() .. member:: string->isUAlphabetic(position::integer= ?) Returns "true" if the character at the specified position has the Unicode alphabetic property, defaulting to the first character. Otherwise it will return "false". .. member: string->isBase() .. member:: string->isBase(position::integer= ?) Returns "true" if the character at the specified position is a base Unicode character, defaulting to the first character. Otherwise it will return "false". .. member: string->isBlank() .. member:: string->isBlank(position::integer= ?) Returns "true" if the character at the specified position is a space or tab, defaulting to the first character. Otherwise it will return "false". .. member: string->isCntrl() .. member:: string->isCntrl(position::integer= ?) Returns "true" if the character at the specified position is a control character, defaulting to the first character. Otherwise it will return "false". .. member: string->isDigit() .. member:: string->isDigit(position::integer= ?) Returns "true" if the character at the specified position is a base 10 digit, defaulting to the first character. Otherwise it will return "false". .. member: string->isXDigit() .. member:: string->isXDigit(position::integer= ?) Returns "true" if the character at the specified position is a hexadecimal digit, defaulting to the first character. Otherwise it will return "false". .. member: string->isGraph() .. member:: string->isGraph(position::integer= ?) Returns "true" if the character at the specified position is printable and not whitespace, defaulting to the first character. Otherwise it will return "false". .. member: string->isLower() .. member:: string->isLower(position::integer= ?) Returns "true" if the character at the specified position is lowercase, defaulting to the first character. Otherwise it will return "false". .. member: string->isULowercase() .. member:: string->isULowercase(position::integer= ?) Returns "true" if the character at the specified position has the Unicode lowercase property, defaulting to the first character. Otherwise it will return "false". .. member: string->isPrint() .. member:: string->isPrint(position::integer= ?) Returns "true" if the character at the specified position is printable, defaulting to the first character. Otherwise it will return "false". .. member: string->isPunct() .. member:: string->isPunct(position::integer= ?) Returns "true" if the character at the specified position is punctuation, defaulting to the first character. Otherwise it will return "false". .. member: string->isSpace() .. member:: string->isSpace(position::integer= ?) Returns "true" if the character at the specified position is whitespace, defaulting to the first character. Otherwise it will return "false". .. member: string->isTitle() .. member:: string->isTitle(position::integer= ?) Returns "true" if the character at the specified position is in the Unicode category "Letter, Titlecase", defaulting to the first character. Otherwise it will return "false". .. member: string->isUpper() .. member:: string->isUpper(position::integer= ?) Returns "true" if the character at the specified position is uppercase, defaulting to the first character. Otherwise it will return "false". .. member: string->isUUppercase() .. member:: string->isUUppercase(position::integer= ?) Returns "true" if the character at the specified position has the Unicode uppercase property, defaulting to the first character. Otherwise it will return "false". .. member: string->isWhitespace() .. member:: string->isWhitespace(position::integer= ?) Returns "true" if the character at the specified position is whitespace, defaulting to the first character. Otherwise it will return "false". .. member: string->isUWhitespace() .. member:: string->isUWhitespace(position::integer= ?) Returns "true" if the character at the specified position has the Unicode whitespace property, defaulting to the first character. Otherwise it will return "false". .. member:: string->find(find::string, offset::integer, -case::boolean= ?) .. member:: string->find(find::string, offset::integer, length::integer) .. member:: string->find(find::string, offset::integer, length::integer, \ patOffset::integer, patLength::integer, case::boolean) .. member:: string->find(find::string, \ -offset::integer= ?, \ -length::integer= ?, \ -patOffset::integer= ?, \ -patLength::integer= ?, \ -case::boolean= ?) Searches the base string for the specified string pattern, returning the position where the pattern first begins in the base string or "0" if the pattern cannot be found. The comparison is not case-sensitive unless the ``-case`` parameter is passed. The ``-offset`` and ``-length`` parameters can specify a portion of the base string within which to look for the match, with the former specifying the position to begin the search and the latter specifying the number of characters to search. (If ``-length`` is not specified, the method will search to the end of the base string.) The ``-patOffset`` and ``-patLength`` parameters can specify that only a portion of the pattern should be used for matching; they behave similarly for the string pattern as the ``-offset`` and ``-length`` parameters do for the base string. .. member:: string->findLast(find::string, \ offset::integer= ?, \ -length::integer= ?, \ -patOffset::integer= ?, \ -patLength::integer= ?, \ -case::boolean= ?) Similar to `string->find` except that it returns the starting position of the *last* match found in the base string. .. member:: string->contains(find::string, -case::boolean= ?) .. member:: string->contains(find::regexp, -ignoreCase::boolean= ?) Returns "true" if the specified string pattern or regular expression matches within the base string. Otherwise it will return "false". By default, string matching is not case-sensitive unless an optional ``-case`` parameter is passed to the method, but regular expression matching is case-sensitive unless an optional ``-ignoreCase`` parameter is passed to the method. .. member:: string->get(position::integer) Returns the character at the specified position in the base string. .. member:: string->equals(find::string, case::boolean) .. member:: string->equals(find::string, -case::boolean= ?) Similar to the ``==`` equality operator. Returns "true" if the specified string pattern is equivalent to the base string. The comparison is not case-sensitive unless the ``-case`` parameter is passed. .. member:: string->compare(find::string, -case::boolean= ?) .. member:: string->compare(find::string, offset::integer, \ length::integer= ?, \ patOffset::integer= ?, \ patLength::integer= ?, \ -case::boolean= ?) Compares the specified string pattern to the base string and returns "0" if they are equal, "1" if the characters in the base string are bitwise greater than the parameter, and "-1" if the characters in the base string are bitwise less than the parameter. The comparison is not case-sensitive unless the ``-case`` parameter is passed. Optionally, the comparison can be made on smaller portions of the base string by passing the ``offset`` and ``length`` parameters, and smaller portions of the string pattern by passing the ``patOffset`` and ``patLength`` parameters. .. member:: string->beginsWith(find::string, case::boolean) .. member:: string->beginsWith(find::string, -case::boolean= ?) Returns "true" if the specified string pattern matches the beginning of the base string, otherwise it will return "false". The comparison is not case-sensitive unless the ``-case`` parameter is passed. .. member:: string->endsWith(find::string, case::boolean) .. member:: string->endsWith(find::string, -case::boolean= ?) Returns "true" if the specified string pattern matches the end of the base string, otherwise it will return "false". The comparison is not case-sensitive unless the ``-case`` parameter is passed. .. member:: string->getPropertyValue(position::integer, property::integer)::integer Returns the Unicode property value for the character at the position specified in the first parameter and the Unicode property specified in the second parameter. Unicode properties are defined in the `Unicode Character Database`_ (UCD) and `Unicode Technical Reports`_ (UTR). Lasso defines many methods that return values for these Unicode property names, corresponding to this `list of properties`_ in the ICU sources. All of these methods have the ``UCHAR_`` prefix, e.g. `UCHAR_UPPERCASE`. .. member:: string->hasBinaryProperty(position::integer, property::integer)::boolean Returns "true" if the character at the position specified in the first parameter has the Unicode property specified in the second parameter, otherwise it returns "false". Find the Size of a String ------------------------- The following example returns the number of characters in a string:: 'Ralph is a red rhinoceros'->size // => 25 Check for Lowercase Characters ------------------------------ The following example inspects each character in a string and counts the number of lowercase letters it contains:: local(num_lcase) = 0 local(my_string) = 'Ralph is a red rhinoceros' loop(#my_string->size) => { #my_string->isLower(loop_count) ? #num_lcase++ } #num_lcase // => 20 Check the Beginning of a String ------------------------------- The following example checks to see if a string begins with "https:". If so, it displays "secure", otherwise it displays "insecure":: local(url) = 'https://secure.example.com' #url->beginsWith('https:') ? 'secure' | 'insecure' // => secure Find a Substring ---------------- This example uses the `string->find` method to find and output each position in a string where there is an apostrophe:: local(my_string) = "Don't, it's not worth it!" local(position) = 0 while(#position < #my_string->size) => {^ #position = #my_string->find(`'`, #position + 1) if(0 == #position) => { loop_abort } #position + '\n' ^} // => // 4 // 10 Extract a Substring ------------------- The following example pulls the substring "red" out of the base string:: local(my_string) = 'Ralph is a red rhinoceros' #my_string->sub(12, 3) // => red Extract a Specified Character Position -------------------------------------- The following example uses `string->get` to return the last character in a string:: local(my_string) = 'Ralph is a red rhinoceros' #my_string->get(#my_string->size) // => s String Manipulation Methods =========================== The :type:`string` type includes many member methods that can modify or manipulate a string object in-place, which are documented below. These methods do not return a value, and instead modify the value of the string object. .. member:: string->append(s::string) .. member:: string->append(obj::any) Concatenates a single parameter to the end of the base string, after converting it to a string if necessary. It modifies the string object in-place, not returning any value. .. member:: string->appendChar(i::integer) Concatenates a single character to the end of the base string, specified by its Unicode integer value in base 10. It modifies the string object in-place, not returning any value. .. member: string->remove() .. member: string->remove(position::integer= ?) .. member: string->remove(position::integer, num::integer) .. member:: string->remove(position::integer= ?, num::integer= ?) Removes one or more characters from the base string starting at the specified position, defaulting to the first character. A second parameter can specify the number of characters to remove, defaulting to removing all the characters from the starting position. It modifies the string object in-place, not returning any value. .. member:: string->normalize() .. member:: string->decompose() Transforms the string into either its normalized or decomposed form. It modifies the string object in-place, not returning any value. For more information on normalizing Unicode strings, see the `Unicode Normalization FAQ`_ and `Unicode Standard Annex #15`_. .. member:: string->foldCase() Converts the characters in the string to allow for case-insensitive comparisons. It modifies the string object in-place, not returning any value. .. member:: string->trim() Removes any whitespace from the beginning and end of the string. It modifies the string object in-place, not returning any value. .. member:: string->reverse() Changes the string object to the value of the base string in reverse order. It modifies the string object in-place, not returning any value. .. member:: string->toLower(position::integer) Changes the character at the specified position to lowercase if possible. It modifies the string object in-place, not returning any value. .. member:: string->toUpper(position::integer) Changes the character at the specified position to uppercase if possible. It modifies the string object in-place, not returning any value. .. member:: string->toTitle(position::integer) Changes the character at the specified position to title case if possible. It modifies the string object in-place, not returning any value. .. member:: string->lowercase() Changes every possible character in the string to lowercase. It modifies the string object in-place, not returning any value. .. member:: string->uppercase() Changes every possible character in the string to uppercase. It modifies the string object in-place, not returning any value. .. member:: string->titlecase() .. member:: string->titlecase(language::string, country::string) Changes every possible word in the string to title case. It can be called with a language code for the first parameter and a country code for the second to specify a locale to be used when performing this operation. It modifies the string object in-place, not returning any value. .. member:: string->padLeading(tosize::integer, with::string= ?) If the base string is smaller in size than the first parameter specifying the target size of the string, it changes the base string by prepending a character to its beginning until it reaches the specified size. The character used for prepending defaults to a space, but can be set with an optional second parameter. It modifies the string object in-place, not returning any value. .. member:: string->padTrailing(tosize::integer, with::string= ?) If the base string is smaller in size than the first parameter specifying the target size of the string, it changes the base string by appending a character to its end until it reaches the specified size. The character used for appending defaults to a space, but can be set with an optional second parameter. It modifies the string object in-place, not returning any value. .. member:: string->removeLeading(find::string) .. member:: string->removeLeading(find::regexp) Removes all substrings that match the string pattern or regular expression specified in the parameter from the beginning of the base string. It keeps removing until the beginning of the base string no longer matches the specified pattern. It modifies the string object in-place, not returning any value. .. member:: string->removeTrailing(find::string) Removes all substrings that match the string pattern specified in the parameter from the end of the base string. It keeps removing until the end of the string no longer matches the specified pattern. It modifies the string object in-place, not returning any value. .. member:: string->merge(where::integer, what::string, offset::integer= ?, length::integer= ?) Merges a specified string into the base string. It requires the first parameter to specify the position in the base string for the merge to take place and a second parameter specifying the string to merge into the base string. It modifies the string object in-place, not returning any value. Optionally, a third parameter can specify the starting position of the passed string to be used in the merge and a fourth can specify the number of characters after the offset to be merged from the passed string. .. member:: string->replace(find::string, replace::string, -case::boolean= ?) .. member:: string->replace(find::regexp, replace= ?, ignoreCase= ?) Replaces all substrings found in the base string that match the string pattern or regular expression specified in the first parameter with the replacement string specified in the second parameter. For regular expression matches, the replacement string can optionally be specified as a separate parameter, or it will use the replacement string of the :type:`regexp` object. It modifies the string object in-place, not returning any value. When using a string pattern for matching, the method defaults to case-insensitive matching unless otherwise specified by the third parameter. When using a regular expression, the default is the reverse: it uses case-sensitive matching unless otherwise specified by the third parameter. Append Data to a String ----------------------- This example uses the `string->append` method to add a trailing slash to a directory path if one does not already exist:: local(dir_path) = '/var/lasso/home' if(not #dir_path->endsWith('/')) => { #dir_path->append('/') } #dir_path // => /var/lasso/home/ Remove Whitespace Around a String --------------------------------- This example uses the `string->trim` method to remove whitespace from the beginning and end of a string:: local(my_string) = '\n Ralph the Ringed Rhino \n\n' #my_string->trim #my_string // => Ralph the Ringed Rhino Ensure All Characters are Lowercase ----------------------------------- This example converts all the characters in a string to lowercase:: local(my_string) = 'Ralph the Ringed Rhino races red radishes in THE RINK.' #my_string->lowercase #my_string // => ralph the ringed rhino races red radishes in the rink. Remove a Pattern from the End of a String ----------------------------------------- This example removes all the trailing commas from a string:: local(my_string) = 'First, Second, Fifth,,,' #my_string->removeTrailing(',') #my_string // => First, Second, Fifth String Encoding Methods ======================= .. member:: string->hash() Returns a simple hash of the string object. .. member:: string->unescape() Returns the value of the string object with any escape sequences (a sequence beginning with a backslash) replaced with their literal Unicode equivalents. This is the same escape process used by Lasso for non-ticked string literals. .. member:: string->encodeHtml() .. member:: string->encodeHtml(linebreaks::boolean, ignorechars::boolean) Returns the value of the string object with any reserved, illegal, or extended ASCII characters converted to their equivalent HTML entity. This replacement can be modified by passing two boolean parameters. If the first parameter is set to "true", line breaks are encoded. If the second parameter is set to "true", the following characters are not encoded: ``" & ' < >`` (double quotation mark, ampersand, single quotation mark, less than or left angle bracket, and greater than or right angle bracket, respectively). .. member:: string->decodeHtml() Returns the value of the string object with any HTML entities converted to their Unicode equivalent. This is the opposite of the `string->encodeHtml` method. .. member:: string->encodeXml() Returns the value of the string object with any reserved or illegal XML characters encoded into their equivalent XML entity. .. member:: string->decodeXml() Returns the value of the string object with any XML entities converted to their Unicode equivalent. This is the opposite of the `string->encodeXml` method. .. member:: string->encodeHtmlToXml() Returns the value of the string object with any HTML character entity references converted to their equivalent numeric character reference. .. member: string->asBytes() .. member:: string->asBytes(encoding::string= ?) Returns the value of the string object as a bytes object. By default, UTF-8 encoding is used for this conversion, but any encoding can be specified as a string parameter to this method. .. member:: string->encodeSql() Returns the value of the string object with any illegal characters for MySQL data sources properly escaped. .. member:: string->encodeSql92() Returns the value of the string object with any illegal characters for SQL-92--compliant data sources properly escaped. Not for use with MySQL. .. member:: string->encodeUrl()::bytes Returns a byte stream of the string object with any illegal characters for URLs properly escaped. See `bytes->encodeUrl`. Convert Escape Sequences ------------------------ The following example creates a string with escape sequences using a ticked string literal so that Lasso won't automatically unescape them. It then outputs the string before calling `string->unescape` and then shows the result of calling `string->unescape`:: local(my_string) = `Chinese Character: \u4E26` #my_string + '\n' #my_string->unescape // => // Chinese Character: \u4E26 // Chinese Character: 並 Encode HTML Entities -------------------- The following example uses `string->encodeHtml` to return a string with the HTML reserved characters encoded as entities:: local(my_string) = '<>&' #my_string->encodeHtml // => <>& Encode for Use in MySQL ----------------------- The following example returns a string whose quotes have been encoded for use in a MySQL SQL statement:: local(my_string) = "Don't forget to encode" #my_string->encodeSql // => Don\'t forget to encode String Iteration Methods ======================== .. member:: string->forEachCharacter() Executes a given capture block once for every character in the base string. The character can be accessed in the capture block through the special local variable ``#1``. .. member:: string->forEachWordBreak() Executes a given capture block once for every word in the base string. The word can be accessed in the capture block through the special local variable ``#1``. .. member:: string->forEachLineBreak() Executes a given capture block once for every substring that would be generated by splitting the base string on a line break. Every line break character is recognized: ``"\r"``, ``"\n"``, and ``"\r\n"``. Each of the substrings can be accessed in the capture block through the special local variable ``#1``. .. member:: string->forEachMatch(exp::string) .. member:: string->forEachMatch(exp::regexp) Executes a given capture block once for every match in the base string. Matches can be specified as either :type:`string` or :type:`regexp` objects. The match can be accessed in the capture block through the special local variable ``#1``. .. member:: string->eachCharacter() Returns an ``eacher`` that can be used in conjunction with query expressions to inspect and perform complex operations on every character in the base string. .. member:: string->eachWordBreak() Returns an ``eacher`` that can be used in conjunction with query expressions to inspect and perform complex operations on every word in the base string. .. member:: string->eachLineBreak() Returns an ``eacher`` that can be used in conjunction with query expressions to inspect and perform complex operations on every line in the base string. .. member:: string->eachMatch(exp::string) .. member:: string->eachMatch(exp::regexp) Returns an ``eacher`` that can be used in conjunction with query expressions to inspect and perform complex operations on every specified match in the base string. Matches can be specified as either :type:`string` or :type:`regexp` objects. Iterate Over Lines ------------------ The following example takes a string with multiple lines and runs the lines of the string together with slashes, storing the result in the variable "quoted_poem". It removes the trailing slash at the end and then displays the variable "quoted_poem" in quotes. :: local(poem) = '\ An old silent pond... A frog jumps into the pond, Splash! Silence again.' local(quoted_poem) = '' #poem->forEachLineBreak => { #quoted_poem->append(#1 + '/') } #quoted_poem->removeTrailing('/') '"' + #quoted_poem + '"' // => "An old silent pond.../A frog jumps into the pond,/Splash! Silence again." Iterate Over Words ------------------ The following example takes a string and inspects each word using a query expression. If the word starts with the letter "r" then it will transform it to uppercase. The query expression selects each word, allowing us to create a staticarray of words. :: local(my_string) = 'Ralph is a red rhinoceros.' ( with word in #my_string->eachWordBreak select (#word->beginsWith('r') ? #word->uppercase& | #word) )->asStaticArray // => staticarray(RALPH, is, a, RED, RHINOCEROS.) Iterate Over a Specified Regular Expression Match ------------------------------------------------- The following example uses `string->eachMatch` with a :type:`regexp` object to find every vowel in a string, where the local variable "vowels" is used to count the number of each vowel in the string. :: local(my_string) = 'ralph is a red rhinoceros.' local(vowels) = map('a'=0, 'e'=0, 'i'=0, 'o'=0, 'u'=0) with letter in #my_string->eachMatch(regexp(`[aeiouAEIOU]`)) do #vowels->find(#letter)++ #vowels // => map(a = 2, e = 2, i = 2, o = 2, u = 0) String Export Methods ===================== .. member:: string->split(find::string) Returns an array with elements created by breaking up the base string on the specified string. If an empty string is specified, each element of the array will be a single character from the base string. .. member:: string->values() Returns an array where each element is one character from the base string. .. member:: string->keys() Returns a :type:`generateSeries` from 1 to the number of characters in the base string, or an empty :type:`generateSeries` if the base string is empty. Split a String Into an Array ---------------------------- The following example creates an array by splitting a string on a comma:: local(my_string) = '1,3,9,f,g' #my_string->split(',') // => array(1, 3, 9, f, g) .. _Unicode Character Database: http://www.unicode.org/ucd/ .. _Unicode Technical Reports: http://www.unicode.org/reports/ .. _list of properties: http://icu-project.org/apiref/icu4c-latest/uchar_8h.html#enum-members .. _Unicode Normalization FAQ: http://www.unicode.org/faq/normalization.html .. _Unicode Standard Annex #15: http://www.unicode.org/reports/tr15/