Byte Streams

Binary data in Lasso is stored and manipulated using the bytes type. This chapter details the operators and methods that can manipulate binary data.

Tip

The bytes type is often used in conjunction with the string type to convert binary data between different character encodings, such as UTF-8 and ISO-8859-1. See the Strings chapter for more information about the string type.

Creating Bytes Objects

While string data in Lasso is processed as one- to four-byte Unicode characters, the bytes type can represent raw strings of single bytes, which is often referred to as a byte stream or binary data.

Lasso’s methods return a bytes object in the following situations:

  • The bytes creator method allocates a new bytes object.
  • The web_request->param methods return a bytes object.
  • The field method returns a bytes object from MySQL “BLOB” fields.
  • Other methods that return or require binary data as outlined in their documentation.
type bytes
bytes()
bytes(initial::integer)
bytes(copy::bytes)
bytes(import::string, encoding::string=?)
bytes(doc::pdf_doc)

Allocates a bytes object. Can convert a string or pdf_doc type to a bytes type, or instantiate a new bytes object. Accepts one optional parameter that can specify the initial size in bytes for the stream; or specify the string, pdf_doc, or bytes object to convert to a new bytes object. If converting a string object, it can accept an optional second parameter to specify the encoding of the string.

bytes->reserve(size::integer)

Attempts to preallocate enough memory for the specified number of bytes. Useful for optimization by avoiding memory reallocation if the expected byte stream size is known in advance.

Instantiate a New Bytes Object

Use the bytes creator method. The example below creates an empty bytes object with a size of 1024 bytes:

local(obj) = bytes(1024)

Convert String Data to a Bytes Object

Use the bytes creator method. The following example converts a string to a bytes object:

local(obj) = bytes('This is some text')

Bytes Inspection Methods

Byte streams are similar to strings and support many of the same member methods. Additionally, byte streams support a number of member methods that make it easier to deal with binary data. The most common methods are outlined below.

bytes->size()

Returns the number of bytes contained in the bytes object.

bytes->length()

Deprecated since version 9.0: Use bytes->size instead.

bytes->get(position::integer) → integer

Returns a single byte from the stream. Requires a parameter specifying which byte to fetch.

bytes->getRange(position::integer, num::integer) → bytes

Returns a range of bytes from the byte stream. Requires two parameters: the first specifies the byte position to start from, and the second specifies how many bytes to return.

bytes->find(find::bytes, position::integer=?, length::integer=?, patPosition::integer=?, patLength::integer=?)
bytes->find(find::string, position::integer=?, length::integer=?, patPosition::integer=?, patLength::integer=?)

Searches the bytes object for the byte sequence or string pattern specified in the first parameter, returning the position where the sequence first begins in the bytes object or “0” if the pattern cannot be found.

The second and third parameters can specify a portion of the bytes object within which to look for the match, with the former specifying the position to begin the search and the latter specifying the number of bytes to search. Similarly, the fourth and fifth parameters can specify a portion of the sequence that should be used for matching.

bytes->contains(find::string)
bytes->contains(find::bytes)

Returns “true” if the byte stream contains the specified sequence.

bytes->beginsWith(find::string)
bytes->beginsWith(find::bytes)

Returns “true” if the byte stream begins with the specified sequence.

bytes->endsWith(find::string)
bytes->endsWith(find::bytes)

Returns “true” if the byte stream ends with the specified sequence.

bytes->bestCharset(charset::string)

Checks if the byte stream can be encoded using the specified character set. Returns the either the specified character set name if it can, or an appropriate character set name if not.

bytes->detectCharset()

Checks which character sets could be used to decode the byte stream and returns a staticarray of guesses where each is a staticarray of the character set name, the language covered by the character set (if any), and a confidence value.

Find a Character Set for a Byte Stream

Use the bytes->bestCharset method. The examples below show the result of passing a byte stream containing a character that can’t be encoded with the suggested character set:

bytes('This is a plain ASCII string')->bestCharset('ISO-8859-1')
// => ISO-8859-1

bytes('This isn’t a plain ASCII string')->bestCharset('ISO-8859-1')
// => UTF-8

Bytes Export Methods

Bytes objects keep track of a “marker”, indicating where in the stream export operations will begin from. Newly created bytes objects have their marker set to “0”, and are incremented by the number of exported bytes when any of the export member methods that return bytes objects are called. The marker can also be set manually.

bytes->asString(encoding::string=?)

Returns the entire byte stream as a string using the specified encoding, defaulting to “UTF-8”.

bytes->marker()

Returns the current position at which exports will occur in the byte stream.

bytes->marker=(value::integer)

Sets the byte stream’s marker to the passed value.

bytes->position()
bytes->position=(value::integer)
bytes->setPosition(i::integer)

Deprecated since version 9.0: Use bytes->marker and bytes->marker= instead.

bytes->exportString(encoding::string)

Returns a string representing the byte stream. Requires a single parameter specifying the character encoding (e.g. “ISO-8859-1” or “UTF-8”) for the export. If the byte stream has a marker set, only the bytes following the marker will be returned. The marker is not modified.

bytes->exportBytes(num::integer=?)

Returns the byte stream as a bytes object. Accepts one optional parameter that can specify the number of bytes to return. If the byte stream has a marker set, only the bytes following the marker will be returned. Sets the marker to the end of the stream.

bytes->export8bits()
bytes->export16bits()
bytes->export32bits()
bytes->export64bits()

Returns 1, 2, 4, or 8 bytes of the byte stream starting from the marker as an integer and increments the marker by the same amount.

bytes->exportSigned8bits()
bytes->exportSigned16bits()
bytes->exportSigned32bits()
bytes->exportSigned64bits()

Returns 1, 2, 4, or 8 bytes of the byte stream starting from the marker as a signed (two’s-complement) integer and increments the marker by the same amount.

bytes->split(find::string)
bytes->split(find::bytes)

Returns an array of bytes objects using the specified sequence as the delimiter to split the byte stream. If the delimiter provided is an empty byte stream or string, the byte stream is split on each byte, so the returned array will have each byte as one of its elements.

bytes->sub(position::integer, num::integer=?)

Returns a specified slice of the byte stream. Requires an integer parameter specifying the index into the byte stream to start taking the slice from. An optional second integer parameter can specify the number of bytes to slice out of the bytes object. If the second parameter is not specified, all of the bytes following the index are returned.

Return the Size of a Byte Stream

Use the bytes->size method. The example below returns the size of a bytes object:

local(obj) = bytes('abc…')
#obj->size

// => 6

Return a Single Byte from a Byte Stream

Use the bytes->get method. An integer parameter specifies the index of the byte to return. Note that this method returns an integer, not a fragment of the original data (such as a string character):

local(obj) = bytes('hello world')
#obj->get(2)

// => 101

Find a Value Within a Byte Stream

Use the bytes->find method. The example below returns the starting byte number of the value 'rhino', which is contained within the byte stream:

bytes('running rhinos risk rampage')->find('rhino')
// => 9

Determine If a Byte Stream Contains a Value

Use the bytes->contains method. The example below will return “true” if the value 'Rhino' is contained within the byte stream. Note that in this example it will return “false” because the bytes of 'rhino' are a different sequence than the bytes of 'Rhino'.

bytes('running rhinos risk rampage')->find('Rhino')
// => false

Export a String from a Byte Stream

Use the bytes->exportString method. The following example exports a string using UTF-8 encoding:

local(obj) = bytes('This is a string')
#obj->exportString('UTF-8')

// => This is a string

Bytes Decoding/Encoding Methods

bytes->crc()

Returns the cyclic redundancy check integer value for the byte stream.

bytes->encodeBase64()

Returns a base64-encoded representation of the byte stream as a bytes object.

bytes->decodeBase64()

Returns the binary data of a base64-encoded byte stream as a bytes object. This is the opposite of the bytes->encodeBase64 method.

bytes->encodeHex()

Returns the byte stream in hexadecimal format.

bytes->decodeHex()

Returns the binary data of a byte stream containing hexadecimal ASCII characters by converting each pair of characters to a single byte. This is the opposite of the bytes->encodeHex method.

bytes->encodeMd5()

Returns the MD5 hash value for the byte stream as a bytes object.

bytes->encodeQP()

Returns the byte stream in quoted-printable format.

bytes->decodeQP()

Returns the binary data of a quoted-printable–encoded byte stream as a bytes object. This is the opposite of the bytes->encodeQP method.

bytes->encodeSql()

Returns the byte stream with any illegal characters for MySQL data sources properly escaped.

bytes->encodeSql92()

Returns the byte stream with any illegal characters for SQL-92–compliant data sources properly escaped. Not for use with MySQL.

bytes->encodeUrl()

Returns the byte stream with any illegal characters for URLs properly escaped.

bytes->decodeUrl()

Returns the binary data of a URL-encoded byte stream as a bytes object, with any escaped characters replaced with their ASCII equivalents. This is the opposite of the bytes->encodeUrl method.

Encode a File as Base64

Use the bytes->encodeBase64 method. The example below reads a file into a byte stream and prints its Base64-encoded value:

file('red-dot.png')->readBytes->encodeBase64
// => iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==

Bytes Iteration Methods

bytes->forEachByte()

Executes a given capture block once for every bytes in the byte stream. The byte can be accessed in the capture block through the special local variable #1.

bytes->eachByte()

Returns an eacher that can be used in conjunction with query expressions to inspect and perform complex operations on every byte in the byte stream.

Bytes Manipulation Methods

Calling the following methods will modify the bytes object without returning a value.

bytes->setSize(num::integer)

Sets the byte stream size to the specified number of bytes.

bytes->setRange(what::bytes, where::integer=?, whatStart::integer=?, whatLen::integer=?)

Sets a range of characters within a byte stream. Requires one parameter for the binary data to be inserted. The optional second, third, and fourth parameters specify the integer offset into the byte stream to insert the new data, and the offset and length of the new data to be inserted, respectively.

bytes->padLeading(tosize::integer, with::bytes=?)
bytes->padLeading(tosize::integer, with::string=?)

If the byte stream is smaller in size than the first parameter specifying the target number of bytes, it changes the byte stream by prepending a character to its beginning until it reaches the specified size. The character used for prepending defaults to a space, but can be set with an optional second parameter.

bytes->padTrailing(tosize::integer, with::bytes=?)
bytes->padTrailing(tosize::integer, with::string=?)

If the byte stream is smaller in size than the first parameter specifying the target number of bytes, it changes the byte stream by appending a character to its end until it reaches the specified size. The character used for appending defaults to a space, but can be set with an optional second parameter.

bytes->replace(find::bytes, replace::bytes)

Replaces all instances of a value within a byte stream with a new value. Requires two parameters: the first parameter is the value to find, and the second parameter is the value with which to replace the first parameter.

bytes->remove()
bytes->remove(position::integer, num::integer)

Removes bytes from a byte stream. When passed without a parameter, it removes all bytes, setting the object to an empty bytes object. In its second form, it requires an offset into the byte stream and the number of bytes to remove starting from there.

bytes->removeLeading(find::bytes)

Removes all occurrences of the specified sequence from the beginning of the byte stream. Requires one parameter specifying the data to be removed.

bytes->removeTrailing(find::bytes)

Removes all occurrences of the parameter sequence from the end of the byte stream. Requires one parameter specifying the data to be removed.

bytes->append(rhs::bytes)
bytes->append(rhs::string)

Appends the specified data to the end of the byte stream. Requires one parameter specifying the data to append.

bytes->trim()

Removes all whitespace ASCII characters from the beginning and the end of the byte stream.

bytes->importString(s::string, enc::string=?)

Imports a string parameter into the byte stream. A second parameter can specify the character encoding (e.g. “ISO-8859-1” or “UTF-8”) to use for the import.

bytes->importBytes(b::bytes)

Imports a bytes object parameter into the byte stream.

bytes->import8bits(i::integer)
bytes->import16bits(i::integer)
bytes->import32bits(i::integer)
bytes->import64bits(i::integer)

Imports the first 1, 2, 4, or 8 bytes of an integer parameter.

bytes->swapBytes()

Swaps the position of every pair of bytes, e.g. a byte stream of 'father' becomes 'afhtre'.

Add a String to a Byte Stream

Use the bytes->append method. The following example adds the string 'I am' to the end of a byte stream:

local(obj) = bytes
#obj->append('I am')

Find and Replace Values in a Byte Stream

Use the bytes->replace method. The following example finds the string 'Blue' and replaces it with the string 'Green' within the byte stream:

local(colors) = bytes('Blue Red Yellow')
#colors->replace('Blue', 'Green')

Import a String Into a Byte Stream

Use the bytes->importString method. The following example imports a string using ISO-8859-1 encoding:

local(obj) = bytes('This is a string')
#obj->importString('This is another string', 'ISO-8859-1')