1.38Class Tokenizer

Simple stream-oriented parser for efficient basic recognition of incoming data.

Class Tokenizer( [seps],[options],[tokLen],[source] )
seps A string representing the separators.
options Tokenization options.
tokLen Maximum length of returned tokens.
source The string to be tokenized, or a stream to be read for tokens.

The tokenizer class is meant to provide simple and efficient logic to parse incoming data (mainly, incoming from string).

The source can also be set at a second time with the Tokenizer.parse method. seps defaults to " " if not given.

The options parameter can be a binary combinations of the following values:

- Tokenizer.groupsep: Groups different tokens into one. If not given, when a token immediately follows another, an empty field is returned.

Methods
hasCurrentReturn true if the tokenizer has a current token.
nextAdvances the tokenizer up to the next token.
nextTokenReturns the next token from the tokenizer
parseChanges or set the source data for this tokenizer.
rewindResets the status of the tokenizer.
tokenGet the current token.

Methods

hasCurrent

Return true if the tokenizer has a current token.

Tokenizer.hasCurrent()
ReturnTrue if a token is now available, false otherwise.

Contrarily to iterators, it is necessary to call this Tokenizer.next at least once before calling this method.

See also: Tokenizer, Tokenizer.

next

Advances the tokenizer up to the next token.

Tokenizer.next()
ReturnTrue if a new token is now available, false otherwise.
Raise
IoError on errors on the underlying stream.
CodeError if called on an unprepared Tokenizer.

For example:


   t = Tokenizer( source|"A string to be tokenized" )
   while t.hasCurrent()
      > "Token: ", t.token()
      t.next()
   end

See also: Tokenizer.

nextToken

Returns the next token from the tokenizer

Tokenizer.nextToken()
ReturnA string or nil at the end of the tokenization.
Raise
IoError on errors on the underlying stream.
CodeError if called on an unprepared Tokenizer.

This method is actually a combination of Tokenizer.next followed by Tokenizer.token.

Sample usage:


   t = Tokenizer( source|"A string to be tokenized" )
   while (token = t.nextToken()) != nil
      > "Token: ", token
   end

Note: When looping, remember to check the value of the returned token against nil, as empty strings can be legally returned multiple times, and they are considered false in logic checks.

parse

Changes or set the source data for this tokenizer.

Tokenizer.parse( source )
source A string or a stream to be used as a source for the tokenizer.
Raise
IoError on errors on the underlying stream.

The first token is immediately read and set as the current token. If it's not empty, that is, if at least a token can be read, Tokenizer.hasCurrent returns true, and Tokenizer.token returns its value.

rewind

Resets the status of the tokenizer.

Tokenizer.rewind()
Raise
IoError if the tokenizer is tokenizing a non-rewindable stream.

token

Get the current token.

Tokenizer.token()
ReturnTrue if a new token is now available, false otherwise.
Raise
IoError on errors on the underlying stream.
CodeError if called on an unprepared Tokenizer, or before next().

This method returns the current token.

See also: Tokenizer, Tokenizer.

Made with http://www.falconpl.org