Main Page   Class Hierarchy   Alphabetical List   Compound List   Compound Members  

util::StreamTokenizer Class Reference

#include <StreamTokenizer.h>

Inheritance diagram for util::StreamTokenizer:

util::Object List of all members.

Detailed Description

StreamTokenizer reads an input stream and splits it into separate tokens.

It may work in unlimited number of user-defined modes, and there are two predefined ones: C and xml.


Public Types

enum  Syntax { SYNTAX_C, SYNTAX_XML }
 Possible syntaxes for the tokenizer. More...


Public Methods

 StreamTokenizer (Syntax syntax=SYNTAX_C)
 Initialize a StreamTokenizer with the given syntax.

 ~StreamTokenizer ()
void setAttribute (unsigned char character, int attr)
 Set the attribute for a character.

int getAttribute (unsigned char character) const
 Get the attribute for a character.

int & attribute (unsigned char character)
 Get the attribute for a character.

TokengetNextToken (std::istream &input) throw (io::IOException&)
 Get the next token from an input stream.

void addSection (Section *sec)
 Add a new section.

void removeSection (Section *sec)
 Remove a section.

void setDecimalSeparator (unsigned char separator)
 Set the decimal separator character.

char getDecimalSeparator (void)
 Get the decimal separator character.

void setCSyntax (void)
 A method for setting the reader's syntax to C style.

void setXMLSyntax (void)
 A method for setting the reader's syntax to XML style.

unsigned int getLineNumber (void)
 Get the current line number.

void setLineNumber (unsigned int number)
 Set the current line number.


Member Enumeration Documentation

enum util::StreamTokenizer::Syntax
 

Possible syntaxes for the tokenizer.

  • SYNTAX_C - C syntax
  • SYNTAX_XML - XML syntax


Constructor & Destructor Documentation

util::StreamTokenizer::StreamTokenizer Syntax    syntax = SYNTAX_C
 

Initialize a StreamTokenizer with the given syntax.

("C" by default)


Member Function Documentation

int& util::StreamTokenizer::attribute unsigned char    character [inline]
 

Get the attribute for a character.

Parameters:
character  the character whose attribute you want to check
a  bit mask of set attributes

int util::StreamTokenizer::getAttribute unsigned char    character const [inline]
 

Get the attribute for a character.

Parameters:
character  the character whose attribute you want to check
a  bit mask of set attributes

StreamTokenizer::Token * util::StreamTokenizer::getNextToken std::istream &    input throw (io::IOException&)
 

Get the next token from an input stream.

The returned token is a newly allocated object that must be destroyed with the delete operator.

Parameters:
input  the stream to read from
Returns:
a token

void util::StreamTokenizer::setAttribute unsigned char    character,
int    attr
 

Set the attribute for a character.

A character attribute is a logical or operation of the predefined constants TOKEN_{NORMAL, WORD, NUMBER, SPACE, SECTION}. When StreamTokenizer encounters a character with an attribute other than TOKEN_NORMAL, it takes special actions. Normal characters are considered separate tokens, adjacent word characters form one token, spaces separate tokens and are ignored, numbers may be digits or other parts of a number, and sections start a user-defined section that is treated as one token. The number and section attributes are not intended for manual adjustment.

Parameters:
character  the character whose attribute is to be altered
attr  the new attribute.

void util::StreamTokenizer::setCSyntax void   
 

A method for setting the reader's syntax to C style.

In C mode, StreamTokenizer recognizes "slashlash" and "slashstar" style comments, and string constants delimited by double quotes. It also reads numbers as numerical tokens.

In C mode, string literals without double quotes are of type TOKEN_WORD. Strings with double quotes are of type TOKEN_SECTION with the section name "string". Comments are of type TOKEN_SECTION with the section name "comment".

void util::StreamTokenizer::setXMLSyntax void   
 

A method for setting the reader's syntax to XML style.

In XML mode, StreamTokenizer recognizes XML tags (including comments), and automatically parses entities defined in XML 1.0 specification (like &amp;).

In XML mode, string literals without quotes (single or double) are of type TOKEN_WORD. Strings with quotes are of type TOKEN_SECTION with the section name "string". String sections are normalized according to the attribute value normalization scheme defined in XML 1.0 recommendation. In addition to strings, the following section types are recognized:

  • <![CDATA[ ]]> - cdata
  • <!-- --> - comment
  • <! > - declaration
  • <? ?> - instruction
  • < > - tag


The documentation for this class was generated from the following files:
Documentation generated on 11.09.2003 with Doxygen.
The documentation is copyrighted material.
Copyright © Topi Mäenpää 2003. All rights reserved.