libUTL++
utl::RDparser Class Reference

Recursive-descent parser. More...

#include <RDparser.h>

Inheritance diagram for utl::RDparser:

Public Member Functions

virtual void copy (const Object &rhs)
 Copy another instance. More...
 
void addProduction (uint_t id, const String &name, const String &rhs)
 Add a production rule. More...
 
void addTerminal (uint_t id, const String &name, const String &regex)
 Add a terminal symbol. More...
 
bool compile ()
 Compile the state machine to parse the grammar. More...
 
bool ok () const
 Determine whether a properly defined grammar has been compiled. More...
 
Graphparse (Stream &stream, const String *prod=nullptr)
 Parse text from the given stream until EOF. More...
 
Graphparse (const String &str, const String *prod=nullptr)
 Parse the given string. More...
 
- Public Member Functions inherited from utl::Object
void clear ()
 Revert to initial state. More...
 
virtual int compare (const Object &rhs) const
 Compare with another object. More...
 
virtual void vclone (const Object &rhs)
 Make an exact copy of another instance. More...
 
virtual void steal (Object &rhs)
 "Steal" the internal representation from another instance. More...
 
virtual void dump (Stream &os, uint_t level=uint_t_max) const
 Dump a human-readable representation of self to the given output stream. More...
 
void dumpWithClassName (Stream &os, uint_t indent=4, uint_t level=uint_t_max) const
 Front-end for dump() that prints the object's class name. More...
 
virtual const ObjectgetKey () const
 Get the key for this object. More...
 
bool hasKey () const
 Determine whether or not the object has a key. More...
 
virtual const ObjectgetProxiedObject () const
 Get the proxied object (= self if none). More...
 
virtual ObjectgetProxiedObject ()
 Get the proxied object (= self if none). More...
 
virtual size_t hash (size_t size) const
 Get the hash code for the object. More...
 
bool _isA (const RunTimeClass *runTimeClass) const
 Determine whether self's class is a descendent of the given class. More...
 
virtual String toString () const
 Return a string representation of self. More...
 
 operator String () const
 Conversion to String. More...
 
size_t allocatedSize () const
 Get the total allocated size of this object. More...
 
virtual size_t innerAllocatedSize () const
 Get the "inner" allocated size. More...
 
virtual void addOwnedIt (const class FwdIt *it) const
 Notify self that it owns the given iterator. More...
 
virtual void removeOwnedIt (const class FwdIt *it) const
 Notify self that the given owned iterator has been destroyed. More...
 
bool operator< (const Object &rhs) const
 Less-than operator. More...
 
bool operator<= (const Object &rhs) const
 Less-than-or-equal-to operator. More...
 
bool operator> (const Object &rhs) const
 Greater-than operator. More...
 
bool operator>= (const Object &rhs) const
 Greater-than-or-equal-to operator. More...
 
bool operator== (const Object &rhs) const
 Equal-to operator. More...
 
bool operator!= (const Object &rhs) const
 Unequal-to operator. More...
 
void serializeIn (Stream &is, uint_t mode=ser_default)
 Serialize from an input stream. More...
 
void serializeOut (Stream &os, uint_t mode=ser_default) const
 Serialize to an output stream. More...
 
virtual void serialize (Stream &stream, uint_t io, uint_t mode=ser_default)
 Serialize to or from a stream. More...
 
void serializeOutBoxed (Stream &os, uint_t mode=ser_default) const
 Serialize a boxed object to an output stream. More...
 

Additional Inherited Members

- Static Public Member Functions inherited from utl::Object
static ObjectserializeInNullable (Stream &is, uint_t mode=ser_default)
 Serialize a nullptr-able object from an input stream. More...
 
static void serializeOutNullable (const Object *object, Stream &os, uint_t mode=ser_default)
 Serialize a nullptr-able object to an output stream. More...
 
static void serializeNullable (Object *&object, Stream &stream, uint_t io, uint_t mode=ser_default)
 Serialize a nullptr-able object to or from a stream. More...
 
static ObjectserializeInBoxed (Stream &is, uint_t mode=ser_default)
 Serialize a boxed object from an input stream. More...
 
static void serializeBoxed (Object *&object, Stream &stream, uint_t io, uint_t mode=ser_default)
 Serialize a boxed object to or from a stream. More...
 
- Protected Member Functions inherited from utl::FlagsMI
 FlagsMI ()
 Constructor. More...
 
virtual ~FlagsMI ()
 Destructor. More...
 
void copyFlags (const FlagsMI &rhs)
 Copy the given flags. More...
 
void copyFlags (const FlagsMI &rhs, uint_t lsb, uint_t msb)
 Copy (some of) the given flags. More...
 
void copyFlags (uint64_t flags, uint_t lsb, uint_t msb)
 Copy (some of) the given flags. More...
 
bool getFlag (uint_t flagNum) const
 Get a user-defined flag. More...
 
void setFlag (uint_t flagNum, bool val)
 Set a user-defined flag. More...
 
uint64_t getFlagsNumber (uint64_t mask, uint64_t shift=0)
 Get a multi-bit value in the flags data (which is stored as one 64-bit integer). More...
 
void setFlagsNumber (uint64_t mask, uint64_t shift, uint64_t num)
 Set a multi-bit value in the flags data (which is stored as one 64-bit integer). More...
 
uint64_t getFlags () const
 Get the flags. More...
 
void setFlags (uint64_t flags)
 Set the flags. More...
 

Detailed Description

Recursive-descent parser.

This is a fairly simple parser that should be adequate for many purposes. RDparser allows you to define a grammar and then parse text that satisfies the grammar. A grammar is defined by its productions and terminals.

Productions

A grammar production describes a rule for translating a production symbol into one or more production or terminal symbols. A production symbol may also be referred to as a non-terminal symbol.

The rules for writing a valid grammar production are themselves best described by writing a grammar, but don't worry I'll try to make it clear!

PRODUCTION ::= BRANCH { \| BRANCH }
BRANCH ::= ATOM { ATOM }
ATOM ::= PRODUCTION
| TERMINAL
| \[ PRODUCTION \]
| \{ PRODUCTION \}

Let's examine each of these three productions:

  • A PRODUCTION consists of a BRANCH followed by zero or more additional BRANCHes.
  • A BRANCH consists of an ATOM followed by zero or more additional ATOMs.
  • An ATOM consists of one of the following:
    1. a PRODUCTION
    2. a TERMINAL
    3. open-square-bracket PRODUCTION close-square-bracket
    4. open-curly-bracket PRODUCTION close-curly-bracket

As you can see:

  • Symbols enclosed in square brackets must be matched zero or one times.
  • Symbols enclosed in curly brackets must be matched zero or more times.
  • A symbol can be escaped with the '\' character to suppress its normal meaning. The ATOM production contains examples of this. The square and curly brackets had to be escaped otherwise the rule's meaning is rather different, given that those symbols carry special meaning in the grammar for writing production rules.
  • The '|' character separates multiple possible translations, any of which can satisfy the production rule.

Left-recursion is not supported. That is, you cannot write a production rule whose first symbol is the name of the production itself.

Terminals

At some point we have to describe the actual characters we expect to see. The "leaves" of the grammar consist of terminal symbols which are specified by regular expressions (Regex). For example, you might have a terminal symbol to represent a number, with a regular expression like "\\d+(\\.\\d*)?".

General Instructions

After defining the grammar by defining productions (addProduction()) and terminals (addTerminal()), call compile() to create the state machine that will be used to parse the grammar. Then you're ready to parse text (parse()). To see a full example of RDparser in action, look at the example program that parses simple math expressions.

Author
Adam McKee

Definition at line 103 of file RDparser.h.

Member Function Documentation

◆ copy()

virtual void utl::RDparser::copy ( const Object rhs)
virtual

Copy another instance.

When you override copy(), you should usually call the superclass's copy().

Parameters
rhsobject to copy

Reimplemented from utl::Object.

◆ addProduction()

void utl::RDparser::addProduction ( uint_t  id,
const String name,
const String rhs 
)

Add a production rule.

Parameters
idunique production rule id
nameproduction name
rhsright-hand-side of production

◆ addTerminal()

void utl::RDparser::addTerminal ( uint_t  id,
const String name,
const String regex 
)

Add a terminal symbol.

Parameters
idunique terminal symbol id
nameterminal name
regexregular expression for terminal

◆ compile()

bool utl::RDparser::compile ( )

Compile the state machine to parse the grammar.

Returns
true if compilation successful, false otherwise

◆ ok()

bool utl::RDparser::ok ( ) const
inline

Determine whether a properly defined grammar has been compiled.

Returns
true if grammar compiled successfully, false otherwise

Definition at line 139 of file RDparser.h.

References utl::deInit(), and utl::init().

◆ parse() [1/2]

Graph* utl::RDparser::parse ( Stream stream,
const String prod = nullptr 
)

Parse text from the given stream until EOF.

Returns
parse tree (Graph of ParseNode)
Parameters
streaminput stream
prod(optional) root production

◆ parse() [2/2]

Graph* utl::RDparser::parse ( const String str,
const String prod = nullptr 
)

Parse the given string.

Returns
parse tree (Graph of ParseNode)
Parameters
strinput string
prod(optional) root production

The documentation for this class was generated from the following file: