Skip to main content

The PToken Class Reference

Class representing a token in the compiled regular expression token stream. More...

Declaration

class reg::PToken { ... }

Enumerations Index

enum classKind : uint16_t { ... }

The kind of token. More...

Public Constructors Index

PToken ()

Creates a token of kind 'End'. More...

PToken (Kind k)

Creates a token of the given kind k. More...

PToken (char c)

Create a token for an ASCII character. More...

PToken (uint16_t v)

Create a token for a byte of an UTF-8 character. More...

PToken (uint16_t from, uint16_t to)

Create a token representing a range from one character from to another character to. More...

Public Member Functions Index

const char *kindStr () const

returns a string representation of the tokens kind (useful for debugging). More...

voidsetValue (uint16_t value)

Sets the value for a token. More...

Kindkind () const

Returns the kind of the token. More...

uint16_tfrom () const

Returns the 'from' part of the character range. More...

uint16_tto () const

Returns the 'to' part of the character range. More...

uint16_tvalue () const

Returns the value for this token. More...

charasciiValue () const

Returns the value for this token as a ASCII character. More...

boolisRange () const

Returns true iff this token represents a range of characters. More...

boolisCharClass () const

Returns true iff this token is a positive or negative character class. More...

Private Member Attributes Index

uint32_tm_rep

Description

Class representing a token in the compiled regular expression token stream.

A token has a kind and an optional value whose meaning depends on the kind. It is also possible to store a (from,to) character range in a token.

Definition at line 58 of file regex.cpp.

Enumerations

Kind

enum class reg::PToken::Kind : uint16_t
strong

The kind of token.

Enumeration values
End (= 0x0000)
WhiteSpace (= 0x1001)
Digit (= 0x1002)
Alpha (= 0x1003)
AlphaNum (= 0x1004)
CharClass (= 0x2001)
NegCharClass (= 0x2002)
BeginOfLine (= 0x4001)
EndOfLine (= 0x4002)
BeginOfWord (= 0x4003)
EndOfWord (= 0x4004)
BeginCapture (= 0x4005)
EndCapture (= 0x4006)
Any (= 0x4007)
Star (= 0x4008)
Optional (= 0x4009)
Character (= 0x8000)

Ranges per bit mask:

  • 255 from part of a range, except for 0 which is the End marker
  • 8191 built-in ranges
  • 12287 user defined ranges
  • 20479 special operations
  • 32768 literal character

Definition at line 70 of file regex.cpp.

71 {
72 End = 0x0000,
73 WhiteSpace = 0x1001, // \s range [ \t\r\n]
74 Digit = 0x1002, // \d range [0-9]
75 Alpha = 0x1003, // \a range [a-z_A-Z\x80-\xFF]
76 AlphaNum = 0x1004, // \w range [a-Z_A-Z0-9\x80-\xFF]
77 CharClass = 0x2001, // []
78 NegCharClass = 0x2002, // [^]
79 BeginOfLine = 0x4001, // ^
80 EndOfLine = 0x4002, // $
81 BeginOfWord = 0x4003, // \<
82 EndOfWord = 0x4004, // \>
83 BeginCapture = 0x4005, // (
84 EndCapture = 0x4006, // )
85 Any = 0x4007, // .
86 Star = 0x4008, // *
87 Optional = 0x4009, // ?
88 Character = 32768 // c
89 };

Public Constructors

PToken()

reg::PToken::PToken ()
inline

Creates a token of kind 'End'.

Definition at line 124 of file regex.cpp.

124 PToken() : m_rep(0) {}

Reference m_rep.

PToken()

reg::PToken::PToken (Kind k)
inline explicit

Creates a token of the given kind k.

Definition at line 127 of file regex.cpp.

127 explicit PToken(Kind k) : m_rep(static_cast<uint32_t>(k)<<16) {}

Reference m_rep.

PToken()

reg::PToken::PToken (char c)
inline

Create a token for an ASCII character.

Definition at line 130 of file regex.cpp.

130 PToken(char c) : m_rep((static_cast<uint32_t>(Kind::Character)<<16) |
131 static_cast<uint32_t>(c)) {}

References Character and m_rep.

PToken()

reg::PToken::PToken (uint16_t v)
inline

Create a token for a byte of an UTF-8 character.

Definition at line 134 of file regex.cpp.

134 PToken(uint16_t v) : m_rep((static_cast<uint32_t>(Kind::Character)<<16) |
135 static_cast<uint32_t>(v)) {}

References Character and m_rep.

PToken()

reg::PToken::PToken (uint16_t from, uint16_t to)
inline

Create a token representing a range from one character from to another character to.

Definition at line 138 of file regex.cpp.

138 PToken(uint16_t from,uint16_t to) : m_rep(static_cast<uint32_t>(from)<<16 | to) {}

References from, m_rep and to.

Public Member Functions

asciiValue()

char reg::PToken::asciiValue ()
inline

Returns the value for this token as a ASCII character.

Definition at line 156 of file regex.cpp.

156 char asciiValue() const { return static_cast<char>(m_rep); }

Reference m_rep.

Referenced by reg::Ex::Private::compile, reg::Ex::match and reg::Ex::Private::matchAt.

from()

uint16_t reg::PToken::from ()
inline

Returns the 'from' part of the character range.

Only valid if this token represents a range

Definition at line 147 of file regex.cpp.

147 uint16_t from() const { return m_rep>>16; }

Reference m_rep.

Referenced by isRange, reg::Ex::Private::matchAt and PToken.

isCharClass()

bool reg::PToken::isCharClass ()
inline

Returns true iff this token is a positive or negative character class.

Definition at line 162 of file regex.cpp.

162 bool isCharClass() const { return kind()==Kind::CharClass || kind()==Kind::NegCharClass; }

References CharClass, kind and NegCharClass.

Referenced by reg::Ex::Private::matchAt.

isRange()

bool reg::PToken::isRange ()
inline

Returns true iff this token represents a range of characters.

Definition at line 159 of file regex.cpp.

159 bool isRange() const { return m_rep!=0 && from()<=to(); }

References from, m_rep and to.

kind()

Kind reg::PToken::kind ()
inline

Returns the kind of the token.

Definition at line 144 of file regex.cpp.

144 Kind kind() const { return static_cast<Kind>(m_rep>>16); }

Reference m_rep.

Referenced by reg::Ex::Private::compile, isCharClass, reg::Ex::match and reg::Ex::Private::matchAt.

kindStr()

const char * reg::PToken::kindStr ()
inline

returns a string representation of the tokens kind (useful for debugging).

Definition at line 92 of file regex.cpp.

92 const char *kindStr() const
93 {
94 if ((m_rep>>16)>=0x1000 || m_rep==0)
95 {
96 switch(static_cast<Kind>((m_rep>>16)))
97 {
98 case Kind::End: return "End";
99 case Kind::Alpha: return "Alpha";
100 case Kind::AlphaNum: return "AlphaNum";
101 case Kind::WhiteSpace: return "WhiteSpace";
102 case Kind::Digit: return "Digit";
103 case Kind::CharClass: return "CharClass";
104 case Kind::NegCharClass: return "NegCharClass";
105 case Kind::Character: return "Character";
106 case Kind::BeginOfLine: return "BeginOfLine";
107 case Kind::EndOfLine: return "EndOfLine";
108 case Kind::BeginOfWord: return "BeginOfWord";
109 case Kind::EndOfWord: return "EndOfWord";
110 case Kind::BeginCapture: return "BeginCapture";
111 case Kind::EndCapture: return "EndCapture";
112 case Kind::Any: return "Any";
113 case Kind::Star: return "Star";
114 case Kind::Optional: return "Optional";
115 }
116 }
117 else
118 {
119 return "Range";
120 }
121 }

References Alpha, AlphaNum, Any, BeginCapture, BeginOfLine, BeginOfWord, Character, CharClass, Digit, End, EndCapture, EndOfLine, EndOfWord, m_rep, NegCharClass, Optional, Star and WhiteSpace.

Referenced by reg::Ex::Private::matchAt.

setValue()

void reg::PToken::setValue (uint16_t value)
inline

Sets the value for a token.

Definition at line 141 of file regex.cpp.

141 void setValue(uint16_t value) { m_rep = (m_rep & 0xFFFF0000) | value; }

References m_rep and value.

to()

uint16_t reg::PToken::to ()
inline

Returns the 'to' part of the character range.

Only valid if this token represents a range

Definition at line 150 of file regex.cpp.

150 uint16_t to() const { return m_rep & 0xFFFF; }

Reference m_rep.

Referenced by isRange, reg::Ex::Private::matchAt and PToken.

value()

uint16_t reg::PToken::value ()
inline

Returns the value for this token.

Definition at line 153 of file regex.cpp.

153 uint16_t value() const { return m_rep & 0xFFFF; }

Reference m_rep.

Referenced by reg::Ex::Private::compile, reg::Ex::Private::matchAt and setValue.

Private Member Attributes

m_rep

uint32_t reg::PToken::m_rep

Definition at line 165 of file regex.cpp.

165 uint32_t m_rep;

Referenced by asciiValue, from, isRange, kind, kindStr, PToken, PToken, PToken, PToken, PToken, setValue, to and value.


The documentation for this class was generated from the following file:


Generated via doxygen2docusaurus by Doxygen 1.14.0.