Derivation Tree Reference#
Fandango constraints make use of symbols (anything enclosed in <...>) to express desired or required properties of generated inputs.
During evaluation, any <SYMBOL> returns a derivation tree representing the composition of the produced or parsed string.
This derivation tree has a type of DerivationTree, as do all subtrees.
DerivationTree objects support 264 functions, methods, and operators directly; after converting DerivationTree objects into standard Python types such as str, the entire Python ecosystem is available.
Derivation Tree Structure#
Let’s have a look at the structure of a derivation tree. Consider this derivation tree from the ISO 8601 date and time grammar:
The elements of the tree are designated as follows:
- Node
Any connected element of a tree.
- Child
An immediate descendant of a node. In the above tree, the
<start>node has one child,<iso8601datetime>.- Descendant
A child of a node, or a descendant of one of its children.
<iso8601month>is a descendant of<iso8601date>. All nodes (except the root node) are descendants of the root node.- Parent
The parent of a node \(N\) is the node that has \(N\) as a child.
<start>is the parent of<iso8601datetime>.- Root
The node without parent; typically
<start>.- Terminal Symbol
A node with at least one child.
- Nonterminal Symbol
A node without children.
Concatenating all terminal symbols (using str(<SYMBOL>)) in a derivation tree yields the string represented. For the above tree, this would be 2025-10-27.
Evaluating Derivation Trees#
To write constraints, you may want to serialize derivation trees into standard Python types such as strings, bytes, or ints. To do so, simply call str(<SYMBOL>), bytes(<SYMBOL>), or int(<SYMBOL>) respectively.
Internally, the serialization is done in str objects as long as the sub-tree only contains string values, and in bytes otherwise. During concatenation, strings are converted to bytes in with utf-8 encoding, for conversion from bytes to strings, latin-1 is used to prevent breaking non-utf-8 strings. If you would like to convert them using utf-8, you can call <SYMBOL>.to_string("utf-8").
Concatenation of non-bit types with bit types always triggers Fandango to convert the remaining trailing bits into bytes (with the first bit being the most significant). If the number of trailing bits isn’t a multiple of 8, Fandango will throw an error.
int(<SYMBOL>) on a subtree consisting only of strings will attempt to parse the string as a number. If the tree consists of bits only, it will be converted to a number with the first bit being the most significant. Bytes are converted to strings and then treated accordingly.
Previous versions of Fandango automatically converted certain values — this no longer works to prevent accidental mistakes while writing constraints.
Tip
It is generally unwise to rely on the type of the internal representation of linearized DerivationTree values as these may change with future expansions of Fandango. Use functions directly implemented on them (<SYMBOL>.startswith("Hello")), or explicitly convert them to a base type in Python instead (str(<SYMBOL>)).
General DerivationTree Functions#
These functions are available for all DerivationTree objects, regardless of the type they evaluate into.
Converters#
Since any <SYMBOL> has the type DerivationTree, one must convert it first into a standard Python type before passing it as argument to a standard Python function.
str(<SYMBOL>) -> strConvert
<SYMBOL>into a Unicode string. Byte strings in<SYMBOL>are converted usinglatin-1encoding.<SYMBOL>.to_string(encoding: str = "latin-1") -> strMore flexible alternative to
str(<SYMBOL>)bytes(<SYMBOL>) -> bytesConvert
<SYMBOL>into a byte string. Unicode strings in<SYMBOL>are converted usingutf-8encoding.<SYMBOL>.to_bytes(encoding: str = "utf-8") -> bytesMore flexible alternative to
bytes(<SYMBOL>)int(<SYMBOL>) -> intConvert
<SYMBOL>into an integer, like the Pythonint()function.<SYMBOL>must be exclusively bits, or a Unicode string or byte string representing an integer literal.<SYMBOL>.to_int(encoding: str = "utf-8") -> intMore flexible alternative to
int(<SYMBOL>)<SYMBOL>.should_be_serialized_to_bytes() -> boolReturns
Trueif the sub-tree contains bits or bytes and thus is natively serialized to bytes<SYMBOL>.contains_bits() -> boolReturns
Trueif the sub-tree contains bits<SYMBOL>.contains_bytes() -> boolReturns
Trueif the sub-tree contains bytes<SYMBOL>.to_bits(encoding: str = "utf-8") -> intProvides a bitwise representation of the input, in a string of
0s and1s<SYMBOL>.is_terminal() -> boolTrue if
<SYMBOL>is a terminal node.<SYMBOL>.is_nonterminal() -> boolTrue if
<SYMBOL>is a nonterminal node.
Accessing Children#
len(<SYMBOL>) -> intReturn the number of children of
<SYMBOL>.
Important
To access the length of the string represented by <SYMBOL>, use len(str(<SYMBOL>)).
<SYMBOL>[n] -> DerivationTreeAccess the
nth child of<SYMBOL>, as aDerivationTree.<SYMBOL>[0]is the first child;<SYMBOL>[-1]is the last child.
Important
To access the nth character of <SYMBOL>, use str(<SYMBOL>)[n].
<SYMBOL>[start:stop] -> DerivationTreeReturn a new
DerivationTreewhich has the children<SYMBOL>[start]to<SYMBOL>[stop-1]as children. Ifstartis omitted, children start from the beginning; ifstopis omitted, children go up to the end, including the last one.<SYMBOL>.children() -> list[DerivationTree]Return a list containing all children of
<SYMBOL>.<SYMBOL>.children_values() -> list[TreeValue]Return a list containing the values of all children of
<SYMBOL>.
Note
TreeValue objects can be converted into standard Python objects with str(), bytes(), or int(). Alternatively, use the same base functionality from str, bytes, and int on them directly that is implemented on entire DerivationTrees.
<SYMBOL_1> in <SYMBOL_2>Return True if
<SYMBOL_1> == CHILDfor any of the children of<SYMBOL_2>.VALUE in <SYMBOL>Return True if
VALUE == CHILD.value()for any of the children of<SYMBOL>.
Accessing Descendants#
<SYMBOL>.descendants() -> list[DerivationTree]Return a list containing all descendants of
<SYMBOL>; that is, all children and their transitive children.<SYMBOL>.descendant_values() -> list[TreeValue]Return a list containing the values of all descendants of
<SYMBOL>; that is, the values of all children and their transitive children.
Note
TreeValue objects can be converted into standard Python objects with str(), bytes(), or int(). Alternatively, use the same base functionality from str, bytes, and int on them directly that is implemented on entire DerivationTrees.
Accessing Parents#
<SYMBOL>.parent() -> DerivationTree | NoneReturn the parent of the current node, or
Nonefor the root node.
Accessing Sources#
<SYMBOL>.sources() -> list[DerivationTree]Return a list containing all sources of
<SYMBOL>. Sources are symbols used in generator expressions out of which the value of<SYMBOL>was created; see the section on data conversions for details.
Comparisons#
<SYMBOL_1> == <SYMBOL_2>Returns True if both trees have the same structure and all nodes have the same values.
<SYMBOL> == VALUEReturns True if
<SYMBOL>.value() == VALUE— requires the types to match<SYMBOL_1> != <SYMBOL_2>Returns True if both trees have a different structure or any nodes have different values.
<SYMBOL> != VALUEInverse of
<SYMBOL> == VALUE.
To compare values with <, >, <=, >=, etc., explicitly convert them to the type you would like to use for comparison first.
Debugging#
See the section on output formats for details on these representations.
<SYMBOL>.to_bits() -> strReturn a bit representation (
0and1characters) of<SYMBOL>.<SYMBOL>.to_grammar() -> strReturn a grammar-like representation of
<SYMBOL>.<SYMBOL>.to_tree() -> strReturn a tree representation of
<SYMBOL>, usingTree(...)constructors.repr(<SYMBOL>) -> strReturn the internal representation of
<SYMBOL>, as aDerivationTreeconstructor that can be evaluated as a Python expression.
Type-Specific Functions#
The bulk of available functions comes from the Python standard library. The vast majority of methods implemented on str, bytes, and int are also implemented on DerivationTrees:
If a method is only implemented on one of the three base types, the
DerivationTreeinternally is first transformed into that type (usingstr(<SYMBOL>)etc.).If a method is implemented on multiple base types, but they always have different signatures (such as
.startswith, which takes astrif called on astr, andbytesif called onbytes), theDerivationTreeis transformed into the appropriate base type.If a method is implemented on multiple base types, and there is no way to distinguish based on the signatures, the method is envoked on the underlying type. This is not recommended as these methods rely on knowledge of the type of the internal representation (which may change in the future). Simply convert the tree to the desired base type first (
str(<SYMBOL>).upper()).