Converting ANTLR and Other Input Specs#
Often, you may already have an input format specification available, but not (yet) in Fandango .fan
format.
Fandango’s convert
command allows you automatically translate common input specifications into .fan
files - at least most of it.
Important
All these converters are lossy - that is, some features of the original specifications may not be converted into Fandango. Hence, the idea is that you use converted formats as a base for further manual editing and checking.
Note
All these formats define the syntax of input files, typically for the purpose of parsing.
To produce inputs that are also semantically valid, you will often have to augment the .fan
files with constraints to make them semantically valid, too.
Under Construction
All these converters are experimental at this point.
Converting ANTLR Specs#
Fandango allows you to automatically convert ANTLR grammar specifications (.g4
, .antlr
) into Fandango .fan
files.
ANTLR is a very popular parser generator; a wide large collections of ANTLR grammars is available.
Simply use the command fandango convert
, followed by the ANTLR file to be converted.
As an example, consider this simple Calculator.g4
ANTLR file:
// https://www.inovex.de/de/blog/building-a-simple-calculator-with-antlr-in-python/
grammar Calculator;
expression
: NUMBER # Number
| '(' expression ')' # Parentheses
| expression TIMES expression # Multiplication
| expression DIV expression # Division
| expression PLUS expression # Addition
| expression MINUS expression # Subtraction
;
PLUS : '+';
MINUS: '-';
TIMES: '*';
DIV : '/';
NUMBER : [0-9]+;
WS : [ \r\n\t]+ -> skip;
Invoking fandango convert
produces an (almost) equivalent Fandango .fan
file:
$ fandango convert Calculator.g4
# Automatically generated from '../src/fandango/converters/antlr/Calculator.g4'.
#
# Calculator
<expression> ::= <NUMBER> | '(' <expression> ')' | <expression> <TIMES> <expression> | <expression> <DIV> <expression> | <expression> <PLUS> <expression> | <expression> <MINUS> <expression>
<PLUS> ::= '+'
<MINUS> ::= '-'
<TIMES> ::= '*'
<DIV> ::= '/'
<NUMBER> ::= r'[0-9]'+
<WS> ::= r'[ \r\n\t]'+ # NOTE: was '-> skip'
Note the NOTE
comment at the bottom: The ANTLR lexer action skip
has no equivalent in Fandango; hence WS
elements will neither be skipped nor generated.
Still, we can use this grammar to produce expressions.
Note the usage of the -o
option to specify an output file and the --start
option to specify a start symbol.
$ fandango convert -o Calculator.fan Calculator.g4
$ fandango fuzz -f Calculator.fan --start='<expression>' -n 10
fandango:WARNING: Symbol <WS> defined, but not used
(1428)/2/173+0711/47
((6))*47-92*6938-72
81058+96206/99-8686
806
430/((79))
7647-((4189))-96*05171
((((1)*34+0)))+481
59
0
(858)
Note
Most features of ANTLR that cannot be represented in Fandango will be marked by NOTE
comments.
These include
Actions
Modifiers
Clauses such as
return
orthrows
Exceptions
Predicate options
Element options
Negations (
~
) over complex expressions
Converting 010 Binary Templates#
Fandango provides some basic support for converting Binary Templates (.bt
, .010
) for the 010 Editor.
A large collection of binary templates for various binary formats is available.
Again, simply use the command fandango convert
, followed by the binary template file to be converted.
Our GIF example is automatically created from a GIF binary template.
Note
010 Binary Templates can contain arbitrary code that will be executed during parsing. Fandango will recognize a number of common patterns; features that will require manual work include
Checksums
Complex length encodings
Note
The fandango convert
command provides two options to specify bit orderings, should the .bt
file not already do so.
--endianness=(little|big)
and--bitfield-order=(left-to-right|right-to-left)
Converting DTDs#
A Document Type Definition (DTD, .dtd
) specifies the format of an XML file.
Fandango can convert these into .fan
files, enabling the production of XML files that conform to the DTD.
Again, simply use the command fandango convert
, followed by the binary template file to be converted.
Note
As with Binary Templates, Fandango will recognize a number of common patterns, but not all.
In the generated .fan
file, you can customize every single element in its context.
As an example, consider this svg11.fan
file which specializes individual elements of a svg.fan
file generated from an SVG DTD.
The DTD by itself does not specify types of individual fields, so we do this here:
include('svg.fan')
# Add standard blurb at top
<start> ::= ('<?xml version="1.0" standalone="no"?>'
'<!DOCTYPE svg>' <svg>)
<svg> ::= ('<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"'
' width=' <svg_width_value>
' height=' <svg_height_value>
' baseProfile="full" viewBox=' <svg_viewBox_value> '>'
(<desc> | <title> | <metadata> | <animate> | <set> | <animateMotion> | <animateColor> | <animateTransform> | <svg> | <g> | <defs> | <symbol> | <use> | <switch> | <image> | <style> | <path> | <rect> | <circle> | <line> | <ellipse> | <polyline> | <polygon> | <text> | <altGlyphDef> | <marker> | <color_profile> | <linearGradient> | <radialGradient> | <pattern> | <clipPath> | <mask> | <filter> | <cursor> | <a> | <view> | <script> | <font> | <font_face> | <foreignObject>){10} '</svg>')
# Standard data types
<cdata> ::= <qint> | <string>
<qnat> ::= <q> <nat> <q>
<nat> ::= r'[1-9]' <digit>* | '0'
<qint> ::= <q> <int> <q>
<int> ::= r'[1-9]' <digit>* | '-' r'[1-9]' <digit>* | '0'
<string> ::= '"' <char>* '"' | "'" <char>* "'"
<char> ::= r'[0-9a-zA-Z_-]+'
<id> ::= <q> <ascii_letter> (<ascii_letter> | <digit> | '_')* <q>
<nmtoken> ::= <id>
<pcdata> ::= <cdata>
<url> ::= <q> 'https://cispa.de' <q>
<qpercentage> ::= <q> <percentage> <q>
<percentage> ::= ("0" | r"[1-9][0-9]?" | "100")
# SVG-specific data types
<Coordinate_datatype> ::= <qint> := "'100'"
<Length_datatype> ::= <qnat>
<FontFamilyValue_datatype> ::= <string> := '"sans-serif"'
<FontSizeValue_datatype> ::= <qnat> := "'12'"
<FontSizeAdjustValue_datatype> ::= <qnat> := "'0'"
<GlyphOrientationHorizontalValue_datatype> ::= <qint> := "'0'"
<GlyphOrientationVerticalValue_datatype> ::= <qint> := "'0'"
<Number_datatype> ::= <qint>
<NumberOptionalNumber_datatype> ::= <qint>
<OpacityValue_datatype> ::= <qpercentage> := "'100'"
<PathData_datatype> ::= <q> (<int> <ws>)+ <q>
<Text_datatype> ::= <string>
<Script_datatype> ::= <string>
<SVGColor_datatype> ::= <q> '#' (<hexdigit>{3} | <hexdigit>{6}) <q>
# Mappings of attributes to data types
<accent_height_value> ::= <Number_datatype>
<alphabetic_value> ::= <Number_datatype>
<amplitude_value> ::= <Number_datatype>
<arabic_form_value> ::= <cdata>
<arcrole_value> ::= <cdata>
<ascent_value> ::= <Number_datatype>
<attributeName_value> ::= <cdata>
<attributeType_value> ::= <cdata>
<azimuth_value> ::= <Number_datatype>
<baseFrequency_value> ::= <NumberOptionalNumber_datatype>
<baseProfile_value> ::= <Text_datatype>
<base_value> ::= <cdata>
<baseline_shift_value> ::= <cdata>
<bbox_value> ::= <cdata>
<begin_value> ::= <cdata>
<bias_value> ::= <Number_datatype>
<by_value> ::= <cdata>
<cap_height_value> ::= <Number_datatype>
<class_value> ::= <cdata>
<clip_path_value> ::= <cdata>
<clip_value> ::= <cdata>
<color_profile_value> ::= <cdata>
<color_value> ::= <cdata>
<contentScriptType_value> ::= <cdata>
<contentStyleType_value> ::= <cdata>
<cursor_value> ::= <cdata>
<cx_value> ::= <Coordinate_datatype>
<cy_value> ::= <Coordinate_datatype>
<d_value> ::= <PathData_datatype>
<descent_value> ::= <Number_datatype>
<diffuseConstant_value> ::= <Number_datatype>
<divisor_value> ::= <Number_datatype>
<dur_value> ::= <cdata>
<dx_value> ::= <Number_datatype>
<dy_value> ::= <Number_datatype>
<elevation_value> ::= <Number_datatype>
<enable_background_value> ::= <cdata>
<end_value> ::= <cdata>
<exponent_value> ::= <Number_datatype>
<fePointLight_z_value> ::= <Number_datatype>
<fePointLight_y_value> ::= <Number_datatype>
<fePointLight_x_value> ::= <Number_datatype>
<feSpotLight_z_value> ::= <Number_datatype>
<feSpotLight_y_value> ::= <Number_datatype>
<feSpotLight_x_value> ::= <Number_datatype>
<fill_opacity_value> ::= <OpacityValue_datatype>
<fill_value> ::= <cdata>
<filterRes_value> ::= <NumberOptionalNumber_datatype>
<filter_value> ::= <cdata>
<flood_color_value> ::= <SVGColor_datatype>
<flood_opacity_value> ::= <OpacityValue_datatype>
<font_family_value> ::= <FontFamilyValue_datatype>
<font_size_adjust_value> ::= <FontSizeAdjustValue_datatype>
<font_size_value> ::= <FontSizeValue_datatype>
<font_stretch_value> ::= <cdata>
<font_style_value> ::= <cdata>
<font_variant_value> ::= <cdata>
<font_weight_value> ::= <cdata>
<format_value> ::= <cdata>
<from_value> ::= <cdata>
<fx_value> ::= <Coordinate_datatype>
<fy_value> ::= <Coordinate_datatype>
<g1_value> ::= <cdata>
<g2_value> ::= <cdata>
<glyphRef_value> ::= <cdata>
<glyph_name_value> ::= <cdata>
<glyph_orientation_horizontal_value> ::= <GlyphOrientationHorizontalValue_datatype>
<glyph_orientation_vertical_value> ::= <GlyphOrientationVerticalValue_datatype>
<gradientTransform_value> ::= <cdata>
<hanging_value> ::= <Number_datatype>
<height_value> ::= <Number_datatype>
<horiz_adv_x_value> ::= <Number_datatype>
<horiz_origin_x_value> ::= <Number_datatype>
<horiz_origin_y_value> ::= <Number_datatype>
<href_value> ::= <url>
<id_value> ::= <id>
<ideographic_value> ::= <Number_datatype>
<in2_value> ::= <cdata>
<in_value> ::= <cdata>
<intercept_value> ::= <Number_datatype>
<k1_value> ::= <Number_datatype>
<k2_value> ::= <Number_datatype>
<k3_value> ::= <Number_datatype>
<k4_value> ::= <Number_datatype>
<k_value> ::= <Number_datatype>
<kernelMatrix_value> ::= <cdata>
<kernelUnitLength_value> ::= <NumberOptionalNumber_datatype>
<kerning_value> ::= <cdata>
<keyPoints_value> ::= <cdata>
<keySplines_value> ::= <cdata>
<keyTimes_value> ::= <cdata>
<lang_value> ::= <nmtoken>
<letter_spacing_value> ::= <cdata>
<lighting_color_value> ::= <SVGColor_datatype>
<limitingConeAngle_value> ::= <Number_datatype>
<local_value> ::= <cdata>
<markerHeight_value> ::= <Length_datatype>
<markerWidth_value> ::= <Length_datatype>
<marker_end_value> ::= <cdata>
<marker_mid_value> ::= <cdata>
<marker_start_value> ::= <cdata>
<mask_value> ::= <cdata>
<mathematical_value> ::= <Number_datatype>
<max_value> ::= <cdata>
<media_value> ::= <cdata>
<min_value> ::= <cdata>
<name_value> ::= <cdata>
<numOctaves_value> ::= <cdata>
<offset_value> ::= <Number_datatype>
<onabort_value> ::= <Script_datatype>
<onactivate_value> ::= <Script_datatype>
<onbegin_value> ::= <Script_datatype>
<onclick_value> ::= <Script_datatype>
<onend_value> ::= <Script_datatype>
<onerror_value> ::= <Script_datatype>
<onfocusin_value> ::= <Script_datatype>
<onfocusout_value> ::= <Script_datatype>
<onload_value> ::= <Script_datatype>
<onmousedown_value> ::= <Script_datatype>
<onmousemove_value> ::= <Script_datatype>
<onmouseout_value> ::= <Script_datatype>
<onmouseover_value> ::= <Script_datatype>
<onmouseup_value> ::= <Script_datatype>
<onrepeat_value> ::= <Script_datatype>
<onresize_value> ::= <Script_datatype>
<onscroll_value> ::= <Script_datatype>
<onunload_value> ::= <Script_datatype>
<onzoom_value> ::= <Script_datatype>
<opacity_value> ::= <OpacityValue_datatype>
<order_value> ::= <Number_datatype>
<orient_value> ::= <cdata>
<orientation_value> ::= <cdata>
<origin_value> ::= <cdata>
<overline_position_value> ::= <Number_datatype>
<overline_thickness_value> ::= <Number_datatype>
<panose_1_value> ::= <cdata>
<pathLength_value> ::= <Number_datatype>
<path_value> ::= <cdata>
<patternTransform_value> ::= <cdata>
<pointsAtX_value> ::= <Number_datatype>
<pointsAtY_value> ::= <Number_datatype>
<pointsAtZ_value> ::= <Number_datatype>
<points_value> ::= <cdata>
<preserveAspectRatio_value> ::= <cdata>
<r_value> ::= <Length_datatype>
<radius_value> ::= <Number_datatype>
<refX_value> ::= <cdata>
<refY_value> ::= <cdata>
<repeatCount_value> ::= <cdata>
<repeatDur_value> ::= <cdata>
<requiredExtensions_value> ::= <cdata>
<requiredFeatures_value> ::= <cdata>
<result_value> ::= <cdata>
<role_value> ::= <cdata>
<rotate_value> ::= <cdata>
<rx_value> ::= <Length_datatype>
<ry_value> ::= <Length_datatype>
<scale_value> ::= <Number_datatype>
<seed_value> ::= <Number_datatype>
<slope_value> ::= <Number_datatype>
<specularConstant_value> ::= <Number_datatype>
<specularExponent_value> ::= <Number_datatype>
<startOffset_value> ::= <Length_datatype>
<stdDeviation_value> ::= <NumberOptionalNumber_datatype>
<stemh_value> ::= <Number_datatype>
<stemv_value> ::= <Number_datatype>
<stop_color_value> ::= <SVGColor_datatype>
<stop_opacity_value> ::= <OpacityValue_datatype>
<strikethrough_position_value> ::= <Number_datatype>
<strikethrough_thickness_value> ::= <Number_datatype>
<string_value> ::= <cdata>
<stroke_dasharray_value> ::= <cdata>
<stroke_dashoffset_value> ::= <cdata>
<stroke_miterlimit_value> ::= <cdata>
<stroke_opacity_value> ::= <OpacityValue_datatype>
<stroke_value> ::= <cdata>
<stroke_width_value> ::= <Number_datatype>
<style_value> ::= <cdata>
<surfaceScale_value> ::= <Number_datatype>
<systemLanguage_value> ::= <cdata>
<tableValues_value> ::= <cdata>
<targetX_value> ::= <cdata>
<targetY_value> ::= <cdata>
<target_value> ::= <nmtoken>
<textLength_value> ::= <Length_datatype>
<text_decoration_value> ::= <cdata>
<title_value> ::= <Text_datatype>
<to_value> ::= <cdata>
<transform_value> ::= <cdata>
<type_value> ::= <cdata>
<u1_value> ::= <cdata>
<u2_value> ::= <cdata>
<underline_position_value> ::= <Number_datatype>
<underline_thickness_value> ::= <Number_datatype>
<unicode_range_value> ::= <cdata>
<unicode_value> ::= <cdata>
<units_per_em_value> ::= <Number_datatype>
<v_alphabetic_value> ::= <Number_datatype>
<v_hanging_value> ::= <Number_datatype>
<v_ideographic_value> ::= <Number_datatype>
<v_mathematical_value> ::= <Number_datatype>
<values_value> ::= <cdata>
<vert_adv_y_value> ::= <Number_datatype>
<vert_origin_x_value> ::= <Number_datatype>
<vert_origin_y_value> ::= <Number_datatype>
<viewBox_value> ::= <q> <int> <ws> <int> <ws> <int> <ws> <int> <q>
<viewTarget_value> ::= <cdata>
<width_value> ::= <Number_datatype>
<widths_value> ::= <cdata>
<word_spacing_value> ::= <cdata>
<x1_value> ::= <Coordinate_datatype>
<x2_value> ::= <Coordinate_datatype>
<x_height_value> ::= <Number_datatype>
<x_value> ::= <Coordinate_datatype>
<y1_value> ::= <Coordinate_datatype>
<y2_value> ::= <Coordinate_datatype>
<y_value> ::= <Coordinate_datatype>
<z_value> ::= <Coordinate_datatype>
Once this is all set, we can use this to test SVGs with extreme values, as in this svgextreme.fan
example:
include('svg11.fan')
# where (int(<width_value>) > 1e8 or int(<height_value>) > 1e8)
# Check with extreme number values
where <Number_datatype> == "'1000000'"
# Ensure we have a minimum of children
where len(<svg>) > 20
Converting .fan
files#
With fandango convert
, you can also “convert” .fan
files.
This results in a “normalized” format, where all comments and blank lines have been removed.
If we send this input to fandango convert
:
# A fine file to produce person names
from faker import Faker
fake = Faker()
include('persons.fan')
<first_name> ::= <name> := fake.first_name()
<last_name> ::= <name> := fake.last_name()
then we get
$ fandango convert persons-faker.fan
# Automatically generated from 'persons-faker.fan'.
#
from faker import Faker
fake = Faker()
include('persons.fan')
<first_name> ::= <name> := fake.first_name()
<last_name> ::= <name> := fake.last_name()
Note
This feature can be useful to detect semantic changes in .fan
files.