Converting ANTLR and Other Input Specs#

Often, you may already have an input format specification available, but not (yet) in Fandango .fan format. Fandango’s convert command allows you automatically translate common input specifications into .fan files - at least most of it.

Important

All these converters are lossy - that is, some features of the original specifications may not be converted into Fandango. Hence, the idea is that you use converted formats as a base for further manual editing and checking.

Note

All these formats define the syntax of input files, typically for the purpose of parsing. To produce inputs that are also semantically valid, you will often have to augment the .fan files with constraints to make them semantically valid, too.

Under Construction

All these converters are experimental at this point.

Converting ANTLR Specs#

Fandango allows you to automatically convert ANTLR grammar specifications (.g4, .antlr) into Fandango .fan files. ANTLR is a very popular parser generator; a wide large collections of ANTLR grammars is available.

Simply use the command fandango convert, followed by the ANTLR file to be converted.

As an example, consider this simple Calculator.g4 ANTLR file:

// https://www.inovex.de/de/blog/building-a-simple-calculator-with-antlr-in-python/

grammar Calculator;

expression 
    : NUMBER                        # Number
    | '(' expression ')'            # Parentheses
    | expression TIMES expression   # Multiplication
    | expression DIV expression     # Division
    | expression PLUS expression    # Addition
    | expression MINUS expression   # Subtraction
;

PLUS : '+';
MINUS: '-';
TIMES: '*';
DIV  : '/';
NUMBER : [0-9]+;
WS : [ \r\n\t]+ -> skip;

Invoking fandango convert produces an (almost) equivalent Fandango .fan file:

$ fandango convert Calculator.g4
# Automatically generated from '../src/fandango/converters/antlr/Calculator.g4'.
#
# Calculator
<expression> ::= <NUMBER> | '(' <expression> ')' | <expression> <TIMES> <expression> | <expression> <DIV> <expression> | <expression> <PLUS> <expression> | <expression> <MINUS> <expression>
<PLUS> ::= '+'
<MINUS> ::= '-'
<TIMES> ::= '*'
<DIV> ::= '/'
<NUMBER> ::= r'[0-9]'+
<WS> ::= r'[ \r\n\t]'+  # NOTE: was '-> skip'

Note the NOTE comment at the bottom: The ANTLR lexer action skip has no equivalent in Fandango; hence WS elements will neither be skipped nor generated.

Still, we can use this grammar to produce expressions. Note the usage of the -o option to specify an output file and the --start option to specify a start symbol.

$ fandango convert -o Calculator.fan Calculator.g4
$ fandango fuzz -f Calculator.fan --start='<expression>' -n 10
fandango:WARNING: Symbol <WS> defined, but not used
(1428)/2/173+0711/47
((6))*47-92*6938-72
81058+96206/99-8686
806
430/((79))
7647-((4189))-96*05171
((((1)*34+0)))+481
59
0
(858)

Note

Most features of ANTLR that cannot be represented in Fandango will be marked by NOTE comments. These include

  • Actions

  • Modifiers

  • Clauses such as return or throws

  • Exceptions

  • Predicate options

  • Element options

  • Negations (~) over complex expressions

Converting 010 Binary Templates#

Fandango provides some basic support for converting Binary Templates (.bt, .010) for the 010 Editor. A large collection of binary templates for various binary formats is available.

Again, simply use the command fandango convert, followed by the binary template file to be converted.

Our GIF example is automatically created from a GIF binary template.

Note

010 Binary Templates can contain arbitrary code that will be executed during parsing. Fandango will recognize a number of common patterns; features that will require manual work include

  • Checksums

  • Complex length encodings

Note

The fandango convert command provides two options to specify bit orderings, should the .bt file not already do so.

  • --endianness=(little|big) and

  • --bitfield-order=(left-to-right|right-to-left)

Converting DTDs#

A Document Type Definition (DTD, .dtd) specifies the format of an XML file. Fandango can convert these into .fan files, enabling the production of XML files that conform to the DTD.

Again, simply use the command fandango convert, followed by the binary template file to be converted.

Note

As with Binary Templates, Fandango will recognize a number of common patterns, but not all.

In the generated .fan file, you can customize every single element in its context. As an example, consider this svg11.fan file which specializes individual elements of a svg.fan file generated from an SVG DTD. The DTD by itself does not specify types of individual fields, so we do this here:

include('svg.fan')

# Add standard blurb at top
<start> ::= ('<?xml version="1.0" standalone="no"?>'
'<!DOCTYPE svg>' <svg>)

<svg> ::= ('<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"'
' width=' <svg_width_value>
' height=' <svg_height_value>
' baseProfile="full" viewBox=' <svg_viewBox_value> '>'
(<desc> | <title> | <metadata> | <animate> | <set> | <animateMotion> | <animateColor> | <animateTransform> | <svg> | <g> | <defs> | <symbol> | <use> | <switch> | <image> | <style> | <path> | <rect> | <circle> | <line> | <ellipse> | <polyline> | <polygon> | <text> | <altGlyphDef> | <marker> | <color_profile> | <linearGradient> | <radialGradient> | <pattern> | <clipPath> | <mask> | <filter> | <cursor> | <a> | <view> | <script> | <font> | <font_face> | <foreignObject>){10} '</svg>')

# Standard data types
<cdata> ::= <qint> | <string>
<qnat> ::= <q> <nat> <q>
<nat> ::= r'[1-9]' <digit>* | '0'
<qint> ::= <q> <int> <q>
<int> ::= r'[1-9]' <digit>* | '-' r'[1-9]' <digit>* | '0'
<string> ::= '"' <char>* '"' | "'" <char>* "'"
<char> ::= r'[0-9a-zA-Z_-]+'
<id> ::= <q> <ascii_letter> (<ascii_letter> | <digit> | '_')* <q>
<nmtoken> ::= <id>
<pcdata> ::= <cdata>
<url> ::= <q> 'https://cispa.de' <q>
<qpercentage> ::= <q> <percentage> <q>
<percentage> ::= ("0" | r"[1-9][0-9]?" | "100")

# SVG-specific data types
<Coordinate_datatype> ::= <qint> := "'100'"
<Length_datatype> ::= <qnat>
<FontFamilyValue_datatype> ::= <string> := '"sans-serif"'
<FontSizeValue_datatype> ::= <qnat> := "'12'"
<FontSizeAdjustValue_datatype> ::= <qnat> := "'0'"
<GlyphOrientationHorizontalValue_datatype> ::= <qint> := "'0'"
<GlyphOrientationVerticalValue_datatype> ::= <qint> := "'0'"
<Number_datatype> ::= <qint>
<NumberOptionalNumber_datatype> ::= <qint>
<OpacityValue_datatype> ::= <qpercentage> := "'100'"
<PathData_datatype> ::= <q> (<int> <ws>)+ <q>
<Text_datatype> ::= <string>
<Script_datatype> ::= <string>
<SVGColor_datatype> ::= <q> '#' (<hexdigit>{3} | <hexdigit>{6}) <q>

# Mappings of attributes to data types
<accent_height_value> ::= <Number_datatype>
<alphabetic_value> ::= <Number_datatype>
<amplitude_value> ::= <Number_datatype>
<arabic_form_value> ::= <cdata>
<arcrole_value> ::= <cdata>
<ascent_value> ::= <Number_datatype>
<attributeName_value> ::= <cdata>
<attributeType_value> ::= <cdata>
<azimuth_value> ::= <Number_datatype>
<baseFrequency_value> ::= <NumberOptionalNumber_datatype>
<baseProfile_value> ::= <Text_datatype>
<base_value> ::= <cdata>
<baseline_shift_value> ::= <cdata>
<bbox_value> ::= <cdata>
<begin_value> ::= <cdata>
<bias_value> ::= <Number_datatype>
<by_value> ::= <cdata>
<cap_height_value> ::= <Number_datatype>
<class_value> ::= <cdata>
<clip_path_value> ::= <cdata>
<clip_value> ::= <cdata>
<color_profile_value> ::= <cdata>
<color_value> ::= <cdata>
<contentScriptType_value> ::= <cdata>
<contentStyleType_value> ::= <cdata>
<cursor_value> ::= <cdata>
<cx_value> ::= <Coordinate_datatype>
<cy_value> ::= <Coordinate_datatype>
<d_value> ::= <PathData_datatype>
<descent_value> ::= <Number_datatype>
<diffuseConstant_value> ::= <Number_datatype>
<divisor_value> ::= <Number_datatype>
<dur_value> ::= <cdata>
<dx_value> ::= <Number_datatype>
<dy_value> ::= <Number_datatype>
<elevation_value> ::= <Number_datatype>
<enable_background_value> ::= <cdata>
<end_value> ::= <cdata>
<exponent_value> ::= <Number_datatype>
<fePointLight_z_value> ::= <Number_datatype>
<fePointLight_y_value> ::= <Number_datatype>
<fePointLight_x_value> ::= <Number_datatype>
<feSpotLight_z_value> ::= <Number_datatype>
<feSpotLight_y_value> ::= <Number_datatype>
<feSpotLight_x_value> ::= <Number_datatype>
<fill_opacity_value> ::= <OpacityValue_datatype>
<fill_value> ::= <cdata>
<filterRes_value> ::= <NumberOptionalNumber_datatype>
<filter_value> ::= <cdata>
<flood_color_value> ::= <SVGColor_datatype>
<flood_opacity_value> ::= <OpacityValue_datatype>
<font_family_value> ::= <FontFamilyValue_datatype>
<font_size_adjust_value> ::= <FontSizeAdjustValue_datatype>
<font_size_value> ::= <FontSizeValue_datatype>
<font_stretch_value> ::= <cdata>
<font_style_value> ::= <cdata>
<font_variant_value> ::= <cdata>
<font_weight_value> ::= <cdata>
<format_value> ::= <cdata>
<from_value> ::= <cdata>
<fx_value> ::= <Coordinate_datatype>
<fy_value> ::= <Coordinate_datatype>
<g1_value> ::= <cdata>
<g2_value> ::= <cdata>
<glyphRef_value> ::= <cdata>
<glyph_name_value> ::= <cdata>
<glyph_orientation_horizontal_value> ::= <GlyphOrientationHorizontalValue_datatype>
<glyph_orientation_vertical_value> ::= <GlyphOrientationVerticalValue_datatype>
<gradientTransform_value> ::= <cdata>
<hanging_value> ::= <Number_datatype>
<height_value> ::= <Number_datatype>
<horiz_adv_x_value> ::= <Number_datatype>
<horiz_origin_x_value> ::= <Number_datatype>
<horiz_origin_y_value> ::= <Number_datatype>
<href_value> ::= <url>
<id_value> ::= <id>
<ideographic_value> ::= <Number_datatype>
<in2_value> ::= <cdata>
<in_value> ::= <cdata>
<intercept_value> ::= <Number_datatype>
<k1_value> ::= <Number_datatype>
<k2_value> ::= <Number_datatype>
<k3_value> ::= <Number_datatype>
<k4_value> ::= <Number_datatype>
<k_value> ::= <Number_datatype>
<kernelMatrix_value> ::= <cdata>
<kernelUnitLength_value> ::= <NumberOptionalNumber_datatype>
<kerning_value> ::= <cdata>
<keyPoints_value> ::= <cdata>
<keySplines_value> ::= <cdata>
<keyTimes_value> ::= <cdata>
<lang_value> ::= <nmtoken>
<letter_spacing_value> ::= <cdata>
<lighting_color_value> ::= <SVGColor_datatype>
<limitingConeAngle_value> ::= <Number_datatype>
<local_value> ::= <cdata>
<markerHeight_value> ::= <Length_datatype>
<markerWidth_value> ::= <Length_datatype>
<marker_end_value> ::= <cdata>
<marker_mid_value> ::= <cdata>
<marker_start_value> ::= <cdata>
<mask_value> ::= <cdata>
<mathematical_value> ::= <Number_datatype>
<max_value> ::= <cdata>
<media_value> ::= <cdata>
<min_value> ::= <cdata>
<name_value> ::= <cdata>
<numOctaves_value> ::= <cdata>
<offset_value> ::= <Number_datatype>
<onabort_value> ::= <Script_datatype>
<onactivate_value> ::= <Script_datatype>
<onbegin_value> ::= <Script_datatype>
<onclick_value> ::= <Script_datatype>
<onend_value> ::= <Script_datatype>
<onerror_value> ::= <Script_datatype>
<onfocusin_value> ::= <Script_datatype>
<onfocusout_value> ::= <Script_datatype>
<onload_value> ::= <Script_datatype>
<onmousedown_value> ::= <Script_datatype>
<onmousemove_value> ::= <Script_datatype>
<onmouseout_value> ::= <Script_datatype>
<onmouseover_value> ::= <Script_datatype>
<onmouseup_value> ::= <Script_datatype>
<onrepeat_value> ::= <Script_datatype>
<onresize_value> ::= <Script_datatype>
<onscroll_value> ::= <Script_datatype>
<onunload_value> ::= <Script_datatype>
<onzoom_value> ::= <Script_datatype>
<opacity_value> ::= <OpacityValue_datatype>
<order_value> ::= <Number_datatype>
<orient_value> ::= <cdata>
<orientation_value> ::= <cdata>
<origin_value> ::= <cdata>
<overline_position_value> ::= <Number_datatype>
<overline_thickness_value> ::= <Number_datatype>
<panose_1_value> ::= <cdata>
<pathLength_value> ::= <Number_datatype>
<path_value> ::= <cdata>
<patternTransform_value> ::= <cdata>
<pointsAtX_value> ::= <Number_datatype>
<pointsAtY_value> ::= <Number_datatype>
<pointsAtZ_value> ::= <Number_datatype>
<points_value> ::= <cdata>
<preserveAspectRatio_value> ::= <cdata>
<r_value> ::= <Length_datatype>
<radius_value> ::= <Number_datatype>
<refX_value> ::= <cdata>
<refY_value> ::= <cdata>
<repeatCount_value> ::= <cdata>
<repeatDur_value> ::= <cdata>
<requiredExtensions_value> ::= <cdata>
<requiredFeatures_value> ::= <cdata>
<result_value> ::= <cdata>
<role_value> ::= <cdata>
<rotate_value> ::= <cdata>
<rx_value> ::= <Length_datatype>
<ry_value> ::= <Length_datatype>
<scale_value> ::= <Number_datatype>
<seed_value> ::= <Number_datatype>
<slope_value> ::= <Number_datatype>
<specularConstant_value> ::= <Number_datatype>
<specularExponent_value> ::= <Number_datatype>
<startOffset_value> ::= <Length_datatype>
<stdDeviation_value> ::= <NumberOptionalNumber_datatype>
<stemh_value> ::= <Number_datatype>
<stemv_value> ::= <Number_datatype>
<stop_color_value> ::= <SVGColor_datatype>
<stop_opacity_value> ::= <OpacityValue_datatype>
<strikethrough_position_value> ::= <Number_datatype>
<strikethrough_thickness_value> ::= <Number_datatype>
<string_value> ::= <cdata>
<stroke_dasharray_value> ::= <cdata>
<stroke_dashoffset_value> ::= <cdata>
<stroke_miterlimit_value> ::= <cdata>
<stroke_opacity_value> ::= <OpacityValue_datatype>
<stroke_value> ::= <cdata>
<stroke_width_value> ::= <Number_datatype>
<style_value> ::= <cdata>
<surfaceScale_value> ::= <Number_datatype>
<systemLanguage_value> ::= <cdata>
<tableValues_value> ::= <cdata>
<targetX_value> ::= <cdata>
<targetY_value> ::= <cdata>
<target_value> ::= <nmtoken>
<textLength_value> ::= <Length_datatype>
<text_decoration_value> ::= <cdata>
<title_value> ::= <Text_datatype>
<to_value> ::= <cdata>
<transform_value> ::= <cdata>
<type_value> ::= <cdata>
<u1_value> ::= <cdata>
<u2_value> ::= <cdata>
<underline_position_value> ::= <Number_datatype>
<underline_thickness_value> ::= <Number_datatype>
<unicode_range_value> ::= <cdata>
<unicode_value> ::= <cdata>
<units_per_em_value> ::= <Number_datatype>
<v_alphabetic_value> ::= <Number_datatype>
<v_hanging_value> ::= <Number_datatype>
<v_ideographic_value> ::= <Number_datatype>
<v_mathematical_value> ::= <Number_datatype>
<values_value> ::= <cdata>
<vert_adv_y_value> ::= <Number_datatype>
<vert_origin_x_value> ::= <Number_datatype>
<vert_origin_y_value> ::= <Number_datatype>
<viewBox_value> ::= <q> <int> <ws> <int> <ws> <int> <ws> <int> <q>
<viewTarget_value> ::= <cdata>
<width_value> ::= <Number_datatype>
<widths_value> ::= <cdata>
<word_spacing_value> ::= <cdata>
<x1_value> ::= <Coordinate_datatype>
<x2_value> ::= <Coordinate_datatype>
<x_height_value> ::= <Number_datatype>
<x_value> ::= <Coordinate_datatype>
<y1_value> ::= <Coordinate_datatype>
<y2_value> ::= <Coordinate_datatype>
<y_value> ::= <Coordinate_datatype>
<z_value> ::= <Coordinate_datatype>

Once this is all set, we can use this to test SVGs with extreme values, as in this svgextreme.fan example:

include('svg11.fan')

# where (int(<width_value>) > 1e8 or int(<height_value>) > 1e8)

# Check with extreme number values
where <Number_datatype> == "'1000000'"

# Ensure we have a minimum of children
where len(<svg>) > 20

Converting .fan files#

With fandango convert, you can also “convert” .fan files. This results in a “normalized” format, where all comments and blank lines have been removed. If we send this input to fandango convert:

# A fine file to produce person names
from faker import Faker
fake = Faker()

include('persons.fan')

<first_name> ::= <name> := fake.first_name()
<last_name> ::= <name> := fake.last_name()

then we get

$ fandango convert persons-faker.fan
# Automatically generated from 'persons-faker.fan'.
#
from faker import Faker
fake = Faker()
include('persons.fan')

<first_name> ::= <name> := fake.first_name()
<last_name> ::= <name> := fake.last_name()

Note

This feature can be useful to detect semantic changes in .fan files.