Bits and Bit Fields

Bits and Bit Fields#

Some binary inputs use individual bits to specify contents. For instance, you might have a flag byte that holds multiple (bit) flags:

struct {
  unsigned int italic: 1;  // 1 bit
  unsigned int bold: 1;
  unsigned int underlined: 1;
  unsigned int strikethrough: 1;
  unsigned int brightness: 4;  // 4 bits
} format_flags;

How does one represent such bit fields in a Fandango spec?

Representing Bits#

In Fandango, bits can be represented in Fandango using the special values 0 (for a zero bit) and 1 (for a non-zero bit). Hence, you can define a <bit> value as

<bit> ::= 0 | 1

With this, the above format_flag byte would be specified as

<start>         ::= <format_flag>
<format_flag>   ::= <italic> <bold> <underlined> <strikethrough> <brightness>
<italic>        ::= <bit>
<bold>          ::= <bit>
<underlined>    ::= <bit>
<strikethrough> ::= <bit>
<brightness>    ::= <bit>{4}
<bit>           ::= 0 | 1

A <format_flag> symbol would thus always consist of these eight bits. We can use the special option --format=bits to view the output as a bit stream:

$ fandango fuzz --format=bits -f bits.fan -n 1 --start-symbol='<format_flag>'
fandango:WARNING: Symbol <start> defined, but not used
00110111

Note

The combination of --format=bits and --start-symbol is particularly useful to debug bit fields.

Internally, Fandango treats individual flags as integers, too. Hence, we can also apply constraints to the individual flags. For instance, we can profit from the fact that Python treats 0 as False and 1 as True:

$ fandango fuzz --format=bits -f bits.fan -n 10 -c '<italic> and <bold>'
11100010
11010101
11011110
11100100
11010110
11111111
11100110
11100001
11101100
11100101

Fandango strictly follows a “left-to-right” order - that is, the order in which bits and bytes are specified in the grammar, the most significant bit is stored first.

Hence, we can also easily set the value of the entire brightness field using a constraint:

$ fandango fuzz --format=bits -f bits.fan -n 1 -c '<brightness> == 0b1111'
00111111

Note

Fandango always strictly follows a “left-to-right” order - that is, the order in which bits and bytes are specified in the grammar.

Of course, we can also give the number in decimal format:

$ fandango fuzz --format=bits -f bits.fan -n 1 -c '<brightness> == 15'
00111111
01001111
11111111
01101111
11101111
01111111
00101111
00001111
10101111
11001111

Note how the last four bits (the <brightness> field) are always set to 1111 - the number 15.

Warning

When implementing a format, be sure to follow its conventions regarding

  • bit ordering (most or least significant bit first)

  • byte ordering (most or least significant byte first)

Parsing Bits#

Fandango also supports parsing inputs with bits. This is what happens if we send a byte \xf0 (the upper four bits set) to the parser:

$ echo -n '\xf0' | fandango parse -f bits.fan -o - --format=bits
FandangoParseError: '<stdin>', position 0x0001 (1): mismatched input b'x'
FandangoParseError: 1 error(s) during parsing
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[6], line 2
      1 get_ipython().system("echo -n '\\xf0' | fandango parse -f bits.fan -o - --format=bits --validate")
----> 2 assert _exit_code == 0

AssertionError: 

We see that the input was properly parsed and decomposed into individual bits.

This is the resulting parse tree:

The grammar format shows us that the values are properly assigned:

$ echo -n '\xf0' | fandango parse -f bits.fan -o - --format=grammar

Warning

To parse bits properly, they must come in multiples of eight.

Bits and Padding#

When generating binary inputs, you may need to adhere to specific lengths. Such lengths are often enforced by padding – that is, adding bits until the required length is achieved. For instance, let us assume you have a field consisting of some bits. However, the overall length of the field must be a multiple of eight to have it byte-aligned. For such padding, define the field as

<field> ::= <bits> <padding>
<padding> ::= 0*

combined with a constraint

where len(<field>) % 8 == 0

Note that applied on derivation trees, len() always returns the number of child elements, not the string length; here, we use this to access the number of elements in <field>.