About Fandango#

Given the specification of a program’s input language, Fandango quickly generates myriads of valid sample inputs for testing.

The specification language combines a grammar with constraints written in Python, so it is extremely expressive and flexible. Most notably, you can define your own testing goals in Fandango. If you need the inputs to have particular values or distributions, you can express all these right away in Fandango.

Fandango supports multiple modes of operation:

  • By default, Fandango operates as a black-box fuzzer - that is, it creates inputs from a .fan Fandango specification file.

  • If you have sample inputs, Fandango can mutate these to obtain more realistic inputs.

Fandango comes as a portable Python program and can easily be run on a large variety of platforms.

Under the hood, Fandango uses sophisticated evolutionary algorithms to produce inputs, it starts with a population of random inputs, and evolves these through mutations and cross-over until they fulfill the given constraints.

Fandango is in active development! Features planned for 2025 include:

  • protocol testing

  • coverage-guided testing

  • code-directed testing

  • high diversity inputs

and many more.

Refer to Fandango#

To refer to Fandango, use its official URL:

https://fandango-fuzzer.github.io

Cite Fandango#

If you want to cite Fandango in your academic work, use the ISSTA 2025 paper by Zamudio Amaya, Smytzek, and Zeller [ZASZ25]. Note that José Antonio has two proper last names, Zamudio Amaya, so the proper way to cite the paper is like this:

@article{zamudio2025fandango,
author = {Zamudio Amaya, Jos\'{e} Antonio and Smytzek, Marius and Zeller, Andreas},
title = {{FANDANGO}: {E}volving Language-Based Testing},
year = {2025},
issue_date = {July 2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {2},
number = {ISSTA},
url = {https://doi.org/10.1145/3728915},
ALTurl = {https://publications.cispa.de/articles/standard/FANDANGO_Evolving_Language-Based_Testing/28769252?file=53591066},
doi = {10.1145/3728915},
abstract = {Language-based fuzzers leverage formal input specifications (languages) to generate arbitrarily large and diverse sets of valid inputs for a program under test. Modern language-based test generators combine grammars and constraints to satisfy syntactic and semantic input constraints. ISLA, the leading input generator in that space, uses symbolic constraint solving to solve input constraints. Using solvers places ISLA among the most precise fuzzers but also makes it slow.    In this paper, we explore search-based testing as an alternative to symbolic constraint solving. We employ a genetic algorithm that iteratively generates candidate inputs from an input specification, evaluates them against defined constraints, evolving a population of inputs through syntactically valid mutations and retaining those with superior fitness until the semantic input constraints are met. This evolutionary procedure, analogous to natural genetic evolution, leads to progressively improved inputs that cover both semantics and syntax. This change boosts the efficiency of language-based testing: In our experiments, compared to ISLA, our search-based FANDANGO prototype is faster by one to three orders of magnitude without sacrificing precision.    The search-based approach no longer restricts constraints to constraint solvers' (miniature) languages. In FANDANGO, constraints can use the whole Python language and library. This expressiveness gives testers unprecedented flexibility in shaping test inputs. It allows them to state arbitrary goals for test generation: ''Please produce 1,000 valid test inputs where the voltage field follows a Gaussian distribution but never exceeds 20 mV.''},
journal = {Proc. ACM Softw. Eng.},
month = jun,
articleno = {ISSTA040},
numpages = {23},
keywords = {Language-based testing, fuzzing, test generation}
}


@inproceedings{bettscheider2024mining,
author = {Bettscheider, Leon and Zeller, Andreas},
title = {Look Ma, No Input Samples! Mining Input Grammars from Code with Symbolic Parsing},
year = {2024},
isbn = {9798400706585},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3663529.3663790},
doi = {10.1145/3663529.3663790},
abstract = {Generating test inputs at the system level (“fuzzing”) is most effective if one has a complete specification (such as a grammar) of the input language.  In the absence of a specification, all known fuzzing approaches rely on a set of input samples to infer input properties and guide test generation.  If the set of inputs is incomplete, however, so will be the resulting test cases; if one has no input samples, meaningful test generation so far has been hard to impossible.    In this paper, we introduce a means to determine the input language of a program from the program code alone, opening several new possibilities for comprehensive testing of a wide range of programs. Our symbolic parsing approach first transforms the program such that (1) ‍calls to parsing functions are abstracted into parsing corresponding symbolic nonterminals, and (2) ‍loops and recursions are limited such that the transformed parser then has a finite set of paths.  Symbolic testing then associates each path with a sequence of symbolic nonterminals and terminals, which form a grammar.  First grammars extracted from nontrivial C subjects by our prototype show very high recall and precision, enabling new levels of effectiveness, efficiency, and applicability in test generators.},
booktitle = {Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering},
pages = {522--526},
numpages = {5},
keywords = {Input grammars, fuzzing, symbolic analysis, test generation},
location = {Porto de Galinhas, Brazil},
series = {FSE 2024}
}

@inproceedings{gopinath2020mining,
author = {Gopinath, Rahul and Mathis, Bj\"{o}rn and Zeller, Andreas},
title = {Mining input grammars from dynamic control flow},
year = {2020},
isbn = {9781450370431},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3368089.3409679},
doi = {10.1145/3368089.3409679},
abstract = {One of the key properties of a program is its input specification. Having a formal input specification can be critical in fields such as vulnerability analysis, reverse engineering, software testing, clone detection, or refactoring. Unfortunately, accurate input specifications for typical programs are often unavailable or out of date.  In this paper, we present a general algorithm that takes a program and a small set of sample inputs and automatically infers a readable context-free grammar capturing the input language of the program. We infer the syntactic input structure only by observing access of input characters at different locations of the input parser. This works on all stack based recursive descent input parsers, including parser combinators, and works entirely without program specific heuristics. Our Mimid prototype produced accurate and readable grammars for a variety of evaluation subjects, including complex languages such as JSON, TinyC, and JavaScript.},
booktitle = {Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
pages = {172--183},
numpages = {12},
keywords = {context-free grammar, control-flow, dataflow, dynamic analysis, fuzzing},
location = {Virtual Event, USA},
series = {ESEC/FSE 2020}
}

@inproceedings{schroeder2022mining,
author = {Schr\"{o}der, Michael and Cito, J\"{u}rgen},
title = {Grammars for free: toward grammar inference for Ad Hoc parsers},
year = {2022},
isbn = {9781450392242},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3510455.3512787},
doi = {10.1145/3510455.3512787},
abstract = {Ad hoc parsers are everywhere: they appear any time a string is split, looped over, interpreted, transformed, or otherwise processed. Every ad hoc parser gives rise to a language: the possibly infinite set of input strings that the program accepts without going wrong. Any language can be described by a formal grammar: a finite set of rules that can generate all strings of that language. But programmers do not write grammars for ad hoc parsers---even though they would be eminently useful. Grammars can serve as documentation, aid program comprehension, generate test inputs, and allow reasoning about language-theoretic security. We propose an automatic grammar inference system for ad hoc parsers that would enable all of these use cases, in addition to opening up new possibilities in mining software repositories and bi-directional parser synthesis.},
booktitle = {Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results},
pages = {41–45},
numpages = {5},
location = {Pittsburgh, Pennsylvania},
series = {ICSE-NIER '22}
}

@inproceedings{kulkarni2022arvada,
author = {Kulkarni, Neil and Lemieux, Caroline and Sen, Koushik},
title = {Learning highly recursive input grammars},
year = {2022},
isbn = {9781665403375},
publisher = {IEEE Press},
url = {https://doi.org/10.1109/ASE51524.2021.9678879},
doi = {10.1109/ASE51524.2021.9678879},
abstract = {This paper presents Arvada, an algorithm for learning context-free grammars from a set of positive examples and a Boolean-valued oracle. Arvada learns a context-free grammar by building parse trees from the positive examples. Starting from initially flat trees, Arvada builds structure to these trees with a key operation: it bubbles sequences of sibling nodes in the trees into a new node, adding a layer of indirection to the tree. Bubbling operations enable recursive generalization in the learned grammar. We evaluate Arvada against GLADE and find it achieves on average increases of 4.98 \texttimes{} in recall and 3.13 \texttimes{} in F1 score, while incurring only a 1.27 \texttimes{} slowdown and requiring only 0.87\texttimes{} as many calls to the oracle. Arvada has a particularly marked improvement over GLADE on grammars with highly recursive structure, like those of programming languages.},
booktitle = {Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering},
pages = {456–467},
numpages = {12},
location = {Melbourne, Australia},
series = {ASE '21}
}

@article{steinhoefel2024language,
author = {Steinh\"{o}fel, Dominic and Zeller, Andreas},
title = {Language-Based Software Testing},
year = {2024},
issue_date = {April 2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {67},
number = {4},
issn = {0001-0782},
url = {https://doi.org/10.1145/3631520},
doi = {10.1145/3631520},
abstract = {Constraints over grammar elements can make test generation easier than ever.},
journal = {Commun. ACM},
month = mar,
pages = {80–84},
numpages = {5}
}

@inproceedings{steinhoefel2022isla,
author = {Steinh\"{o}fel, Dominic and Zeller, Andreas},
title = {Input invariants},
year = {2022},
isbn = {9781450394130},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3540250.3549139},
doi = {10.1145/3540250.3549139},
abstract = {How can we generate valid system inputs? Grammar-based fuzzers are highly efficient in producing syntactically valid system inputs. However, programs will often reject inputs that are semantically invalid. We introduce ISLa, a declarative specification language for context-sensitive properties of structured system inputs based on context-free grammars. With ISLa, it is possible to specify input constraints like "a variable has to be defined before it is used," "the 'file name' block must be 100 bytes long," or "the number of columns in all CSV rows must be identical."Such constraints go into the ISLa fuzzer, which leverages the power of solvers like Z3 to solve semantic constraints and, on top, handles quantifiers and predicates over grammar structure. We show that a few ISLa constraints suffice to produce 100\% semantically valid inputs while still maintaining input diversity. ISLa can also parse and precisely validate inputs against semantic constraints.ISLa constraints can be mined from existing input samples. For this, our ISLearn prototype uses a catalog of common patterns, instantiates these over input elements, and retains those candidates that hold for the inputs observed and whose instantiations are fully accepted by input-processing programs. The resulting constraints can then again be used for fuzzing and parsing.},
booktitle = {Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
pages = {583--594},
numpages = {12},
keywords = {specification language, grammars, fuzzing, constraint mining},
location = {Singapore, Singapore},
series = {ESEC/FSE 2022}
}

Read More#

To learn more about how Fandango works, start with the ISSTA 2025 paper by Zamudio Amaya, Smytzek, and Zeller [ZASZ25].

The core idea of Fandango, namely combining grammars and constraints, was introduced as language-based software testing by Steinhöfel and Zeller [SteinhofelZ24] and first implemented in the ISLa framework [SteinhofelZ22]. Both of these laid the foundation for Fandango.

The work on Fandango is funded by the ERC S3 project “Semantics of Software Systems”; the S3 grant proposal (available via the above link) lists several ideas that have been realized in Fandango (and a few more).

The work on Fandango is also related to mining grammars from programs and inputs. Important works in the field include Bettscheider and Zeller [BZ24], Gopinath, Mathis, and Zeller [GMZ20], Schröder and Cito [SchroderC22], and Kulkarni, Lemieux, and Sen [KLS22].

[BZ24]

Leon Bettscheider and Andreas Zeller. Look ma, no input samples! mining input grammars from code with symbolic parsing. In Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering, FSE 2024, 522–526. New York, NY, USA, 2024. Association for Computing Machinery. URL: https://doi.org/10.1145/3663529.3663790, doi:10.1145/3663529.3663790.

[GMZ20]

Rahul Gopinath, Björn Mathis, and Andreas Zeller. Mining input grammars from dynamic control flow. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020, 172–183. New York, NY, USA, 2020. Association for Computing Machinery. URL: https://doi.org/10.1145/3368089.3409679, doi:10.1145/3368089.3409679.

[KLS22]

Neil Kulkarni, Caroline Lemieux, and Koushik Sen. Learning highly recursive input grammars. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering, ASE '21, 456–467. IEEE Press, 2022. URL: https://doi.org/10.1109/ASE51524.2021.9678879, doi:10.1109/ASE51524.2021.9678879.

[SchroderC22]

Michael Schröder and Jürgen Cito. Grammars for free: toward grammar inference for ad hoc parsers. In Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER '22, 41–45. New York, NY, USA, 2022. Association for Computing Machinery. URL: https://doi.org/10.1145/3510455.3512787, doi:10.1145/3510455.3512787.

[SteinhofelZ22]

Dominic Steinhöfel and Andreas Zeller. Input invariants. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2022, 583–594. New York, NY, USA, 2022. Association for Computing Machinery. URL: https://doi.org/10.1145/3540250.3549139, doi:10.1145/3540250.3549139.

[SteinhofelZ24]

Dominic Steinhöfel and Andreas Zeller. Language-based software testing. Commun. ACM, 67(4):80–84, March 2024. URL: https://doi.org/10.1145/3631520, doi:10.1145/3631520.

[ZASZ25] (1,2)

José Antonio Zamudio Amaya, Marius Smytzek, and Andreas Zeller. FANDANGO: Evolving language-based testing. Proc. ACM Softw. Eng., June 2025. URL: https://doi.org/10.1145/3728915, doi:10.1145/3728915.

Acknowledgments#

Fandango is a project of the CISPA Helmholtz Center for Information Security to facilitate highly efficient and highly customizable software testing.

This research was funded by the European Union (ERC “Semantics of Software Systems”, S3, 101093186). Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them.