**BiYoocc: A Bidirectional Compiler in Eiffel** A Conjecture =============================================================================== I have the conjecture that: Creating a Bidirectional compiler with Eiffel is a natural fit, due to its native support for invariants. Background ---------------------------------------------------------------------------------- Current programming language practice, has caught up Eiffel and Smalltalk, In recoginzing that tooling is fundamental to help programmers develop high quality software. So any mature porgramming language should offer support for: - A Debugger - A linter - Code Formatter - Structured editor - Package manager. All of these tools depend on two intertwined concepts: a Parser and a Pretty Printer. A pretty printer is a function that receives an Abstract Syntax Tree (AST) and outputs a human-readable representation. So pretty printers and parsers are important for all the programming language designers, and for their users A bidirectional compiler is one in which the pretty printing of the parsed tree (AST) is also a valid code. One would expect this to be the default behavior for all compilers. But most pretty printers evolve independently from the compiler, which leads to incompatibilities between them. This has led to the development of parser generators (BiYacc) with the express purpose of having bidirectionality by default. In this paper, we intend to show that developing bidirectional parsers is not something extraordinary, but a naturally expressed as part of Eiffel's design-by-contract and its invariant mechanism. The invariant that defines a bidirectional parser is: parse(pretty_print(parse(source))) = parse (source) This means, that parsing the pretty_printed version of a source code is equivalent to parsing the source code. But having the pretty_printed version is not enough. For a refactoring tool. We want to manipulate the AST, and yet after our manipulation we want to have code that is still valid. So that invariant is expressed like this: source |> parse |> transform |> pretty_print |> is_valid is_valid :: CST -> Bool pretty_print :: AST -> CST transform :: AST -> AST parse :: CST -> AST source :: CST CST :: String As an evolution of bidirectional parsers we have document synchronizers. A document synchronizer has a round property in which. Transfroming A to B and B to A leaves A unchanged. So the synchronizer_invariant can be expressed as follows: source_a = as_a(as_b(source_a) also observe how those two invariants are complimentary. The bidirectional_parser_invariant focuses on the parse tree. the synchronizer_invariant focuses on the text representation of the code. This invariant is naturally expressed in Eiffel and more importantly is checked, on every method invocation. which makes expressing bidirectional compilers in Eiffel a Natural fit. Known-Unknowns (Uncertainties) ------------------------------------------------------------------------------- There are a few things that at this stage we ignore about this project: 1. Yoocc, which will be the core tool of our approach was developed in 1996 and hasn't received maintenance since 1998. So getting it to work with the current Eiffel could be a challenge. This will be addressed in Stage 0 of the project, by updating the code for Yoocc. 2. There aren't many grammars created for Yoocc apart from a few demo examples in its source code. Which could challenge the validity of the approach by not using more widespread grammars. This will be addressed in Stage 2 of the plan. By developing a grammar for a light version of Java. 3. Even if it turns out that developing bidirectional compilers with Eiffel is possible, it might be that the effort is bigger than the current state of the art (BiYacc). This will be addressed by Stage 3. The plan ------------------------------------------------------------------------------- To demonstrate my claim that Eiffel (and Yoocc) make it easy to evolve a programming language grammar into bidirectionality. I will follow this stages. Stage 0: Update Yoocc to the current Eiffel. At this stage, the goal is to get the demo languages developed with Yoocc to compile again. 1. We take the Yoocc compiler developed by Monash University. 2. Take the example languages as part of the Yoocc distribution and show that the pretty printed code generated with it is not, bidirectional. Stage 1: Evolve demo languages to be bidirectional 1.1 Add the invariant of Listing 1 to the Yoocc compiler. 1.2 Try to compile the generated languages and follow the errors generated, until there is none. 1.3 Show that the pretty_printed version of the example languages are bidirectional. 1.4 With this we would have demonstrated that is possible to evolve a language to be bidirectional without changing its grammar. Stage 2: Evolve demo languages to have a Language server protocol (LSP) Continue to demonstrate that it is possible to incrementally evolve language developed on a OO Compiler. We will implement an LSP for all of the demo languages, in Yoocc. Using the same mechanism as for bidirectinoal programming: Inheritance. With this two evolutions of langugaes that were not born to be done this way. We hope to show that OO Compilers can better evolve a language than the current alternatives of today. Stage 2: Develop BiPandoc. BiPandoc was a hobby project by Zhung et al. Where they created a limited subset of Pandoc using bidirectional compiler technology. Pandoc is the most popular document translator in the world. It takes documents written in let's say markdown and translate them to multiple formats. And yet in such a tool, the transformations aren't bidirectional. For example, it is possible to translate from one source format to another but not the other way around. However, if each Pandoc language used was made with BiYoocc the translation would be bidirectional by default. So bidirectional Pandoc will implement several popular languages like Markdown, HTML, Latex. A bidirectional Pandoc clone would have an impact on millions of users, with this "universal" file format translator. Stage 4: Multi-langugae programming. (Stretch Goal) If every language is equivalent to each other (Turing complete). How far could we take the idea of programming in one language and another programming a different one? Could we perhaps have one developer working in Python and another Working in Scheme? in the same source code? Expected Contributions ---------------------------------------------------------- As a result of this work, I expect to deliver the following work products. Create a bidirectional compiler in Eiffel on top of Yoocc, called BiYoocc Return Yoocc to working condition Have working demos of bidirectional languages in BiYooc Reproduce the paper BiOOP, to demonstrate the equivalence of BiYoocc to the previous work. As a stretch goal, Develop BiPandoc, as a new product that is facilitated by the BiYooc technology. The grand vision ----------------------------------------------------------- I see BiYooc as a steping stone for a bigger vision of programming. Were the code is stored in a central database as Abstract Syntax Tree, Not as a set of text-files. One on which each diff and merge operation are semantic operations that work on the AST not the the lines of text. And that allow developers to work in the same program using different grammars. So a python developer could read code developed by a smalltalk programmer. Next Action ----------------------------------------------------------- In the following months, I intend to execute the different stages of the plan and write a paper to share the experience with the wider Programming Language Research community.