Discussions on the Syntactic Meta Programming(wg-camlp4 list)

There are some interesting discussions in the wg-camlp4 mailing list, I wrote a long mail yesterday, I cleaned it a bit, pasted it here 


 I rewrite the whole camlP4(named Fan) from scratch, building the quotation kit and throw away the crappy grammar parser, so plz believe me that I do understand the whole technology stack of camlP4, if we could reach some consensus, I would be happy to handle over the maintaining of  Fan, Fan does not loose any feature compared with camlP4, in fact it has more interesting featrues.
   Let’s begin with some easy, not too technical parts which has a significant effect on user experience though:
   1. Performance
          Performance does matter, it’s a shame that  the most time spent in compiling the ocaml compiler is dedicated to camlP4, but it is an engineering problem, currently compiling Fan only takes less than 20s, and it can be improved further
   2. Building issues
        The design of having side effects by dynamic loading is generically a bad idea, in Fan the dynamic loading only register some functionality the Fan support, it does not have any other side effect, each file stands alone says which (ppx , or filters, or syntax) it want to use with a good default option. so the building is always something like ‘-pp fan pluging1 plugin2 plugin3′, the order of pulgings does not matter, also, loading all the plugins you have does not have any side effect, even better, you can do the static linking all the plugins you collected, the building process is simplified.  
  3. Grammar Extension (Language namespace)
       I concur that grammar extension arbitrarily is a bad idea, and I agree with Gabrier that so far only the quotation(Here  quotation means delimited DSL, quosi-quotation means Lisp style macros) is modular, composable, and  I also agree with Gabrier -ppx should not be used to do syntax overriding (this should not be called syntax extension actually), that’s a terrible idea to do syntax overriding, since the user never understand what’s going on underly without reading the Makefile. So here some my suggestion is that some really conevenient syntax extesion, i.e, (let try.. in) should be merged to the built in parser. quotations does not bring too much heavy syntax (imho). In Fan, we proposed the concept of a hierarchical language name space, since once quotation is heavily used, it’s really easy to introduce conflict, the language namespace querying is exactly like java package namespace, you can import, close import to save some typing.
    Here is a taste
     {:.Fan.Lang.Meta.expr| a + b |} ——> 
      `App (`App ((`Id (`Lid “+”)), (`Id (`Lid “a”)))), (`Id (`Lid “b”)))
     {:.Fan.Lang.Meta.N.expr| a + b |}  —–>
         (_loc, (`Id (_loc, (`Lid (_loc, “+”)))),
           (`Id (_loc, (`Lid (_loc, “a”)))))),
      (`Id (_loc, (`Lid (_loc, “b”))))) 
 the .Fan.Lang.Meta.expr the first ‘.’ means it’s from the absolute namespace,  the N.expr shares exactly the same syntax without location, though
   4. Portable to diffierrent compiler extensions(like LexiFi’s fork of ocaml)
       I am pretty sure it’s pretty easy to do in Fan, only Ast2pt (dumping the intemediate Ast into Parsetree) part need to be changed to diffierent compilers.
Now let’s talk about some internal parts of SMP.
Quasi-Quotation is the essential part of SMP,  I am surprised so far that the discussion silently ignores the quasi-quotation, Leo’s answer of writing   three parsers is neither satisfying nor practical(imho). 
Camlp4 is mainly composed of two parts, one is the extensible parser and the other significant part is Ast Lifting. Since we all agree that extensible parser increases the complexity too much, let’s simply ignore that part.
The Ast Lifting are tightly coupled with the design of the Abstract Syntax Tree.  People complain about that Camlp4 Ast is hard to learn and using quasi-quotation to do the pattern match is a bad idea.
Let me explain the topic a bit:
    Camlp4Ast is hard to learn, I agree, it has some alien names that nobody understand what it  means, quosi-quotation is definitely a great idea to boom the meta-programming, but my experience here is for very very small Ast fragment, using the Abstract Syntax Tree directly, otherwise Quasi-quotation is a life saver to do the meta programming.
   Luckily the quotation kit has nothing to do with the parser part, it’s simply several functions(I did some simplify a bit) which turns a normal runtime  
value into an Ast node generically, such kind functions are neither easy to write nor easy to read,the idea case is that it should be generated once for all, and all the data types in normal ocamlshould be derived automatically(some ADT with functions can not be derived). I bet it’s mostly likely a nightmare if we maintain 3 parsers for the ocaml grammar while two other parsers dumping to a meta-level
   So, how to make Ast Lifting easier, 
        The first guideline is “Don’t mixing with records”, 
         Once you encoding AST with records, you have to encode the records in the meta level which increases the complexity without bringing any new features, it’s simply not worthwhile.
        The second guideline is “Don’t do any syntax desugaring” , syntax desguaring makes the semantics of syntax meta programming a bit weird. Syntax desguaring happens everywhere in Parsetree, think about the list literals, it uses the syntax desuaring, if you don’t use any syntax desugaring, for example, you want to match the bigarray access, you simply needed to match `Bigarray(..)’ instead of 
                     {txt= Ldot (Ldot (Lident “Bigarray”, array), (“get”|”set” as gs)) ;_};_},
       The third guideline is to make it as uniform as possible
       This not only helps the user, but it helps the meta-programming over types to derive some utility types. Take a look at my Ast encoding in Fan https://github.com/bobzhang/Fan/blob/master/src/Ast.ml (it needs to be polished, plz don’t panic when you see variants I use here)
      The initial Ast has locations and ant support, but here we derive 3 other Asts thanks to my very regular design. AstN is the Ast without locations, the locations are important, but it is simply not too much helpful when you only do the code generation, but it complicates the expanded code a lot), AstA is the Ast without antiquotations(simply remove the ant branch), it is a subtype of Ast(thanks to the choice we use variants here), AstNA is the Ast without neither locations nor antiquotations), it is a subtype of AstN.  In practice, I found the Ast without locations is particular helpful when you only do the code generation, it simplifies this part significantly. The beautiful part is that  all the four Ast share the same grammar with the same quosiquotatoin mechanism, as I showed .Fan.Lang.N.expr and .Fan.Lang.expr
    I don’t know how many parsers you have to maintain to reach such a goal or it’s never going to happen.
    Using variants to encode the intermediate ast has a lots of other benefits, but I don’t want to cover it in such a short mail.
   So, my proposal is that the community design an Intermediate Ast together, and write a built-in parser to such Intermediate Ast then dump to Parsetree, but I am for that Parsetree still needs to be cleaned a bit but not too much change .  I do appreciate you can take something away from Fan, I think the Parsetree is not the ideal part to do SMP, HTH


About these ads

2 thoughts on “Discussions on the Syntactic Meta Programming(wg-camlp4 list)

  1. As an outside observer of this discussion, I find your posts very interesting (and often intriguing). May I say though that your formatting (bold, italic, underline, horizontal rules, newlines in the middle of sentences) sometimes make your points slightly harder to follow for me ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s