<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>hongboz</title>
	<atom:link href="http://hongboz.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://hongboz.wordpress.com</link>
	<description>Love the world</description>
	<lastBuildDate>Fri, 08 Feb 2013 04:50:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='hongboz.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>hongboz</title>
		<link>http://hongboz.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://hongboz.wordpress.com/osd.xml" title="hongboz" />
	<atom:link rel='hub' href='http://hongboz.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Syntactic Meta-Programming in OCaml (II)</title>
		<link>http://hongboz.wordpress.com/2013/02/05/syntactic-meta-programming-in-ocaml-ii-5/</link>
		<comments>http://hongboz.wordpress.com/2013/02/05/syntactic-meta-programming-in-ocaml-ii-5/#comments</comments>
		<pubDate>Tue, 05 Feb 2013 19:57:31 +0000</pubDate>
		<dc:creator>hongboz</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hongboz.wordpress.com/?p=155</guid>
		<description><![CDATA[In this post, we continue discussing syntactic meta-programming following last post. My years of experience in different meta-program system(Common Lisp, Template Haskell, Camlp4) tell me that quosi-quotation is the most essential part in syntactic meta programming. Though all three claims they have quosi-quotation support. But Template Haskell&#8217;s quosi-quotation falls far behind either Camlp4 or Common [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hongboz.wordpress.com&#038;blog=40164267&#038;post=155&#038;subd=hongboz&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>In this post, we continue discussing syntactic meta-programming<br />
following <a href="http://hongboz.wordpress.com/2013/01/28/random-thoughts-about-syntactic-meta-programming-i/">last post</a>.
</p>
<p>
My years of experience in different meta-program system(Common Lisp,<br />
Template Haskell, Camlp4) tell me that quosi-quotation is the most<br />
essential part in syntactic meta programming. Though all three claims<br />
they have quosi-quotation support. But Template Haskell&#8217;s<br />
quosi-quotation falls far behind either Camlp4 or Common Lisp. For a<br />
decent quosi-quotation system, first, nested quotation and<br />
anti-quotation is necessary, second, like Lisp, every part should be<br />
able to be quoted and antiquoted except keywords position, that&#8217;s to<br />
say, each part of the code fragment can be parametrized.
</p>
<p>
For the notation, we denote <code>Ast^0</code> as the normal Ast, <code>Ast^1</code> as Ast<br />
encoding <code>Ast^0</code>, the same as <code>Ast^n</code>.
</p>
<p>
So in this post, we discuss the quosi-quotation first.
</p>
<p>
The implementation of quosi-quotation heavily relies on the<br />
implementation of the compiler, so let&#8217;s limit the scope of how to get<br />
quosi-quotation done to OCaml.
</p>
<p>
Let&#8217;s ignore the antiquote part, and focus the quote part first, the<br />
essential of quosi-quotation is to encode the Ast using Ast itself in<br />
the meta level: there are different technologies to implement<br />
quosi-quotations, to my knowledge, I summarized three here:
</p>
<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Raw String manipulation</h3>
<div class="outline-text-3" id="text-1">
<p>   This is the most intuitive way, given a string input, the normal<br />
   way of parsing is transform it into a parsetree,
</p>
<pre class="src src-ocaml"><span style="color:#ffa500;font-weight:bold;">val</span> <span style="color:#dfaf8f;">parse</span><span style="color:#f0e68c;">:</span> <span style="color:#8cd0d3;">string </span><span style="color:#f0e68c;">-&gt;</span><span style="color:#8cd0d3;"> ast</span>
</pre>
<p>
   To encode the meta-level ast, we can do the unparsing again,<br />
   assume we have an unparsing function which unparse the ast
</p>
<pre class="src src-ocaml"><span style="color:#ffa500;font-weight:bold;">val</span> <span style="color:#dfaf8f;">unparse</span><span style="color:#f0e68c;">:</span> <span style="color:#8cd0d3;">ast </span><span style="color:#f0e68c;">-&gt;</span><span style="color:#8cd0d3;"> string</span>
</pre>
<p>
   so after the composition of parse and unparse, you transformed a<br />
   string into the meta-level
</p>
<pre class="src src-ocaml"><span style="color:#f0e68c;">(</span>parse <span style="color:#cc9393;">"3"</span><span style="color:#f0e68c;">)</span>
<span style="color:#f0e68c;">-</span> `Int <span style="color:#cc9393;">"3"</span>
unparse<span style="color:#f0e68c;">(</span>parse <span style="color:#cc9393;">"3"</span><span style="color:#f0e68c;">)</span>
<span style="color:#f0e68c;">-</span> <span style="color:#cc9393;">"`Int \"3\""</span>

</pre>
<p>
   Then you can do <code>parse</code> again, after <code>parse(unparse (parse "3"))</code>,<br />
   we managed to lift the Ast in the meta level. There are serious<br />
   defects with this way, First, it&#8217;s very brittle, since we are doing<br />
   string manipulation in different levels, second, after <b>unparsing</b>,<br />
   the location is totally lost, location is one of the most tedious<br />
   but necessary part for a practical meta programming system, third,<br />
   there is no easy way to integrate with antiquot. This technique is<br />
   quite intuitive and easy to understand, but I don&#8217;t know any<br />
   meta-system do it this way, so feel free to tell me if you know<br />
   anyone does similar work <img src='http://s1.wp.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />
</p>
</div>
</div>
<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">Maintaining different parsers</h3>
<div class="outline-text-3" id="text-2">
<p>
   Unlike the string manipulation, it write different parsers for<br />
   different actions. Suppose we are in OCaml, if we want to support<br />
   quosi-quotations in such syntax categories
</p>
<pre class="example">sig_item, str_item, patt, expr, module_type, module_expr, class_type
class_expr, class_sig_item, class_str_item, with_constr, binding, rec_binding,
match_case,
</pre>
<p>
   And you want the quosi-quotaion appears in both <code>expr</code> and <code>patt</code><br />
   positions, then you have to write <code>14 x (2+1)</code> parsers, the parser can<br />
   not be re-usable, if you want to support <code>overloaded quotations</code> (I<br />
   will talk about it later), then you have to roll your own parser<br />
   again. Writing parser is not hard, but it&#8217;s not fun either, and<br />
   keeping sync up different parsers is a nightmare.
</p>
<p>
   To make things worse, once anti-quotation is considered, for each<br />
   category, there are three parsers to write, but anti-quot makes<br />
   them slightly different. To be honest, this way is impractical.
</p>
</div>
</div>
<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Ast Lifting</h3>
<div class="outline-text-3" id="text-3">
<p>   Another mechanism to do quosi-quotation is that imaging we have a<br />
   powerful function:
</p>
<pre class="src src-ocaml"><span style="color:#ffa500;font-weight:bold;">val</span> <span style="color:#dfaf8f;">meta</span><span style="color:#f0e68c;">:</span> <span style="color:#8cd0d3;">ast</span><span style="color:#f0e68c;">^</span>0 <span style="color:#f0e68c;">-&gt;</span> ast<span style="color:#f0e68c;">^</span>1
</pre>
<p>
   This seems magic, but it&#8217;s possible even though in OCaml we don&#8217;t<br />
   have generic programming support, since we have the definition<br />
   of ast.
</p>
<p>
   The problem with this technique is that it requires an explicit<br />
   <code>Ant</code> tag in the ast representation, since at <code>ast^0</code> level, you<br />
   have to store <code>Ant</code> as intermediate node which will be removed when<br />
   applied <code>meta</code> function.
</p>
<p>
   Let&#8217;s walk through each phase in Fan
</p>
<p>
   Think about how such piece of code would be parsed in Fan:
</p>
<pre class="src src-ocaml"><span style="color:#f0e68c;">{:</span><span style="color:#8cd0d3;">expr</span><span style="color:#f0e68c;">|</span> <span style="color:#f0e68c;">$</span>x <span style="color:#f0e68c;">+</span> y<span style="color:#f0e68c;">|}</span>
</pre>
<p>
   For the first phase (I removed the location for simplicity)
</p>
<pre class="src src-ocaml">`App<span style="color:#f0e68c;">(</span>`App
      <span style="color:#f0e68c;">(</span> `Id <span style="color:#f0e68c;">(</span> `Lid <span style="color:#f0e68c;">(</span> <span style="color:#cc9393;">"+"</span><span style="color:#f0e68c;">)),</span>
        `Ant <span style="color:#f0e68c;">(</span> <span style="color:#f0e68c;">{</span>cxt <span style="color:#f0e68c;">=</span> <span style="color:#cc9393;">"expr"</span><span style="color:#f0e68c;">;</span> sep <span style="color:#f0e68c;">=</span> None<span style="color:#f0e68c;">;</span> decorations <span style="color:#f0e68c;">=</span> <span style="color:#cc9393;">""</span><span style="color:#f0e68c;">;</span> content <span style="color:#f0e68c;">=</span> <span style="color:#cc9393;">"x"</span><span style="color:#f0e68c;">})),</span>
     `Id <span style="color:#f0e68c;">(</span> `Lid <span style="color:#f0e68c;">(</span> <span style="color:#cc9393;">"y"</span><span style="color:#f0e68c;">)))</span>
</pre>
<p>
   Here <code>Ant</code> exists only as intermediate node, it will be eliminated<br />
   in the meta-step
</p>
<p>
   after applied with <code>meta</code> function
</p>
<pre class="src src-ocaml"><span style="color:#f0e68c;">(</span><span style="color:#8cd0d3;">Filters</span>.<span style="color:#8cd0d3;">ME</span>.meta_expr _loc <span style="color:#f0e68c;">(</span>t expr <span style="color:#cc9393;">"$x + y"</span><span style="color:#f0e68c;">));</span>
<span style="color:#f0e68c;">-</span> <span style="color:#f0e68c;">:</span> <span style="color:#8cd0d3;">FanAst.expr </span><span style="color:#f0e68c;">=</span>
`App
  <span style="color:#f0e68c;">(,</span>
   `App
     <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"App"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span>
      `App
        <span style="color:#f0e68c;">(,</span>
         `App
           <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"App"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span>
            `App
              <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"Id"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span>
               `App
                 <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"Lid"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span>
                  `Str <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"+"</span><span style="color:#f0e68c;">)))),</span>
         `Ant <span style="color:#f0e68c;">(,</span> <span style="color:#f0e68c;">{</span>cxt <span style="color:#f0e68c;">=</span> <span style="color:#cc9393;">"expr"</span><span style="color:#f0e68c;">;</span> sep <span style="color:#f0e68c;">=</span> None<span style="color:#f0e68c;">;</span> decorations <span style="color:#f0e68c;">=</span> <span style="color:#cc9393;">""</span><span style="color:#f0e68c;">;</span> content <span style="color:#f0e68c;">=</span> <span style="color:#cc9393;">"x"</span><span style="color:#f0e68c;">}))),</span>
   `App
     <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"Id"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span>
      `App <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"Lid"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span> `Str <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"y"</span><span style="color:#f0e68c;">))))</span>   
</pre>
<p>
   (t is a parsing function)<br />
   Here we see that <code>Ant</code> node is still kept and it will be<br />
   filtering, now we can filter the <code>Ant</code> node into a normal Ast,
</p>
<pre class="src src-ocaml"><span style="color:#f0e68c;">(</span><span style="color:#8cd0d3;">Ant</span>.antiquot_expander <span style="color:#f0e68c;">~</span>parse_patt <span style="color:#f0e68c;">~</span>parse_expr<span style="color:#f0e68c;">)#</span>expr <span style="color:#f0e68c;">(</span><span style="color:#8cd0d3;">Filters</span>.<span style="color:#8cd0d3;">ME</span>.meta_expr _loc <span style="color:#f0e68c;">(</span>t expr <span style="color:#cc9393;">" $x + y"</span><span style="color:#f0e68c;">));</span>
<span style="color:#f0e68c;">-</span> <span style="color:#f0e68c;">:</span> <span style="color:#8cd0d3;">FanAst.expr </span><span style="color:#f0e68c;">=</span>
`App
  <span style="color:#f0e68c;">(,</span>
   `App
     <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"App"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span>
      `App
        <span style="color:#f0e68c;">(,</span>
         `App
           <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"App"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span>
            `App
              <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"Id"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span>
               `App
                 <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"Lid"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span>
                  `Str <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"+"</span><span style="color:#f0e68c;">)))),</span>
         `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"x"</span><span style="color:#f0e68c;">)))),</span>
   `App
     <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"Id"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span>
      `App <span style="color:#f0e68c;">(,</span> `App <span style="color:#f0e68c;">(,</span> `Vrn <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"Lid"</span><span style="color:#f0e68c;">),</span> `Id <span style="color:#f0e68c;">(,</span> `Lid <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"_loc"</span><span style="color:#f0e68c;">))),</span> `Str <span style="color:#f0e68c;">(,</span> <span style="color:#cc9393;">"y"</span><span style="color:#f0e68c;">))))</span>   
</pre>
<p>
   (location in the meta-level is ignored)<br />
   If we want to share the same grammar between the <code>Ast^n(n=0,1,2,...)</code>,<br />
   Ast lifting (a function of type <code>Ast^0 -&gt; Ast^1</code>) is necessary.
</p>
</div>
</div>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">Summary</h2>
<div class="outline-text-2" id="text-1">
<p>
  We see the three techniques introduced here to do the<br />
  quosi-quotation, Fan adopts the third one, suppose we pick the<br />
  third one, let&#8217;s discuss what kind of Ast representation we need to<br />
  make life easier.
</p>
<p>
  As we discussed previously, introducing records in the Abstract Syntax<br />
  brings in un-necessary complexity when you want to encode the Ast<br />
  using the Ast itself since you have to express the record in the<br />
  meta-level as well.
</p>
<p>
  Another defect with current Parsetree is that it was designed without<br />
  meta-programming in mind, so it does not provide an <code>Ant</code> tag in all<br />
  syntax categories, so in the zero stage <code>Ast^0</code>, you can not have an<br />
  Ast node <code>$x</code> in the outermost, since it&#8217;s semantically incorrect in<br />
  <code>Ast^0</code>, but syntactically correct in <code>Ast^n(n=0,1,2,...)</code>
</p>
<p>
  The third defect with the <code>Parsetree</code> is that it&#8217;s quite irregular,<br />
  so you can not do any meta-programming with the parsetree itself, for<br />
  example, stripping all the location from the Ast node to derive a new<br />
  type without locations, deriving a new type without anti-quot tags (we<br />
  will see that such ability is quite important in <a href="https://github.com/bobzhang/Fan">Fan</a>)
</p>
<p>
  The fourth defect is more serious from the point of view of<br />
  semantics, since in OCaml, there is no way to express absolute path,<br />
  when you do the Ast lifting, the time you define Ast lifting is<br />
  different from the time you use the quotations
</p>
<p>
  Camlp4&#8242;s Ast is slightly better than Parsetree, since it does not<br />
  introduce records to increase the complexity.
</p>
<p>
  However, Camlp4&#8242;s Ast can not express the absolute path which<br />
  results in a semantics imprecise, another serious implementation<br />
  defect is that it tries to encode the anti-quote using both two<br />
  techniques: either explicit <code>Ant</code> tag or via string mangling, prefix<br />
  the string with <code>\\$:</code>, and Camlp4&#8242;s tag name is totally not<br />
  meaningful.
</p>
<p>
  Think a bit further , about syntactic meta-programming, what we<br />
  really care about is purely syntax, <code>Int "3"= should not be   different whether it is of type =expr</code> or <code>patt</code>, if we take a<br />
  location of ast node, we should not care about whether its type is<br />
  <code>expr</code> or <code>patt</code> or <code>str_item</code>, right?
</p>
<p>
  If we compose two ast node using semi syntax <code>;</code>, we really don&#8217;t<br />
  care about whether it&#8217;s expr node or patt node
</p>
<pre class="src src-ocaml"><span style="color:#ffa500;font-weight:bold;">let</span> <span style="color:#8cd0d3;">sem</span><span style="color:#dfaf8f;"> a b </span><span style="color:#f0e68c;">=</span> <span style="color:#f0e68c;">{|</span> <span style="color:#f0e68c;">$</span>a<span style="color:#f0e68c;">;</span> <span style="color:#f0e68c;">$</span>b <span style="color:#f0e68c;">|}</span>
</pre>
<p>
  The code above should work well under already syntax categories as<br />
  long as it support <code>`Sem</code> tag.
</p>
<p>
  Changing the underlying representation of Ast means all existing<br />
  code in Camlp4 engine can not be reused, since the quotation-kit no<br />
  longer apply in Fan, but the tough old days are already gone, Fan<br />
  already managed to provide the whole quotation kit from scratch.  In<br />
  the next post we will talk about the underly Ast using polymorphic<br />
  variants in Fan, and argue why it&#8217;s the right direction.
</p>
<p>
  Thanks for your reading!(btw, there&#8217;s a bug in Emacs org/blog, sorry for posting several times)
</p>
</div>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hongboz.wordpress.com/155/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hongboz.wordpress.com/155/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hongboz.wordpress.com&#038;blog=40164267&#038;post=155&#038;subd=hongboz&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hongboz.wordpress.com/2013/02/05/syntactic-meta-programming-in-ocaml-ii-5/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5285ae5e9b0eb3ff9612d458b48a04e1?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">hongboz</media:title>
		</media:content>
	</item>
		<item>
		<title>Discussions on the Syntactic Meta Programming(wg-camlp4 list)</title>
		<link>http://hongboz.wordpress.com/2013/01/31/discussions-on-the-syntactic-meta-programmingwg-camlp4-list/</link>
		<comments>http://hongboz.wordpress.com/2013/01/31/discussions-on-the-syntactic-meta-programmingwg-camlp4-list/#comments</comments>
		<pubDate>Thu, 31 Jan 2013 04:47:09 +0000</pubDate>
		<dc:creator>hongboz</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hongboz.wordpress.com/?p=69</guid>
		<description><![CDATA[There are some interesting discussions in the wg-camlp4 mailing list, I wrote a long mail yesterday, I cleaned it a bit, pasted it here  &#8212;&#8212;&#8212;  I rewrite the whole camlP4(named Fan) from scratch, building the quotation kit and throw away the crappy grammar parser, so plz believe me that I do understand the whole technology stack [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hongboz.wordpress.com&#038;blog=40164267&#038;post=69&#038;subd=hongboz&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>There are some interesting discussions in the wg-camlp4 mailing list, I wrote a long mail yesterday, I cleaned it a bit, pasted it here </p>
<p>&#8212;&#8212;&#8212;</p>
<div> I rewrite the whole camlP4(named Fan) from scratch, building the quotation kit and throw away the crappy grammar parser, so plz believe me <b>that I do understand the whole technology stack of camlP4</b>, if we could reach some consensus, I would be happy to handle over the maintaining of  Fan, Fan does not loose any feature compared with camlP4, in fact it has more interesting featrues.</div>
<div> </div>
<div>   Let&#8217;s begin with some easy, not too technical parts which has a significant effect on user experience though:</div>
<div>   1. Performance</div>
<div>          Performance does matter, it&#8217;s a shame that  the most time spent in compiling the ocaml compiler is dedicated to camlP4, but it is an engineering problem, currently compiling Fan only takes less than 20s, and it can be improved further</div>
<div>   2. Building issues</div>
<div>        The design of having side effects by dynamic loading is generically a bad idea, in Fan<b> the dynamic loading only register some functionality the Fan support,</b> it <b>does not have any other side effec</b>t, each file stands alone says which (ppx , or filters, or syntax) it want to use with a good default option. so the building is always something like &#8216;-pp fan pluging1 plugin2 plugin3&#8242;, <b>the order of pulgings does not matter</b>, also, l<b>oading all the plugins you have does not have any side effect, even better, you can do the static linking all the plugins you collected, the building process is simplified.  </b></div>
<div><b> </b> 3. Grammar Extension (<b>Language namespace</b>)</div>
<div><b>       </b>I concur that grammar extension arbitrarily is a bad idea, and I agree with Gabrier that so far only the quotation(Here  quotation means delimited DSL, quosi-quotation means Lisp style macros) is modular, composable, and  I also agree with Gabrier -ppx<b> should not be used to do syntax overriding (this should not be called syntax extension actually), </b>that&#8217;s a terrible idea to do syntax overriding, since the user never understand what&#8217;s going on underly without reading the Makefile. So here some my suggestion is that some really conevenient syntax extesion, i.e, (let try.. in) should be merged to the built in parser. quotations does not bring too much heavy syntax (imho). In Fan, we proposed the concept of a hierarchical language name space, since once quotation is heavily used, it&#8217;s really easy to introduce conflict, <b>the language namespace querying is exactly like java package namespace,</b> you can import, close import to save some typing.</div>
<div>    Here is a taste</div>
<div>   &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;</div>
<div>     {:.Fan.Lang.Meta.expr| a + b |} &#8212;&#8212;&gt; </div>
<div>      `App (`App ((`Id (`Lid &#8220;+&#8221;)), (`Id (`Lid &#8220;a&#8221;)))), (`Id (`Lid &#8220;b&#8221;)))</div>
<div>     {:.Fan.Lang.Meta.N.expr| a + b |}  &#8212;&#8211;&gt;</div>
<div>      `App</div>
<div>    (_loc,</div>
<div>      (`App</div>
<div>         (_loc, (`Id (_loc, (`Lid (_loc, &#8220;+&#8221;)))),</div>
<div>           (`Id (_loc, (`Lid (_loc, &#8220;a&#8221;)))))),</div>
<div>      (`Id (_loc, (`Lid (_loc, &#8220;b&#8221;))))) </div>
<div>
<div> &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;</div>
<div> the .Fan.Lang.Meta.expr the first &#8216;.&#8217; means it&#8217;s from the absolute namespace,  the <b>N.expr shares exactly the same syntax without location</b>, though</div>
</div>
<div> </div>
<div>   4. Portable to diffierrent compiler extensions(like LexiFi&#8217;s fork of ocaml)</div>
<div>       I am pretty sure it&#8217;s pretty easy to do in Fan, only Ast2pt (dumping the intemediate Ast into Parsetree) part need to be changed to diffierent compilers.
<div> </div>
<div>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</div>
<div>Now let&#8217;s talk about some internal parts of SMP.</div>
<div>Quasi-Quotation is the essential part of SMP,  I am surprised so far that the discussion <b>silently ignores the quasi-quotation,</b> Leo&#8217;s answer of writing   three parsers is neither satisfying nor practical(imho). </div>
<div> </div>
<div>Camlp4 is mainly composed of two parts, one is the extensible parser and <b>the other significant part is Ast Lifting</b>. Since we all agree that extensible parser increases the complexity too much, let&#8217;s simply ignore that part.</div>
<div> </div>
<div>The Ast Lifting are tightly coupled <b>with the design of the Abstract Syntax Tree.</b>  People complain about that Camlp4 Ast is hard to learn and using quasi-quotation to do the pattern match is a bad idea.</div>
<div> </div>
<div>Let me explain the topic a bit:</div>
<div>    Camlp4Ast is hard to learn, I agree, it has some alien names that nobody understand what it  means, quosi-quotation <b>is definitely a great idea</b> to boom the meta-programming, but my experience here is <b>for very very small Ast fragment, using the Abstract Syntax Tree directly,</b> otherwise Quasi-quotation is a life saver to do the meta programming.</div>
<div>   Luckily the quotation kit has nothing to do with the parser part, it&#8217;s simply several functions(I did some simplify a bit) which turns a normal runtime  </div>
<div>value into an Ast node generically, <b>such kind functions are neither easy to write nor easy to read</b>,<b>the idea case is that it should be generated once for all, and all the data types in normal ocaml</b><b>should be derived automatically</b>(some ADT with functions can not be derived). <b>I bet it&#8217;s mostly likely a nightmare if we maintain 3 parsers for the ocaml grammar while two other parsers dumping to a meta-level</b></div>
<div>  </div>
<div>   So, how to make Ast Lifting easier, </div>
<div>        The first guideline is <b>&#8220;Don&#8217;t mixing with records&#8221;, </b></div>
<div><b>         </b>Once you encoding AST with records, you have to encode the records in the meta level which increases the complexity without bringing any new features, <b>it&#8217;s simply not worthwhile.</b></div>
<div><b> </b></div>
<div><b>       </b> The second guideline is &#8220;Don&#8217;t do <b>any </b>syntax desugaring&#8221; , syntax desguaring makes the semantics of syntax meta programming a bit weird. Syntax desguaring happens everywhere in Parsetree, think about the list literals, it uses the syntax desuaring, if you don&#8217;t use any syntax desugaring, for example, you want to match the bigarray access, you simply needed to match `Bigarray(..)&#8217; instead of </div>
<div>
<div> </div>
<div>Pexp_apply</div>
<div>        ({pexp_desc=Pexp_ident</div>
<div>                     {txt= Ldot (Ldot (Lident &#8220;Bigarray&#8221;, array), (&#8220;get&#8221;|&#8221;set&#8221; as gs)) ;_};_},</div>
<div>         label_exprs)</div>
</div>
<div>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</div>
<div>       The third guideline is to<b> </b>make it <b>as uniform as possible</b></div>
<div><b>       </b>This not only helps the user, but <b>it helps the meta-programming over types to derive some utility types. </b>Take a look at my Ast encoding in Fan <a href="https://github.com/bobzhang/Fan/blob/master/src/Ast.ml" target="_blank">https://github.com/bobzhang/Fan/blob/master/src/Ast.ml</a> (it needs to be polished, plz don&#8217;t panic when you see variants I use here)</div>
<div><b>      </b>The initial Ast has locations and ant support, but<b> here we derive 3 other Asts thanks to my very regular design</b>.<b> AstN is the Ast without locations</b>, the locations are important, but it is simply not too much helpful when you only do the code generation, but it complicates the expanded code a lot), <b>AstA is the Ast without antiquotations(simply remove the ant branch), </b>it is a subtype of Ast(thanks to the choice we use variants here), <b>AstNA is the Ast without neither locations nor antiquotations</b>), it is a subtype of AstN.  <b>In practice, I found the Ast without locations is particular helpful when you only do the code generation, it simplifies this part significantly.<i><span style="text-decoration:underline;"> The beautif</span></i></b><span style="text-decoration:underline;"><b><i>u</i></b><i>l part is that  all the four Ast share the same grammar with the same quosiquotatoin mechanism, as I showed .Fan.Lang.N.expr and .Fan.Lang.expr</i></span></div>
<div>    I don&#8217;t know how many parsers you have to maintain to reach such a goal or it&#8217;s never going to happen.</div>
<div>    Using variants to encode the intermediate ast has a lots of other benefits, but I don&#8217;t want to cover it in such a short mail.</div>
<div> </div>
<div>   So,<b> my proposal is that the community design an Intermediate Ast together, and write a built-in parser to such Intermediate Ast then dump to Parsetree, but I am for that Parsetree still needs to be cleaned a bit but not too much change .  </b>I do appreciate you can take something away from Fan, I think the Parsetree is<b> not the ideal part</b> to do SMP, HTH</div>
</div>
<p> </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hongboz.wordpress.com/69/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hongboz.wordpress.com/69/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hongboz.wordpress.com&#038;blog=40164267&#038;post=69&#038;subd=hongboz&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hongboz.wordpress.com/2013/01/31/discussions-on-the-syntactic-meta-programmingwg-camlp4-list/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5285ae5e9b0eb3ff9612d458b48a04e1?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">hongboz</media:title>
		</media:content>
	</item>
		<item>
		<title>Random thoughts about Syntactic Meta Programming (I)</title>
		<link>http://hongboz.wordpress.com/2013/01/28/random-thoughts-about-syntactic-meta-programming-i/</link>
		<comments>http://hongboz.wordpress.com/2013/01/28/random-thoughts-about-syntactic-meta-programming-i/#comments</comments>
		<pubDate>Mon, 28 Jan 2013 05:00:00 +0000</pubDate>
		<dc:creator>hongboz</dc:creator>
				<category><![CDATA[Hello]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hongboz.wordpress.com/?p=59</guid>
		<description><![CDATA[I should write this blog long time ago, but I am so adddicted to Fan that I don&#8217;t have time to write it, programming is much more fun than blogging. Anyway, better late than never, XD. What&#8217;s syntactic meta programming? What&#8217;s meta programming? Meta programming is an interesting but also challenging domain, the essential idea [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hongboz.wordpress.com&#038;blog=40164267&#038;post=59&#038;subd=hongboz&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I should write this blog long time ago, but I am so adddicted to <a href="https://github.com/bobzhang/Fan">Fan</a>  that I don&#8217;t have time to write it, programming is much more fun than blogging. </p>
<p> Anyway, better late than never, XD. </p>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">What&#8217;s syntactic meta programming?</h2>
<div class="outline-text-2" id="text-1">     </div>
<div id="outline-container-1-1" class="outline-3">
<h3 id="sec-1-1">What&#8217;s meta programming?</h3>
<div class="outline-text-3" id="text-1-1">
<p>    Meta programming is an interesting but also challenging domain, the    essential idea is that &#8220;program as data&#8221;. Wait, you may wonder that    in <a href="http://en.wikipedia.org/wiki/Von_Neumann_architecture">Von Neumann architecture</a>, program is <b>always</b> data, so to be more    precise, meta programming is kinda &#8220;program as structured data&#8221;, the    structured data should be easy to manipulate and generate. Think    about Lisp, since it does not have any concrete syntax, its program    is always <a href="http://en.wikipedia.org/wiki/S-expression">S-expression</a>, a hierachical data structure which is easy    to manipulate and process. </p>
</p></div>
</p></div>
<div id="outline-container-1-2" class="outline-3">
<h3 id="sec-1-2">Meta-program at different layers</h3>
<div class="outline-text-3" id="text-1-2">
<p>    When you write a compiler, the program should have different    representations in different stages, think about the ocaml compiler    workflow </p>
<pre class="example">Raw String --&gt;  Token Stream --&gt;
Parsetree --&gt; Typedtree --&gt; Lambda --&gt;
ULambda --&gt; C-- --&gt; Mach --&gt; Linear --&gt;
Assembly
</pre>
<p>    So, at different stages, the program as a structured data can be    processed in different ways. </p>
<p>    You can insert plugins per level, for example, the c macros mainly    does the token stream transformation, but there is a problem with    the token stream that it is not a structured data. </p>
<p>    Ther earlier stage you do the transformation, the easier it is to    be mapped to you original source program, the later stage you do    the transformation, the compiler do more program analysis, but it&#8217;s    harder to map to the original program. So each stage has its use    case. </p>
<p>    Here we only talk about syntactic meta programming(SMP), where the    layer is in the parsetree or called Abstract Syntax and we only    talk about the host language <a href="http://caml.inria.fr/">OCaml</a> (OCaml is really a great    language, you should have a try!), but some high level design    choices should be applied to other host languages as well. </p>
</p></div>
</p></div>
</p></div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2">The essential part of SMP</h2>
<div class="outline-text-2" id="text-2">
<p>   I suggest anyone who are interested in SMP should learn <a href="http://en.wikipedia.org/wiki/Common_Lisp">Common Lisp</a>,   there are so many brilliant ideas there and forgotten by people   outside the community. And two books are really fun, one is <a href="http://www.paulgraham.com/onlisp.html">On Lisp</a>,   the other is <a href="http://letoverlambda.com/">Let Over Lambda</a> . </p>
<p>   The essential part of SMP is Quasi-Quotation. There is a nice paper   introduces the benefits of Quasi-Quotation: <a href="#people.csail.mit.edu-alan-ftp-quasiquote-v59.ps.gz">Quasiquotation in Lisp</a>. </p>
<p>   Here we only scratch its surface a tiny bit: &#8220;Quasiquotation is a   <b>parameterized version</b> of ordinary quotation, where instead of   specifying a value exactly, some <b>holes</b> are left to be filled in   later. A quasiquotation is a template.&#8221;, breifly, quasi-quotation   entitiles you the ability to <b>abstraction over code</b>. </p>
<p>   As the paper said, a typical use of quasiquotation in a macro   definition looks like </p>
<pre class="src src-lisp">(<span style="color:#af00ff;">defmacro</span> (<span style="color:#0000ff;">push</span> expr var)
 `(set! ,var (cons ,expr ,var)))
</pre>
<p>   Here the &#8220;`&#8221; introduces a quasi-quotaion, and &#8220;,&#8221; introduces a   parameter(we also call it anti-quote), there are a number of   languages which supports quasiquotation except the lisp family, but   <b>none</b> of them are even close to Lisp. </p>
<p>   One challenging part lies not in quote part, it lies in <b>anti-quote</b>   part, however. In lisp, you can antiquote <b>everywhere</b>, suppose you   are writing <code>Template Haskell</code>, you can write some thing like this </p>
<pre class="example">[| import $module |]
</pre>
<p>   In lisp, it allows very <b>fine-grained</b> quasi-quote. </p>
<p>   The other challegning part is <b>nested quosi-quotation</b>. Since   meta-program itself  is a normal program, when you do meta   programming a lot in Common Lisp, you will find you wrote a lot of   duplicated meta-programs, here nested quasi-quotation came to   rescue. </p>
<p>   Discussing nested quasi-quotation may goes beyond the scope of the   first blog, but you can have a taste here </p>
<pre class="src src-lisp">(<span style="color:#af00ff;">defmacro</span> (<span style="color:#0000ff;">def-caller</span> abbrev proc)
 `(<span style="color:#af00ff;">defmacro</span> (,abbrev var expr)
    `(,`,proc (<span style="color:#af00ff;">lambda</span> (,var) ,expr))))
</pre>
</p></div>
<div id="outline-container-2-1" class="outline-3">
<h3 id="sec-2-1">Some defects in Lisp Style Macors</h3>
<div class="outline-text-3" id="text-2-1">
<p>    Though I really enjoyed Lisp Macros, to be honest, the S-expression    as concrete syntax to represent a program is not the optimal way to    express ideas. </p>
<p>    For the extreme flexibility, you have to pay that for each program    you use a sub-optimal concrete syntax. </p>
<p>    The second problem is that Lisp is a dynamically typed language,    though currently practical type system can help catch only some    trivial errors, but they <b>do help a lot</b>. </p>
<p>    For a sufficient smart compiler, like <a href="http://www.sbcl.org/">SBCL</a>, they did type inference    or constraint propgation, and that <b>emits really helpful warnings</b>,    the type checking may not be that important there, but that depends    on the compiler implementation, some young implementations, like    <a href="http://clojure.org/">clojure</a>, the compiler is not smart enough to help diagnose, yet. </p>
<p>    The third problem is that Lisp macros ignore <b>locations</b> totally,    when you process the raw S-expression, no location is kept, in some    domains, code generation, for example, location is not that    important since you only emit a large trunk of code, in other    domains, Ast transformation, location is important to help emit    helpful error messages. Keeping location correct is very tedious    but necessary, IMHO. Some meta programming system, Template    Haskell, ignores locations as well. </p>
</p></div>
</p></div>
</p></div>
<div id="outline-container-3" class="outline-2">
<h2 id="sec-3">How to do SMP in rich syntax language</h2>
<div class="outline-text-2" id="text-3">
<p>   Now let&#8217;s go back to OCaml, the great language XD. </p>
<p>   It is the same as Lisp, you have to encode the Ast in the host   language, you can encode the ocaml&#8217;s Ast using S-expression as well. </p>
<p>   S-expression is a viable option, <a href="http://felix-lang.org/">Felix</a> adopts this mechanism. The   advantage of using S-exprssion to encode the S-expression is that   you can reach <b>the maximum code reuse</b> and <b>don&#8217;t need to fight   against the type system</b> from time to time. </p>
<p>   For example, in <a href="http://brion.inria.fr/gallium/index.php/Camlp4">Camlp4</a>, once you want to get the location of an Ast   node, you have to fix its type, so if have to write a lot of   bolierpolate code like this </p>
<pre class="src src-ocaml"><span style="color:#0000ee;font-weight:bold;">val</span> <span style="color:#af5f00;">loc_of_expr</span><span style="color:#af0000;">:</span> <span style="color:#008700;">expr </span><span style="color:#af0000;">-&gt;</span><span style="color:#008700;"> loc</span>
<span style="color:#0000ee;font-weight:bold;">val</span> <span style="color:#af5f00;">loc_of_ctyp</span><span style="color:#af0000;">:</span> <span style="color:#008700;">ctyp </span><span style="color:#af0000;">-&gt;</span><span style="color:#008700;"> loc</span>
<span style="color:#0000ee;font-weight:bold;">val</span> <span style="color:#af5f00;">loc_of_patt</span><span style="color:#af0000;">:</span> <span style="color:#008700;">patt </span><span style="color:#af0000;">-&gt;</span><span style="color:#008700;"> loc</span>
<span style="color:#af0000;">....</span>
</pre>
<p>   Things turn out to be better with <a href="http://en.wikipedia.org/wiki/Type_class">type class</a> support in Haskell, but   that&#8217;s another story.  </p>
<p>   Think about the case you want to use a <b>semi</b> <code>;</code> to connect two Ast   node, you have to write things like </p>
<pre class="src src-ocaml"><span style="color:#0000ee;font-weight:bold;">let</span> <span style="color:#0000ff;">sem</span><span style="color:#af5f00;"> e1 e2 </span><span style="color:#af0000;">=</span>
   <span style="color:#0000ee;font-weight:bold;">let</span> <span style="color:#af5f00;">_loc </span><span style="color:#af0000;">=</span> <span style="color:#008700;">Loc</span>.merge <span style="color:#af0000;">(</span>loc_of_expr e1 <span style="color:#af0000;">)</span> <span style="color:#af0000;">(</span>loc_of_expr e2<span style="color:#af0000;">)</span> <span style="color:#0000ee;font-weight:bold;">in</span>
   Sem<span style="color:#af0000;">(</span>_loc<span style="color:#af0000;">,</span> e1<span style="color:#af0000;">,</span>e2<span style="color:#af0000;">)</span>
</pre>
<p>   Everytime you want to fetch the location, you have to <b>fix its   type</b>,  that&#8217;s too bad, the API to process the Syntax is <b>too verbose</b> </p>
<p>   But using Algebraic Data Type <b>does have some advantages</b>, the first   is <b>pattern match</b> (with exhuastive check), the second is type   checking, we do tell some difference between <code>Ast.expr</code> and   <code>Ast.patt</code>, and that helps, but you can not tell whether it&#8217;s an   expresson of type int or type boolean, for example  </p>
<pre class="src src-ocaml"><span style="color:#af0000;">(</span>Int <span style="color:#87005f;">"3"</span> <span style="color:#af0000;">:</span> <span style="color:#008700;">expr</span><span style="color:#af0000;">)</span>
<span style="color:#af0000;">(</span>String <span style="color:#87005f;">"3"</span> <span style="color:#af0000;">:</span><span style="color:#008700;">expr</span><span style="color:#af0000;">)</span>
</pre>
<p>   <a href="#http-www.metaocaml.org">MetaOCaml</a> can guarantees the type correctness, but there is always a   trade off between expressivity and type safety. Anyway, in a   staticly typed language, i.e, OCaml, the generated program is always   type checked.  </p>
<p>   So, in OCaml or other ML dialects , you can encode the Abstract   Syntax using one of those: untyped s-expression, partial typed sum   types, records, GADT, or mixins of records and sum types.   there is another unique solution which exists in OCaml, <a href="http://caml.inria.fr/pub/docs/manual-ocaml-4.00/manual006.html">variants</a>. </p>
<p>   We will discuss it further in the next post. </p>
</p></div>
<div id="outline-container-3-1" class="outline-3">
<h3 id="sec-3-1">Quasi-quotation in OCaml</h3>
<div class="outline-text-3" id="text-3-1">
<p>       Quasi-quotation in lisp is free, since the concrete syntax is    exactly the same as abstract syntax. </p>
<pre class="src src-lisp">(+ a 3 4) <span style="color:#af0000;">;; </span><span style="color:#af0000;">program</span>

`(+ a 3 4) <span style="color:#af0000;">;; </span><span style="color:#af0000;">data </span>
</pre>
<p>    There is a paper which summarizes how to do quasi-quotation in rich    syntax language: <a href="http://ipaper.googlecode.com/git-history/969fbd798753dc0b10ea9efe5af7773ff10f728a/Miscs/why-its-nice-to-be-quoted.pdf">Why it&#8217;s nice to be quoted.</a>  </p>
<p>    Unlike Lisp, the different between program and data is obvious </p>
<pre class="src src-ocaml">3 <span style="color:#af0000;">(* </span><span style="color:#af0000;">program </span><span style="color:#af0000;">*)</span>
`Int <span style="color:#af0000;">(</span>_loc<span style="color:#af0000;">,</span> <span style="color:#87005f;">"3"</span><span style="color:#af0000;">)</span> <span style="color:#af0000;">(* </span><span style="color:#af0000;">data </span><span style="color:#af0000;">*)</span>

<span style="color:#87005f;">"3"</span> <span style="color:#af0000;">(* </span><span style="color:#af0000;">program </span><span style="color:#af0000;">*)</span>
`String <span style="color:#af0000;">(</span>_loc<span style="color:#af0000;">,</span> <span style="color:#87005f;">"3"</span><span style="color:#af0000;">)</span> <span style="color:#af0000;">(* </span><span style="color:#af0000;">data </span><span style="color:#af0000;">*)</span>
</pre>
<p>    (Here we use a qutoe &#8220;`&#8221; to denote that it&#8217;s an Ast ) </p>
<p>    Let&#8217;s take a look at the parsing phase (for simplicity, we ignore    the locations). </p>
<p>    When you do the parsing, the normal behavior  is as follows: </p>
<pre class="src src-ocaml"> <span style="color:#87005f;">"3 + 4"</span>
 <span style="color:#af0000;">==&gt;</span> <span style="color:#af00ff;">to</span> the Ast 
`App <span style="color:#af0000;">((</span>`App <span style="color:#af0000;">((</span>`Id <span style="color:#af0000;">(</span>`Lid <span style="color:#87005f;">"+"</span><span style="color:#af0000;">)),</span> <span style="color:#af0000;">(</span>`Int <span style="color:#87005f;">"3"</span><span style="color:#af0000;">))),</span> <span style="color:#af0000;">(</span>`Int <span style="color:#87005f;">"4"</span><span style="color:#af0000;">))</span>
</pre>
<p>    But to do the quasi-quotation, you need to turn the Ast itself into    data, so you need to encode the Ast using the Ast itself </p>
<pre class="src src-ocaml"><span style="color:#87005f;">"3+4"</span>
<span style="color:#af0000;">==&gt;</span> <span style="color:#af00ff;">to</span> the Ast
`App <span style="color:#af0000;">((</span>`App <span style="color:#af0000;">((</span>`Id <span style="color:#af0000;">(</span>`Lid <span style="color:#87005f;">"+"</span><span style="color:#af0000;">)),</span> <span style="color:#af0000;">(</span>`Int <span style="color:#87005f;">"3"</span><span style="color:#af0000;">))),</span> <span style="color:#af0000;">(</span>`Int <span style="color:#87005f;">"4"</span><span style="color:#af0000;">))</span>

<span style="color:#af0000;">==&gt;</span> <span style="color:#af00ff;">to</span> the Data
`App
 <span style="color:#af0000;">((</span>`App
     <span style="color:#af0000;">((</span>`Vrn <span style="color:#87005f;">"App"</span><span style="color:#af0000;">),</span>
       <span style="color:#af0000;">(</span>`App
          <span style="color:#af0000;">((</span>`App
              <span style="color:#af0000;">((</span>`Vrn <span style="color:#87005f;">"App"</span><span style="color:#af0000;">),</span>
                <span style="color:#af0000;">(</span>`App <span style="color:#af0000;">((</span>`Vrn <span style="color:#87005f;">"Id"</span><span style="color:#af0000;">),</span> <span style="color:#af0000;">(</span>`App <span style="color:#af0000;">((</span>`Vrn <span style="color:#87005f;">"Lid"</span><span style="color:#af0000;">),</span> <span style="color:#af0000;">(</span>`Str <span style="color:#87005f;">"+"</span><span style="color:#af0000;">))))))),</span>
            <span style="color:#af0000;">(</span>`App <span style="color:#af0000;">((</span>`Vrn <span style="color:#87005f;">"Int"</span><span style="color:#af0000;">),</span> <span style="color:#af0000;">(</span>`Str <span style="color:#87005f;">"3"</span><span style="color:#af0000;">))))))),</span>
   <span style="color:#af0000;">(</span>`App <span style="color:#af0000;">((</span>`Vrn <span style="color:#87005f;">"Int"</span><span style="color:#af0000;">),</span> <span style="color:#af0000;">(</span>`Str <span style="color:#87005f;">"4"</span><span style="color:#af0000;">))))</span>
</pre>
<p>    So, to do it once for all, we needs      a function (for simplicty) </p>
<pre class="src src-ocaml"><span style="color:#0000ee;font-weight:bold;">val</span> <span style="color:#af5f00;">meta_expr</span><span style="color:#af0000;">:</span> <span style="color:#008700;">expr</span><span style="color:#af0000;">^</span>0 <span style="color:#af0000;">-&gt;</span> expr<span style="color:#af0000;">^</span>1 
</pre>
<p>    Luckily since <code>expr^1</code> is a subset of <code>expr^0</code>, so you get the    belowing function for free </p>
<pre class="src src-ocaml"><span style="color:#0000ee;font-weight:bold;">val</span> <span style="color:#af5f00;">meta_expr</span><span style="color:#af0000;">:</span> <span style="color:#008700;">expr</span><span style="color:#af0000;">^</span>1 <span style="color:#af0000;">-&gt;</span> expr<span style="color:#af0000;">^</span>2 
</pre>
<p>    Actually you may find that the category <code>expr^2</code> is exactly the    same as <code>expr^1</code>, so once you have <code>expr^0 -&gt; expr^1</code>, you have    <code>expr^0 -&gt; expr^n</code>. (antiquotation will be discussed later). </p>
<p>    So the problem only lies into how to write the function    <code>expr^0-&gt;expr^1</code>,  you need to encode the Ast using the Ast itself,    this requires that the Ast should be expressive enough to express    itself. This is alwasy not true, suppose you use the <a href="http://en.wikipedia.org/wiki/Higher-order_abstract_syntax">HOAS</a>, HOAS is    not expressive enough to express itself. </p>
<p>    If you mixin the records with sum types, you have to express both    records and sum types, the Ast lifting is <b>neither easy to write</b>,    <b>nor easy to read</b>, with locations, it becomes even more cmoplex,    the best case is to <b>do it automatically and once for all</b>. </p>
<p>    Suppose you only use sum types, luckily we might find that only    <b>five tags</b> are expressive enough to express this function <code>expr^0    -&gt; expr^1</code>, here are <b>five tags</b> </p>
<pre class="src src-ocaml">App Vrn Str Tup Com
</pre>
<p>    Here <code>Tup</code> means &#8220;tuple&#8221;, and <code>Com</code> means &#8220;Comma&#8221;. </p>
<p>       The minimal, the better, this means as long as the changes to the    Abstract Syntax Tree does not involves the <b>five tags</b>, it will    always work out of the box. </p>
<p>    So, to design the right Ast for meta programming, the first thing    is to <b>keep it simple</b>, don&#8217;t use <b>Records</b> or other complex data    types , Sum types or polymorphic variants are rich enough to    express the who syntax of ocaml but itself is very simple to do the    Ast Lifting. </p>
<p>    In the next blog, we may discuss tThe right way to design an    Abstract Syntax Tree for SMP. </p>
</p></div>
</p></div>
</p></div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hongboz.wordpress.com/59/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hongboz.wordpress.com/59/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hongboz.wordpress.com&#038;blog=40164267&#038;post=59&#038;subd=hongboz&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hongboz.wordpress.com/2013/01/28/random-thoughts-about-syntactic-meta-programming-i/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5285ae5e9b0eb3ff9612d458b48a04e1?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">hongboz</media:title>
		</media:content>
	</item>
		<item>
		<title>Fan, A langugage to implement languages (I)</title>
		<link>http://hongboz.wordpress.com/2012/11/13/fan-a-langugage-to-implement-languages-i/</link>
		<comments>http://hongboz.wordpress.com/2012/11/13/fan-a-langugage-to-implement-languages-i/#comments</comments>
		<pubDate>Tue, 13 Nov 2012 05:00:00 +0000</pubDate>
		<dc:creator>hongboz</dc:creator>
				<category><![CDATA[Fan]]></category>

		<guid isPermaLink="false">http://hongboz.wordpress.com/?p=54</guid>
		<description><![CDATA[This will be a series of blogs introducing a new programming language Fan. Fan is OCamlPlus, it provides all features what OCaml provides and a language to manipulate programs. I am also seeking collaboration if you are interested in such a fascinating project. It aims to provide the OCaml + A Compiler Domain Specific Language. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hongboz.wordpress.com&#038;blog=40164267&#038;post=54&#038;subd=hongboz&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>This will be a series of blogs introducing a new programming language <a href="https://github.com/bobzhang/Fan">Fan</a>. </p>
<p> Fan is OCamlPlus, it provides all features what <a href="http://caml.inria.fr/">OCaml</a> provides and a language to manipulate programs. I am also <b>seeking collaboration</b> if you are interested in such a fascinating project. </p>
<p> It aims to provide the <code>OCaml + A Compiler Domain Specific Language</code>. The compiler domain is a bit special, it&#8217;s the compiler domain which can be used by users to create their own domain specific languages, e.g, database query, financial modelling. Our purpose is to make you write a practical compiler <code>in one day</code>, yes, this is not a joke, with the right tools and nice abstraction, it&#8217;s very promising to help average programmers to create their own languages to fit their domains in a short term. </p>
<p> The compiler domain is a rather large domain, it consists of several sub-domains, so the compiler of Fan itself also benefits from the Domain specific language(DSL). Unlike other bootstrapping model, <b>all features</b> of the previous version of Fan compiler is <b>usable</b> for the next release. Yes, Fan is written using itself, it&#8217;s really Fun <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  </p>
<p> Fan evolved from the <a href="http://brion.inria.fr/gallium/index.php/Camlp4">Camlp4</a>, but with a more ambitious goal and different underlying engines, I will compare them later. </p>
<p> Ok, let&#8217;s talk business. </p>
<p> Why a new programming language? Because I don&#8217;t find a programming language make me happy (yet). </p>
<p> Thinking about how you solve a problem. </p>
<p> It&#8217;s mainly divided into two steps. </p>
<ul>
<li>The first step is to think of an algorithm to tackle the problem,   without ambiguity. This is what we call <b>inherent complexity</b>,   however fancy the programming language it is, you still have to think   of a way to solve it.  </li>
<li>The second step is to map your algorithm into your favourite   language, i.e, Haskell. Ideally, it should be straightforward, but   in reality, it will bring a lot of trouble, and we call it   <b>accidental complexity</b>. </li>
</ul>
<p>   What we can do to enhance a programmer&#8217;s productivity lies in how to avoid the <b>accidental complexity</b>, the second step. </p>
<p> The problem lies that your favourite language was not designed for your specific domain, it&#8217;s a <b>general</b> purpose programming language. When you transfer your ideas into your language, you have to do <code>a lot of dirty work</code>. With the help of modern IDE, people may be alleviated a bit, but programs are not just written to execute, its more functional goal is to help exchange ideas. When you want to understand how a piece of program work, you have to do the <code>reverse-engineering</code> to map your programs back into your ideas. Because when you do the translation from your ideas into your programs, you will lose the big picture, the initial brief ideas are mixed with a lot of noises. </p>
<p> This is a sad fact that how programmers do the work nowadays. <img src='http://s0.wp.com/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' />  </p>
<p> &#8220;When you have a hammer, everything is a nail&#8221;. </p>
<p> One difference between human being and animals is the fact that man can use tools, the fact that man can not only use tools but also create tools makes human-beings so intelligent. It&#8217;s a sad fact that most programmers still live in the cave-age, they can only accept what tools provided. Smart programmers should create a tool which is best fit for their domain. </p>
<p> So, what&#8217;s the right way to solve a problem? </p>
<p> When you find some similar problems appear once and again, try to design your language which makes you can express your ideas <b>as isomorphic as possible</b> to the problem&#8217;s descriptions, then write a compiler to compile the language. Then it&#8217;s done. People who read your program will understand it straight-forward, you write your programs quickly, everything seems to be perfect, everyone is happy. </p>
<p> Wait, you may find that I am cheating, writing a toy-language is not hard, writing a medium language is painful, creating a general purpose language is too hard, and communicating your legacy library with your new language will drive you crazy. So you may say:&#8221;let&#8217;s forget about it&#8221; and shy away. </p>
<p> Yes, that&#8217;s true, and that&#8217;s why I design a new programming language to address such an issue, remember that creating a language itself <b>is a domain</b>, this domain shares some similar abstractions which should be factored out. And to make life happier, you are extending a general purpose programming language to fit your domain instead of creating a brand new language, and they are compiled into <b>the same intermediate representation</b>, like C# and VB, you never have an inter-operation problem. </p>
<p> Once you finished the language for one domain, your productivity will be boosted exponentially in such a domain. </p>
<p> Fan is created to help you achieve such a goal! </p>
<p> There are different abstraction and DSL solutions, next post I will compare them and talk about the solution Fan chooses and its good and bad effects. </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hongboz.wordpress.com/54/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hongboz.wordpress.com/54/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hongboz.wordpress.com&#038;blog=40164267&#038;post=54&#038;subd=hongboz&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hongboz.wordpress.com/2012/11/13/fan-a-langugage-to-implement-languages-i/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5285ae5e9b0eb3ff9612d458b48a04e1?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">hongboz</media:title>
		</media:content>
	</item>
	</channel>
</rss>
