Woo! Made an indentation grammar! Kinda!

2013-08-07 13:19:52 -04:00 · 2013-08-07 13:19:52 -04:00 · 96ec8b6319
commit 96ec8b6319
parent dff49a1cce
1 changed files with 30 additions and 0 deletions
--- a/articles/text_editors_with_contenteditable.article.html
+++ b/articles/text_editors_with_contenteditable.article.html
@ -174,4 +174,34 @@ identifier = letters:[A-Za-z]+ { return letters.join(""); }
 whitespace = [ \n\t]*
 	</code></pre>
 	<p>So, it nearly killed me, but I managed to figure out how to make a parser (using pegjs) that <em>actually parses indentation-based grammars!</em> The basic way it works is that as it matches a line, it reads in the number of spaces before it. Using that, it figures out what indentation level it's at - then, as it's returning the line's payload, it puts the payload into the right code block. It's hacky and awful but I think I understand how it works. I'd rather move the logic into the <code>block</code> rule rather than the <code>line</code> rule, but that's pretty simple in theory. Here you go:
 	<pre><code>
 { var code = [];
  var indentations = [0];
  var current_head = [code]; }
 b = line* { return code; }
 line = 
  (spaces:" "* &amp; 
  { var level = spaces.length;
    if (level &gt; indentations[indentations.length-1]) {
      indentations.push(level);
      var new_block = [];
      current_head[current_head.length-1].push(new_block);
      current_head.push(new_block)
    } else if (indentations.indexOf(level) == -1) {
      return false; 
    } else {
      while (indentations[indentations.length-1] != level) {
        indentations.pop()
        current_head.pop()
      }
    }
    return true; } 
  { return spaces.length; })
  payload:("ab" bs:"b"+ "a" { return "ab" + bs.join("") + "a"; }) 
  "\n"
  { current_head[current_head.length-1].push(payload);
    return payload; }
 	</code></pre>
 </article>