cleaned and commented the lexer (again) interpolate_string() continues to shrink

This commit is contained in:
Jeremy Ashkenas 2010-03-07 12:47:03 -05:00
parent f74fae58e3
commit 4906cf1aff
4 changed files with 403 additions and 335 deletions

View File

@ -36,7 +36,7 @@ to avoid having a JavaScript error at runtime.</p> </td>
be used as identifiers or properties.</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">JS_FORBIDDEN: </span><span class="nx">JS_KEYWORDS</span><span class="p">.</span><span class="nx">concat</span> <span class="nx">RESERVED</span></pre></div> </td> </tr> <tr id="section-9"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-9">#</a> </div> <p>Token matching regexes.</p> </td> <td class="code"> <div class="highlight"><pre><span class="nx">IDENTIFIER</span> <span class="o">:</span> <span class="sr">/^([a-zA-Z$_](\w|\$)*)/</span>
<span class="nx">NUMBER</span> <span class="o">:</span> <span class="sr">/^(\b((0(x|X)[0-9a-fA-F]+)|([0-9]+(\.[0-9]+)?(e[+\-]?[0-9]+)?)))\b/i</span>
<span class="nx">HEREDOC</span> <span class="o">:</span> <span class="sr">/^(&quot;{6}|&#39;{6}|&quot;{3}\n?([\s\S]*?)\n?([ \t]*)&quot;{3}|&#39;{3}\n?([\s\S]*?)\n?([ \t]*)&#39;{3})/</span>
<span class="nx">INTERPOLATION</span> <span class="o">:</span> <span class="sr">/(^|[\s\S]*?(?:[\\]|\\\\)?)\$([a-zA-Z_@]\w*|{[\s\S]*?(?:[^\\]|\\\\)})/</span>
<span class="nx">INTERPOLATION</span> <span class="o">:</span> <span class="sr">/^\$([a-zA-Z_@]\w*)/</span>
<span class="nx">OPERATOR</span> <span class="o">:</span> <span class="sr">/^([+\*&amp;|\/\-%=&lt;&gt;:!?]+)/</span>
<span class="nx">WHITESPACE</span> <span class="o">:</span> <span class="sr">/^([ \t]+)/</span>
<span class="nx">COMMENT</span> <span class="o">:</span> <span class="sr">/^(((\n?[ \t]*)?#[^\n]*)+)/</span>
@ -64,31 +64,45 @@ treated as identifiers.</p> </td> <td class="code">
occurs at the start of a line. We disambiguate these from trailing whens to
avoid an ambiguity in the grammar.</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">BEFORE_WHEN: </span><span class="p">[</span><span class="s1">&#39;INDENT&#39;</span><span class="p">,</span> <span class="s1">&#39;OUTDENT&#39;</span><span class="p">,</span> <span class="s1">&#39;TERMINATOR&#39;</span><span class="p">]</span></pre></div> </td> </tr> <tr id="section-15"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-15">#</a> </div> <h2>The Lexer Class</h2> </td> <td class="code"> <div class="highlight"><pre></pre></div> </td> </tr> <tr id="section-16"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-16">#</a> </div> <p>The Lexer class reads a stream of CoffeeScript and divvys it up into tagged
tokens. A minor bit of the ambiguity in the grammar has been avoided by
pushing some extra smarts into the Lexer.</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">exports.Lexer: </span><span class="nx">class</span> <span class="nx">Lexer</span></pre></div> </td> </tr> <tr id="section-17"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-17">#</a> </div> <p>Scan by attempting to match tokens one at a time. Slow and steady.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">tokenize: </span><span class="p">(</span><span class="nx">code</span><span class="p">,</span> <span class="nx">options</span><span class="p">)</span> <span class="o">-&gt;</span>
pushing some extra smarts into the Lexer.</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">exports.Lexer: </span><span class="nx">class</span> <span class="nx">Lexer</span></pre></div> </td> </tr> <tr id="section-17"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-17">#</a> </div> <p><strong>tokenize</strong> is the Lexer's main method. Scan by attempting to match tokens
one at a time, using a regular expression anchored at the start of the
remaining code, or a custom recursive token-matching method
(for interpolations). When the next token has been recorded, we move forward
within the code past the token, and begin again.</p>
<p>Each tokenizing method is responsible for incrementing <code>@i</code> by the number of
characters it has consumed. <code>@i</code> can be thought of as our finger on the page
of source.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">tokenize: </span><span class="p">(</span><span class="nx">code</span><span class="p">,</span> <span class="nx">options</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="nx">o</span> <span class="o">:</span> <span class="nx">options</span> <span class="o">or</span> <span class="p">{}</span>
<span class="err">@</span><span class="nx">code</span> <span class="o">:</span> <span class="nx">code</span> <span class="c1"># The remainder of the source code.</span>
<span class="err">@</span><span class="nx">i</span> <span class="o">:</span> <span class="mi">0</span> <span class="c1"># Current character position we&#39;re parsing.</span>
<span class="err">@</span><span class="nx">line</span> <span class="o">:</span> <span class="nx">o</span><span class="p">.</span><span class="nx">line</span> <span class="o">or</span> <span class="mi">0</span> <span class="c1"># The current line.</span>
<span class="err">@</span><span class="nx">indent</span> <span class="o">:</span> <span class="mi">0</span> <span class="c1"># The current indent level.</span>
<span class="err">@</span><span class="nx">indents</span> <span class="o">:</span> <span class="p">[]</span> <span class="c1"># The stack of all indent levels we are currently within.</span>
<span class="err">@</span><span class="nx">tokens</span> <span class="o">:</span> <span class="p">[]</span> <span class="c1"># Collection of all parsed tokens in the form [&#39;TOKEN_TYPE&#39;, value, line]</span>
<span class="err">@</span><span class="nx">indent</span> <span class="o">:</span> <span class="mi">0</span> <span class="c1"># The current indentation level.</span>
<span class="err">@</span><span class="nx">indents</span> <span class="o">:</span> <span class="p">[]</span> <span class="c1"># The stack of all current indentation levels.</span>
<span class="err">@</span><span class="nx">tokens</span> <span class="o">:</span> <span class="p">[]</span> <span class="c1"># Stream of parsed tokens in the form [&#39;TYPE&#39;, value, line]</span>
<span class="k">while</span> <span class="err">@</span><span class="nx">i</span> <span class="o">&lt;</span> <span class="err">@</span><span class="nx">code</span><span class="p">.</span><span class="nx">length</span>
<span class="err">@</span><span class="nv">chunk: </span><span class="err">@</span><span class="nx">code</span><span class="p">.</span><span class="nx">slice</span><span class="p">(</span><span class="err">@</span><span class="nx">i</span><span class="p">)</span>
<span class="err">@</span><span class="nx">extract_next_token</span><span class="p">()</span>
<span class="err">@</span><span class="nx">close_indentation</span><span class="p">()</span>
<span class="k">return</span> <span class="err">@</span><span class="nx">tokens</span> <span class="k">if</span> <span class="nx">o</span><span class="p">.</span><span class="nx">rewrite</span> <span class="o">is</span> <span class="kc">no</span>
<span class="k">return</span> <span class="err">@</span><span class="nx">tokens</span> <span class="k">if</span> <span class="nx">o</span><span class="p">.</span><span class="nx">rewrite</span> <span class="o">is</span> <span class="kc">off</span>
<span class="p">(</span><span class="k">new</span> <span class="nx">Rewriter</span><span class="p">()).</span><span class="nx">rewrite</span> <span class="err">@</span><span class="nx">tokens</span></pre></div> </td> </tr> <tr id="section-18"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-18">#</a> </div> <p>At every position, run through this list of attempted matches,
short-circuiting if any of them succeed.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">extract_next_token: </span><span class="o">-&gt;</span>
short-circuiting if any of them succeed. Their order determines precedence:
<code>@literal_token</code> is the fallback catch-all.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">extract_next_token: </span><span class="o">-&gt;</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">identifier_token</span><span class="p">()</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">number_token</span><span class="p">()</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">heredoc_token</span><span class="p">()</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">string_token</span><span class="p">()</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">js_token</span><span class="p">()</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">regex_token</span><span class="p">()</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">comment_token</span><span class="p">()</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">line_token</span><span class="p">()</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">whitespace_token</span><span class="p">()</span>
<span class="k">return</span> <span class="err">@</span><span class="nx">literal_token</span><span class="p">()</span></pre></div> </td> </tr> <tr id="section-19"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-19">#</a> </div> <h2>Tokenizers</h2> </td> <td class="code"> <div class="highlight"><pre></pre></div> </td> </tr> <tr id="section-20"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-20">#</a> </div> <p>Matches identifying literals: variables, keywords, method names, etc.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">identifier_token: </span><span class="o">-&gt;</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">js_token</span><span class="p">()</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">string_token</span><span class="p">()</span>
<span class="k">return</span> <span class="err">@</span><span class="nx">literal_token</span><span class="p">()</span></pre></div> </td> </tr> <tr id="section-19"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-19">#</a> </div> <h2>Tokenizers</h2> </td> <td class="code"> <div class="highlight"><pre></pre></div> </td> </tr> <tr id="section-20"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-20">#</a> </div> <p>Matches identifying literals: variables, keywords, method names, etc.
Check to ensure that JavaScript reserved words aren't being used as
identifiers. Because CoffeeScript reserves a handful of keywords that are
allowed in JavaScript, we're careful not to tag them as keywords when
referenced as property names here, so you can still do <code>jQuery.is()</code> even
though <code>is</code> means <code>===</code> otherwise.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">identifier_token: </span><span class="o">-&gt;</span>
<span class="k">return</span> <span class="kc">false</span> <span class="nx">unless</span> <span class="nv">id: </span><span class="err">@</span><span class="nx">match</span> <span class="nx">IDENTIFIER</span><span class="p">,</span> <span class="mi">1</span>
<span class="err">@</span><span class="nx">name_access_type</span><span class="p">()</span>
<span class="nv">tag: </span><span class="s1">&#39;IDENTIFIER&#39;</span>
@ -102,60 +116,55 @@ short-circuiting if any of them succeed.</p> </td> <td c
<span class="k">return</span> <span class="kc">false</span> <span class="nx">unless</span> <span class="nv">number: </span><span class="err">@</span><span class="nx">match</span> <span class="nx">NUMBER</span><span class="p">,</span> <span class="mi">1</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;NUMBER&#39;</span><span class="p">,</span> <span class="nx">number</span>
<span class="err">@</span><span class="nx">i</span> <span class="o">+=</span> <span class="nx">number</span><span class="p">.</span><span class="nx">length</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-22"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-22">#</a> </div> <p>Matches strings, including multi-line strings.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">string_token: </span><span class="o">-&gt;</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-22"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-22">#</a> </div> <p>Matches strings, including multi-line strings. Ensures that quotation marks
are balanced within the string's contents, and within nested interpolations.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">string_token: </span><span class="o">-&gt;</span>
<span class="k">return</span> <span class="kc">false</span> <span class="nx">unless</span> <span class="nx">starts</span><span class="p">(</span><span class="err">@</span><span class="nx">chunk</span><span class="p">,</span> <span class="s1">&#39;&quot;&#39;</span><span class="p">)</span> <span class="o">or</span> <span class="nx">starts</span><span class="p">(</span><span class="err">@</span><span class="nx">chunk</span><span class="p">,</span> <span class="s2">&quot;&#39;&quot;</span><span class="p">)</span>
<span class="nv">string: </span><span class="err">@</span><span class="nx">balanced_token</span> <span class="p">[</span><span class="s1">&#39;&quot;&#39;</span><span class="p">,</span> <span class="s1">&#39;&quot;&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s1">&#39;${&#39;</span><span class="p">,</span> <span class="s1">&#39;}&#39;</span><span class="p">]</span>
<span class="nv">string: </span><span class="err">@</span><span class="nx">balanced_token</span> <span class="p">[</span><span class="s2">&quot;&#39;&quot;</span><span class="p">,</span> <span class="s2">&quot;&#39;&quot;</span><span class="p">]</span> <span class="k">if</span> <span class="nx">string</span> <span class="o">is</span> <span class="kc">false</span>
<span class="nv">string: </span><span class="err">@</span><span class="nx">balanced_token</span> <span class="p">[</span><span class="s2">&quot;&#39;&quot;</span><span class="p">,</span> <span class="s2">&quot;&#39;&quot;</span><span class="p">]</span> <span class="nx">unless</span> <span class="nx">string</span>
<span class="k">return</span> <span class="kc">false</span> <span class="nx">unless</span> <span class="nx">string</span>
<span class="err">@</span><span class="nx">interpolate_string</span> <span class="nx">string</span><span class="p">.</span><span class="nx">replace</span> <span class="nx">STRING_NEWLINES</span><span class="p">,</span> <span class="s2">&quot; \\\n&quot;</span>
<span class="err">@</span><span class="nx">line</span> <span class="o">+=</span> <span class="nx">count</span> <span class="nx">string</span><span class="p">,</span> <span class="s2">&quot;\n&quot;</span>
<span class="err">@</span><span class="nx">i</span> <span class="o">+=</span> <span class="nx">string</span><span class="p">.</span><span class="nx">length</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-23"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-23">#</a> </div> <p>Matches heredocs, adjusting indentation to the correct level.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">heredoc_token: </span><span class="o">-&gt;</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-23"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-23">#</a> </div> <p>Matches heredocs, adjusting indentation to the correct level, as heredocs
preserve whitespace, but ignore indentation to the left.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">heredoc_token: </span><span class="o">-&gt;</span>
<span class="k">return</span> <span class="kc">false</span> <span class="nx">unless</span> <span class="nx">match</span> <span class="o">=</span> <span class="err">@</span><span class="nx">chunk</span><span class="p">.</span><span class="nx">match</span><span class="p">(</span><span class="nx">HEREDOC</span><span class="p">)</span>
<span class="nv">doc: </span><span class="err">@</span><span class="nx">sanitize_heredoc</span> <span class="nx">match</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">or</span> <span class="nx">match</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;STRING&#39;</span><span class="p">,</span> <span class="s2">&quot;\&quot;$doc\&quot;&quot;</span>
<span class="err">@</span><span class="nx">line</span> <span class="o">+=</span> <span class="nx">count</span> <span class="nx">match</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="s2">&quot;\n&quot;</span>
<span class="err">@</span><span class="nx">i</span> <span class="o">+=</span> <span class="nx">match</span><span class="p">[</span><span class="mi">1</span><span class="p">].</span><span class="nx">length</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-24"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-24">#</a> </div> <p>Matches interpolated JavaScript.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">js_token: </span><span class="o">-&gt;</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-24"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-24">#</a> </div> <p>Matches JavaScript interpolated directly into the source via backticks.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">js_token: </span><span class="o">-&gt;</span>
<span class="k">return</span> <span class="kc">false</span> <span class="nx">unless</span> <span class="nx">starts</span> <span class="err">@</span><span class="nx">chunk</span><span class="p">,</span> <span class="s1">&#39;`&#39;</span>
<span class="k">return</span> <span class="kc">false</span> <span class="nx">unless</span> <span class="nv">script: </span><span class="err">@</span><span class="nx">balanced_token</span> <span class="p">[</span><span class="s1">&#39;`&#39;</span><span class="p">,</span> <span class="s1">&#39;`&#39;</span><span class="p">]</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;JS&#39;</span><span class="p">,</span> <span class="nx">script</span><span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="nx">JS_CLEANER</span><span class="p">,</span> <span class="s1">&#39;&#39;</span><span class="p">)</span>
<span class="err">@</span><span class="nx">i</span> <span class="o">+=</span> <span class="nx">script</span><span class="p">.</span><span class="nx">length</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-25"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-25">#</a> </div> <p>Matches regular expression literals.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">regex_token: </span><span class="o">-&gt;</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-25"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-25">#</a> </div> <p>Matches regular expression literals. Lexing regular expressions is difficult
to distinguish from division, so we borrow some basic heuristics from
JavaScript and Ruby.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">regex_token: </span><span class="o">-&gt;</span>
<span class="k">return</span> <span class="kc">false</span> <span class="nx">unless</span> <span class="nv">regex: </span><span class="err">@</span><span class="nx">match</span> <span class="nx">REGEX</span><span class="p">,</span> <span class="mi">1</span>
<span class="k">return</span> <span class="kc">false</span> <span class="k">if</span> <span class="nx">include</span> <span class="nx">NOT_REGEX</span><span class="p">,</span> <span class="err">@</span><span class="nx">tag</span><span class="p">()</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;REGEX&#39;</span><span class="p">,</span> <span class="nx">regex</span>
<span class="err">@</span><span class="nx">i</span> <span class="o">+=</span> <span class="nx">regex</span><span class="p">.</span><span class="nx">length</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-26"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-26">#</a> </div> <p>Matches a balanced group such as a single or double-quoted string. Pass in
a series of delimiters, all of which must be balanced correctly within the
token's contents.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">balanced_token: </span><span class="p">(</span><span class="nx">delimited</span><span class="p">...)</span> <span class="o">-&gt;</span>
<span class="nv">levels: </span><span class="p">[]</span>
<span class="nv">i: </span><span class="mi">0</span>
<span class="k">while</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="err">@</span><span class="nx">chunk</span><span class="p">.</span><span class="nx">length</span>
<span class="k">for</span> <span class="nx">pair</span> <span class="k">in</span> <span class="nx">delimited</span>
<span class="p">[</span><span class="nx">open</span><span class="p">,</span> <span class="nx">close</span><span class="p">]</span><span class="o">:</span> <span class="nx">pair</span>
<span class="k">if</span> <span class="nx">levels</span><span class="p">.</span><span class="nx">length</span> <span class="o">and</span> <span class="nx">starts</span> <span class="err">@</span><span class="nx">chunk</span><span class="p">,</span> <span class="s1">&#39;\\&#39;</span><span class="p">,</span> <span class="nx">i</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">break</span>
<span class="k">else</span> <span class="k">if</span> <span class="nx">levels</span><span class="p">.</span><span class="nx">length</span> <span class="o">and</span> <span class="nx">starts</span><span class="p">(</span><span class="err">@</span><span class="nx">chunk</span><span class="p">,</span> <span class="nx">close</span><span class="p">,</span> <span class="nx">i</span><span class="p">)</span> <span class="o">and</span> <span class="nx">levels</span><span class="p">[</span><span class="nx">levels</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]</span> <span class="o">is</span> <span class="nx">pair</span>
<span class="nx">levels</span><span class="p">.</span><span class="nx">pop</span><span class="p">()</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="nx">close</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="mi">1</span> <span class="nx">unless</span> <span class="nx">levels</span><span class="p">.</span><span class="nx">length</span>
<span class="k">break</span>
<span class="k">else</span> <span class="k">if</span> <span class="nx">starts</span> <span class="err">@</span><span class="nx">chunk</span><span class="p">,</span> <span class="nx">open</span><span class="p">,</span> <span class="nx">i</span>
<span class="nx">levels</span><span class="p">.</span><span class="nx">push</span><span class="p">(</span><span class="nx">pair</span><span class="p">)</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="nx">open</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">break</span>
<span class="k">break</span> <span class="nx">unless</span> <span class="nx">levels</span><span class="p">.</span><span class="nx">length</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">throw</span> <span class="k">new</span> <span class="nb">Error</span> <span class="s2">&quot;SyntaxError: Unterminated ${levels.pop()[0]} starting on line ${@line + 1}&quot;</span> <span class="k">if</span> <span class="nx">levels</span><span class="p">.</span><span class="nx">length</span>
<span class="k">return</span> <span class="kc">false</span> <span class="k">if</span> <span class="nx">i</span> <span class="o">is</span> <span class="mi">0</span>
<span class="k">return</span> <span class="err">@</span><span class="nx">chunk</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nx">i</span><span class="p">)</span></pre></div> </td> </tr> <tr id="section-27"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-27">#</a> </div> <p>Matches and conumes comments.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">comment_token: </span><span class="o">-&gt;</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-26"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-26">#</a> </div> <p>Matches a token in which which the passed delimiter pairs must be correctly
balanced (ie. strings, JS literals).</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">balanced_token: </span><span class="p">(</span><span class="nx">delimited</span><span class="p">...)</span> <span class="o">-&gt;</span>
<span class="err">@</span><span class="nx">balanced_string</span> <span class="err">@</span><span class="nx">chunk</span><span class="p">,</span> <span class="nx">delimited</span><span class="p">...</span></pre></div> </td> </tr> <tr id="section-27"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-27">#</a> </div> <p>Matches and conumes comments. We pass through comments into JavaScript,
so they're treated as real tokens, like any other part of the language.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">comment_token: </span><span class="o">-&gt;</span>
<span class="k">return</span> <span class="kc">false</span> <span class="nx">unless</span> <span class="nv">comment: </span><span class="err">@</span><span class="nx">match</span> <span class="nx">COMMENT</span><span class="p">,</span> <span class="mi">1</span>
<span class="err">@</span><span class="nx">line</span> <span class="o">+=</span> <span class="p">(</span><span class="nx">comment</span><span class="p">.</span><span class="nx">match</span><span class="p">(</span><span class="nx">MULTILINER</span><span class="p">)</span> <span class="o">or</span> <span class="p">[]).</span><span class="nx">length</span>
<span class="nv">lines: </span><span class="nx">comment</span><span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="nx">COMMENT_CLEANER</span><span class="p">,</span> <span class="s1">&#39;&#39;</span><span class="p">).</span><span class="nx">split</span><span class="p">(</span><span class="nx">MULTILINER</span><span class="p">)</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;COMMENT&#39;</span><span class="p">,</span> <span class="nx">compact</span> <span class="nx">lines</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;TERMINATOR&#39;</span><span class="p">,</span> <span class="s2">&quot;\n&quot;</span>
<span class="err">@</span><span class="nx">i</span> <span class="o">+=</span> <span class="nx">comment</span><span class="p">.</span><span class="nx">length</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-28"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-28">#</a> </div> <p>Matches newlines, indents, and outdents, and determines which is which.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">line_token: </span><span class="o">-&gt;</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-28"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-28">#</a> </div> <p>Matches newlines, indents, and outdents, and determines which is which.
If we can detect that the current line is continued onto the the next line,
then the newline is suppressed:</p>
<pre><code>elements
.each( ... )
.map( ... )
</code></pre>
<p>Keeps track of the level of indentation, because a single outdent token
can close multiple indents, so we need to know how far in we happen to be.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">line_token: </span><span class="o">-&gt;</span>
<span class="k">return</span> <span class="kc">false</span> <span class="nx">unless</span> <span class="nv">indent: </span><span class="err">@</span><span class="nx">match</span> <span class="nx">MULTI_DENT</span><span class="p">,</span> <span class="mi">1</span>
<span class="err">@</span><span class="nx">line</span> <span class="o">+=</span> <span class="nx">indent</span><span class="p">.</span><span class="nx">match</span><span class="p">(</span><span class="nx">MULTILINER</span><span class="p">).</span><span class="nx">length</span>
<span class="err">@</span><span class="nx">i</span> <span class="o">+=</span> <span class="nx">indent</span><span class="p">.</span><span class="nx">length</span>
@ -165,18 +174,18 @@ token's contents.</p> </td> <td class="code">
<span class="nv">no_newlines: </span><span class="nx">next_character</span> <span class="o">is</span> <span class="s1">&#39;.&#39;</span> <span class="o">or</span> <span class="p">(</span><span class="err">@</span><span class="nx">value</span><span class="p">()</span> <span class="o">and</span> <span class="err">@</span><span class="nx">value</span><span class="p">().</span><span class="nx">match</span><span class="p">(</span><span class="nx">NO_NEWLINE</span><span class="p">)</span> <span class="o">and</span>
<span class="nx">prev</span> <span class="o">and</span> <span class="p">(</span><span class="nx">prev</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">isnt</span> <span class="s1">&#39;.&#39;</span><span class="p">)</span> <span class="o">and</span> <span class="o">not</span> <span class="err">@</span><span class="nx">value</span><span class="p">().</span><span class="nx">match</span><span class="p">(</span><span class="nx">CODE</span><span class="p">))</span>
<span class="k">if</span> <span class="nx">size</span> <span class="o">is</span> <span class="err">@</span><span class="nx">indent</span>
<span class="k">return</span> <span class="err">@</span><span class="nx">suppress_newlines</span><span class="p">(</span><span class="nx">indent</span><span class="p">)</span> <span class="k">if</span> <span class="nx">no_newlines</span>
<span class="k">return</span> <span class="err">@</span><span class="nx">suppress_newlines</span><span class="p">()</span> <span class="k">if</span> <span class="nx">no_newlines</span>
<span class="k">return</span> <span class="err">@</span><span class="nx">newline_token</span><span class="p">(</span><span class="nx">indent</span><span class="p">)</span>
<span class="k">else</span> <span class="k">if</span> <span class="nx">size</span> <span class="o">&gt;</span> <span class="err">@</span><span class="nx">indent</span>
<span class="k">return</span> <span class="err">@</span><span class="nx">suppress_newlines</span><span class="p">(</span><span class="nx">indent</span><span class="p">)</span> <span class="k">if</span> <span class="nx">no_newlines</span>
<span class="k">return</span> <span class="err">@</span><span class="nx">suppress_newlines</span><span class="p">()</span> <span class="k">if</span> <span class="nx">no_newlines</span>
<span class="nv">diff: </span><span class="nx">size</span> <span class="o">-</span> <span class="err">@</span><span class="nx">indent</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;INDENT&#39;</span><span class="p">,</span> <span class="nx">diff</span>
<span class="err">@</span><span class="nx">indents</span><span class="p">.</span><span class="nx">push</span> <span class="nx">diff</span>
<span class="k">else</span>
<span class="err">@</span><span class="nx">outdent_token</span> <span class="err">@</span><span class="nx">indent</span> <span class="o">-</span> <span class="nx">size</span><span class="p">,</span> <span class="nx">no_newlines</span>
<span class="err">@</span><span class="nv">indent: </span><span class="nx">size</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-29"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-29">#</a> </div> <p>Record an outdent token or tokens, if we happen to be moving back inwards
past multiple recorded indents.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">outdent_token: </span><span class="p">(</span><span class="nx">move_out</span><span class="p">,</span> <span class="nx">no_newlines</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-29"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-29">#</a> </div> <p>Record an outdent token or multiple tokens, if we happen to be moving back
inwards past several recorded indents.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">outdent_token: </span><span class="p">(</span><span class="nx">move_out</span><span class="p">,</span> <span class="nx">no_newlines</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="k">while</span> <span class="nx">move_out</span> <span class="o">&gt;</span> <span class="mi">0</span> <span class="o">and</span> <span class="err">@</span><span class="nx">indents</span><span class="p">.</span><span class="nx">length</span>
<span class="nv">last_indent: </span><span class="err">@</span><span class="nx">indents</span><span class="p">.</span><span class="nx">pop</span><span class="p">()</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;OUTDENT&#39;</span><span class="p">,</span> <span class="nx">last_indent</span>
@ -188,14 +197,16 @@ as being "spaced", because there are some cases where it makes a difference.</p>
<span class="nv">prev: </span><span class="err">@</span><span class="nx">prev</span><span class="p">()</span>
<span class="nv">prev.spaced: </span><span class="kc">true</span> <span class="k">if</span> <span class="nx">prev</span>
<span class="err">@</span><span class="nx">i</span> <span class="o">+=</span> <span class="nx">space</span><span class="p">.</span><span class="nx">length</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-31"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-31">#</a> </div> <p>Generate a newline token. Multiple newlines get merged together.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">newline_token: </span><span class="p">(</span><span class="nx">newlines</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-31"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-31">#</a> </div> <p>Generate a newline token. Consecutive newlines get merged together.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">newline_token: </span><span class="p">(</span><span class="nx">newlines</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;TERMINATOR&#39;</span><span class="p">,</span> <span class="s2">&quot;\n&quot;</span> <span class="nx">unless</span> <span class="err">@</span><span class="nx">tag</span><span class="p">()</span> <span class="o">is</span> <span class="s1">&#39;TERMINATOR&#39;</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-32"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-32">#</a> </div> <p>Use a <code>\</code> at a line-ending to suppress the newline.
The slash is removed here once its job is done.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">suppress_newlines: </span><span class="p">(</span><span class="nx">newlines</span><span class="p">)</span> <span class="o">-&gt;</span>
The slash is removed here once its job is done.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">suppress_newlines: </span><span class="o">-&gt;</span>
<span class="err">@</span><span class="nx">tokens</span><span class="p">.</span><span class="nx">pop</span><span class="p">()</span> <span class="k">if</span> <span class="err">@</span><span class="nx">value</span><span class="p">()</span> <span class="o">is</span> <span class="s2">&quot;\\&quot;</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-33"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-33">#</a> </div> <p>We treat all other single characters as a token. Eg.: <code>( ) , . !</code>
Multi-character operators are also literal tokens, so that Jison can assign
the proper order of operations.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">literal_token: </span><span class="o">-&gt;</span>
the proper order of operations. There are some symbols that we tag specially
here. <code>;</code> and newlines are both treated as a <code>TERMINATOR</code>, we distinguish
parentheses that indicate a method call from regular parentheses, and so on.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">literal_token: </span><span class="o">-&gt;</span>
<span class="nv">match: </span><span class="err">@</span><span class="nx">chunk</span><span class="p">.</span><span class="nx">match</span><span class="p">(</span><span class="nx">OPERATOR</span><span class="p">)</span>
<span class="nv">value: </span><span class="nx">match</span> <span class="o">and</span> <span class="nx">match</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="err">@</span><span class="nx">tag_parameters</span><span class="p">()</span> <span class="k">if</span> <span class="nx">value</span> <span class="o">and</span> <span class="nx">value</span><span class="p">.</span><span class="nx">match</span><span class="p">(</span><span class="nx">CODE</span><span class="p">)</span>
@ -227,15 +238,14 @@ if it's a special kind of accessor.</p> </td> <td class=
<span class="err">@</span><span class="nx">tag</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="s1">&#39;SOAK_ACCESS&#39;</span><span class="p">)</span>
<span class="err">@</span><span class="nx">tokens</span><span class="p">.</span><span class="nx">splice</span><span class="p">(</span><span class="o">-</span><span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
<span class="k">else</span>
<span class="err">@</span><span class="nx">tag</span> <span class="mi">1</span><span class="p">,</span> <span class="s1">&#39;PROPERTY_ACCESS&#39;</span></pre></div> </td> </tr> <tr id="section-36"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-36">#</a> </div> <p>Sanitize a heredoc by escaping double quotes and erasing all external
indentation on the left-hand side.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">sanitize_heredoc: </span><span class="p">(</span><span class="nx">doc</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="err">@</span><span class="nx">tag</span> <span class="mi">1</span><span class="p">,</span> <span class="s1">&#39;PROPERTY_ACCESS&#39;</span></pre></div> </td> </tr> <tr id="section-36"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-36">#</a> </div> <p>Sanitize a heredoc by escaping internal double quotes and erasing all
external indentation on the left-hand side.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">sanitize_heredoc: </span><span class="p">(</span><span class="nx">doc</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="nv">indent: </span><span class="p">(</span><span class="nx">doc</span><span class="p">.</span><span class="nx">match</span><span class="p">(</span><span class="nx">HEREDOC_INDENT</span><span class="p">)</span> <span class="o">or</span> <span class="p">[</span><span class="s1">&#39;&#39;</span><span class="p">]).</span><span class="nx">sort</span><span class="p">()[</span><span class="mi">0</span><span class="p">]</span>
<span class="nx">doc</span><span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="k">new</span> <span class="nb">RegExp</span><span class="p">(</span><span class="s2">&quot;^&quot;</span> <span class="o">+</span><span class="nx">indent</span><span class="p">,</span> <span class="s1">&#39;gm&#39;</span><span class="p">),</span> <span class="s1">&#39;&#39;</span><span class="p">)</span>
<span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="nx">MULTILINER</span><span class="p">,</span> <span class="s2">&quot;\\n&quot;</span><span class="p">)</span>
<span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="sr">/&quot;/g</span><span class="p">,</span> <span class="s1">&#39;\\&quot;&#39;</span><span class="p">)</span></pre></div> </td> </tr> <tr id="section-37"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-37">#</a> </div> <p>A source of ambiguity in our grammar was parameter lists in function
definitions (as opposed to argument lists in function calls). Tag
parameter identifiers in order to avoid this. Also, parameter lists can
make use of splats.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">tag_parameters: </span><span class="o">-&gt;</span>
<span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="sr">/&quot;/g</span><span class="p">,</span> <span class="s1">&#39;\\&quot;&#39;</span><span class="p">)</span></pre></div> </td> </tr> <tr id="section-37"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-37">#</a> </div> <p>A source of ambiguity in our grammar used to be parameter lists in function
definitions versus argument lists in function calls. Walk backwards, tagging
parameters specially in order to make things easier for the parser.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">tag_parameters: </span><span class="o">-&gt;</span>
<span class="k">return</span> <span class="k">if</span> <span class="err">@</span><span class="nx">tag</span><span class="p">()</span> <span class="o">isnt</span> <span class="s1">&#39;)&#39;</span>
<span class="nv">i: </span><span class="mi">0</span>
<span class="k">while</span> <span class="kc">true</span>
@ -247,69 +257,92 @@ make use of splats.</p> </td> <td class="code">
<span class="k">when</span> <span class="s1">&#39;)&#39;</span> <span class="k">then</span> <span class="nx">tok</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">:</span> <span class="s1">&#39;PARAM_END&#39;</span>
<span class="k">when</span> <span class="s1">&#39;(&#39;</span> <span class="k">then</span> <span class="k">return</span> <span class="nx">tok</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">:</span> <span class="s1">&#39;PARAM_START&#39;</span>
<span class="kc">true</span></pre></div> </td> </tr> <tr id="section-38"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-38">#</a> </div> <p>Close up all remaining open blocks at the end of the file.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">close_indentation: </span><span class="o">-&gt;</span>
<span class="err">@</span><span class="nx">outdent_token</span><span class="p">(</span><span class="err">@</span><span class="nx">indent</span><span class="p">)</span></pre></div> </td> </tr> <tr id="section-39"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-39">#</a> </div> <p>Error for when you try to use a forbidden word in JavaScript as
<span class="err">@</span><span class="nx">outdent_token</span><span class="p">(</span><span class="err">@</span><span class="nx">indent</span><span class="p">)</span></pre></div> </td> </tr> <tr id="section-39"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-39">#</a> </div> <p>The error for when you try to use a forbidden word in JavaScript as
an identifier.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">identifier_error: </span><span class="p">(</span><span class="nx">word</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="k">throw</span> <span class="k">new</span> <span class="nb">Error</span> <span class="s2">&quot;SyntaxError: Reserved word \&quot;$word\&quot; on line ${@line + 1}&quot;</span></pre></div> </td> </tr> <tr id="section-40"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-40">#</a> </div> <p>Error for when you try to assign to a reserved word in JavaScript,
<span class="k">throw</span> <span class="k">new</span> <span class="nb">Error</span> <span class="s2">&quot;SyntaxError: Reserved word \&quot;$word\&quot; on line ${@line + 1}&quot;</span></pre></div> </td> </tr> <tr id="section-40"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-40">#</a> </div> <p>The error for when you try to assign to a reserved word in JavaScript,
like "function" or "default".</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">assignment_error: </span><span class="o">-&gt;</span>
<span class="k">throw</span> <span class="k">new</span> <span class="nb">Error</span> <span class="s2">&quot;SyntaxError: Reserved word \&quot;${@value()}\&quot; on line ${@line + 1} can&#39;t be assigned&quot;</span></pre></div> </td> </tr> <tr id="section-41"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-41">#</a> </div> <p>Expand variables and expressions inside double-quoted strings using
<a href="http://wiki.ecmascript.org/doku.php?id=strawman:string_interpolation">ECMA Harmony's interpolation syntax</a>.</p>
<span class="k">throw</span> <span class="k">new</span> <span class="nb">Error</span> <span class="s2">&quot;SyntaxError: Reserved word \&quot;${@value()}\&quot; on line ${@line + 1} can&#39;t be assigned&quot;</span></pre></div> </td> </tr> <tr id="section-41"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-41">#</a> </div> <p>Matches a balanced group such as a single or double-quoted string. Pass in
a series of delimiters, all of which must be nested correctly within the
contents of the string. This method allows us to have strings within
interpolations within strings etc...</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">balanced_string: </span><span class="p">(</span><span class="nx">str</span><span class="p">,</span> <span class="nx">delimited</span><span class="p">...)</span> <span class="o">-&gt;</span>
<span class="nv">levels: </span><span class="p">[]</span>
<span class="nv">i: </span><span class="mi">0</span>
<span class="k">while</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="nx">str</span><span class="p">.</span><span class="nx">length</span>
<span class="k">for</span> <span class="nx">pair</span> <span class="k">in</span> <span class="nx">delimited</span>
<span class="p">[</span><span class="nx">open</span><span class="p">,</span> <span class="nx">close</span><span class="p">]</span><span class="o">:</span> <span class="nx">pair</span>
<span class="k">if</span> <span class="nx">levels</span><span class="p">.</span><span class="nx">length</span> <span class="o">and</span> <span class="nx">starts</span> <span class="nx">str</span><span class="p">,</span> <span class="s1">&#39;\\&#39;</span><span class="p">,</span> <span class="nx">i</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">break</span>
<span class="k">else</span> <span class="k">if</span> <span class="nx">levels</span><span class="p">.</span><span class="nx">length</span> <span class="o">and</span> <span class="nx">starts</span><span class="p">(</span><span class="nx">str</span><span class="p">,</span> <span class="nx">close</span><span class="p">,</span> <span class="nx">i</span><span class="p">)</span> <span class="o">and</span> <span class="nx">levels</span><span class="p">[</span><span class="nx">levels</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]</span> <span class="o">is</span> <span class="nx">pair</span>
<span class="nx">levels</span><span class="p">.</span><span class="nx">pop</span><span class="p">()</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="nx">close</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="mi">1</span> <span class="nx">unless</span> <span class="nx">levels</span><span class="p">.</span><span class="nx">length</span>
<span class="k">break</span>
<span class="k">else</span> <span class="k">if</span> <span class="nx">starts</span> <span class="nx">str</span><span class="p">,</span> <span class="nx">open</span><span class="p">,</span> <span class="nx">i</span>
<span class="nx">levels</span><span class="p">.</span><span class="nx">push</span><span class="p">(</span><span class="nx">pair</span><span class="p">)</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="nx">open</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">break</span>
<span class="k">break</span> <span class="nx">unless</span> <span class="nx">levels</span><span class="p">.</span><span class="nx">length</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">throw</span> <span class="k">new</span> <span class="nb">Error</span> <span class="s2">&quot;SyntaxError: Unterminated ${levels.pop()[0]} starting on line ${@line + 1}&quot;</span> <span class="k">if</span> <span class="nx">levels</span><span class="p">.</span><span class="nx">length</span>
<span class="k">return</span> <span class="kc">false</span> <span class="k">if</span> <span class="nx">i</span> <span class="o">is</span> <span class="mi">0</span>
<span class="k">return</span> <span class="nx">str</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nx">i</span><span class="p">)</span></pre></div> </td> </tr> <tr id="section-42"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-42">#</a> </div> <p>Expand variables and expressions inside double-quoted strings using
<a href="http://wiki.ecmascript.org/doku.php?id=strawman:string_interpolation">ECMA Harmony's interpolation syntax</a>
for substitution of bare variables as well as arbitrary expressions.</p>
<pre><code>"Hello $name."
"Hello ${name.capitalize()}."
</code></pre> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">interpolate_string: </span><span class="p">(</span><span class="nx">str</span><span class="p">)</span> <span class="o">-&gt;</span>
</code></pre>
<p>If it encounters an interpolation, this method will recursively create a
new Lexer, tokenize the interpolated contents, and merge them into the
token stream.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">interpolate_string: </span><span class="p">(</span><span class="nx">str</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="k">if</span> <span class="nx">str</span><span class="p">.</span><span class="nx">length</span> <span class="o">&lt;</span> <span class="mi">3</span> <span class="o">or</span> <span class="o">not</span> <span class="nx">starts</span> <span class="nx">str</span><span class="p">,</span> <span class="s1">&#39;&quot;&#39;</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;STRING&#39;</span><span class="p">,</span> <span class="nx">str</span>
<span class="k">else</span>
<span class="nv">lexer: </span> <span class="k">new</span> <span class="nx">Lexer</span><span class="p">()</span>
<span class="nv">tokens: </span><span class="p">[]</span>
<span class="nv">quote: </span> <span class="nx">str</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">str: </span> <span class="nx">str</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nx">str</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
<span class="k">while</span> <span class="nx">str</span><span class="p">.</span><span class="nx">length</span>
<span class="nv">match: </span><span class="nx">str</span><span class="p">.</span><span class="nx">match</span> <span class="nx">INTERPOLATION</span>
<span class="k">if</span> <span class="nx">match</span>
<span class="p">[</span><span class="nx">group</span><span class="p">,</span> <span class="nx">before</span><span class="p">,</span> <span class="nx">interp</span><span class="p">]</span><span class="o">:</span> <span class="nx">match</span>
<span class="k">if</span> <span class="nx">starts</span> <span class="nx">before</span><span class="p">,</span> <span class="s1">&#39;\\&#39;</span><span class="p">,</span> <span class="nx">before</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span>
<span class="nv">prev: </span><span class="nx">before</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nx">before</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
<span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span> <span class="p">[</span><span class="s1">&#39;STRING&#39;</span><span class="p">,</span> <span class="s2">&quot;$quote$prev$$interp$quote&quot;</span><span class="p">]</span> <span class="k">if</span> <span class="nx">before</span><span class="p">.</span><span class="nx">length</span>
<span class="k">else</span>
<span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span> <span class="p">[</span><span class="s1">&#39;STRING&#39;</span><span class="p">,</span> <span class="s2">&quot;$quote$before$quote&quot;</span><span class="p">]</span> <span class="k">if</span> <span class="nx">before</span><span class="p">.</span><span class="nx">length</span>
<span class="k">if</span> <span class="nx">starts</span> <span class="nx">interp</span><span class="p">,</span> <span class="s1">&#39;{&#39;</span>
<span class="nv">inner: </span><span class="nx">interp</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nx">interp</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">nested: </span><span class="nx">lexer</span><span class="p">.</span><span class="nx">tokenize</span> <span class="s2">&quot;($inner)&quot;</span><span class="p">,</span> <span class="p">{</span><span class="nv">rewrite: </span><span class="kc">no</span><span class="p">,</span> <span class="nv">line: </span><span class="err">@</span><span class="nx">line</span><span class="p">}</span>
<span class="nx">nested</span><span class="p">.</span><span class="nx">pop</span><span class="p">()</span>
<span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span> <span class="p">[</span><span class="s1">&#39;TOKENS&#39;</span><span class="p">,</span> <span class="nx">nested</span><span class="p">]</span>
<span class="k">else</span>
<span class="nv">interp: </span><span class="s2">&quot;this.${ interp.substring(1) }&quot;</span> <span class="k">if</span> <span class="nx">starts</span> <span class="nx">interp</span><span class="p">,</span> <span class="s1">&#39;@&#39;</span>
<span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span> <span class="p">[</span><span class="s1">&#39;IDENTIFIER&#39;</span><span class="p">,</span> <span class="nx">interp</span><span class="p">]</span>
<span class="nv">str: </span><span class="nx">str</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="nx">group</span><span class="p">.</span><span class="nx">length</span><span class="p">)</span>
<span class="k">else</span>
<span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span> <span class="p">[</span><span class="s1">&#39;STRING&#39;</span><span class="p">,</span> <span class="s2">&quot;$quote$str$quote&quot;</span><span class="p">]</span>
<span class="nv">str: </span><span class="s1">&#39;&#39;</span>
<span class="k">if</span> <span class="nx">tokens</span><span class="p">.</span><span class="nx">length</span> <span class="o">&gt;</span> <span class="mi">1</span>
<span class="k">for</span> <span class="nx">i</span> <span class="k">in</span> <span class="p">[</span><span class="nx">tokens</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span><span class="p">..</span><span class="mi">1</span><span class="p">]</span>
<span class="p">[</span><span class="nx">prev</span><span class="p">,</span> <span class="nx">tok</span><span class="p">]</span><span class="o">:</span> <span class="p">[</span><span class="nx">tokens</span><span class="p">[</span><span class="nx">i</span> <span class="o">-</span> <span class="mi">1</span><span class="p">],</span> <span class="nx">tokens</span><span class="p">[</span><span class="nx">i</span><span class="p">]]</span>
<span class="k">if</span> <span class="nx">tok</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">is</span> <span class="s1">&#39;STRING&#39;</span> <span class="o">and</span> <span class="nx">prev</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">is</span> <span class="s1">&#39;STRING&#39;</span>
<span class="p">[</span><span class="nx">prev</span><span class="p">,</span> <span class="nx">tok</span><span class="p">]</span><span class="o">:</span> <span class="p">[</span><span class="nx">prev</span><span class="p">[</span><span class="mi">1</span><span class="p">].</span><span class="nx">substring</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nx">prev</span><span class="p">[</span><span class="mi">1</span><span class="p">].</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span><span class="p">),</span> <span class="nx">tok</span><span class="p">[</span><span class="mi">1</span><span class="p">].</span><span class="nx">substring</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nx">tok</span><span class="p">[</span><span class="mi">1</span><span class="p">].</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)]</span>
<span class="nx">tokens</span><span class="p">.</span><span class="nx">splice</span> <span class="nx">i</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="p">[</span><span class="s1">&#39;STRING&#39;</span><span class="p">,</span> <span class="s2">&quot;$quote$prev$tok$quote&quot;</span><span class="p">]</span>
<span class="nv">lexer: </span> <span class="k">new</span> <span class="nx">Lexer</span><span class="p">()</span>
<span class="nv">tokens: </span> <span class="p">[]</span>
<span class="nv">quote: </span> <span class="nx">str</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">[</span><span class="nx">i</span><span class="p">,</span> <span class="nx">pi</span><span class="p">]</span><span class="o">:</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">]</span>
<span class="k">while</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="nx">str</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">if</span> <span class="nx">starts</span> <span class="nx">str</span><span class="p">,</span> <span class="s1">&#39;\\&#39;</span><span class="p">,</span> <span class="nx">i</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">else</span> <span class="k">if</span> <span class="nv">match: </span><span class="nx">str</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="nx">i</span><span class="p">).</span><span class="nx">match</span> <span class="nx">INTERPOLATION</span>
<span class="p">[</span><span class="nx">group</span><span class="p">,</span> <span class="nx">interp</span><span class="p">]</span><span class="o">:</span> <span class="nx">match</span>
<span class="nv">interp: </span><span class="s2">&quot;this.${ interp.substring(1) }&quot;</span> <span class="k">if</span> <span class="nx">starts</span> <span class="nx">interp</span><span class="p">,</span> <span class="s1">&#39;@&#39;</span>
<span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span> <span class="p">[</span><span class="s1">&#39;STRING&#39;</span><span class="p">,</span> <span class="s2">&quot;$quote${ str.substring(pi, i) }$quote&quot;</span><span class="p">]</span> <span class="k">if</span> <span class="nx">pi</span> <span class="o">&lt;</span> <span class="nx">i</span>
<span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span> <span class="p">[</span><span class="s1">&#39;IDENTIFIER&#39;</span><span class="p">,</span> <span class="nx">interp</span><span class="p">]</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="nx">group</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span>
<span class="nv">pi: </span><span class="nx">i</span> <span class="o">+</span> <span class="mi">1</span>
<span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="nv">expr: </span><span class="err">@</span><span class="nx">balanced_string</span> <span class="nx">str</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="nx">i</span><span class="p">),</span> <span class="p">[</span><span class="s1">&#39;${&#39;</span><span class="p">,</span> <span class="s1">&#39;}&#39;</span><span class="p">])</span> <span class="o">and</span> <span class="nx">expr</span><span class="p">.</span><span class="nx">length</span> <span class="o">&gt;</span> <span class="mi">3</span>
<span class="nv">inner: </span><span class="nx">expr</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="nx">expr</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">nested: </span><span class="nx">lexer</span><span class="p">.</span><span class="nx">tokenize</span> <span class="s2">&quot;($inner)&quot;</span><span class="p">,</span> <span class="p">{</span><span class="nv">rewrite: </span><span class="kc">no</span><span class="p">,</span> <span class="nv">line: </span><span class="err">@</span><span class="nx">line</span><span class="p">}</span>
<span class="nx">nested</span><span class="p">.</span><span class="nx">pop</span><span class="p">()</span>
<span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span> <span class="p">[</span><span class="s1">&#39;STRING&#39;</span><span class="p">,</span> <span class="s2">&quot;$quote${ str.substring(pi, i) }$quote&quot;</span><span class="p">]</span> <span class="k">if</span> <span class="nx">pi</span> <span class="o">&lt;</span> <span class="nx">i</span>
<span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span> <span class="p">[</span><span class="s1">&#39;TOKENS&#39;</span><span class="p">,</span> <span class="nx">nested</span><span class="p">]</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="nx">expr</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span>
<span class="nv">pi: </span><span class="nx">i</span> <span class="o">+</span> <span class="mi">1</span>
<span class="nx">i</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span> <span class="p">[</span><span class="s1">&#39;STRING&#39;</span><span class="p">,</span> <span class="s2">&quot;$quote${ str.substring(pi, i) }$quote&quot;</span><span class="p">]</span> <span class="k">if</span> <span class="nx">pi</span> <span class="o">&lt;</span> <span class="nx">i</span> <span class="o">and</span> <span class="nx">pi</span> <span class="o">&lt;</span> <span class="nx">str</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">for</span> <span class="nx">each</span><span class="p">,</span> <span class="nx">i</span> <span class="k">in</span> <span class="nx">tokens</span>
<span class="k">if</span> <span class="nx">each</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">is</span> <span class="s1">&#39;TOKENS&#39;</span>
<span class="err">@</span><span class="nx">token</span> <span class="nx">nested</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="nx">nested</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">for</span> <span class="nx">nested</span> <span class="k">in</span> <span class="nx">each</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="err">@</span><span class="nv">tokens: </span><span class="err">@</span><span class="nx">tokens</span><span class="p">.</span><span class="nx">concat</span> <span class="nx">each</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="k">else</span>
<span class="err">@</span><span class="nx">token</span> <span class="nx">each</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="nx">each</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;+&#39;</span><span class="p">,</span> <span class="s1">&#39;+&#39;</span> <span class="k">if</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="nx">tokens</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span></pre></div> </td> </tr> <tr id="section-42"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-42">#</a> </div> <h2>Helpers</h2> </td> <td class="code"> <div class="highlight"><pre></pre></div> </td> </tr> <tr id="section-43"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-43">#</a> </div> <p>Add a token to the results, taking note of the line number.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">token: </span><span class="p">(</span><span class="nx">tag</span><span class="p">,</span> <span class="nx">value</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="err">@</span><span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span><span class="p">([</span><span class="nx">tag</span><span class="p">,</span> <span class="nx">value</span><span class="p">,</span> <span class="err">@</span><span class="nx">line</span><span class="p">])</span></pre></div> </td> </tr> <tr id="section-44"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-44">#</a> </div> <p>Peek at a tag in the current token stream.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">tag: </span><span class="p">(</span><span class="nx">index</span><span class="p">,</span> <span class="nx">tag</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="err">@</span><span class="nx">token</span> <span class="s1">&#39;+&#39;</span><span class="p">,</span> <span class="s1">&#39;+&#39;</span> <span class="k">if</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="nx">tokens</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span></pre></div> </td> </tr> <tr id="section-43"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-43">#</a> </div> <h2>Helpers</h2> </td> <td class="code"> <div class="highlight"><pre></pre></div> </td> </tr> <tr id="section-44"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-44">#</a> </div> <p>Add a token to the results, taking note of the line number.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">token: </span><span class="p">(</span><span class="nx">tag</span><span class="p">,</span> <span class="nx">value</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="err">@</span><span class="nx">tokens</span><span class="p">.</span><span class="nx">push</span><span class="p">([</span><span class="nx">tag</span><span class="p">,</span> <span class="nx">value</span><span class="p">,</span> <span class="err">@</span><span class="nx">line</span><span class="p">])</span></pre></div> </td> </tr> <tr id="section-45"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-45">#</a> </div> <p>Peek at a tag in the current token stream.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">tag: </span><span class="p">(</span><span class="nx">index</span><span class="p">,</span> <span class="nx">tag</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="k">return</span> <span class="nx">unless</span> <span class="nv">tok: </span><span class="err">@</span><span class="nx">prev</span><span class="p">(</span><span class="nx">index</span><span class="p">)</span>
<span class="k">return</span> <span class="nx">tok</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">:</span> <span class="nx">tag</span> <span class="k">if</span> <span class="nx">tag</span><span class="o">?</span>
<span class="nx">tok</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span></pre></div> </td> </tr> <tr id="section-45"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-45">#</a> </div> <p>Peek at a value in the current token stream.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">value: </span><span class="p">(</span><span class="nx">index</span><span class="p">,</span> <span class="nx">val</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="nx">tok</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span></pre></div> </td> </tr> <tr id="section-46"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-46">#</a> </div> <p>Peek at a value in the current token stream.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">value: </span><span class="p">(</span><span class="nx">index</span><span class="p">,</span> <span class="nx">val</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="k">return</span> <span class="nx">unless</span> <span class="nv">tok: </span><span class="err">@</span><span class="nx">prev</span><span class="p">(</span><span class="nx">index</span><span class="p">)</span>
<span class="k">return</span> <span class="nx">tok</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">:</span> <span class="nx">val</span> <span class="k">if</span> <span class="nx">val</span><span class="o">?</span>
<span class="nx">tok</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span></pre></div> </td> </tr> <tr id="section-46"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-46">#</a> </div> <p>Peek at a previous token, entire.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">prev: </span><span class="p">(</span><span class="nx">index</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="err">@</span><span class="nx">tokens</span><span class="p">[</span><span class="err">@</span><span class="nx">tokens</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="p">(</span><span class="nx">index</span> <span class="o">or</span> <span class="mi">1</span><span class="p">)]</span></pre></div> </td> </tr> <tr id="section-47"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-47">#</a> </div> <p>Attempt to match a string against the current chunk, returning the indexed
<span class="nx">tok</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span></pre></div> </td> </tr> <tr id="section-47"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-47">#</a> </div> <p>Peek at a previous token, entire.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">prev: </span><span class="p">(</span><span class="nx">index</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="err">@</span><span class="nx">tokens</span><span class="p">[</span><span class="err">@</span><span class="nx">tokens</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="p">(</span><span class="nx">index</span> <span class="o">or</span> <span class="mi">1</span><span class="p">)]</span></pre></div> </td> </tr> <tr id="section-48"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-48">#</a> </div> <p>Attempt to match a string against the current chunk, returning the indexed
match if successful, and <code>false</code> otherwise.</p> </td> <td class="code"> <div class="highlight"><pre> <span class="nv">match: </span><span class="p">(</span><span class="nx">regex</span><span class="p">,</span> <span class="nx">index</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="k">return</span> <span class="kc">false</span> <span class="nx">unless</span> <span class="nv">m: </span><span class="err">@</span><span class="nx">chunk</span><span class="p">.</span><span class="nx">match</span><span class="p">(</span><span class="nx">regex</span><span class="p">)</span>
<span class="k">if</span> <span class="nx">m</span> <span class="k">then</span> <span class="nx">m</span><span class="p">[</span><span class="nx">index</span><span class="p">]</span> <span class="k">else</span> <span class="kc">false</span></pre></div> </td> </tr> <tr id="section-48"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-48">#</a> </div> <h2>Utility Functions</h2> </td> <td class="code"> <div class="highlight"><pre></pre></div> </td> </tr> <tr id="section-49"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-49">#</a> </div> <p>Does a list include a value?</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">include: </span><span class="p">(</span><span class="nx">list</span><span class="p">,</span> <span class="nx">value</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="nx">list</span><span class="p">.</span><span class="nx">indexOf</span><span class="p">(</span><span class="nx">value</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="mi">0</span></pre></div> </td> </tr> <tr id="section-50"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-50">#</a> </div> <p>Peek at the beginning of a given string to see if it matches a sequence.</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">starts: </span><span class="p">(</span><span class="nx">string</span><span class="p">,</span> <span class="nx">literal</span><span class="p">,</span> <span class="nx">start</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="nx">string</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="nx">start</span><span class="p">,</span> <span class="p">(</span><span class="nx">start</span> <span class="o">or</span> <span class="mi">0</span><span class="p">)</span> <span class="o">+</span> <span class="nx">literal</span><span class="p">.</span><span class="nx">length</span><span class="p">)</span> <span class="o">is</span> <span class="nx">literal</span></pre></div> </td> </tr> <tr id="section-51"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-51">#</a> </div> <p>Trim out all falsy values from an array.</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">compact: </span><span class="p">(</span><span class="nx">array</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nx">item</span> <span class="k">for</span> <span class="nx">item</span> <span class="k">in</span> <span class="nx">array</span> <span class="k">when</span> <span class="nx">item</span></pre></div> </td> </tr> <tr id="section-52"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-52">#</a> </div> <p>Count the number of occurences of a character in a string.</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">count: </span><span class="p">(</span><span class="nx">string</span><span class="p">,</span> <span class="nx">letter</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="k">if</span> <span class="nx">m</span> <span class="k">then</span> <span class="nx">m</span><span class="p">[</span><span class="nx">index</span><span class="p">]</span> <span class="k">else</span> <span class="kc">false</span></pre></div> </td> </tr> <tr id="section-49"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-49">#</a> </div> <h2>Utility Functions</h2> </td> <td class="code"> <div class="highlight"><pre></pre></div> </td> </tr> <tr id="section-50"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-50">#</a> </div> <p>Does a list include a value?</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">include: </span><span class="p">(</span><span class="nx">list</span><span class="p">,</span> <span class="nx">value</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="nx">list</span><span class="p">.</span><span class="nx">indexOf</span><span class="p">(</span><span class="nx">value</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="mi">0</span></pre></div> </td> </tr> <tr id="section-51"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-51">#</a> </div> <p>Peek at the beginning of a given string to see if it matches a sequence.</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">starts: </span><span class="p">(</span><span class="nx">string</span><span class="p">,</span> <span class="nx">literal</span><span class="p">,</span> <span class="nx">start</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="nx">string</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="nx">start</span><span class="p">,</span> <span class="p">(</span><span class="nx">start</span> <span class="o">or</span> <span class="mi">0</span><span class="p">)</span> <span class="o">+</span> <span class="nx">literal</span><span class="p">.</span><span class="nx">length</span><span class="p">)</span> <span class="o">is</span> <span class="nx">literal</span></pre></div> </td> </tr> <tr id="section-52"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-52">#</a> </div> <p>Trim out all falsy values from an array.</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">compact: </span><span class="p">(</span><span class="nx">array</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nx">item</span> <span class="k">for</span> <span class="nx">item</span> <span class="k">in</span> <span class="nx">array</span> <span class="k">when</span> <span class="nx">item</span></pre></div> </td> </tr> <tr id="section-53"> <td class="docs"> <div class="octowrap"> <a class="octothorpe" href="#section-53">#</a> </div> <p>Count the number of occurences of a character in a string.</p> </td> <td class="code"> <div class="highlight"><pre><span class="nv">count: </span><span class="p">(</span><span class="nx">string</span><span class="p">,</span> <span class="nx">letter</span><span class="p">)</span> <span class="o">-&gt;</span>
<span class="nv">num: </span><span class="mi">0</span>
<span class="nv">pos: </span><span class="nx">string</span><span class="p">.</span><span class="nx">indexOf</span><span class="p">(</span><span class="nx">letter</span><span class="p">)</span>
<span class="k">while</span> <span class="nx">pos</span> <span class="o">isnt</span> <span class="o">-</span><span class="mi">1</span>

View File

@ -4,16 +4,14 @@
// The CoffeeScript parser is generated by [Jison](http://github.com/zaach/jison)
// from this grammar file. Jison is a bottom-up parser generator, similar in
// style to [Bison](http://www.gnu.org/software/bison), implemented in JavaScript.
// It can recognize
// [LALR(1), LR(0), SLR(1), and LR(1)](http://en.wikipedia.org/wiki/LR_grammar)
// It can recognize [LALR(1), LR(0), SLR(1), and LR(1)](http://en.wikipedia.org/wiki/LR_grammar)
// type grammars. To create the Jison parser, we list the pattern to match
// on the left-hand side, and the action to take (usually the creation of syntax
// tree nodes) on the right. As the parser runs, it
// shifts tokens from our token stream, from left to right, and
// [attempts to match](http://en.wikipedia.org/wiki/Bottom-up_parsing)
// the token sequence against the rules below. When a match can be made, it
// reduces into the
// [nonterminal](http://en.wikipedia.org/wiki/Terminal_and_nonterminal_symbols)
// reduces into the [nonterminal](http://en.wikipedia.org/wiki/Terminal_and_nonterminal_symbols)
// (the enclosing name at the top), and we proceed from there.
// If you run the `cake build:parser` command, Jison constructs a parse table
// from our rules and saves it into `lib/parser.js`.

View File

@ -74,7 +74,14 @@
// pushing some extra smarts into the Lexer.
exports.Lexer = (function() {
Lexer = function Lexer() { };
// Scan by attempting to match tokens one at a time. Slow and steady.
// **tokenize** is the Lexer's main method. Scan by attempting to match tokens
// one at a time, using a regular expression anchored at the start of the
// remaining code, or a custom recursive token-matching method
// (for interpolations). When the next token has been recorded, we move forward
// within the code past the token, and begin again.
// Each tokenizing method is responsible for incrementing `@i` by the number of
// characters it has consumed. `@i` can be thought of as our finger on the page
// of source.
Lexer.prototype.tokenize = function tokenize(code, options) {
var o;
o = options || {};
@ -85,11 +92,11 @@
this.line = o.line || 0;
// The current line.
this.indent = 0;
// The current indent level.
// The current indentation level.
this.indents = [];
// The stack of all indent levels we are currently within.
// The stack of all current indentation levels.
this.tokens = [];
// Collection of all parsed tokens in the form ['TOKEN_TYPE', value, line]
// Stream of parsed tokens in the form ['TYPE', value, line]
while (this.i < this.code.length) {
this.chunk = this.code.slice(this.i);
this.extract_next_token();
@ -101,7 +108,8 @@
return (new Rewriter()).rewrite(this.tokens);
};
// At every position, run through this list of attempted matches,
// short-circuiting if any of them succeed.
// short-circuiting if any of them succeed. Their order determines precedence:
// `@literal_token` is the fallback catch-all.
Lexer.prototype.extract_next_token = function extract_next_token() {
if (this.identifier_token()) {
return null;
@ -112,12 +120,6 @@
if (this.heredoc_token()) {
return null;
}
if (this.string_token()) {
return null;
}
if (this.js_token()) {
return null;
}
if (this.regex_token()) {
return null;
}
@ -130,11 +132,22 @@
if (this.whitespace_token()) {
return null;
}
if (this.js_token()) {
return null;
}
if (this.string_token()) {
return null;
}
return this.literal_token();
};
// Tokenizers
// ----------
// Matches identifying literals: variables, keywords, method names, etc.
// Check to ensure that JavaScript reserved words aren't being used as
// identifiers. Because CoffeeScript reserves a handful of keywords that are
// allowed in JavaScript, we're careful not to tag them as keywords when
// referenced as property names here, so you can still do `jQuery.is()` even
// though `is` means `===` otherwise.
Lexer.prototype.identifier_token = function identifier_token() {
var id, tag;
if (!((id = this.match(IDENTIFIER, 1)))) {
@ -165,11 +178,15 @@
this.i += number.length;
return true;
};
// Matches strings, including multi-line strings.
// Matches strings, including multi-line strings. Ensures that quotation marks
// are balanced within the string's contents, and within nested interpolations.
Lexer.prototype.string_token = function string_token() {
var string;
if (!(starts(this.chunk, '"') || starts(this.chunk, "'"))) {
return false;
}
string = this.balanced_token(['"', '"'], ['${', '}']);
if (string === false) {
if (!(string)) {
string = this.balanced_token(["'", "'"]);
}
if (!(string)) {
@ -180,7 +197,8 @@
this.i += string.length;
return true;
};
// Matches heredocs, adjusting indentation to the correct level.
// Matches heredocs, adjusting indentation to the correct level, as heredocs
// preserve whitespace, but ignore indentation to the left.
Lexer.prototype.heredoc_token = function heredoc_token() {
var doc, match;
if (!((match = this.chunk.match(HEREDOC)))) {
@ -192,9 +210,12 @@
this.i += match[1].length;
return true;
};
// Matches interpolated JavaScript.
// Matches JavaScript interpolated directly into the source via backticks.
Lexer.prototype.js_token = function js_token() {
var script;
if (!(starts(this.chunk, '`'))) {
return false;
}
if (!((script = this.balanced_token(['`', '`'])))) {
return false;
}
@ -202,7 +223,9 @@
this.i += script.length;
return true;
};
// Matches regular expression literals.
// Matches regular expression literals. Lexing regular expressions is difficult
// to distinguish from division, so we borrow some basic heuristics from
// JavaScript and Ruby.
Lexer.prototype.regex_token = function regex_token() {
var regex;
if (!((regex = this.match(REGEX, 1)))) {
@ -215,57 +238,15 @@
this.i += regex.length;
return true;
};
// Matches a balanced group such as a single or double-quoted string. Pass in
// a series of delimiters, all of which must be balanced correctly within the
// string.
Lexer.prototype.balanced_string = function balanced_string(str) {
var _a, _b, _c, _d, close, delimited, i, levels, open, pair;
delimited = Array.prototype.slice.call(arguments, 1);
levels = [];
i = 0;
while (i < str.length) {
_a = delimited;
for (_b = 0, _c = _a.length; _b < _c; _b++) {
pair = _a[_b];
_d = pair;
open = _d[0];
close = _d[1];
if (levels.length && starts(str, '\\', i)) {
i += 1;
break;
} else if (levels.length && starts(str, close, i) && levels[levels.length - 1] === pair) {
levels.pop();
i += close.length - 1;
if (!(levels.length)) {
i += 1;
}
break;
} else if (starts(str, open, i)) {
levels.push(pair);
i += open.length - 1;
break;
}
}
if (!(levels.length)) {
break;
}
i += 1;
}
if (levels.length) {
throw new Error("SyntaxError: Unterminated " + (levels.pop()[0]) + " starting on line " + (this.line + 1));
}
if (i === 0) {
return false;
}
return str.substring(0, i);
};
// Matches a balanced string within the token's contents.
// Matches a token in which which the passed delimiter pairs must be correctly
// balanced (ie. strings, JS literals).
Lexer.prototype.balanced_token = function balanced_token() {
var delimited;
delimited = Array.prototype.slice.call(arguments, 0);
return this.balanced_string.apply(this, [this.chunk].concat(delimited));
};
// Matches and conumes comments.
// Matches and conumes comments. We pass through comments into JavaScript,
// so they're treated as real tokens, like any other part of the language.
Lexer.prototype.comment_token = function comment_token() {
var comment, lines;
if (!((comment = this.match(COMMENT, 1)))) {
@ -279,6 +260,13 @@
return true;
};
// Matches newlines, indents, and outdents, and determines which is which.
// If we can detect that the current line is continued onto the the next line,
// then the newline is suppressed:
// elements
// .each( ... )
// .map( ... )
// Keeps track of the level of indentation, because a single outdent token
// can close multiple indents, so we need to know how far in we happen to be.
Lexer.prototype.line_token = function line_token() {
var diff, indent, next_character, no_newlines, prev, size;
if (!((indent = this.match(MULTI_DENT, 1)))) {
@ -292,12 +280,12 @@
no_newlines = next_character === '.' || (this.value() && this.value().match(NO_NEWLINE) && prev && (prev[0] !== '.') && !this.value().match(CODE));
if (size === this.indent) {
if (no_newlines) {
return this.suppress_newlines(indent);
return this.suppress_newlines();
}
return this.newline_token(indent);
} else if (size > this.indent) {
if (no_newlines) {
return this.suppress_newlines(indent);
return this.suppress_newlines();
}
diff = size - this.indent;
this.token('INDENT', diff);
@ -308,8 +296,8 @@
this.indent = size;
return true;
};
// Record an outdent token or tokens, if we happen to be moving back inwards
// past multiple recorded indents.
// Record an outdent token or multiple tokens, if we happen to be moving back
// inwards past several recorded indents.
Lexer.prototype.outdent_token = function outdent_token(move_out, no_newlines) {
var last_indent;
while (move_out > 0 && this.indents.length) {
@ -336,7 +324,7 @@
this.i += space.length;
return true;
};
// Generate a newline token. Multiple newlines get merged together.
// Generate a newline token. Consecutive newlines get merged together.
Lexer.prototype.newline_token = function newline_token(newlines) {
if (!(this.tag() === 'TERMINATOR')) {
this.token('TERMINATOR', "\n");
@ -345,7 +333,7 @@
};
// Use a `\` at a line-ending to suppress the newline.
// The slash is removed here once its job is done.
Lexer.prototype.suppress_newlines = function suppress_newlines(newlines) {
Lexer.prototype.suppress_newlines = function suppress_newlines() {
if (this.value() === "\\") {
this.tokens.pop();
}
@ -353,7 +341,9 @@
};
// We treat all other single characters as a token. Eg.: `( ) , . !`
// Multi-character operators are also literal tokens, so that Jison can assign
// the proper order of operations.
// the proper order of operations. There are some symbols that we tag specially
// here. `;` and newlines are both treated as a `TERMINATOR`, we distinguish
// parentheses that indicate a method call from regular parentheses, and so on.
Lexer.prototype.literal_token = function literal_token() {
var match, not_spaced, tag, value;
match = this.chunk.match(OPERATOR);
@ -407,17 +397,16 @@
}
}
};
// Sanitize a heredoc by escaping double quotes and erasing all external
// indentation on the left-hand side.
// Sanitize a heredoc by escaping internal double quotes and erasing all
// external indentation on the left-hand side.
Lexer.prototype.sanitize_heredoc = function sanitize_heredoc(doc) {
var indent;
indent = (doc.match(HEREDOC_INDENT) || ['']).sort()[0];
return doc.replace(new RegExp("^" + indent, 'gm'), '').replace(MULTILINER, "\\n").replace(/"/g, '\\"');
};
// A source of ambiguity in our grammar was parameter lists in function
// definitions (as opposed to argument lists in function calls). Tag
// parameter identifiers in order to avoid this. Also, parameter lists can
// make use of splats.
// A source of ambiguity in our grammar used to be parameter lists in function
// definitions versus argument lists in function calls. Walk backwards, tagging
// parameters specially in order to make things easier for the parser.
Lexer.prototype.tag_parameters = function tag_parameters() {
var _a, i, tok;
if (this.tag() !== ')') {
@ -444,104 +433,126 @@
Lexer.prototype.close_indentation = function close_indentation() {
return this.outdent_token(this.indent);
};
// Error for when you try to use a forbidden word in JavaScript as
// The error for when you try to use a forbidden word in JavaScript as
// an identifier.
Lexer.prototype.identifier_error = function identifier_error(word) {
throw new Error("SyntaxError: Reserved word \"" + word + "\" on line " + (this.line + 1));
};
// Error for when you try to assign to a reserved word in JavaScript,
// The error for when you try to assign to a reserved word in JavaScript,
// like "function" or "default".
Lexer.prototype.assignment_error = function assignment_error() {
throw new Error("SyntaxError: Reserved word \"" + (this.value()) + "\" on line " + (this.line + 1) + " can't be assigned");
};
// Matches a balanced group such as a single or double-quoted string. Pass in
// a series of delimiters, all of which must be nested correctly within the
// contents of the string. This method allows us to have strings within
// interpolations within strings etc...
Lexer.prototype.balanced_string = function balanced_string(str) {
var _a, _b, _c, _d, close, delimited, i, levels, open, pair;
delimited = Array.prototype.slice.call(arguments, 1);
levels = [];
i = 0;
while (i < str.length) {
_a = delimited;
for (_b = 0, _c = _a.length; _b < _c; _b++) {
pair = _a[_b];
_d = pair;
open = _d[0];
close = _d[1];
if (levels.length && starts(str, '\\', i)) {
i += 1;
break;
} else if (levels.length && starts(str, close, i) && levels[levels.length - 1] === pair) {
levels.pop();
i += close.length - 1;
if (!(levels.length)) {
i += 1;
}
break;
} else if (starts(str, open, i)) {
levels.push(pair);
i += open.length - 1;
break;
}
}
if (!(levels.length)) {
break;
}
i += 1;
}
if (levels.length) {
throw new Error("SyntaxError: Unterminated " + (levels.pop()[0]) + " starting on line " + (this.line + 1));
}
if (i === 0) {
return false;
}
return str.substring(0, i);
};
// Expand variables and expressions inside double-quoted strings using
// [ECMA Harmony's interpolation syntax](http://wiki.ecmascript.org/doku.php?id=strawman:string_interpolation).
// [ECMA Harmony's interpolation syntax](http://wiki.ecmascript.org/doku.php?id=strawman:string_interpolation)
// for substitution of bare variables as well as arbitrary expressions.
// "Hello $name."
// "Hello ${name.capitalize()}."
// If it encounters an interpolation, this method will recursively create a
// new Lexer, tokenize the interpolated contents, and merge them into the
// token stream.
Lexer.prototype.interpolate_string = function interpolate_string(str) {
var _a, _b, _c, _d, _e, _f, _g, _h, _i, _j, _k, _l, _m, each, expression, group, i, inner, interp, last_i, lexer, match, nested, prev, quote, tok, tokens;
var _a, _b, _c, _d, _e, each, expr, group, i, inner, interp, lexer, match, nested, pi, quote, tokens;
if (str.length < 3 || !starts(str, '"')) {
return this.token('STRING', str);
} else {
lexer = new Lexer();
tokens = [];
quote = str.substring(0, 1);
i = 1;
last_i = i;
_a = [1, 1];
i = _a[0];
pi = _a[1];
while (i < str.length - 1) {
if (starts(str, '\\', i)) {
i += 1;
} else {
match = str.substring(i).match(INTERPOLATION);
if (match) {
_a = match;
group = _a[0];
interp = _a[1];
if (starts(interp, '@')) {
interp = "this." + (interp.substring(1));
}
if (last_i < i) {
tokens.push(['STRING', quote + (str.substring(last_i, i)) + quote]);
}
tokens.push(['IDENTIFIER', interp]);
i += group.length - 1;
last_i = i + 1;
} else {
expression = this.balanced_string(str.substring(i), ['${', '}']);
if (expression && expression.length > 3) {
inner = expression.substring(2, expression.length - 1);
nested = lexer.tokenize("(" + inner + ")", {
rewrite: false,
line: this.line
});
nested.pop();
if (last_i < i) {
tokens.push(['STRING', quote + (str.substring(last_i, i)) + quote]);
}
tokens.push(['TOKENS', nested]);
i += expression.length - 1;
last_i = i + 1;
}
} else if ((match = str.substring(i).match(INTERPOLATION))) {
_b = match;
group = _b[0];
interp = _b[1];
if (starts(interp, '@')) {
interp = "this." + (interp.substring(1));
}
if (pi < i) {
tokens.push(['STRING', quote + (str.substring(pi, i)) + quote]);
}
tokens.push(['IDENTIFIER', interp]);
i += group.length - 1;
pi = i + 1;
} else if (((expr = this.balanced_string(str.substring(i), ['${', '}']))) && expr.length > 3) {
inner = expr.substring(2, expr.length - 1);
nested = lexer.tokenize("(" + inner + ")", {
rewrite: false,
line: this.line
});
nested.pop();
if (pi < i) {
tokens.push(['STRING', quote + (str.substring(pi, i)) + quote]);
}
tokens.push(['TOKENS', nested]);
i += expr.length - 1;
pi = i + 1;
}
i += 1;
}
if (last_i < i && last_i < str.length - 1) {
tokens.push(['STRING', quote + (str.substring(last_i, i)) + quote]);
if (pi < i && pi < str.length - 1) {
tokens.push(['STRING', quote + (str.substring(pi, i)) + quote]);
}
if (tokens.length > 1) {
_d = tokens.length - 1; _e = 1;
for (_c = 0, i = _d; (_d <= _e ? i <= _e : i >= _e); (_d <= _e ? i += 1 : i -= 1), _c++) {
_f = [tokens[i - 1], tokens[i]];
prev = _f[0];
tok = _f[1];
if (tok[0] === 'STRING' && prev[0] === 'STRING') {
_g = [prev[1].substring(1, prev[1].length - 1), tok[1].substring(1, tok[1].length - 1)];
prev = _g[0];
tok = _g[1];
tokens.splice(i - 1, 2, ['STRING', quote + prev + tok + quote]);
}
}
}
_h = []; _i = tokens;
for (i = 0, _j = _i.length; i < _j; i++) {
each = _i[i];
_h.push((function() {
if (each[0] === 'TOKENS') {
_k = each[1];
for (_l = 0, _m = _k.length; _l < _m; _l++) {
nested = _k[_l];
this.token(nested[0], nested[1]);
}
} else {
this.token(each[0], each[1]);
}
_c = []; _d = tokens;
for (i = 0, _e = _d.length; i < _e; i++) {
each = _d[i];
_c.push((function() {
each[0] === 'TOKENS' ? (this.tokens = this.tokens.concat(each[1])) : this.token(each[0], each[1]);
if (i < tokens.length - 1) {
return this.token('+', '+');
}
}).call(this));
}
return _h;
return _c;
}
};
// Helpers

View File

@ -110,40 +110,54 @@ BEFORE_WHEN: ['INDENT', 'OUTDENT', 'TERMINATOR']
# pushing some extra smarts into the Lexer.
exports.Lexer: class Lexer
# Scan by attempting to match tokens one at a time. Slow and steady.
# **tokenize** is the Lexer's main method. Scan by attempting to match tokens
# one at a time, using a regular expression anchored at the start of the
# remaining code, or a custom recursive token-matching method
# (for interpolations). When the next token has been recorded, we move forward
# within the code past the token, and begin again.
#
# Each tokenizing method is responsible for incrementing `@i` by the number of
# characters it has consumed. `@i` can be thought of as our finger on the page
# of source.
tokenize: (code, options) ->
o : options or {}
@code : code # The remainder of the source code.
@i : 0 # Current character position we're parsing.
@line : o.line or 0 # The current line.
@indent : 0 # The current indent level.
@indents : [] # The stack of all indent levels we are currently within.
@tokens : [] # Collection of all parsed tokens in the form ['TOKEN_TYPE', value, line]
@indent : 0 # The current indentation level.
@indents : [] # The stack of all current indentation levels.
@tokens : [] # Stream of parsed tokens in the form ['TYPE', value, line]
while @i < @code.length
@chunk: @code.slice(@i)
@extract_next_token()
@close_indentation()
return @tokens if o.rewrite is no
return @tokens if o.rewrite is off
(new Rewriter()).rewrite @tokens
# At every position, run through this list of attempted matches,
# short-circuiting if any of them succeed.
# short-circuiting if any of them succeed. Their order determines precedence:
# `@literal_token` is the fallback catch-all.
extract_next_token: ->
return if @identifier_token()
return if @number_token()
return if @heredoc_token()
return if @string_token()
return if @js_token()
return if @regex_token()
return if @comment_token()
return if @line_token()
return if @whitespace_token()
return if @js_token()
return if @string_token()
return @literal_token()
# Tokenizers
# ----------
# Matches identifying literals: variables, keywords, method names, etc.
# Check to ensure that JavaScript reserved words aren't being used as
# identifiers. Because CoffeeScript reserves a handful of keywords that are
# allowed in JavaScript, we're careful not to tag them as keywords when
# referenced as property names here, so you can still do `jQuery.is()` even
# though `is` means `===` otherwise.
identifier_token: ->
return false unless id: @match IDENTIFIER, 1
@name_access_type()
@ -163,17 +177,20 @@ exports.Lexer: class Lexer
@i += number.length
true
# Matches strings, including multi-line strings.
# Matches strings, including multi-line strings. Ensures that quotation marks
# are balanced within the string's contents, and within nested interpolations.
string_token: ->
return false unless starts(@chunk, '"') or starts(@chunk, "'")
string: @balanced_token ['"', '"'], ['${', '}']
string: @balanced_token ["'", "'"] if string is false
string: @balanced_token ["'", "'"] unless string
return false unless string
@interpolate_string string.replace STRING_NEWLINES, " \\\n"
@line += count string, "\n"
@i += string.length
true
# Matches heredocs, adjusting indentation to the correct level.
# Matches heredocs, adjusting indentation to the correct level, as heredocs
# preserve whitespace, but ignore indentation to the left.
heredoc_token: ->
return false unless match = @chunk.match(HEREDOC)
doc: @sanitize_heredoc match[2] or match[4]
@ -182,14 +199,17 @@ exports.Lexer: class Lexer
@i += match[1].length
true
# Matches interpolated JavaScript.
# Matches JavaScript interpolated directly into the source via backticks.
js_token: ->
return false unless starts @chunk, '`'
return false unless script: @balanced_token ['`', '`']
@token 'JS', script.replace(JS_CLEANER, '')
@i += script.length
true
# Matches regular expression literals.
# Matches regular expression literals. Lexing regular expressions is difficult
# to distinguish from division, so we borrow some basic heuristics from
# JavaScript and Ruby.
regex_token: ->
return false unless regex: @match REGEX, 1
return false if include NOT_REGEX, @tag()
@ -197,38 +217,13 @@ exports.Lexer: class Lexer
@i += regex.length
true
# Matches a balanced group such as a single or double-quoted string. Pass in
# a series of delimiters, all of which must be balanced correctly within the
# string.
balanced_string: (str, delimited...) ->
levels: []
i: 0
while i < str.length
for pair in delimited
[open, close]: pair
if levels.length and starts str, '\\', i
i += 1
break
else if levels.length and starts(str, close, i) and levels[levels.length - 1] is pair
levels.pop()
i += close.length - 1
i += 1 unless levels.length
break
else if starts str, open, i
levels.push(pair)
i += open.length - 1
break
break unless levels.length
i += 1
throw new Error "SyntaxError: Unterminated ${levels.pop()[0]} starting on line ${@line + 1}" if levels.length
return false if i is 0
return str.substring(0, i)
# Matches a balanced string within the token's contents.
# Matches a token in which which the passed delimiter pairs must be correctly
# balanced (ie. strings, JS literals).
balanced_token: (delimited...) ->
@balanced_string @chunk, delimited...
# Matches and conumes comments.
# Matches and conumes comments. We pass through comments into JavaScript,
# so they're treated as real tokens, like any other part of the language.
comment_token: ->
return false unless comment: @match COMMENT, 1
@line += (comment.match(MULTILINER) or []).length
@ -239,6 +234,15 @@ exports.Lexer: class Lexer
true
# Matches newlines, indents, and outdents, and determines which is which.
# If we can detect that the current line is continued onto the the next line,
# then the newline is suppressed:
#
# elements
# .each( ... )
# .map( ... )
#
# Keeps track of the level of indentation, because a single outdent token
# can close multiple indents, so we need to know how far in we happen to be.
line_token: ->
return false unless indent: @match MULTI_DENT, 1
@line += indent.match(MULTILINER).length
@ -249,10 +253,10 @@ exports.Lexer: class Lexer
no_newlines: next_character is '.' or (@value() and @value().match(NO_NEWLINE) and
prev and (prev[0] isnt '.') and not @value().match(CODE))
if size is @indent
return @suppress_newlines(indent) if no_newlines
return @suppress_newlines() if no_newlines
return @newline_token(indent)
else if size > @indent
return @suppress_newlines(indent) if no_newlines
return @suppress_newlines() if no_newlines
diff: size - @indent
@token 'INDENT', diff
@indents.push diff
@ -261,8 +265,8 @@ exports.Lexer: class Lexer
@indent: size
true
# Record an outdent token or tokens, if we happen to be moving back inwards
# past multiple recorded indents.
# Record an outdent token or multiple tokens, if we happen to be moving back
# inwards past several recorded indents.
outdent_token: (move_out, no_newlines) ->
while move_out > 0 and @indents.length
last_indent: @indents.pop()
@ -280,20 +284,22 @@ exports.Lexer: class Lexer
@i += space.length
true
# Generate a newline token. Multiple newlines get merged together.
# Generate a newline token. Consecutive newlines get merged together.
newline_token: (newlines) ->
@token 'TERMINATOR', "\n" unless @tag() is 'TERMINATOR'
true
# Use a `\` at a line-ending to suppress the newline.
# The slash is removed here once its job is done.
suppress_newlines: (newlines) ->
suppress_newlines: ->
@tokens.pop() if @value() is "\\"
true
# We treat all other single characters as a token. Eg.: `( ) , . !`
# Multi-character operators are also literal tokens, so that Jison can assign
# the proper order of operations.
# the proper order of operations. There are some symbols that we tag specially
# here. `;` and newlines are both treated as a `TERMINATOR`, we distinguish
# parentheses that indicate a method call from regular parentheses, and so on.
literal_token: ->
match: @chunk.match(OPERATOR)
value: match and match[1]
@ -334,18 +340,17 @@ exports.Lexer: class Lexer
else
@tag 1, 'PROPERTY_ACCESS'
# Sanitize a heredoc by escaping double quotes and erasing all external
# indentation on the left-hand side.
# Sanitize a heredoc by escaping internal double quotes and erasing all
# external indentation on the left-hand side.
sanitize_heredoc: (doc) ->
indent: (doc.match(HEREDOC_INDENT) or ['']).sort()[0]
doc.replace(new RegExp("^" +indent, 'gm'), '')
.replace(MULTILINER, "\\n")
.replace(/"/g, '\\"')
# A source of ambiguity in our grammar was parameter lists in function
# definitions (as opposed to argument lists in function calls). Tag
# parameter identifiers in order to avoid this. Also, parameter lists can
# make use of splats.
# A source of ambiguity in our grammar used to be parameter lists in function
# definitions versus argument lists in function calls. Walk backwards, tagging
# parameters specially in order to make things easier for the parser.
tag_parameters: ->
return if @tag() isnt ')'
i: 0
@ -363,64 +368,85 @@ exports.Lexer: class Lexer
close_indentation: ->
@outdent_token(@indent)
# Error for when you try to use a forbidden word in JavaScript as
# The error for when you try to use a forbidden word in JavaScript as
# an identifier.
identifier_error: (word) ->
throw new Error "SyntaxError: Reserved word \"$word\" on line ${@line + 1}"
# Error for when you try to assign to a reserved word in JavaScript,
# The error for when you try to assign to a reserved word in JavaScript,
# like "function" or "default".
assignment_error: ->
throw new Error "SyntaxError: Reserved word \"${@value()}\" on line ${@line + 1} can't be assigned"
# Matches a balanced group such as a single or double-quoted string. Pass in
# a series of delimiters, all of which must be nested correctly within the
# contents of the string. This method allows us to have strings within
# interpolations within strings etc...
balanced_string: (str, delimited...) ->
levels: []
i: 0
while i < str.length
for pair in delimited
[open, close]: pair
if levels.length and starts str, '\\', i
i += 1
break
else if levels.length and starts(str, close, i) and levels[levels.length - 1] is pair
levels.pop()
i += close.length - 1
i += 1 unless levels.length
break
else if starts str, open, i
levels.push(pair)
i += open.length - 1
break
break unless levels.length
i += 1
throw new Error "SyntaxError: Unterminated ${levels.pop()[0]} starting on line ${@line + 1}" if levels.length
return false if i is 0
return str.substring(0, i)
# Expand variables and expressions inside double-quoted strings using
# [ECMA Harmony's interpolation syntax](http://wiki.ecmascript.org/doku.php?id=strawman:string_interpolation).
# [ECMA Harmony's interpolation syntax](http://wiki.ecmascript.org/doku.php?id=strawman:string_interpolation)
# for substitution of bare variables as well as arbitrary expressions.
#
# "Hello $name."
# "Hello ${name.capitalize()}."
#
# If it encounters an interpolation, this method will recursively create a
# new Lexer, tokenize the interpolated contents, and merge them into the
# token stream.
interpolate_string: (str) ->
if str.length < 3 or not starts str, '"'
@token 'STRING', str
else
lexer: new Lexer()
tokens: []
quote: str.substring(0, 1)
i: 1
last_i: i
lexer: new Lexer()
tokens: []
quote: str.substring(0, 1)
[i, pi]: [1, 1]
while i < str.length - 1
if starts str, '\\', i
i += 1
else
match: str.substring(i).match INTERPOLATION
if match
[group, interp]: match
interp: "this.${ interp.substring(1) }" if starts interp, '@'
tokens.push ['STRING', "$quote${ str.substring(last_i, i) }$quote"] if last_i < i
tokens.push ['IDENTIFIER', interp]
i += group.length - 1
last_i: i + 1
else
expression: @balanced_string str.substring(i), ['${', '}']
if expression and expression.length > 3
inner: expression.substring(2, expression.length - 1)
nested: lexer.tokenize "($inner)", {rewrite: no, line: @line}
nested.pop()
tokens.push ['STRING', "$quote${ str.substring(last_i, i) }$quote"] if last_i < i
tokens.push ['TOKENS', nested]
i += expression.length - 1
last_i: i + 1
else if match: str.substring(i).match INTERPOLATION
[group, interp]: match
interp: "this.${ interp.substring(1) }" if starts interp, '@'
tokens.push ['STRING', "$quote${ str.substring(pi, i) }$quote"] if pi < i
tokens.push ['IDENTIFIER', interp]
i += group.length - 1
pi: i + 1
else if (expr: @balanced_string str.substring(i), ['${', '}']) and expr.length > 3
inner: expr.substring(2, expr.length - 1)
nested: lexer.tokenize "($inner)", {rewrite: no, line: @line}
nested.pop()
tokens.push ['STRING', "$quote${ str.substring(pi, i) }$quote"] if pi < i
tokens.push ['TOKENS', nested]
i += expr.length - 1
pi: i + 1
i += 1
tokens.push ['STRING', "$quote${ str.substring(last_i, i) }$quote"] if last_i < i and last_i < str.length - 1
if tokens.length > 1
for i in [tokens.length - 1..1]
[prev, tok]: [tokens[i - 1], tokens[i]]
if tok[0] is 'STRING' and prev[0] is 'STRING'
[prev, tok]: [prev[1].substring(1, prev[1].length - 1), tok[1].substring(1, tok[1].length - 1)]
tokens.splice i - 1, 2, ['STRING', "$quote$prev$tok$quote"]
tokens.push ['STRING', "$quote${ str.substring(pi, i) }$quote"] if pi < i and pi < str.length - 1
for each, i in tokens
if each[0] is 'TOKENS'
@token nested[0], nested[1] for nested in each[1]
@tokens: @tokens.concat each[1]
else
@token each[0], each[1]
@token '+', '+' if i < tokens.length - 1