matches against the beginning of the source code. When a match is found,
a token is produced, we consume the match, and start again. Tokens are in the
form:</p>
<pre><code>[tag, value, line_number]
</code></pre>
<p>Which is a format that can be fed directly into <ahref="http://github.com/zaach/jison">Jison</a>.</p></td><tdclass="code"><divclass="highlight"><pre></pre></div></td></tr><trid="section-2"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-2">#</a></div><p>Set up the Lexer for both Node.js and the browser, depending on where we are.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="k">if</span><spanclass="nx">process</span><spanclass="o">?</span>
<spanclass="nv">helpers: </span><spanclass="k">this</span><spanclass="p">.</span><spanclass="nx">helpers</span></pre></div></td></tr><trid="section-3"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-3">#</a></div><p>Import the helpers we need.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">include: </span><spanclass="nx">helpers</span><spanclass="p">.</span><spanclass="nx">include</span>
<spanclass="nv">balanced_string: </span><spanclass="nx">helpers</span><spanclass="p">.</span><spanclass="nx">balanced_string</span></pre></div></td></tr><trid="section-4"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-4">#</a></div><h2>The Lexer Class</h2></td><tdclass="code"><divclass="highlight"><pre></pre></div></td></tr><trid="section-5"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-5">#</a></div><p>The Lexer class reads a stream of CoffeeScript and divvys it up into tagged
pushing some extra smarts into the Lexer.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">exports.Lexer: </span><spanclass="nx">class</span><spanclass="nx">Lexer</span></pre></div></td></tr><trid="section-6"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-6">#</a></div><p><strong>tokenize</strong> is the Lexer's main method. Scan by attempting to match tokens
<spanclass="err">@</span><spanclass="nx">code</span><spanclass="o">:</span><spanclass="nx">code</span><spanclass="c1"># The remainder of the source code.</span>
<spanclass="err">@</span><spanclass="nx">i</span><spanclass="o">:</span><spanclass="mi">0</span><spanclass="c1"># Current character position we're parsing.</span>
<spanclass="err">@</span><spanclass="nx">line</span><spanclass="o">:</span><spanclass="nx">o</span><spanclass="p">.</span><spanclass="nx">line</span><spanclass="o">or</span><spanclass="mi">0</span><spanclass="c1"># The current line.</span>
<spanclass="err">@</span><spanclass="nx">indent</span><spanclass="o">:</span><spanclass="mi">0</span><spanclass="c1"># The current indentation level.</span>
<spanclass="err">@</span><spanclass="nx">indents</span><spanclass="o">:</span><spanclass="p">[]</span><spanclass="c1"># The stack of all current indentation levels.</span>
<spanclass="err">@</span><spanclass="nx">tokens</span><spanclass="o">:</span><spanclass="p">[]</span><spanclass="c1"># Stream of parsed tokens in the form ['TYPE', value, line]</span>
<spanclass="p">(</span><spanclass="k">new</span><spanclass="nx">Rewriter</span><spanclass="p">()).</span><spanclass="nx">rewrite</span><spanclass="err">@</span><spanclass="nx">tokens</span></pre></div></td></tr><trid="section-7"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-7">#</a></div><p>At every position, run through this list of attempted matches,
short-circuiting if any of them succeed. Their order determines precedence:
<code>@literal_token</code> is the fallback catch-all.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">extract_next_token: </span><spanclass="o">-></span>
<spanclass="k">return</span><spanclass="err">@</span><spanclass="nx">literal_token</span><spanclass="p">()</span></pre></div></td></tr><trid="section-8"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-8">#</a></div><h2>Tokenizers</h2></td><tdclass="code"><divclass="highlight"><pre></pre></div></td></tr><trid="section-9"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-9">#</a></div><p>Language extensions get the highest priority, first chance to tag tokens
Check to ensure that JavaScript reserved words aren't being used as
identifiers. Because CoffeeScript reserves a handful of keywords that are
allowed in JavaScript, we're careful not to tag them as keywords when
referenced as property names here, so you can still do <code>jQuery.is()</code> even
though <code>is</code> means <code>===</code> otherwise.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">identifier_token: </span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-11"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-11">#</a></div><p>Matches numbers, including decimals, hex, and exponential notation.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">number_token: </span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-12"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-12">#</a></div><p>Matches strings, including multi-line strings. Ensures that quotation marks
are balanced within the string's contents, and within nested interpolations.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">string_token: </span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-13"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-13">#</a></div><p>Matches heredocs, adjusting indentation to the correct level, as heredocs
preserve whitespace, but ignore indentation to the left.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">heredoc_token: </span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-14"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-14">#</a></div><p>Matches JavaScript interpolated directly into the source via backticks.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">js_token: </span><spanclass="o">-></span>
JavaScript and Ruby, borrow slash balancing from <code>@balanced_token</code>, and
borrow interpolation from <code>@interpolate_string</code>.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">regex_token: </span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-16"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-16">#</a></div><p>Matches a token in which which the passed delimiter pairs must be correctly
<spanclass="nx">balanced_string</span><spanclass="err">@</span><spanclass="nx">chunk</span><spanclass="p">,</span><spanclass="nx">delimited</span></pre></div></td></tr><trid="section-17"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-17">#</a></div><p>Matches and conumes comments. We pass through comments into JavaScript,
so they're treated as real tokens, like any other part of the language.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">comment_token: </span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-18"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-18">#</a></div><p>Matches newlines, indents, and outdents, and determines which is which.
If we can detect that the current line is continued onto the the next line,
then the newline is suppressed:</p>
<pre><code>elements
.each( ... )
.map( ... )
</code></pre>
<p>Keeps track of the level of indentation, because a single outdent token
can close multiple indents, so we need to know how far in we happen to be.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">line_token: </span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-19"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-19">#</a></div><p>Record an outdent token or multiple tokens, if we happen to be moving back
inwards past several recorded indents.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">outdent_token: </span><spanclass="p">(</span><spanclass="nx">move_out</span><spanclass="p">,</span><spanclass="nx">no_newlines</span><spanclass="p">)</span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-20"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-20">#</a></div><p>Matches and consumes non-meaningful whitespace. Tag the previous token
as being "spaced", because there are some cases where it makes a difference.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">whitespace_token: </span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-21"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-21">#</a></div><p>Generate a newline token. Consecutive newlines get merged together.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">newline_token: </span><spanclass="p">(</span><spanclass="nx">newlines</span><spanclass="p">)</span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-22"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-22">#</a></div><p>Use a <code>\</code> at a line-ending to suppress the newline.
The slash is removed here once its job is done.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">suppress_newlines: </span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-23"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-23">#</a></div><p>We treat all other single characters as a token. Eg.: <code>( ) , . !</code>
the proper order of operations. There are some symbols that we tag specially
here. <code>;</code> and newlines are both treated as a <code>TERMINATOR</code>, we distinguish
parentheses that indicate a method call from regular parentheses, and so on.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">literal_token: </span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-24"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-24">#</a></div><h2>Token Manipulators</h2></td><tdclass="code"><divclass="highlight"><pre></pre></div></td></tr><trid="section-25"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-25">#</a></div><p>As we consume a new <code>IDENTIFIER</code>, look at the previous token to determine
if it's a special kind of accessor.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">name_access_type: </span><spanclass="o">-></span>
<spanclass="err">@</span><spanclass="nx">tag</span><spanclass="mi">1</span><spanclass="p">,</span><spanclass="s1">'PROPERTY_ACCESS'</span></pre></div></td></tr><trid="section-26"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-26">#</a></div><p>Sanitize a heredoc by escaping internal double quotes and erasing all
external indentation on the left-hand side.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">sanitize_heredoc: </span><spanclass="p">(</span><spanclass="nx">doc</span><spanclass="p">)</span><spanclass="o">-></span>
<spanclass="p">.</span><spanclass="nx">replace</span><spanclass="p">(</span><spanclass="sr">/"/g</span><spanclass="p">,</span><spanclass="s1">'\\"'</span><spanclass="p">)</span></pre></div></td></tr><trid="section-27"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-27">#</a></div><p>A source of ambiguity in our grammar used to be parameter lists in function
definitions versus argument lists in function calls. Walk backwards, tagging
parameters specially in order to make things easier for the parser.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">tag_parameters: </span><spanclass="o">-></span>
<spanclass="kc">true</span></pre></div></td></tr><trid="section-28"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-28">#</a></div><p>Close up all remaining open blocks at the end of the file.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">close_indentation: </span><spanclass="o">-></span>
<spanclass="err">@</span><spanclass="nx">outdent_token</span><spanclass="p">(</span><spanclass="err">@</span><spanclass="nx">indent</span><spanclass="p">)</span></pre></div></td></tr><trid="section-29"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-29">#</a></div><p>The error for when you try to use a forbidden word in JavaScript as
an identifier.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">identifier_error: </span><spanclass="p">(</span><spanclass="nx">word</span><spanclass="p">)</span><spanclass="o">-></span>
<spanclass="k">throw</span><spanclass="k">new</span><spanclass="nb">Error</span><spanclass="s2">"SyntaxError: Reserved word \"$word\" on line ${@line + 1}"</span></pre></div></td></tr><trid="section-30"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-30">#</a></div><p>The error for when you try to assign to a reserved word in JavaScript,
<spanclass="k">throw</span><spanclass="k">new</span><spanclass="nb">Error</span><spanclass="s2">"SyntaxError: Reserved word \"${@value()}\" on line ${@line + 1} can't be assigned"</span></pre></div></td></tr><trid="section-31"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-31">#</a></div><p>Expand variables and expressions inside double-quoted strings using
<spanclass="nx">tokens</span></pre></div></td></tr><trid="section-32"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-32">#</a></div><h2>Helpers</h2></td><tdclass="code"><divclass="highlight"><pre></pre></div></td></tr><trid="section-33"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-33">#</a></div><p>Add a token to the results, taking note of the line number.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">token: </span><spanclass="p">(</span><spanclass="nx">tag</span><spanclass="p">,</span><spanclass="nx">value</span><spanclass="p">)</span><spanclass="o">-></span>
<spanclass="err">@</span><spanclass="nx">tokens</span><spanclass="p">.</span><spanclass="nx">push</span><spanclass="p">([</span><spanclass="nx">tag</span><spanclass="p">,</span><spanclass="nx">value</span><spanclass="p">,</span><spanclass="err">@</span><spanclass="nx">line</span><spanclass="p">])</span></pre></div></td></tr><trid="section-34"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-34">#</a></div><p>Peek at a tag in the current token stream.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">tag: </span><spanclass="p">(</span><spanclass="nx">index</span><spanclass="p">,</span><spanclass="nx">tag</span><spanclass="p">)</span><spanclass="o">-></span>
<spanclass="nx">tok</span><spanclass="p">[</span><spanclass="mi">0</span><spanclass="p">]</span></pre></div></td></tr><trid="section-35"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-35">#</a></div><p>Peek at a value in the current token stream.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">value: </span><spanclass="p">(</span><spanclass="nx">index</span><spanclass="p">,</span><spanclass="nx">val</span><spanclass="p">)</span><spanclass="o">-></span>
<spanclass="nx">tok</span><spanclass="p">[</span><spanclass="mi">1</span><spanclass="p">]</span></pre></div></td></tr><trid="section-36"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-36">#</a></div><p>Peek at a previous token, entire.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">prev: </span><spanclass="p">(</span><spanclass="nx">index</span><spanclass="p">)</span><spanclass="o">-></span>
<spanclass="err">@</span><spanclass="nx">tokens</span><spanclass="p">[</span><spanclass="err">@</span><spanclass="nx">tokens</span><spanclass="p">.</span><spanclass="nx">length</span><spanclass="o">-</span><spanclass="p">(</span><spanclass="nx">index</span><spanclass="o">or</span><spanclass="mi">1</span><spanclass="p">)]</span></pre></div></td></tr><trid="section-37"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-37">#</a></div><p>Attempt to match a string against the current chunk, returning the indexed
match if successful, and <code>false</code> otherwise.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">match: </span><spanclass="p">(</span><spanclass="nx">regex</span><spanclass="p">,</span><spanclass="nx">index</span><spanclass="p">)</span><spanclass="o">-></span>
<spanclass="k">if</span><spanclass="nx">m</span><spanclass="k">then</span><spanclass="nx">m</span><spanclass="p">[</span><spanclass="nx">index</span><spanclass="p">]</span><spanclass="k">else</span><spanclass="kc">false</span></pre></div></td></tr><trid="section-38"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-38">#</a></div><p>There are no exensions to the core lexer by default.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">Lexer.extensions: </span><spanclass="p">[]</span></pre></div></td></tr><trid="section-39"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-39">#</a></div><h2>Constants</h2></td><tdclass="code"><divclass="highlight"><pre></pre></div></td></tr><trid="section-40"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-40">#</a></div><p>Keywords that CoffeeScript shares in common with JavaScript.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">JS_KEYWORDS: </span><spanclass="p">[</span>
<spanclass="p">]</span></pre></div></td></tr><trid="section-41"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-41">#</a></div><p>CoffeeScript-only keywords, which we're more relaxed about allowing. They can't
be used standalone, but you can reference them as an attached property.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">COFFEE_KEYWORDS: </span><spanclass="p">[</span>
<spanclass="p">]</span></pre></div></td></tr><trid="section-42"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-42">#</a></div><p>The combined list of keywords is the superset that gets passed verbatim to
the parser.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">KEYWORDS: </span><spanclass="nx">JS_KEYWORDS</span><spanclass="p">.</span><spanclass="nx">concat</span><spanclass="nx">COFFEE_KEYWORDS</span></pre></div></td></tr><trid="section-43"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-43">#</a></div><p>The list of keywords that are reserved by JavaScript, but not used, or are
used by CoffeeScript internally. We throw an error when these are encountered,
to avoid having a JavaScript error at runtime.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">RESERVED: </span><spanclass="p">[</span>
<spanclass="p">]</span></pre></div></td></tr><trid="section-44"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-44">#</a></div><p>The superset of both JavaScript keywords and reserved words, none of which may
be used as identifiers or properties.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">JS_FORBIDDEN: </span><spanclass="nx">JS_KEYWORDS</span><spanclass="p">.</span><spanclass="nx">concat</span><spanclass="nx">RESERVED</span></pre></div></td></tr><trid="section-45"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-45">#</a></div><p>Token matching regexes.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nx">IDENTIFIER</span><spanclass="o">:</span><spanclass="sr">/^([a-zA-Z\$_](\w|\$)*)/</span>
<spanclass="nx">HEREDOC_INDENT</span><spanclass="o">:</span><spanclass="sr">/^[ \t]+/mg</span></pre></div></td></tr><trid="section-48"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-48">#</a></div><p>Tokens which a regular expression will never immediately follow, but which
<p>Our list is shorter, due to sans-parentheses method calls.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">NOT_REGEX: </span><spanclass="p">[</span>
<spanclass="p">]</span></pre></div></td></tr><trid="section-49"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-49">#</a></div><p>Tokens which could legitimately be invoked or indexed. A opening
of a function invocation or indexing operation.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">CALLABLE: </span><spanclass="p">[</span><spanclass="s1">'IDENTIFIER'</span><spanclass="p">,</span><spanclass="s1">'SUPER'</span><spanclass="p">,</span><spanclass="s1">')'</span><spanclass="p">,</span><spanclass="s1">']'</span><spanclass="p">,</span><spanclass="s1">'}'</span><spanclass="p">,</span><spanclass="s1">'STRING'</span><spanclass="p">,</span><spanclass="s1">'@'</span><spanclass="p">]</span></pre></div></td></tr><trid="section-50"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-50">#</a></div><p>Tokens that indicate an access -- keywords immediately following will be
treated as identifiers.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">ACCESSORS: </span><spanclass="p">[</span><spanclass="s1">'PROPERTY_ACCESS'</span><spanclass="p">,</span><spanclass="s1">'PROTOTYPE_ACCESS'</span><spanclass="p">,</span><spanclass="s1">'SOAK_ACCESS'</span><spanclass="p">,</span><spanclass="s1">'@'</span><spanclass="p">]</span></pre></div></td></tr><trid="section-51"><tdclass="docs"><divclass="octowrap"><aclass="octothorpe"href="#section-51">#</a></div><p>Tokens that, when immediately preceding a <code>WHEN</code>, indicate that the <code>WHEN</code>
avoid an ambiguity in the grammar.</p></td><tdclass="code"><divclass="highlight"><pre><spanclass="nv">BEFORE_WHEN: </span><spanclass="p">[</span><spanclass="s1">'INDENT'</span><spanclass="p">,</span><spanclass="s1">'OUTDENT'</span><spanclass="p">,</span><spanclass="s1">'TERMINATOR'</span><spanclass="p">]</span>