Closes gh-3
This is a complicated issue, but I'll do my best to explain it here.
By default, Haml encodes its templates as Encoding.default_internal,
which is usually UTF-8. This means that strings printed to the
template should be either UTF-8 or UTF-8-compatible ASCII. So far, all
well and good.
Now, it's possible to have strings that are marked as ASCII-8bit, but
which aren't UTF-8 compatible. This includes valid UTF-8 strings that
are forced into an ASCII-8bit encoding. If one of these strings is
concatenated to a UTF-8 string, Ruby says "I don't know what to do
with these non-ASCII characters!" and throws an encoding error. I call
this sort of string "fake ASCII."
This is what was happening in the referenced GitHub issue (or at least
in the sample app Adam Salter created at
http://github.com/adamsalter/test-project/tree/haml_utf8). The
template was UTF-8 encoded, and it was being passed a fake ASCII
string, marked as ASCII-8bit but with UTF-8 byte sequences in it, and
it was choking.
The issue now becomes: where is this fake ASCII string coming from?
From the database. The database drivers used by Rails aren't Ruby 1.9
compatible. Despite storing UTF-8 strings in the database, the drivers
return fake ASCII strings.
The best solution to this is clearly to fix the database drivers, but
that will probably take some time. One stop-gap would be to call
`force_encoding("utf-8")` on all the database values somewhere, which
is still a little annoying. Finally, the solution provided in this
commit is to set `:encoding => "ascii-8bit"` for Haml. This makes the
Haml template itself fake ASCII, which is wrong but will help prevent
encoding errors.
It's not actually the case that they preserve nested content;
this would be very difficult to do in a general way.
They actually only generate content given with =.
Closes gh-46