-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow usage of non-ascii bytestring literals in templates #11
Comments
Michael Bayer (@zzzeek) wrote: ive added backslash replacing for non-ascii characters to expressions sent for AST parsing within expressions, python code blocks, and control lines in [changeset:189]. check the unit tests added to that changeset to get the idea. note that using non-ascii characters anywhere in templates requires that the encoding of the template be specified at the top via a "magic encoding comment". |
Changes by Michael Bayer (@zzzeek):
|
Anonymous wrote: I'm afraid it's still wrong. Test case: import mako.template returns u'\u0142', should return u'\u0142' (tested on svn rev 190). |
Changes by Anonymous:
|
Michael Bayer (@zzzeek) wrote: im sorry, i dont understand at this point. test case:
generated code (if you believe this is incorrect, tell me what it should say - note that all expressions are expected to be str()-able or unicode expressions since they get passed to unicode() unconditionally - use context.write() to bypass this):
program output - assertion case passes:
also observe the unit tests added within the changeset, which embed literal multibyte expressions that come out identically to the original. |
Michael Bayer (@zzzeek) wrote: also, try out the attached patch. it breaks all the current unit tests but i think its what you are looking for, it basically passes the string straight through, adds the "coding" comment to the top of the generated file. i would essentially have to throw out the whole way Mako does unicode and rewrite it to go this approach, it seems. |
Michael Bayer (@zzzeek) wrote: OK, it was using cStringIO. this one passes most tests. again, basic idea is just spitting out the genned module in the same encoding as what was given. not sure if its working all the way though. i know what youre looking for, the total "straight through" without using u"" at all. not sure if i can get this working totally. |
Michael Bayer (@zzzeek) wrote: also im being told that Genshi requires non-ascii strings be sent as u'' as well, so im not sure if this issue is limited to Mako. |
Anonymous wrote: I guess I introduced confusion with '\u0142' which should actually be u'\u0142' - a subtle but important difference :) Now, this assertion should hold, but doesn't: assert f(u'\u0142') == te.render_unicode(f=f) where te = Template(u"#-*- encoding:utf-8\n${f('\u0142')}".encode('utf-8')) I'm currently reviewing your code and the patch attached and looking for a way to implement what I want. Will keep you updated. |
Michael Bayer (@zzzeek) wrote: ultimately, to make everyone no longer notice that you have to say |
Changes by Michael Bayer (@zzzeek):
|
Changes by Michael Bayer (@zzzeek):
|
Migrated issue, originally created by Anonymous
The mako template parser has a problem, or a weirdness, depending on your view. Basically it is not possible to compile any template that contains non-ascii characters inside the ${} code. The problem traces back to python's built-in compiler inability to compile out-of-ascii unicode source. To fix it some kind of encoding-juggling inside ast.py (the 'parse' function?) would be needed as well as adding a #-*- prefix to the code being compiled there. Alas, I haven't been able to fix this myself (mysterious body snatcher exceptions pop out) neither have I enough time to work on it but I'm sure you get the idea.
To replicate the problem, just compile "${f('\u0142')}" as a mako template.
I should add that the problem is serious, at least for us and a showstopper for mako adoption in our project.
Attachments: alternate_unicode.patch
The text was updated successfully, but these errors were encountered: