Validation optional #153

benkiel · 2018-07-03T14:10:09Z

Add the option to turn on/off validation. Switch to lxml.

…of adding validation on/off to GlyphSet

Update tests. Fix errors. Add documenation.

Make validation optional

…. Those should always fail.

…itten. That's expensive.

Get encoding right for py2 and py3

… elements.

Syncing with master

anthrotype · 2018-07-03T15:57:48Z

can you remind me again why in the end we decided to have lxml as hard dependency as opposed to use the ElementTree API (supported by both lxml and built-in ElementTree library) and try to use lxml when it's available else fall back to the built-in ElementTree library if not?

I have mixed feelings because, on the one had, I'm happy when we add good third-party dependencies (usually better than our artisanal "reinvented" wheel); on the other hand, this shouldn't be done too lightly, especially when this is compiled extension, and one which, unlike the others (compreffor, pyclipper, etc.) we don't control directly.

anthrotype · 2018-07-03T16:05:55Z

https://travis-ci.org/robofab-developers/fontParts/jobs/399648345#L492

benkiel · 2018-07-03T16:07:44Z

@anthrotype We could do that, not opposed. We wanted to make lxml the thing for speed, but understand wanting a graceful fallback.

anthrotype · 2018-07-03T16:11:30Z

but maybe you or @typesupply decided to require lxml because you're depending on some lxml-specific features that are not available with the built-in elementtree library?
we need to test this it not the case

anthrotype · 2018-07-03T16:12:33Z

or if it is, then fine, we can add lxml as required dependency.

the problem is, we need to decide now, otherwise everybody who does pip install fontmake or fontParts or anything will get ImportError

typesupply · 2018-07-03T16:13:27Z

It's been a really long time since I wrote this... but, I didn't think I was relying on any LXML specific features. Falling back to ET when LXML isn't available is fine with me.

anthrotype · 2018-07-03T17:18:32Z

we are using pretty_print=True option in etree.to_string function, which only works for lxml, not for elementtree library, whose writer can't automatically indent..
I guess we need to use something like this as fallback:
https://github.com/fonttools/fonttools/blob/8d7774a3e81df1afc6dedb1272e43c7c3aea100e/Lib/fontTools/designspaceLib/__init__.py#L91-L105

typesupply · 2018-07-03T17:24:56Z

Ah, yeah. I forgot about that. It's so weird that ET doesn't have this.

anthrotype · 2018-07-03T17:31:52Z

and also, the encoding="unicode" argument to etree.tostring is a lxml thing.
Actually it's a python3 thing
https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.tostring

anthrotype · 2018-07-03T17:32:52Z

i'm going to encode as a UTF-8 bytes string and decode it as Unicode string before returning it from writeGlyphToString

anthrotype · 2018-07-03T17:46:46Z

two more problems

ElementTree's tostring always seems to sort attributes alphabetically, even if we use OrderedDict; lxml uses the order we give it. (in the diff below, the minus lines are lxml's, plus are ElementTree's)

 - <glyph name="a" format="2">
 + <glyph format="2" name="a">

When it's a simple element (without children), ElementTree always adds an extra space before the ending />, lxml doesn't:

 - <unicode hex="0062"/>
 + <unicode hex="0062" />

anthrotype · 2018-07-03T17:51:43Z

yeah, ElementTree's _serialize_xml functoin calls sorted on the attributes items:
https://github.com/python/cpython/blob/831c29721dcb1b768c6315a4b8a4059c4c97ee8b/Lib/xml/etree/ElementTree.py#L928

😢

benkiel · 2018-07-03T18:00:24Z

Perhaps that's enough to say "Alright, sorry, lxml is a requirement"?

anthrotype · 2018-07-03T18:05:13Z

it was worth a try. I really wouldn't like to implement our own elementtree xml serializer..
Let's require lxml then.

fixes https://travis-ci.org/robofab-developers/fontParts/jobs/399648345#L492 See discussion at #153 (comment)

benkiel · 2018-07-03T18:14:22Z

@anthrotype Agree. If there is pushback on the lxml requirement, we can revisit. Thanks for all the work, really appreciated!

anthrotype · 2018-07-03T18:19:12Z

just pushed v2.2.1.

Oh, when tagging a new release, I like to "edit" the github release notes so that they will be displayed as formatted markdown, instead of as a monospaced code block that is used by default when pushing a git tag. I basically copy and paste the same text of the git tag. Probably there are tools to automate this, but I'm lazy.

benkiel · 2018-07-03T20:01:16Z

@anthrotype got it. and, thanks again; know we just did that one quick — it had been sitting for a while with no comments on the other thread so before the code got too stale it was merged. Thanks again for your running down ETree and everything — it is really appreciated!

chrissimpkins · 2018-07-04T00:42:34Z

Thoughts about a major version increment for these changes? It sounds like the mandatory use of lxml writes may have the potential for large source text diffs c/w the last release for some users (if I understood Cosimo's comment above correctly) and the elimination of default source validation on UFO reads through the main i/o reading class is a significant change in expected behavior. This has the potential to lead to breakage in projects that depend on the previous default behaviors.

benkiel · 2018-07-04T03:45:53Z

@chrissimpkins Fair point. There shouldn't (fingers crossed) be too many diff changes, we didn't change the test cases and made lxml conform to the tests, though I suppose there will still be some. Also open to changing read to validate by default if that seems better.

anthrotype · 2018-07-04T10:12:35Z

I also think it would be better to bump major version for this. I'll add the plistlib module from ufoLib2 and then go to 3.0, if you guys are also OK with it.

benkiel · 2018-07-04T13:19:11Z

@anthrotype Agree. What do you think about the default being to validate on read?

anthrotype · 2018-07-04T14:22:17Z

maybe that could be safer, as long as it's easy to opt-out.
@typemytype seems to prefer validate=False here
https://github.com/typesupply/defcon/pull/191/files

anthrotype · 2018-07-04T14:25:47Z

in the defcon PR I linked, the defaults are rather the opposite of what @benkiel was suggesting:

https://github.com/typesupply/defcon/pull/191/files#diff-c6a63107dc3c355ab30047131285dc06R452

read is False, write is True.. Not sure what's best.

typesupply · 2018-07-04T15:26:10Z

My 2¢ is that validation should be on by default in all cases. That preserves backwards compatibility, for what that is worth. I also didn't engineer defcon against malicious input so I prefer that turning validation off be something that one has to knowingly do.

anthrotype · 2018-07-04T15:49:39Z

makes sense. Ok, let's flip the default to be validate=True in ufoLib, and probably defcon too.
@benkiel would you work on that? I'll work on integrating the new plist module.

typemytype · 2018-07-04T16:33:25Z

I was just following ufoLib in defcon, will change it to read and write to be validated by default in defcon.

benkiel · 2018-07-04T18:59:19Z

@anthrotype done in #155. This now means that validation is on by default for both read and write (it was on for write before)

benkiel and others added 29 commits June 11, 2018 11:07

UFOReader validation turned off by default. Starting on UFOWriter

b8d4b55

UFOWriter validation turned on by default.

42b94f5

Catch additonal calls to GlyphSet that need validation. Inital start …

5f8b600

…of adding validation on/off to GlyphSet

Update glyphLib.

a7aa115

Update tests. Fix errors. Add documenation.

Merge pull request #150 from benkiel/master

62dba9b

Make validation optional

Relax the point data validation.

4c75212

Add in more valiator switches

83303d5

Fix tests, needed named arg.

f81e9b8

Remove some validation escaping when it comes to file existence tests…

34681b8

…. Those should always fail.

Remove some validation escaping.

efd2575

Don't rebuild the list of existing file names each time a glyph is wr…

5867e9b

…itten. That's expensive.

Typo and fallback.

382271c

Use lxml for writing.

739ec8a

Add lxml to requirements

6d774d7

Need to force text to a string and not bytes for Py3

f3b0703

Remove slice from splitlines that isn't needed for lxml

c54c54b

Need to pass an OrderedDict to get correct order.

c8608c2

Found some attrib type errors.

213b90a

Make attrs OrderedDicts

3ae4f10

Get encoding right for py2 and py3

Trick lxml into not writing self-closing tags for outline and contour…

3cfb942

… elements.

Height before width.

4c52a51

Write the note the old way.

5f39650

fix writing lib with lxml

e9df5be

Add indentation to the self-closing prevention.

dab4800

Use len(element).

5ec6b5e

Use lxml for reading.

c008e20

Unneeded import.

1d6dddc

Merge branch 'validation_optional' into master

fe88d1d

Merge pull request #152 from unified-font-object/master

55f131d

Syncing with master

benkiel requested a review from anthrotype July 3, 2018 14:32

anthrotype added a commit that referenced this pull request Jul 3, 2018

Add lxml >= 4.0 to setup.py install_requires

e33561b

fixes https://travis-ci.org/robofab-developers/fontParts/jobs/399648345#L492 See discussion at #153 (comment)

chrissimpkins mentioned this pull request Jul 4, 2018

Fix UFO source validations after ufoLib validation on read changes source-foundry/ufolint#10

Merged

benkiel mentioned this pull request Jul 16, 2018

This isn’t practical in the real world #102

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation optional #153

Validation optional #153

benkiel commented Jul 3, 2018

anthrotype commented Jul 3, 2018

anthrotype commented Jul 3, 2018

benkiel commented Jul 3, 2018

anthrotype commented Jul 3, 2018

anthrotype commented Jul 3, 2018

typesupply commented Jul 3, 2018

anthrotype commented Jul 3, 2018

typesupply commented Jul 3, 2018

anthrotype commented Jul 3, 2018

anthrotype commented Jul 3, 2018

anthrotype commented Jul 3, 2018

anthrotype commented Jul 3, 2018

benkiel commented Jul 3, 2018

anthrotype commented Jul 3, 2018

benkiel commented Jul 3, 2018

anthrotype commented Jul 3, 2018

benkiel commented Jul 3, 2018

chrissimpkins commented Jul 4, 2018

benkiel commented Jul 4, 2018

anthrotype commented Jul 4, 2018

benkiel commented Jul 4, 2018

anthrotype commented Jul 4, 2018

anthrotype commented Jul 4, 2018

typesupply commented Jul 4, 2018

anthrotype commented Jul 4, 2018

typemytype commented Jul 4, 2018

benkiel commented Jul 4, 2018 •

edited

Loading

Validation optional #153

Validation optional #153

Conversation

benkiel commented Jul 3, 2018

anthrotype commented Jul 3, 2018

anthrotype commented Jul 3, 2018

benkiel commented Jul 3, 2018

anthrotype commented Jul 3, 2018

anthrotype commented Jul 3, 2018

typesupply commented Jul 3, 2018

anthrotype commented Jul 3, 2018

typesupply commented Jul 3, 2018

anthrotype commented Jul 3, 2018

anthrotype commented Jul 3, 2018

anthrotype commented Jul 3, 2018

anthrotype commented Jul 3, 2018

benkiel commented Jul 3, 2018

anthrotype commented Jul 3, 2018

benkiel commented Jul 3, 2018

anthrotype commented Jul 3, 2018

benkiel commented Jul 3, 2018

chrissimpkins commented Jul 4, 2018

benkiel commented Jul 4, 2018

anthrotype commented Jul 4, 2018

benkiel commented Jul 4, 2018

anthrotype commented Jul 4, 2018

anthrotype commented Jul 4, 2018

typesupply commented Jul 4, 2018

anthrotype commented Jul 4, 2018

typemytype commented Jul 4, 2018

benkiel commented Jul 4, 2018 • edited Loading

benkiel commented Jul 4, 2018 •

edited

Loading