Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glTF 2.0 syntax changes and JSON encoding restrictions #831

Closed
lexaknyazev opened this issue Feb 4, 2017 · 1 comment
Closed

glTF 2.0 syntax changes and JSON encoding restrictions #831

lexaknyazev opened this issue Feb 4, 2017 · 1 comment

Comments

@lexaknyazev
Copy link
Member

Here're reasons behind glTF 2.0 syntax (objects-to-arrays) changes and new JSON encoding restrictions. While each of them could have medium or little impact on robustness or performance, WG believes that their combination justifies this breaking change.

Syntax change

1. Incorrect usage of JSON (from Vulkan loader)

As opposed to using arrays with elements with a name property, glTF uses objects with arbitrarily named members.

Normally JSON objects are well-defined with well-defined member names. As a result, a JSON parser is needed that cannot just lookup object members by name, but can also iterate over the object members as if they are array elements. Not all JSON parsers support this.

JSON parsers that do support this behavior are typically not optimized for objects with many members.

2. Specifics of WebBrowser's JSON parsers

Modern JS engines try to build hidden class-based representations for already seen objects.

While that makes perfect sense for actual glTF objects like "accessor" or "node", there's no point in creating an object type for "accessors" dictionary. Moreover, when JS runtime sees too many properties, it could fall back to slower map-based internal representation (applies to v8).

This process consumes more client memory and cycles than creating a dense JS array of elements of the same type.

When an engine loads different glTF assets one after another, JS runtime can utilize previously created classes for objects like "accessor", but top-level dictionaries will have always unique signature, so they may not be optimized.

3. Asset size reduction

Arrays-based JSON tend to consume less disk / transfer space, less client's RAM. So JSON parser has fewer bytes to process.

Exporters/converters have to generate unique strings (because of global ID scope), collada2gltf does it by prefixing index with type, like "bufferView_25". Such step won't be needed for arrays. Objects that have some meaningful string name (nodes, meshes, and/or materials) have it usually in two places: as an ID and as a "name" property.

Minifying JSON by renaming all IDs to as-short-as-possible unique strings (like "Zq", "Rw", "w$", etc) surely reduces file size, but eliminates possible benefits of readability of string-based IDs.

E.g., minifying a big asset (actual scene with roughly ~1300 nodes, meshes, and ~4200 accessors) reduces JSON to 77% of original size (collada2gltf output). Combined with removing "name" properties - to 70%. Converting same collada2gltf output to array form and removing names reduces file size to 66% of original size.

4. Parsing performance

Web-browser's parsing performance difference should be noticeable on big assets (1M, at least).

Call to JSON.parse() is synchronous and causes little hangs even with WebWorkers (when data is transferred between threads). So an application that renders and loads assets constantly over time (think mapping applications, Cesium's 3D tiles) may benefit even from small performance gains.

Obviously, we can't easily measure internals of JSON.parse() call and say exactly how much each processing stage cost, but there's a clear win with arrays. Actual processing time varies by browser vendor and by CPU.

Hera're samples of duration (performance.now) of the first call to JSON.parse() and size of parsed object in heap (as reported by browser's tools) afterwards.

First number with objects (names removed, IDs minified, 1011478 bytes), second with arrays (also no names, 957153 bytes).

User-Agent Memory, MB Time, ms
Core i5-6600, W10x64 Celeron 847, W10x64
Objects Arrays Objects Arrays Objects Arrays
MS Edge 38 2.77 2.16 12 7 37 28
Mozilla Firefox 51 2.52 1.93 14 10 60 45
Google Chrome 56 4.02 3.67 14 10 60 45

JSON encoding restrictions

JSON has some vague string-related specifics, that could be avoided by enforcing additional restrictions on glTF encoding:

  • JSON allows different representations of the same string. E.g., "%" == "\u0025". JSON parsers must understand that.
  • JSON allows full Unicode charset. But even one non-ASCII char can reduce parsing performance in v8 by the factor of two, because v8 has a special case, when all chars are "one-byte".

To reduce possible impact of incorrect implementations of string handling in custom loaders, following restrictions are proposed:

  • All glTF-critical strings (i.e., property names and enums) must be defined in the spec.
  • They must use plain text encoding (i.e. "buffer" instead of "\u0062\u0075\u0066\u0066\u0065\u0072") and they must be limited to ASCII chars only.
  • glTF asset must use ASCII/UTF-8 encoding (i.e. non-ASCII chars are allowed only for app-specific strings).

With these restrictions, we can be sure that minimal glTF loader doesn't need proper Unicode support.

@lexaknyazev lexaknyazev changed the title glTF 2.0 syntax changes and data encoding restrictions glTF 2.0 syntax changes and JSON encoding restrictions Feb 4, 2017
@lexaknyazev lexaknyazev mentioned this issue Feb 4, 2017
17 tasks
@pjcozzi pjcozzi mentioned this issue Feb 4, 2017
5 tasks
@pjcozzi
Copy link
Member

pjcozzi commented Jun 15, 2017

Updated in #826

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants