AGS 4: Implement script RTTI #1922

ivan-mogilko · 2023-02-15T07:32:38Z

Resolves #1259.

CC @fernewelten , because he is working on a new script compiler, and I added few functions to it.

The detailed description of the purpose is given in the task ticket #1259, here I will give a brief overview.
There are several potential operations within the script interpreter that are impossible to achieve without having an indication and description of object's type and its inner contents:

Having managed pointers in managed structs and being able to dereference them at runtime when disposing the parent object (thus avoiding memory leaks).
Dynamic pointer casting (at runtime), parent-to-child type casting specifically.
Being able to identify the script data in game saves (see Version-independent save games #1371).

Following actions would be also potentially easier to implement if we had type indication and content description:

Virtual methods (would need a runtime type reference mostly, full RTTI tables are probably not needed for this).
Runtime debugging, such as watching the struct's fields in real-time, for example.

This is where RTTI feature comes from. The idea is to have a table of types and their contents, generated by the script compiler as an extra step, and written either as an extra (optional) part of a script data, or a separate file along with the game project.

This PR does the following:

Implements RTTI generation in both supported script compilers (classic ags3 and new ags4 one).
Serialize generated RTTI per script, as an extension to the script data.
The runtime interpreter creates a joint RTTI collection, updating it after each loaded script.
"--print-rtti" command arg, which makes engine print resulting joint collection on the each room load into the log (this seemed to be the most logical place to have this, as room script is the last loaded script).

--print-rtti command arg is useful for testing the resulting RTTI table.

Additionally, engine also creates "quick references" within the gathered RTTI, which is basically pointers between type and field structs. This allows for easier traversing of types and respective fields under the debugger.

Generation

Currently the type information consists of an ID, a name, a location (where this type was declared), a parent's ID (if it has a parent), a set of flags which define type's kind, and a collection of fields.
The fully qualified type's name may be generated by combining a location's name (usually header or script name) and a type's name. This is required to be able to distinguish potential unrelated types of same names declared in different scripts.
Fields info contains: relative offset, name, typeid, and a set of flags which define field's kind and qualifiers.
The gathered data may be easily expanded in the future (also see notes to the serialization).

Compilers generate RTTI per each compiled script unit. Because of that the type's ID is a local ID, which is relative to this compiled unit only. The engine will have to use fully qualified names to map script's local ID with a global ID in a joint table.

Serialization

The RTTI serialization format is designed having ease of expansion in mind. For that purpose it is not done as a single table with types and nested field items inside, but as a number of separate tables, where each table's entry has a fixed size. The RTTI header describes tables' offsets, and the fixed sizes of their items. The tables are connected using index references (meaning, an item in table 1 may refer to an item in table 2 using its index).

The advantages of such structure are:

faster reading;
faster and easier parsing, if one is e.g. writing a tool that does not read full data, but only wants to find particular item in a file stream;
much easier extension: if you want to expand an existing item then you increase the fixed item's size in header, and even older program will still be able to read new data (although it will only understand old parts of data); if you want to add completely new kind of information then you add a new table, which may be easily skipped by a parser if it does not need it or does not know it.

Following is a format description.

RTTI header

field	type / size	comment
format	uint32	for expanding the rtti format
header size	uint32	size in bytes of a header (counting from "format" field, until header ends)
full rtti size	uint32	size in bytes of a rtti data (counting from "format" field, until data ends)
location entry size	uint32	fixed size of a location's description in bytes (may depend on format)
locations table offset	uint32	a relative position of a locations table (in file)
num locations	uint32	total number of location entries
type entry size	uint32	fixed size of a type's description in bytes (may depend on format)
types table offset	uint32	a relative position of a types table (in file)
num types	uint32	total number of type entries
field entry size	uint32	fixed size of a type field's description in bytes (may depend on format)
fields table offset	uint32	a relative position of a type fields table (in file)
num type fields	uint32	total number of all fields in all types
string table offset	uint32	a relative position of a strings table (in file)
string table size	uint32	total size of a string table, in bytes

RTTI tables

field	type / size	comment
location table	num locs * loc size	see "Location description" below
type table	num types * type size	see "Type description" below
fields table	num type fields * field entry size	see "Field description" below
string table	string table size	all RTTI null-terminated strings packed in a single array (separated by 0s)

Location description

field	type / size	comment
local id	uint32	local ID of this location
name	uint32	an offset of a name in a string table

Type description

field	type / size	comment
local id	uint32	local ID of this type
name	uint32	an offset of a name in a string table
location id	uint32	ID of location this type was declared at
parent type	uint32	local type ID; 0 if no parent
type flags	uint32	may contains helper flags which simplify analyzing this type
size	uint32	in bytes
num fields	uint32	number of member fields this type has, 0 if none
field table index	uint32	index of the first field in the fields table

Field description

field	type / size	comment
offset	uint32	relative offset of this field, in bytes
name	uint32	an offset of a name in a string table
type	uint32	this field's local type ID
type flags	uint32	may contains helper flags which simplify analyzing this member
num elements	uint32	number of (array's) elements

TODO

Make sure there's type with typeid 0 with a meaning "No Type" generated by each compiler (I think old compiler does not do this atm).
There's a nasty problem with fully qualified names taking extra space in a file and mem because of the repeated location (script) name. I'm investigating options, but possibly the type desc may contain a "location id" instead, and then either 1) location names are stored separately, 2) location names taken from section names in basic script data, 3) create a new table of "locations" inside RTTI (which may have some extra uses in the future).
Compiler should only put member name for fields, currently one or both of the compilers put Type::Member there instead, which is redundant.
Perhaps better format the --print-rtti output.
Might add explicit compiler option for generating & saving rtti.
I made Fields that are not array save "num elems" as 1, but maybe that's redundant (need to think this over).

ivan-mogilko · 2023-02-20T00:14:03Z

I decided to do some refactor on this pr, because in the current variant the RTTI struct has too many internal "mechanics" exposed, so it's too easy to produce an invalid state (and there's code duplication too).

ivan-mogilko · 2023-02-23T20:27:40Z

Added a separate "locations" table. Now the fully qualified type's name is generated at runtime by combining location's name and type's name.

AlanDrake · 2023-03-08T17:24:17Z

Does this PR have any impact on script performance?

ivan-mogilko · 2023-03-08T18:42:44Z

Does this PR have any impact on script performance?

A little longer script saving and loading.
The script running will be impacted by another pr (#1923), that actually puts this into use in the new "new" command, which saves the type. But I don't think too much (may be tested though).

I was planning to make one last refactor tbh in the nearest days (I got distracted by the latest 3.6.0 bugs), because I put everything in script.h/cpp, and one class may be split into two for more consistent behavior, but this may also be done later too.

The script writing was reimplemented in the managed code (C++ CLR) in c5c7c87 , but this is quite inconvenient, as there's a script serialization code in the native part which is still used by the compiled room serialization code, and thus we'd have to duplicate any additions in native and managed code in parallel. This change removes managed reimplementation, and makes CompiledScript::Write use native serialization instead. Because the scripts are written in the middle of the compiled game format, which is serialized by the managed code too, we cannot use same stream object directly. There are two options: 1) Write to a temp file using native stream, then copy file contents to the provided managed stream. 2) Write to a native membuf, then copy buf's contents to the managed stream (might require conversion, or marshaling). I implemented this using a temp file for now. If this proves to be slow with large number of scripts, we may switch to using memory buffer (and memory stream) instead.

TODO: section names are missing, so fully qualified type names are not yet available.

ivan-mogilko · 2023-03-09T21:08:12Z

Okay, I think this is done. I had doubts about how the structs are organized, but I am maybe overthinking again, and it should not be impossible to refactor into something more convenient (if a need arises). The serialization format is something I am certain about though.

As been mentioned, it's possible to completely disable rtti generation by toggling a compiler option. Engine can load scripts both with and without rtti.

ivan-mogilko added ags 4 related to the ags4 development context: script compiler context: script vm labels Feb 15, 2023

ivan-mogilko added this to the 4.0.0 (preliminary) milestone Feb 15, 2023

ivan-mogilko force-pushed the ags4--rtti branch 2 times, most recently from d076b7a to 41fde01 Compare February 15, 2023 08:30

ivan-mogilko mentioned this pull request Feb 15, 2023

AGS 4: properly support managed pointers in managed structs #1923

Merged

5 tasks

ivan-mogilko changed the title ~~AGS 4.0: Implement script RTTI~~ AGS 4: Implement script RTTI Feb 15, 2023

ivan-mogilko marked this pull request as draft February 20, 2023 00:12

ivan-mogilko force-pushed the ags4--rtti branch 3 times, most recently from 030cfdd to 840fe39 Compare February 23, 2023 20:21

ivan-mogilko marked this pull request as ready for review February 23, 2023 21:31

ivan-mogilko force-pushed the ags4--rtti branch 2 times, most recently from fd7fd67 to 8a6cf80 Compare March 9, 2023 19:02

ivan-mogilko added 10 commits March 9, 2023 22:16

Common: added RTTI classes and serialization in ccScript

ca16a29

Compiler (old): gather RTTI

06a0ad8

Compiler (new): gather RTTI

bdde8a2

TODO: section names are missing, so fully qualified type names are not yet available.

Engine: ccInstance creates global RTTI table

bcddb3e

Engine: implemented "--print-rtti" command for logging joint RTTI table

ada37b5

Compilers: ensure "undefined" type 0 in RTTI

1577906

Common: for RTTI write field names without struct name

65bc9f5

Common: RTTI stores locations table, type info has loc id

52853eb

Compiler (new): implement adding locations for RTTI

5f3119a

Compilers: explicit option for exporting RTTI

65dabbf

ivan-mogilko force-pushed the ags4--rtti branch from be7bb6c to 798a63c Compare March 9, 2023 19:22

Tools, Tests: fixed linking compiler libs

b00186d

ivan-mogilko force-pushed the ags4--rtti branch from 798a63c to b00186d Compare March 9, 2023 20:08

ivan-mogilko merged commit f13fd18 into adventuregamestudio:ags4 Mar 9, 2023

ivan-mogilko deleted the ags4--rtti branch March 9, 2023 21:08

ivan-mogilko mentioned this pull request Apr 2, 2023

AGS 4: RTTI for AGS script #1259

Closed

ivan-mogilko mentioned this pull request Jun 1, 2023

Support script object pointer downcast #2018

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AGS 4: Implement script RTTI #1922

AGS 4: Implement script RTTI #1922

ivan-mogilko commented Feb 15, 2023 •

edited

Loading

ivan-mogilko commented Feb 20, 2023 •

edited

Loading

ivan-mogilko commented Feb 23, 2023

AlanDrake commented Mar 8, 2023

ivan-mogilko commented Mar 8, 2023 •

edited

Loading

ivan-mogilko commented Mar 9, 2023 •

edited

Loading

AGS 4: Implement script RTTI #1922

AGS 4: Implement script RTTI #1922

Conversation

ivan-mogilko commented Feb 15, 2023 • edited Loading

Generation

Serialization

TODO

ivan-mogilko commented Feb 20, 2023 • edited Loading

ivan-mogilko commented Feb 23, 2023

AlanDrake commented Mar 8, 2023

ivan-mogilko commented Mar 8, 2023 • edited Loading

ivan-mogilko commented Mar 9, 2023 • edited Loading

ivan-mogilko commented Feb 15, 2023 •

edited

Loading

ivan-mogilko commented Feb 20, 2023 •

edited

Loading

ivan-mogilko commented Mar 8, 2023 •

edited

Loading

ivan-mogilko commented Mar 9, 2023 •

edited

Loading