-
-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Complex Text Layouts: 1/4] BiDi/Shaping engine API structure, text processing refactoring. #1180
Comments
Current list of proposed changes: Core Changes (String)
#ifdef WINDOWS_ENABLED
#define FROM_WC_STR(m_value, m_len) (String((const CharType *)(m_value), (m_len)))
#define WC_STR(m_value) ((const wchar_t *)((m_value).get_data()))
#else
#define FROM_WC_STR(m_value, m_len) (String::utf32((const CharType32 *)(m_value), (m_len)))
#define WC_STR(m_value) ((const wchar_t *)((m_value).utf32().get_data()))
#endif
static bool is_single(char16_t p_char);
static bool is_surrogate(char16_t p_char);
static bool is_surrogate_lead(char16_t p_char);
static bool is_surrogate_trail(char16_t p_char);
static char32_t get_supplementary(char16_t p_lead, char16_t p_trail);
static char16_t get_lead(char32_t p_supplementary);
static char16_t get_trail(char32_t p_supplementary);
TextServer (handles font and text shaper implementations)
API mock-up for TextServer base class: class TextServer : public Object {
* GDCLASS(TextServer, Object);
public:
enum TextDirection {
TEXT_DIRECTION_AUTO, // Detects text direction based on string content and specific locale
TEXT_DIRECTION_LTR, // Left-to-right text.
TEXT_DIRECTION_RTL // Right-to-left text.
};
enum TextOrientation {
TEXT_ORIENTATION_HORIZONTAL_TB, // Text flows horizontally, next line to under
TEXT_ORIENTATION_VERTICAL_RL, // For LTR text flows vertically top to bottom, next line is to the left. For RTL, text flows from bottom to top, next line to the right. Vertical scripts displayed upright.
TEXT_ORIENTATION_VERTICAL_LR, // For LTR text flows vertically top to bottom, next line is to the right. For RTL, text flows from bottom to top, next line to the left. Vertical scripts displayed upright.
TEXT_ORIENTATION_SIDEWAYS_RL, // ... Vertical scripts displayed sideways.
TEXT_ORIENTATION_SIDEWAYS_LR
};
enum TextJustification {
TEXT_JUSTIFICATION_NONE = 0,
TEXT_JUSTIFICATION_KASHIDA = 1 << 1, // Change width or add/remove kashidas (ــــ).
TEXT_JUSTIFICATION_WORD_BOUND = 1 << 2, // Adds/removes extra space between the words (for some languages, should add spaces even if there were non in the original string, using dictionary).
TEXT_JUSTIFICATION_GRAPHEME_BOUND = 1 << 3, // Adds/removes extra space in between all non-joining graphemes.
TEXT_JUSTIFICATION_GRAPHEME_WIDTH = 1 << 4 // Adjusts width of the graphemes visually (if supported by font), 10-15% of change should be OK in general.
};
enum TextBreak {
TEXT_BREAK_NONE = 0,
TEXT_BREAK_MANDATORY = 1 << 1, // Breaks line at the explicit line break characters ("\n" etc).
TEXT_BREAK_WORD_BOUND = 1 << 2, // Breaks line between the words.
TEXT_BREAK_GRAPHEME_BOUND = 1 << 3 // Breaks line between any graphemes (in general it's OK to break line anywhere, as long as it isn't reshaped after).
};
enum TextCaretMove {
TEXT_CARET_GRAPHEME,
TEXT_CARET_WORD,
TEXT_CARET_SENTENCE,
TEXT_CARET_PARAGRAPH
};
enum TextGraphemeFlags {
GRAPHEME_FLAG_VALID = 1 << 1,
GRAPHEME_FLAG_RTL = 1 << 2,
GRAPHEME_FLAG_ROTCW = 1 << 3, // For sideways vertical layout.
GRAPHEME_FLAG_ROTCCW = 1 << 4,
};
struct Grapheme {
struct Glyph {
uint32_t glyph_index = 0; // Glyph index is internal value of the font and can't be reused with other fonts, or store UTF-32 codepoint for invalid glyphs (for faster invalid char hex code box display).
Vector2 offset; // Offset from the origin of the glyph.
};
Vector<Glyph> glyphs;
Vector2i range; // Range in the original string this grapheme corresponds to.
Vector2 advance; // Advance to the next glyph.
/*TextGraphemeFlags*/ uint8_t flags; // Used for caret drawing.
RID font;
};
struct Caret {
Rect2 position; // Caret rectangle
bool is_primary;
};
protected:
//......//
virtual bool has_feature(Feature p_ftr); // --> BiDi, Shaping, System Fonts
// Font API
virtual RID create_font_system(const String &p_name); // Loads OS default font by name (if supported).
virtual RID create_font_resource(const String &p_filename); // Loads custom font from "res://" or filesystem.
virtual RID create_font_memory(const Vector<uint8_t> &p_data); // Loads custom font from memory (for built-in fonts).
virtual float font_get_height(RID p_font, float p_size) const;
virtual float font_get_ascent(RID p_font, float p_size) const;
virtual float font_get_descent(RID p_font, float p_size) const;
virtual float font_get_underline_position(RID p_font, float p_size) const;
virtual float font_get_underline_thickness(RID p_font, float p_size) const;
virtual bool font_has_feature(RID p_font, FontFeature p_feature) const; // Outline, Resizable, Distance field
virtual bool font_language_supported(RID p_font, const String &p_locale) const;
virtual bool font_script_supported(RID p_font, const String &p_script) const;
virtual void font_draw_glyph(RID p_font, RID p_canvas, float p_size, const Vector2 &p_pos, uint32_t p_index, const Color &p_color) const;
virtual void font_draw_glyph_outline(RID p_font, RID p_canvas, float p_size, const Vector2 &p_pos, uint32_t p_index, const Color &p_color) const;
virtual void font_draw_invalid_glpyh(RID p_font, RID p_canvas, float p_size, const Vector2 &p_pos, uint32_t p_index, const Color &p_color) const; // Draws box with hex code, scaled to match font size.
// Shaped Text Buffer
virtual RID create_shaped_text(TextDirection p_direction = TEXT_DIRECTION_AUTO, TextOrientation p_orientation = TEXT_ORIENTATION_HORIZONTAL_TB);
virtula void shaped_set_direction(RID p_shaped, TextDirection p_direction = TEXT_DIRECTION_AUTO);
virtula void shaped_set_orientation(RID p_shaped, TextOrientation p_orientation = TEXT_ORIENTATION_HORIZONTAL_TB);
virtual bool shaped_add_text(RID p_shaped, const String &p_text, const List<RID> &p_font, float p_size, const String &p_features = "", const String &p_locale = ""); // Add text and object to span stack, lazy
virutal bool shaped_add_object(RID p_shaped, Variant p_id, const Size2 &p_size, VAlign p_inline_align); // Add inline object
virtual RID shaped_create_substr(RID p_shaped, int p_start, int p_length) const; // Get shaped substring (e.g for line breaking)
virtual Vector<Grapheme> shaped_get_graphemes(RID p_shaped) const; // Returns graphemes as is or BiDi reorders them for the line if range is specified. Graphemes returned in visual (LTR) order. Returned graphems should be usable in the place of characters for the most UI use cases, without massive code changes.
virtual TextDirection shaped_get_direction(RID p_shaped) const; // Returns detected base direction of the string if it was shaped with AUTO direction.
virtual Vector<Vector2i> shaped_get_line_breaks(RID p_shaped, float p_width, /*TextBreak*/ uint8_t p_break_mode) const; // Returns line ranges, ranges can be directly used with get_graphemes function to render multiline text.
virtual Rect2 shaped_get_object_rect(RID p_shaped, Variant p_id) const;
virtual Size2 shaped_get_size(RID p_shaped) const;
virtual float shaped_get_ascent(RID p_shaped) const; // For some languages, graphemes can be offset from the base line significantly, these functions should return maximum ascent and descent, though for most cases using font ascent/descent is OK.
virtual float shaped_get_descent(RID p_shaped) const; // Also, can include size of inline objects.
virtual float shaped_get_line_spacing(RID p_shaped) const; // Offset to the next line (in the direction specified by text orientation)
virtual float shaped_fit_to_width(RID p_shaped, float p_width, /*TextJustification*/ uint8_t p_justification_mode) const; // Adjusts spaces and elongations in the line to fit it to the specified width, returns line width after adjustment.
// Shaped Text Buffer helpers for input controls
virtual Vector<Caret> shaped_get_carets(RID p_shaped, int p_pos) const;
virtual Vector<Rect2> shaped_get_selection(RID p_shaped, int p_start, int p_end) const;
virtual int shaped_hit_test(RID p_shaped, const Vector2 &p_coords) const;
// String API
virtual bool string_get_word(const String &p_string, int p_offset, int &r_beg, int &r_end) const;
virtual bool string_get_line(const String &p_string, int p_offset, int &r_beg, int &r_end) const;
virtual int caret_advance(const String &p_string, int p_value, TextCaretMove p_type) const;
virtual bool is_uppercase(char32_t p_char) const;
virtual bool is_lowercase(char32_t p_char) const;
virtual bool is_titlecase(char32_t p_char) const;
virtual bool is_digit(char32_t p_char) const;
virtual bool is_alphanumeric(char32_t p_char) const;
virtual bool is_punctuation(char32_t p_char) const;
virtual char32_t to_lowercase(char32_t p_char) const;
virtual char32_t to_uppercase(char32_t p_char) const;
virtual char32_t to_titlecase(char32_t p_char) const;
virtual int32_t to_digit(char32_t p_char, int p_radix) const;
// Common
virtual bool load_data(const String &p_filename); // Load custom ICU data file.
virtual void free(RID p_rid); Core changes (Font)
class FontData : public Resource {
GDCLASS(FontData, Resource);
protected:
static void _bind_methods();
public:
RID get_rid() const;
bool load_system(const String &p_name);
bool load_resource(const String &p_filename);
bool load_memory(const Vector<uint8_t> &p_data);
float get_height(float p_size) const;
float get_ascent(float p_size) const;
float get_descent(float p_size) const;
float get_underline_position(float p_size) const;
float get_underline_thickness(float p_size) const;
bool has_feature(Feature p_feature) const;
bool language_supported(const String &p_locale) const;
bool script_supported(const String &p_script) const;
void font_draw_glyph(RID p_canvas, float p_size, const Vector2 &p_pos, uint32_t p_index, const Color &p_color) const;
void font_draw_glyph_outline(RID p_canvas, float p_size, const Vector2 &p_pos, uint32_t p_index, const Color &p_color) const;
void font_draw_invalid_glpyh(RID p_canvas, float p_size, const Vector2 &p_pos, uint32_t p_index, const Color &p_color) const;
}
class Font : public Resource {
GDCLASS(Font, Resource);
protected:
static void _bind_methods();
public:
float get_height(float p_size) const;
float get_ascent(float p_size) const;
float get_descent(float p_size) const;
float get_underline_position(float p_size) const;
float get_underline_thickness(float p_size) const;
Size2 get_string_size(const String &p_string, float p_size) const;
Size2 get_wordwrap_string_size(const String &p_string, float p_size, float p_width) const;
void draw(RID p_canvas_item, float p_size, const Point2 &p_pos, const String &p_text, const Color &p_modulate = Color(1, 1, 1), int p_clip_w = -1, const Color &p_outline_modulate = Color(1, 1, 1)) const;
void draw_halign(RID p_canvas_item, float p_size, const Point2 &p_pos, const String &p_text, HAlign p_align, float p_width, const Color &p_modulate = Color(1, 1, 1), const Color &p_outline_modulate = Color(1, 1, 1)) const;
void draw_wordwrap(RID p_canvas_item, float p_size, const Point2 &p_pos, const String &p_text, HAlign p_align, float p_width, const Color &p_modulate = Color(1, 1, 1), const Color &p_outline_modulate = Color(1, 1, 1)) const;
void add_data(const Ref<FontData> &p_data);
void set_data(int p_idx, const Ref<FontData> &p_data);
void set_data_language_support_override(int p_idx, const Vector<String> &p_locales);
void set_data_script_support_override(int p_idx, const Vector<String> &p_scripts);
int get_data_count() const;
Ref<FontData> get_data(int p_idx) const;
void remove_data(int p_idx);
List<RID> get_data_for_locale(const String &p_locale, const String &p_script);
Core add (ShapedString/ShapedText)
class ShapedText : public Reference {
GDCLASS(ShapedText, Reference);
protected:
void set_direction(TextDirection p_direction);
TextDirection get_direction() const;
void set_orientation(TextOrientation p_orientation);
TextOrientation get_orientation() const;
bool add_text(const String &p_text, const Ref<Font> &p_font, float p_size, const String &p_features = "", const String &p_locale = "");
bool add_object(Variant p_id, const Size2 &p_size, VAlign p_inline_align);
Ref<ShapedText> substr(int p_start, int p_length) const;
Vector<Grapheme> get_graphemes() const;
Vector<Ref<ShapedText>> break_lines(float p_width, /*TextBreak*/ uint8_t p_break_mode) const;
Rect2 get_object_rect(Variant p_id) const;
Size2 get_size() const;
float get_ascent() const;
float get_descent() const;
float get_line_spacing() const;
float fit_to_width(float p_width, /*TextJustification*/ uint8_t p_justification_mode) const;
void draw(RID p_canvas_item, const Point2 &p_pos, const Color &p_modulate = Color(1, 1, 1), const Color &p_outline_modulate = Color(1, 1, 1)) const;
Vector<Caret> get_carets(int p_pos) const;
Vector<Rect2> get_selection(int p_start, int p_end) const;
int hit_test(const Vector2 &p_coords) const;
}; Control changes
Cases of
Cases of
Cases of
ExportAuto include ICU database to exported project.
|
There's this asset that I wrote, which adds a new label node with simple BiDi reordering and Arabic shaping. It's written in GDScript, so it's probably not useful here, but perhaps something similar could be made for a fallback module, if any. |
Proposal looks great, Love the idea of having a TextServer and optionally making use of platform implementations to avoid having to include ICU or similar in all the export templates. |
My only feedback is that I am not sure if its worth having String as UTF16 (as opposed to just using UCS-4 everywhere). Nowadays platforms too much memory to make it worth saving it on this, and strings will never take up that much space. |
The only reason for UTF-16 is ICU, which is using it for its APIs. |
@bruvzg but most string manipulation in Godot assumes UCS, from parsers to text editors and all other stuff, so I feel it may be a better idea to, worst case, just convert to UTF16 when calling ICU, I am not sure if this has a cost other than converting the string, though. |
Converting should be fast, and ICU have its own API for it. I guess we can go with UTF-32 (and only convert it to UTF-16 to get BiDi runs). If it's gonna cause too much trouble, moving to UTF-16 after the rest of the CTL stuff is implemented won't be any harder than doing it first. And some ICU APIs already moved to extendable UText abstraction layer which can be used with UTF-32 strings directly (BiDi is currently not one of them, but eventually we'll be able to get rid of convertion). |
Done some testing, conversion cost is quite low, going with UTF-32 should be fine. Tests done with the Noto font sample texts (about 11 % of the strings have characters outside BMP).
|
sounds great then! |
Where would I plug in (multichannel) signed distance field font atlases for normal and complex layouts? Edited: https://github.com/fire/godot/tree/msdf-oct-2020 and the shader at https://github.com/V-Sekai/godot-msdf-project Edited: I don't know which proposal describes this feature. I saw a paragraph on it. Edited:
|
Proposal is implemented in godotengine/godot#41100 and godotengine/godot#42595. |
Godot resolved the hindering issue 2 years ago. Godot 4 ships by default with ligatures. Issue godotengine/godot-proposals#1180
This proposal is follow up to #4, to get some community feedback/preferences on a specific parts of CTL support implementation.
Describe the project you are working on:
The Godot Engine
Describe the problem or limitation you are having in your project:
Currently, text display is extremely limited, and only supports simple, left-to-right scripts.
Describe the feature / enhancement and how it helps to overcome the problem or limitation:
Proper display of the text requires multiple steps to be done:
🔹 BiDi reordering (placing parts of the text as they are displayed), should be done on the whole paragraph of text, e.g. any part of the text that logically independent of the rest.
Click to expand
🔹 Shaping (choosing context dependent glyphs from the font and their relative positions).
Click to expand
🔹 Since text in each singe line should maintain logical order, breaking is done on non-reordered text (but it should be shaped, and shaping requires direction to be known, hence text is temporary reordered back for breaking).
Then each line is reordered again (using slightly different algorithm), there's no need for shaping it again, results can be taken from the step 2.
Click to expand
🔹 Optionally some advanced technics can be using for line justification, but just expanding spaces should be OK in general.
Click to expand
🔹 For some types of data (urls/emails/source code) each part should be processed separately.
Click to expand
🔹 Doing these steps is quite expensive, and it's results probably should be cached, and results of the steps 1 and 2 can be reused for steps 3, 4 (e.g. resizing controls).
🔹 macOS and Windows have powerful built-in BiDi/shaping engines (CoreText and DirectWrite), and there are open source solutions for both, FreeBidi (LGPL) and ICU (MIT like license) for BiDi (ICU quite big, but also provides tons of potentially useful i18n stuff) and HarfBuzz for shaping (MIT, AFAIK there're no alternatives).
🔹 Most shapers only support widely used languages, for more exotic once SIL Graphite (MPL2 or LGPL) can be used (shaping engine for the font is integrated as bytecode into the font itself), which can be used as backend for HarfBuzz.
🔹 For the cross-platform engine ICU+Harfbuzz+Graphite seems to be the most logical choice, but we probably should have some way for custom platform specific implementations.
🔹 Majority of games do not need any dynamic text (neither dynamic fonts), everything can be pre-rendered as image, probably both (Dynamic font and CTL) should be optional modules to avoid waste of space.
Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:
What's the best way to implement it?
Core only, module or GDNative?
🔹 Built-in CTL as the only way display text.
🔹 Built-in CTL module that can be disabled and the simple fallback module (handling text as it is done now) and GDNative for custom implementations.
🔹 Built-in simple fallback and GDNative only CTL (include dynamic library with the editor and export templates).
Module or GDNative providing what?:
🔹 Only BiDi/shaping APIs.
🔹 Both BiDi/shaping and font implementations. (Simple + BitmapFont/CTL + DynamicFont)
API (Base), How low or high level should it be?
🔹 Low level functions (do all the steps in the
Font->draw
and complex controls):Or have abstraction to expose all internal implementation structures (e.g. BiDi context, and shaping buffers)
🔹 Do BiDi and shaping in the one step, but expose results (see #4 (comment))
🔹 High level, use
ShapedString
structure containing both input and output data as the single entity, w/o need of directly accessing underling low-level stuff, with lazy BiDi/shaping and caching. (see #4 (comment))🔹 High level, use
ShapedParagraph
structure handling all layout features for a whole paragraphs at once.🔹 Use both
ShapedString
andParagraph
as higher level helper for multiline controls. (see https://github.com/bruvzg/godot_tl)🔹 Which part of the API should handle caching? Controls and Font or module functions?
🔹 Something else?
API (Text input, cursor/selection control), should be handled by controls or module?
🔹 Only complex, font specific functions (e.g. ligature cursors), do everything else in the controls.
🔹 Common cursor control API for all controls (e.g.
ShapedString->move_caret(CursorPos, +/- Magnitude, Type WORD/CHAR/LINE/PARA) -> CursorPos
,ShapedStgring->hit_test(..., Coords) -> CursorPos
).🔹 Something else?
API (Font, Canvas)
Currently, we have duplicate string drawing functions both in Canvas and Font, do we need both?
If this enhancement will not be used often, can it be worked around with a few lines of script?:
It will be used to draw all text in the editor and exported apps.
Is there a reason why this should be core and not an add-on in the asset library?:
Main implementation can and probably should be module, and have support for the custom GDNative implementations, but substantial changes to the core are required anyway.
The text was updated successfully, but these errors were encountered: