-
Notifications
You must be signed in to change notification settings - Fork 8.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace CodepointWidthDetector's runtime table with a static one #2368
Conversation
This commit request replaces CodepointWidthDetector's dynamically-generated map with a static constexpr one that's compiled into the binary. It also almost totally removes the notion of an `Invalid` width. We definitely had gaps in our character coverage where we'd report a character as invalid, but we'd then flatten that down to `Narrow` when asked. By combining the not-present state and the narrow state, we get to save a significant chunk of data. I've tested this by feeding it all 0x10FFFF codepoints (and then some) and making sure they 100% match the old code's outputs. |------------------------------|---------------|----------------| | Metric | Then | Now | |------------------------------|---------------|----------------| | disk space | 56k (`.text`) | 3k (`.rdata`) | | runtime memory (allocations) | 1088 | 0 | | runtime memory (bytes) | 51k | ~0 | | memory behavior | not shared | fully shared | | lookup time | ~31ns | ~9ns | | first hit penalty | ~170000ns | 0ns | | lines of code | 1088 | 285 | | clarity | extreme | slightly worse | |------------------------------|---------------|----------------| I also took a moment and cleaned up a stray boolean that we didn't need.
I played around with packing that struct (21 bits for each bound, 2 bits for the width) and it gets really interesting. It's even smaller (of course), but the lookup time for all 10FFFF codepoints does up to 12ns/op instead of 9ns/op. It'll only go down to 8 bytes (from 12). \shrug/ |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
@DHowett-MSFT If each UnicodeRange no longer contains the field CodepointWidth, but separate the codepoint ranges of different widths, I think the following code can reduce the disk size: ////
// https://github.com/microsoft/terminal/blob/734fc1dcc6de4315d4cc91944c5ea83b7b8a7e1a/src/types/CodepointWidthDetector.cpp
#ifndef BELA_UCWIDTH_HPP
#define BELA_UCWIDTH_HPP
#include <iterator>
namespace bela::unicode {
struct interval final {
char32_t first;
char32_t last;
};
inline static bool operator<(const interval &range, const char32_t searchTerm) {
return range.last < searchTerm;
}
// generated from
// http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt
static constexpr interval AmbiguousTable[] = {
{0xa1, 0xa1}, {0xa4, 0xa4}, {0xa7, 0xa8},
{0xaa, 0xaa}, {0xad, 0xae}, {0xb0, 0xb4},
{0xb6, 0xba}, {0xbc, 0xbf}, {0xc6, 0xc6},
{0xd0, 0xd0}, {0xd7, 0xd8}, {0xde, 0xe1},
{0xe6, 0xe6}, {0xe8, 0xea}, {0xec, 0xed},
{0xf0, 0xf0}, {0xf2, 0xf3}, {0xf7, 0xfa},
{0xfc, 0xfc}, {0xfe, 0xfe}, {0x101, 0x101},
{0x111, 0x111}, {0x113, 0x113}, {0x11b, 0x11b},
{0x126, 0x127}, {0x12b, 0x12b}, {0x131, 0x133},
{0x138, 0x138}, {0x13f, 0x142}, {0x144, 0x144},
{0x148, 0x14b}, {0x14d, 0x14d}, {0x152, 0x153},
{0x166, 0x167}, {0x16b, 0x16b}, {0x1ce, 0x1ce},
{0x1d0, 0x1d0}, {0x1d2, 0x1d2}, {0x1d4, 0x1d4},
{0x1d6, 0x1d6}, {0x1d8, 0x1d8}, {0x1da, 0x1da},
{0x1dc, 0x1dc}, {0x251, 0x251}, {0x261, 0x261},
{0x2c4, 0x2c4}, {0x2c7, 0x2c7}, {0x2c9, 0x2cb},
{0x2cd, 0x2cd}, {0x2d0, 0x2d0}, {0x2d8, 0x2db},
{0x2dd, 0x2dd}, {0x2df, 0x2df}, {0x300, 0x36f},
{0x391, 0x3a1}, {0x3a3, 0x3a9}, {0x3b1, 0x3c1},
{0x3c3, 0x3c9}, {0x401, 0x401}, {0x410, 0x44f},
{0x451, 0x451}, {0x2010, 0x2010}, {0x2013, 0x2016},
{0x2018, 0x2019}, {0x201c, 0x201d}, {0x2020, 0x2022},
{0x2024, 0x2027}, {0x2030, 0x2030}, {0x2032, 0x2033},
{0x2035, 0x2035}, {0x203b, 0x203b}, {0x203e, 0x203e},
{0x2074, 0x2074}, {0x207f, 0x207f}, {0x2081, 0x2084},
{0x20ac, 0x20ac}, {0x2103, 0x2103}, {0x2105, 0x2105},
{0x2109, 0x2109}, {0x2113, 0x2113}, {0x2116, 0x2116},
{0x2121, 0x2122}, {0x2126, 0x2126}, {0x212b, 0x212b},
{0x2153, 0x2154}, {0x215b, 0x215e}, {0x2160, 0x216b},
{0x2170, 0x2179}, {0x2189, 0x2189}, {0x2190, 0x2199},
{0x21b8, 0x21b9}, {0x21d2, 0x21d2}, {0x21d4, 0x21d4},
{0x21e7, 0x21e7}, {0x2200, 0x2200}, {0x2202, 0x2203},
{0x2207, 0x2208}, {0x220b, 0x220b}, {0x220f, 0x220f},
{0x2211, 0x2211}, {0x2215, 0x2215}, {0x221a, 0x221a},
{0x221d, 0x2220}, {0x2223, 0x2223}, {0x2225, 0x2225},
{0x2227, 0x222c}, {0x222e, 0x222e}, {0x2234, 0x2237},
{0x223c, 0x223d}, {0x2248, 0x2248}, {0x224c, 0x224c},
{0x2252, 0x2252}, {0x2260, 0x2261}, {0x2264, 0x2267},
{0x226a, 0x226b}, {0x226e, 0x226f}, {0x2282, 0x2283},
{0x2286, 0x2287}, {0x2295, 0x2295}, {0x2299, 0x2299},
{0x22a5, 0x22a5}, {0x22bf, 0x22bf}, {0x2312, 0x2312},
{0x2460, 0x24e9}, {0x24eb, 0x254b}, {0x2550, 0x2573},
{0x2580, 0x258f}, {0x2592, 0x2595}, {0x25a0, 0x25a1},
{0x25a3, 0x25a9}, {0x25b2, 0x25b3}, {0x25b6, 0x25b7},
{0x25bc, 0x25bd}, {0x25c0, 0x25c1}, {0x25c6, 0x25c8},
{0x25cb, 0x25cb}, {0x25ce, 0x25d1}, {0x25e2, 0x25e5},
{0x25ef, 0x25ef}, {0x2605, 0x2606}, {0x2609, 0x2609},
{0x260e, 0x260f}, {0x261c, 0x261c}, {0x261e, 0x261e},
{0x2640, 0x2640}, {0x2642, 0x2642}, {0x2660, 0x2661},
{0x2663, 0x2665}, {0x2667, 0x266a}, {0x266c, 0x266d},
{0x266f, 0x266f}, {0x269e, 0x269f}, {0x26bf, 0x26bf},
{0x26c6, 0x26cd}, {0x26cf, 0x26d3}, {0x26d5, 0x26e1},
{0x26e3, 0x26e3}, {0x26e8, 0x26e9}, {0x26eb, 0x26f1},
{0x26f4, 0x26f4}, {0x26f6, 0x26f9}, {0x26fb, 0x26fc},
{0x26fe, 0x26ff}, {0x273d, 0x273d}, {0x2776, 0x277f},
{0x2b56, 0x2b59}, {0x3248, 0x324f}, {0xe000, 0xf8ff},
{0xfe00, 0xfe0f}, {0xfffd, 0xfffd}, {0x1f100, 0x1f10a},
{0x1f110, 0x1f12d}, {0x1f130, 0x1f169}, {0x1f170, 0x1f18d},
{0x1f18f, 0x1f190}, {0x1f19b, 0x1f1ac}, {0xe0100, 0xe01ef},
{0xf0000, 0xffffd}, {0x100000, 0x10fffd}};
static constexpr interval WideTable[] = {
{0x1100, 0x115f}, {0x231a, 0x231b}, {0x2329, 0x232a},
{0x23e9, 0x23ec}, {0x23f0, 0x23f0}, {0x23f3, 0x23f3},
{0x25fd, 0x25fe}, {0x2614, 0x2615}, {0x2648, 0x2653},
{0x267f, 0x267f}, {0x2693, 0x2693}, {0x26a1, 0x26a1},
{0x26aa, 0x26ab}, {0x26bd, 0x26be}, {0x26c4, 0x26c5},
{0x26ce, 0x26ce}, {0x26d4, 0x26d4}, {0x26ea, 0x26ea},
{0x26f2, 0x26f3}, {0x26f5, 0x26f5}, {0x26fa, 0x26fa},
{0x26fd, 0x26fd}, {0x2705, 0x2705}, {0x270a, 0x270b},
{0x2728, 0x2728}, {0x274c, 0x274c}, {0x274e, 0x274e},
{0x2753, 0x2755}, {0x2757, 0x2757}, {0x2795, 0x2797},
{0x27b0, 0x27b0}, {0x27bf, 0x27bf}, {0x2b1b, 0x2b1c},
{0x2b50, 0x2b50}, {0x2b55, 0x2b55}, {0x2e80, 0x2e99},
{0x2e9b, 0x2ef3}, {0x2f00, 0x2fd5}, {0x2ff0, 0x2ffb},
{0x3000, 0x303e}, {0x3041, 0x3096}, {0x3099, 0x30ff},
{0x3105, 0x312e}, {0x3131, 0x318e}, {0x3190, 0x31ba},
{0x31c0, 0x31e3}, {0x31f0, 0x321e}, {0x3220, 0x3247},
{0x3250, 0x32fe}, {0x3300, 0x4dbf}, {0x4e00, 0xa48c},
{0xa490, 0xa4c6}, {0xa960, 0xa97c}, {0xac00, 0xd7a3},
{0xf900, 0xfaff}, {0xfe10, 0xfe19}, {0xfe30, 0xfe52},
{0xfe54, 0xfe66}, {0xfe68, 0xfe6b}, {0xff01, 0xff60},
{0xffe0, 0xffe6}, {0x16fe0, 0x16fe1}, {0x17000, 0x187ec},
{0x18800, 0x18af2}, {0x1b000, 0x1b11e}, {0x1b170, 0x1b2fb},
{0x1f004, 0x1f004}, {0x1f0cf, 0x1f0cf}, {0x1f18e, 0x1f18e},
{0x1f191, 0x1f19a}, {0x1f200, 0x1f202}, {0x1f210, 0x1f23b},
{0x1f240, 0x1f248}, {0x1f250, 0x1f251}, {0x1f260, 0x1f265},
{0x1f300, 0x1f320}, {0x1f32d, 0x1f335}, {0x1f337, 0x1f37c},
{0x1f37e, 0x1f393}, {0x1f3a0, 0x1f3ca}, {0x1f3cf, 0x1f3d3},
{0x1f3e0, 0x1f3f0}, {0x1f3f4, 0x1f3f4}, {0x1f3f8, 0x1f43e},
{0x1f440, 0x1f440}, {0x1f442, 0x1f4fc}, {0x1f4ff, 0x1f53d},
{0x1f54b, 0x1f54e}, {0x1f550, 0x1f567}, {0x1f57a, 0x1f57a},
{0x1f595, 0x1f596}, {0x1f5a4, 0x1f5a4}, {0x1f5fb, 0x1f64f},
{0x1f680, 0x1f6c5}, {0x1f6cc, 0x1f6cc}, {0x1f6d0, 0x1f6d2},
{0x1f6eb, 0x1f6ec}, {0x1f6f4, 0x1f6f8}, {0x1f910, 0x1f93e},
{0x1f940, 0x1f94c}, {0x1f950, 0x1f96b}, {0x1f980, 0x1f997},
{0x1f9c0, 0x1f9c0}, {0x1f9d0, 0x1f9e6}, {0x20000, 0x2fffd},
{0x30000, 0x3fffd}};
inline bool bisearch(char32_t ch, const interval *table, size_t max) {
size_t min = 0;
size_t mid;
if (ch < table[0].first || ch > table[max].last) {
return false;
}
while (max >= min) {
mid = (min + max) / 2;
if (ch > table[mid].last) {
min = mid + 1;
continue;
}
if (ch < table[mid].first) {
max = mid - 1;
continue;
}
return true;
}
return false;
}
inline size_t CalculateWidthInternal(char32_t ch) {
if (bisearch(ch, WideTable, std::size(WideTable)-1)) {
return 2;
}
if (bisearch(ch, AmbiguousTable, std::size(AmbiguousTable)-1)) {
return 0;
}
return 1;
}
} // namespace bela::unicode
#endif |
This commit replaces CodepointWidthDetector's dynamically-generated map with a static constexpr one that's compiled into the binary. It also almost totally removes the notion of an `Invalid` width. We definitely had gaps in our character coverage where we'd report a character as invalid, but we'd then flatten that down to `Narrow` when asked. By combining the not-present state and the narrow state, we get to save a significant chunk of data. I've tested this by feeding it all 0x10FFFF codepoints (and then some) and making sure they 100% match the old code's outputs. |------------------------------|---------------|----------------| | Metric | Then | Now | |------------------------------|---------------|----------------| | disk space | 56k (`.text`) | 3k (`.rdata`) | | runtime memory (allocations) | 1088 | 0 | | runtime memory (bytes) | 51k | ~0 | | memory behavior | not shared | fully shared | | lookup time | ~31ns | ~9ns | | first hit penalty | ~170000ns | 0ns | | lines of code | 1088 | 285 | | clarity | extreme | slightly worse | |------------------------------|---------------|----------------| I also took a moment and cleaned up a stray boolean that we didn't need. (cherry picked from commit 16e1e29)
This commit replaces CodepointWidthDetector's
dynamically-generated map with a static constexpr one that's compiled
into the binary.
It also almost totally removes the notion of an
Invalid
width. Wedefinitely had gaps in our character coverage where we'd report a
character as invalid, but we'd then flatten that down to
Narrow
whenasked. By combining the not-present state and the narrow state, we get
to save a significant chunk of data.
I've tested this by feeding it all 0x10FFFF codepoints (and then some)
and making sure they 100% match the old code's outputs.
.text
).rdata
)I also took a moment and cleaned up a stray boolean that we didn't need.