-
-
Notifications
You must be signed in to change notification settings - Fork 21.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor how CowData
manages its allocations
#74582
base: master
Are you sure you want to change the base?
Conversation
75c68e7
to
345093c
Compare
I built Godot 4.1.2-rc with this PR and #77306, using g++-mingw-w64-x86-64-posix 10.3.0-14ubuntu1+24.3. Then I tested Terrain3D saving a large data set described in 62585 and our issue. I loaded up 16k x 8k of data, attempting to save 1.4gb memory (though we need 3gb now for 16k terrains, and will be increasing that later). I saved using the inspector, which just uses the ResourceSaver::save(), independent of our code. Upon saving, Godot did not crash so that's an improvement. However I get these errors:
And now my resource file is corrupt. Previously in 4.1.1-stable Godot crashed, but the file was untouched. |
Code-wise and rationale-wise, this looks good to me. Functionality-wise (perf, etc.) I would leave some benchmarking to others to really know how beneficial this is in real world cases. Besides, I've made a branch with this PR and one commit of mine on top to suggest those changes to be integrated here: https://github.com/RandomShaper/godot/commits/pr74582_suggestions. |
51f90f0
to
6979ae1
Compare
- Use an explicit type for the data prefix to enable easier changes. - Explicitly store the capacity instead of using the next power of two. - If resizing to a significantly larger size (more 2x the current capacity), assume that the caller has just resized to their final size and don't allocate any excess capacity. - Trim the capacity less aggressively to fix edge cases when repeatedly adding and removing elements causing extreme performance issues. - Do not rely on overflow checked operations to determine the maximum size, to avoid crashes with some compilers when using too large sizes. - On 64-bit system increase the the maximum size to 0x7ffffff0, generally by a factor of the size of the elements. - On 32-bit system the maximum size remains limited by the size of the address space / memory, but in some cases it may still be possible to allocate larger arrays compared to before. - Add a minimum capacity to remove some unnecessary reallocation for small arrays. - Remove a few unnecessary potential copies when resizing, adding or removing elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the slow review.
Just left a few minor suggestions.
Timeline wise, I think we're getting a bit too close to 4.3 beta to merge such a core change now, unless @godotengine/core members think it particularly safe or critical.
But I'd like to collect a few approvals from the team so we can merge early on for 4.4.
#define ERR_PROPAGATE(m_expr) \ | ||
if (true) { \ | ||
Error err = m_expr; \ | ||
if (unlikely(err)) { \ | ||
return err; \ | ||
} \ | ||
} else \ | ||
((void)0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs a dedicated macro just for 4 cases in this class. It's a common pattern that we fully spell out when it's used.
template <typename T> | ||
constexpr auto ALIGN_AT(const T m_number, const T m_alignment) { | ||
return ((m_number + (m_alignment - 1)) / m_alignment) * m_alignment; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only used once in CowData
, so I would keep it defined there, and likely directly inline.
int capacity; | ||
int size; | ||
}; | ||
static_assert(std::is_trivially_destructible<CowDataPrefix>::value, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static_assert(std::is_trivially_destructible<CowDataPrefix>::value, ""); | |
static_assert(std::is_trivially_destructible<CowDataPrefix>::value, "CowDataPrefix should be trivially destructible."); |
return rc; | ||
|
||
CowDataPrefix *mem_new = (CowDataPrefix *)Memory::realloc_static(prefix, size_t(p_new_capacity) * sizeof(T) + prefix_offset()); | ||
ERR_FAIL_COND_V_MSG(!mem_new, ERR_OUT_OF_MEMORY, "Insufficient memory to allocate CowData"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ERR_FAIL_COND_V_MSG(!mem_new, ERR_OUT_OF_MEMORY, "Insufficient memory to allocate CowData"); | |
ERR_FAIL_NULL_V_MSG(mem_new, ERR_OUT_OF_MEMORY, "Insufficient memory to allocate CowData."); |
|
||
uint32_t *mem_new = (uint32_t *)Memory::alloc_static(_get_alloc_size(current_size), true); | ||
CowDataPrefix *mem_new = (CowDataPrefix *)Memory::alloc_static(size_t(p_capacity) * sizeof(T) + prefix_offset()); | ||
ERR_FAIL_COND_V_MSG(!mem_new, ERR_OUT_OF_MEMORY, "Insufficient memory to allocate CowData"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ERR_FAIL_COND_V_MSG(!mem_new, ERR_OUT_OF_MEMORY, "Insufficient memory to allocate CowData"); | |
ERR_FAIL_NULL_V_MSG(mem_new, ERR_OUT_OF_MEMORY, "Insufficient memory to allocate CowData."); |
@hpvb, this one has been up for quite long and I thought you may want to take a look. I don't quite remember the details, but it looked interesting. |
Awesome! I just found this randomly. Practically, I think it's unavoidable that we have an explicit @RedworkDE Are you still interested in maintaining this PR? I know it's been quite a while since it's been opened. |
Context: I was looking into making CowData use 64-bit indices to solve the various crashes / issues caused by too large arrays, and found that the internal structure of CowData is fundamentally not suited for such a thing (and also really weird) so I went and reworked how the internal data storage is managed.
To the outside this PR shouldn't change any behavior, except that allocating somewhat larger arrays works now (and also on MSVC there are no longer cases where operations return OK but actually failed).
There are a few more improvements that could / should be done, but I stopped before feature creep completely overtook this.
Needs #74555 to pass CI.What to do with the
ERR_PROPAGATE
macro?I added this to check an
Error
return value and to return it from the current method if it is not OK, is this usefull for other places in the engine as well and should be added to the other ERR macros? Should I just inline that check everywhere? Or can it just stay as is?What about alignment?
The old version of this just didn't care about alignment and just returned what it got from the allocator (which is malloc aligned + 16 bytes).
This assumes that the allocator pointer is aligned correctly and then offsets the first element by an multiple of the alignment, but I am not sure if the alignment is correct for types with large alignments, ie simd types. Should this be extended to ensure correct alignment for such types?
About the buffers in
max_size
:The 64 bytes buffer to filling the entire address space with one array is arbitrary and not going to matter beyond preventing overflows when computing the size of the allocation as way more than that are going to be filled with godot, so such allocations are going to fail anyways.
The 15 indices buffer is similarly arbitrary and serves to prevent overflows when computing
size()+1
and similar things.For reference in C# the maximum array size is
0x7fffffc7
bytes regardless of element size. (Buffer of 56).About that performance hole in the current implementation:
When repeatedly increasing and decreasing the size of the allocation across a power of 2, on every operation the entire backing array will be reallocated, causing (in constructed examples 1000x+) slowdowns compared to doing the same with just a few more or less elements.
Before: (on a production build)
After: (on a debug build)
(And no this didn't just move where this happens)
Allowed array sizes:
Previously (unpacked) arrays with 100 million + elements would fail to allocate / crash
Now all Arrays can have 2 billion elements (if you have the 50GB ram needed for that) and all failures will correctly return error codes.
Unfortunately allocating arrays with over 4 billion elements will appear to work, but the size actually gets truncated to 32bit and thus the result is actually smaller.
This happens at the gdscript to native boundary tho, so I am not sure if this can easily be fixed without breaking lots of things or actually making the index/size 64 bit large.See #74676Memory savings:
Because this PR doesn't over-allocate when resizing to the required maximum size, this saves quite a bit of memory.
https://github.com/Calinou/godot-reflection goes from 20MiB+9MiB (30% waste) to 20MiB+800KiB (4% waste)
See https://github.com/RedworkDE/godot/tree/resource-mon-cowdata and https://github.com/RedworkDE/godot/tree/resource-mon-cowdata-64 for adding a resource monitor for this.
Potential issues:
Potential advantages: