-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 64 bits support to Array underlying storage #12221
Comments
One of the simplest breaks becomes any program that is doing the following: for (int i = 0; i < array.Length; i++)
{
} This works fine for any existing programs, since CoreCLR actually defines a limit that is just under |
I frequently run into this array when processing large chunks of data from network streams into byte arrays, and always need to implement chunking logic in order to be able to process the data. |
@tannergooding That's why I proposed above to keep the existing behaviour of throwing OverflowException on Length when there are more than int.MaxValue elements. Nothing changes there. I'm suggesting that changing the CLR implementation as proposed above would allow applications and people who want to to use large arrays without breaking existing applications. You are right that simply passing a large array into a library that does not support it will break, but at least this will give us choice. We need to start somewhere, and .NET cannot keep ignoring this problem. |
Yesterday I happily created an array in Python which contained more than two billion elements. When can I do the same in .NET? Currently we get an exception if we try to construct an array with more than 2B elements. What's wrong with deferring that exception until something calls the |
This is an interesting idea, but I wonder if it would be better to have a If we had a theoretical |
@GrabYourPitchforks interesting facts about the GC. I wasn't aware of such limitations. A My gut feeling is that |
I guess I could palette a LargeArray but if that's going to be implemented than I hope it doesn't create a new type which is not actually an Array, IMHO it would have been much easier to address this if we would have created another sub type of Array internally instead of inventing Span however Span also solves other problems.... |
These days 2GB arrays are barely enough for many applications to run reliably. RAM prices have stagnated for a few years now. Surely, the industry will resolve this problem sooner or later. As RAM amounts resume increasing at Moores law rate this 2GB array issue will become very commonplace sooner or later. A Thinking in the very long term it seems far better to find a way to fix |
If 64 bit support is implemented there could be a tool that analyzes your code for legacy patterns (e.g. usage of |
Since this would be such a massive ecosystem breaking change, one other thing you'd probably have to do is analyze the entire graph of all dependencies your application consumes. If the change were made to |
Not if large array was an array... i.e. derived from it if only internally perhaps; like I suggested back in the span threads. |
If large array is a glorified array, then you could pass a large array into an existing API that accepts a normal array, and you'd end up right back with the problems as originally described in this thread. |
Furthermore, I'm not sure I buy the argument that adding large array would bifurcate the ecosystem. The scenario for large array is that you're operating with enormous data sets (potentially over 2bn elements). By definition you wouldn't be passing this data to legacy APIs anyway since those APIs wouldn't know what to do with that amount of data. Since this scenario is so specialized it almost assumes that you've already accepted that you're limited to calling APIs which have been enlightened. |
You have LongLength on the Array The only fundamental diff is that one is on the LOH and one is not. By virtue of the same fact span wouldn't be able to hold more than such either so large span must be needed also... |
From what I can gather and summarise from the above, there are two pieces of work.
But the more I think about it, the more I think this issue is missing the point and ultimately the real problem with .NET as it stands. Even if the initial proposal was to be implemented, it would only fix the immediate 64-bit issue with Array but still leave collections and Span with the same indexing and length limitations. I've started reading The Rust Programming Language (kind of felt overdue) and it struck me that Rust also mimics C++ size_t and ssize_t with usize and isize. C# on the other hand somehow decided not to expose this CPU architectural detail and forces everyone to the lowest common denominator for most of it's API: a 32-bit CPU with 32-bit addressing. I'd like to emphasis that the 32-bit limitation is purely arbitrary from a user point of view. There is no such thing as a small array and a big array; an image application should not have to be implemented differently whether it works with 2,147,483,647 pixels or 2,147,483,648 pixels. Especially when it's data driven and the application has little control on what the user is up to. Even more frustrating if the hardware has long been capable of it. If you do not believe me or think I'm talking nonsense, I invite you to learn how to program for MS-DOS 16-bit with NEAR and FAR pointers (hint: there is a reason why Doom required a 386 32-bit CPU). Instead of tinkering around the edges, what is the general appetite for a more ambitious approach to fix this limitation? Here is a controversial suggestion, bit of a kick in the nest:
I understand this is far from ideal and can create uncomfortable ecosystem situations (Python2 vs Python3 anyone?). Open to suggestions on how to introduce size types in .NET in a way that doesn't leave .NET and C# more irrelevant on modern hardware each year. |
If we can solve the issue with the GC not being tolerant of variable-length objects bigger than 2 GB,
This scheme would allow |
This would make
I would go with overflow checks when |
@kasthack I was trying to come up with a compatible solution where one never has to worry about getting With explicit casting, it's clear that we're crossing a boundary into "legacy land" and we need to make sure "legacy land" can do everything it would reasonably be able to do with a normal
Arrays don't implement ISet/IDictionary so a cast to these would never succeed. For IList, the same rules would apply as for ICollection (ICollection was just an example above). |
@GPSnoopy's post makes me wonder whether the following variation might make sense:
|
It has been proposed that we could make This would be a portability issue. Code would need to be tested on both bitnesses which currently is rarely required for most codebases. Code on the internet would be subtly broken all the time because developers would only test their own bitness. Likely, there will be language level awkwardness when In C languages the zoo of integer types with their loose size guarantees is a pain point. I don't think we want to routinely use variable-length types in normal code. If All arrays should transparently support large sizes and large indexing. There should be no If the object size increase on 32 bit is considered a problem (it might well be) this could be behind a config switch. Code, that cares about large arrays needs to switch to Likely, it will be fine even in the very long term to keep most code on I see no realistic way to upgrade the existing collection system to 64 bit indexes. It would create unacceptable compatibility issues. For example, if It would be nicer to have a more fundamental and more elegant solution. But I think this is the best tradeoff that we can achieve. |
Just note there is already a case today where Length can throw. Multi-dimensional arrays can be large enough that the total number of elements is greater than Int.MaxValue. I think this is why LongCount was originally added. |
Note: Currently, when you write I really hopes all size-like parameters among the ecosystem is using |
One option is to create a NuGet package for But CLR should be ready by then. |
@lostmsu Are you aware of ongoing work to fix this in CLR that I'm not aware of? |
@GPSnoopy nope, that was an argument to start the CLR work ahead of time before BCL, so that BCL could catch up. |
Users always need the right equipments to run a certain software anyway. |
There is a large difference between specialized software and general-purpose software and many of the callouts you gave are still categorized under "general purpose". Even for games, the "typical" gaming PC (based on Steam Hardware Survey, but also account for consoles and other reporting sites) is a 6 core, 16GB PC running something like an RTX 3060. Machines running 4 cores, 8GB of RAM, and older cards like a GTX 1060 aren't that far behind in usage either, nor are many mobile chipsets. But, even if you do have a more powerful machine, a typical user is still sharing that machine with other apps, services, and still doesn't want to go over around 80% consumption of the max (due to risk of causing thrashing and contention issues).
This world did semi-regularly have issues, often large issues, which app developers had to deal with. Almost every single platform (C, C++, Java, Rust, C#, Python, etc) has been moving towards making their ABIs more stable.
Right, sometimes breaks are necessary; especially in the AOT world. But they are being made intentional and targeted to reduce risk and pain.
There's two main considerations here. First, you need to show that this is an actual foundational issue. The general argument I've given above is that while having 64-bit arrays would be convenient, it is not actually a foundational issue and in many ways not an issue at all due to the alternatives that exist and the real world considerations that an application or library working with such large data would fundamentally have to make to ensure that it remains usable. The second is that .NET Core 1.0 did try to make some big breaks and we found out by .NET Core 2.1 that many of them were a mistake and caused too much pain, so much so that we brought most APIs back and made the ecosystem almost whole/consistent again by .NET Core 3.1. The few exceptions were scenarios that were used so little and which were so fundamentally problematic, that bringing them forward as is was a non-starter and where we've since provided or are working on providing alternatives instead.
Yes, but working with large data is a different experience altogether regardless. You can't just take a data structure that was designed to work with thousands of elements and expect it to behave well when working with billions or more elements. The inverse is also true. Any application that required working with many gigabytes or more of data and decided to do that by simply writing their code the same way as they would for kilobytes of data would end up digging it's own pitfalls. We don't live in an idealized world, costs don't actually scale linearly, not every operation is equivalent. We have many tiers of costs and we have limits on operations and how much can be pushed through a given pipeline. An application that might saturate those pipelines has to rethink how its working so it doesn't become bottlenecked. It likewise needs to aim to balance across all pipelines to ensure they all achieve suitable levels of use (neither maxing them out or underutilizing them). .NET arrays already allow you to work with gigabytes of data, up to almost 2.14 billion elements (regardless of the size per element). Once you start approaching this limit, you're already at the point where you probably should've started restructuring and rethinking how your application work, even if your machine has terabytes of RAM available to it. |
It's certainly true that applications dealing with substantial amounts of data need careful consideration in their design and implementation. However, rather than solely attributing responsibility to the .NET Runtime for addressing users' mistakes, it's worth considering a more balanced perspective. Mistakes and challenges in dealing with large-scale data can indeed be valuable learning experiences for programmers. Furthermore, while partitioning memory allocation may seem advantageous, they come with their own set of drawbacks.
The crux of the matter isn't merely the fixed value of While implementing radical changes at this stage could entail significant disruptions, there's merit in introducing new types such as |
Yes. However, if .NET already doesn't support the thing developers shouldn't be doing (in this case for historical reasons) then there is little to no benefit in .NET doing the massively complex work to support it because we'd be adding support for something that no real world scenario should be using in the first place. The discussion basically stops there as there is no justification for adding support for something which no one should ever be using and for which if users "really" decide they need it, there are plenty of viable workarounds. If developers need to learn, they can just as easy find this issue and read some of the feedback. Developers coming in and saying "we want this anyways, even if its the wrong thing" just adds confusion to the topic and hinders that learning process.
That's really not how that works. The GC and memory allocators in general are optimized for the common cases. They tend to have more overhead, more restrictions, and generally behave worse when introduced to uncommon patterns or extremely large allocations. Today, the GC considers anything over 85KB as "large" and places it on a special heap (LOH) which is almost never compacted or moved. When such large allocations are over reference types, they have a lot more data that needs to be processed to determine liveness and the overhead of that many objects is going to be substantially more than almost anything else. Even if the support did exist, it wouldn't get optimized because it wouldn't be a common case. It would never become common because users would start trying it and see that the perf was awful and they'd end up having to research and find out they shouldn't be doing this stuff in the first place. They then wouldn't use the feature as they'd see they should be doing other things instead.
This is a straw person argument. Computers being restricted to 16-bits would indeed be limiting, but we aren't restricted to 16-bits. We're restricted to a reasonably large boundary (2.14 billion elements, which could take up more than 32-bits of memory space) well above and beyond what hardware supports for its built in systems and the amounts they are designed to be efficient around. What you're stating is like arguing that because we went from 8-bit to 16-bit, 16-bit to 32-bit, and 32-bit to 64-bit address spaces in the past 40 years that we will need to go from 64-bit to 128-bit in the next 100. This is the type of thing that a person might naively state without considering that this is an exponential growth and that 2^64 is such an astronomically large number that a single computer cannot actually hold that much RAM without us fundamentally breaking how the hardware is manufactured and even beyond that, essentially breaking our current understanding of physics. To actually work with that much memory, you need many machines. Once you have many machines, you have non-coherent memory. Once you have non-coherent memory, you have to fundamentally change how you work with the data for both correctness and performance. Distributed computing then typically has a customized Operating System, specialized infrastructure and networking layouts, and unique data management systems to ensure that data can be correctly and efficiently moved about without hitting bottlenecks. This is true for super computers, IoT compute clusters, server setups for datacenters, etc. The same general considerations tend to happen when you get beyond 2 billion elements. Once you start getting that big, you can't actually reliably make the allocations unless you limit the hardware it runs against. Even if you restrict the hardware you run against, you're going to ultimately hurt the overall performance and considerations of the application because you aren't the only process running. The cost of accessing any piece of memory isn't linear, it starts getting bottlenecked by the increasingly smaller caches and prefetch mechanisms (so much so that in many cases you want to bypass these caches by using non-temporal accesses). You also start getting into compute times where you fundamentally have to start parallelizing the computation. All of these changes end up meaning that the way the data needs to be worked with has changed as does how you need to manage the data. You have a need to chunk the data into many separate views that can be independently accessed by separate cores and then independently freed, paged out, etc as the data gets processed. You have to balance the management of that data with the processing time of that data to ensure that the CPU stays reasonably saturated without oversaturating it (typically about 80% utilization is the ideal). You may start considering the need for GPU compute, where a single GPU may peak around 24GB of RAM. Where the GPU itself may be limited in how much they can transfer at one time (until relatively recently this was capped at 256MB). Where you actually have to copy data from one memory system to another and where that memory is non-coherent. Where the allocations and the amount a shader can access at any one time may itself be limited, etc. None of this is a problem unique to .NET. It's just the world of working with "big" data and where "big" is a lot smaller than 2 billion elements.
This is also isn't really a good argument and there are many reasons why using Types like |
Oops. |
Just to cross check a few facts, plus some comments.
AMD Zen 4 increased its virtual address space from 48-bit to 57-bit. Wikipedia indicates Intel did the same with Ice Lake in 2019 (https://en.wikipedia.org/wiki/Intel_5-level_paging). Intel 386 (the first x86 32-bit CPU) was introduced in 1985. AMD Opteron (the first 64-bit CPU) was introduced in 2003. This is 18 years later, not 40. While there is no immediate need for 128-bit physical addressing. Actual support for virtual addressing using more than 64-bit is not as far as you think. In the meantime, the typical amount of RAM per CPU (i.e. UMA) on a server is 768GB-1.5TB. AWS has NUMA instances that go up to 24TB.
You imply too much on how one might use large arrays. The random-access and linear access cost of any array are the same for any size. Cache locality for random access on large arrays depends heavily on the algorithm and its data locality, rather than the fact that a large array has been allocated in one go. Cache locality goes out of the window if you do a linear access on an array that does not fit in cache (even if you split the array into smaller ones). Not sure why you want to deny the ability for people to use large arrays.
Last night, NVIDIA announced the Blackwell GPU which has 192GB of RAM. The previous NVIDIA GPU, the H100, has 80GB of RAM. CUDA has no issue with large data transfer (larger than 256MB) and 64-bit addressing.
You and I do not have the same definition of big data. If I can fit it on a laptop, it's not big.
I think the whole technical discussion is moot. @tannergooding is just regurgitating the same arguments as people did in 2003 when AMD64 came out; arguing why it's useless, has too many performance compromises (e.g. pointers being twice the size), and somehow going on a crusade to unilaterally deny its access to programmers/users. In the meantime the world as moved on. Seriously. The real question is where the actual decision makers at Microsoft see C# and .NET going. My observation (and this is purely my biased opinion) is that Microsoft has nowadays little appetite for .NET (in markets such as system programming, REST servers, large scale services, HPC, GPUs, AI, etc), instead chasing subscription-based revenues and AI services; leaving .NET mostly for "small" applications or services, mobile apps, and some Unity games. |
If you agree that this is a specialized scenario for specialized hardware, sure. I was pushing back on gigabytes of memory being called normal data rather than big data. |
Yes, and that is still nowhere near the entire 64-bit address space. It is an exponential growth. Different scales in terms of nanoseconds:
The reason why I use nanoseconds is because computers currently operate in terms of GHz, 1hz is therefore 1ns. The fastest CPUs can boost to around 6GHz today and with the assistance of liquid nitrogen we've managed to nearly hit 9GHz. The fastest CPUs can do parallel dispatch of up to around 5 additions per cycle, assuming they are non-dependent. So assuming we simply wanted to increment 2^64 times, it would take a single core on a modern processor 12.97 years to do so, operating under peak efficiency and never slowing down. If we scale that to the peak number of cores in a single CPU (192) and factor in that they're hyperthreaded and without factoring in the lower clock speeds, then a single modern CPU can iterate 2^64 times in around 12.33 days. This of course isn't factoring in that memory access is tens to hundreds of times slower than direct instruction execution, that the peak transfer rate of things like DDR5 is 64 GB/s and in practice is much lower due to contention, random access, etc. It also isn't factoring in that we're hitting the limits of how small we can make transistors due to the effects of quantum tunnelling or that there are other fundamental barriers related to heat dissipation, the speed at which electricity can travel through various materials, etc. That while we will continue to see some increases, we require a massive scientific breakthrough and likely a change in how computers are designed to get them substantially faster that what we have today.
The consideration is that many of the necessary data management tasks can no longer be trivially done when it is actually a single allocation. These becomes significantly more trivial if you break it up into appropriate sized chunks and that is just how the data is actually interacted with in systems that are designed to work with large data. They recognize that it is technically a contiguous thing, but that it needs to be appropriately buffered, chunked, streamed, and managed to account for real world hardware limitations.
Yes and this is an extremely high end GPU that is not meant for a typical consumer. The typical GPU is far below 24GB with 24GB being the upper end of what the highest end consumer PCs and even the upper end of what many cloud systems offer.
There is a difference between the API behind the scenes chunking the data for you and it actually working with greater than 256MB chunks. The limit of 256MB transfers comes about from the PCIe specification which required an optional feature introduced in 2007 known as
You're free to have your own definitions for big vs small, but that doesn't negate how general systems think about the data. To the GC, anything over 85KB is large and goes on a special heap. To standard implementations of The same general considerations also apply to the CPU optimization manuals, the implementations of In terms of address space, these are relatively tiny. Even in terms of typical file size (considering modern images or videos), these are relatively tiny amounts. But they are still significant in terms of the scale of the CPU, the hardware itself, and the limits its designed around for efficiency.
They are very different arguments. When 64-bit came out, we were already hitting the limits of 32-bit address space, especially across an entire machine. There was a clear need for the growth even if individual applications weren't near these limits and even if typical applications weren't near those limits. There has been no argument against needing to work with data that is more than 2GB in size nor against there being cases where having a single linear allocation would be convenient. I have stated that a The arguments have therefore been around how the considerations for accessing big data change compared to accessing lesser amounts of data, how we are no where near the limits of the 64-bit address space, how we are approaching fundamental limits in how computers operate that are unlikely to change without a fundamental breakthrough in our understanding of physics, and how the effort would be better spent investing in making it easier to work with big data, irrespective of how its been allocated and particularly allowing for many separate allocations so that the code can work across a broader range of machines and considerations (both small and large distributed systems require similar considerations here for big data).
.NET works just fine with large scale services today and is used by many systems that operate on global scales servicing millions of customers per day, touching petabytes and more worth of data: https://dotnet.microsoft.com/en-us/platform/customers There are of course even more big customers and scenarios that aren't listed on the showcase page. There are plenty of community projects that are targeting these scenarios, there are efforts by the .NET Libraries team to continue improving our support for things like Tensors, working with large data in general, etc. .NET has a clear interest in supporting working with big data, but that in part has to come with consideration for backwards compatibility, with consideration for real world usage (not just convenient usage), and with consideration for what will lead users towards the greatest success. |
While I like the appeal of using
That's true, but:
|
I thought using a native size would be more efficient, because the JIT has to widen an |
TL;DR: Not really. It's implicit as part of 32-bit operations. For RISC based architectures (Arm32/Arm64, Risc-V, LoongArch and others), they typically use a fixed-width encoding and so the actual decoding cost is the same. The execution costs, however, can still differ although this namely applies to operations like multiplication and division, not to operations like addition, subtraction, or shifting. However, because its "fixed-width" generating constants can take multiple instructions and so you may need 3-4 instructions to generate some 64-bit constants. For CISC based architectures (x86 and x64 namely), they typically use a variable-width encoding. For many reasons, including back compat but also factoring in common access sizes, the smallest encoding works with 32-bit registers and you need an additional prefix byte to do 8, 16, or 64-bit register access. This minorly increases the cost to the decoder and can negatively impact other code by pushing bytes outside the normal decode/icache windows. For both sets of architectures, it is typical that doing a 32-bit operation will implicitly zero the upper 32-bits. So doing an public static int M(Span<int> span)
{
int sum = 0;
for (int i = 0; i < span.Length; i++)
{
sum += span[i];
}
return sum;
} Which generates the following: ; Method Program:M(System.Span`1[int]):int (FullOpts)
G_M000_IG01: ;; offset=0x0000
; No prologue
G_M000_IG02: ;; offset=0x0000
mov rax, bword ptr [rcx] ; load the byref field into rax
mov ecx, dword ptr [rcx+0x08] ; load the length into ecx, implicitly zero-extending
xor edx, edx ; zero the sum local
xor r8d, r8d ; zero the index
test ecx, ecx ; check if the length is 0
jle SHORT G_M000_IG04 ; skip loop if it is
align [0 bytes for IG03]
G_M000_IG03: ;; offset=0x000F
mov r10d, r8d ; copy index into r10d, technically unnecessary
add edx, dword ptr [rax+4*r10] ; load element from the index and add to the sum
inc r8d ; increment the index
cmp r8d, ecx ; check if we've hit the bounds
jl SHORT G_M000_IG03 ; if not, continue the loop
G_M000_IG04: ;; offset=0x001E
mov eax, edx ; move the sum into the return register
G_M000_IG05: ;; offset=0x0020
ret ; return
; Total bytes of code: 33 If you instead use the 64-bit based indexing, you end up increasing the encoding cost by around 1-byte per instruction (which uses the wider register). It probably won't matter in practice, but its still unnecessary "bloat" to the 99% use case. |
I think we shouldn't waste much time on how large the benefit is - it is definitely beneficial for many cases. Instead, the topic should focus more about how world-breaking it would be, for existing 32bit-based code. Do more magic for codegen may also affect the performance.
This looks like xarch specific. Popular RISC architectures use the same size for instruction encoding. This is something to consider but should not matter to much. |
OK, I found you're right about this. I agree.
Would you clarity about what's the drawbacks using unsigned types? I initially thought that
Here's the list of a tiny portion of reasons why unsigned indexing wouldn't harm performance anyway:
I don't see anything I absolutely require signed index types. |
For using signed, there is a certain reason for the stupid Enumerator design of c# public ref struct Enumerator
{
/// <summary>The span being enumerated.</summary>
private readonly Span<T> _span;
/// <summary>The next index to yield.</summary>
private int _index;
/// <summary>Initialize the enumerator.</summary>
/// <param name="span">The span to enumerate.</param>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
internal Enumerator(Span<T> span)
{
_span = span;
_index = -1;
}
/// <summary>Advances the enumerator to the next element of the span.</summary>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public bool MoveNext()
{
int index = _index + 1;
if (index < _span.Length)
{
_index = index;
return true;
}
return false;
}
/// <summary>Gets the element at the current position of the enumerator.</summary>
public ref T Current
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
get => ref _span[_index];
}
} Java-style iterators is better with unsigned public ref struct Enumerator
{
private readonly Span<T> _span;
private uint _index;
internal Enumerator(Span<T> span)
{
_span = span;
_index = 0;
}
public bool HasNext() => _index < _span.Length;
public ref T Next() => ref _span[_index++];
} |
You can do the same thing with public ref struct Enumerator
{
private readonly ref T _head;
private readonly nuint _length;
/// <summary>The next index to yield.</summary>
private nuint _index;
/// <summary>Initialize the enumerator.</summary>
/// <param name="span">The span to enumerate.</param>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
internal Enumerator(Span<T> span)
{
_head = ref MemoryMarshal.GetReference(span);
_length = (uint)span.Length
_index = ~(nuint)0;
}
/// <summary>Advances the enumerator to the next element of the span.</summary>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public bool MoveNext()
{
var index = ++_index;
return index < _length;
}
/// <summary>Gets the element at the current position of the enumerator.</summary>
public T Current
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
get => _index < _length ? Unsafe.Add(ref head, _index) : default!;
}
} |
That's part of the discussion point. Whether there is actually benefit or whether it is simply convenience. Many times convenience is a benefit, but other times it may itself be a pit of failure and be inappropriate due to the pain it causes or due to leading users down the wrong path, especially users who may not know better.
.NET cannot make a breaking change here. As has been iterated many times, almost every single for loop written would now have a chance for silent failure. Any such change to make If it was deemed viable, a new array type could be introduced and which new code could opt into using. However, there have been many reasons pointed out as to why this itself isn't viable and how it interplays poorly with a GC. Such a type would reasonably need to be restricted to The most viable path here would be to introduce some
The considerations for RISC architectures were also called out, including the extended instruction sequences often required to work with anything over a 16-bit constant. Every architecture has tradeoffs here and almost every one ends up being more expensive for larger register sizes. It only varies where that expense is seen (encoding size vs operation count vs number of instructions vs ...). |
Sure. One of the primary things to note is that using a signed value doesn't mean negative values are allowed, so many of the "benefits" you've listed for unsigned types equally apply to signed. Many of the checks are likewise equivalent due to the two's complement representation.
The length for types like
You only have a singular check for signed as well. Due to two's complement representation
This is really a non-issue. Due to having an Operating System, a runtime, general program state, etc; you can never actually have something that is On a 32-bit system, it's typical for the OS to explicitly reserve at minimum 1GB of memory, so the maximum user reserved space is 3GB, which of course is shared with other systems. Without explicit opt-in, many systems historically limited this user space to 2GB as well, notably. On 64-bit systems, most modern hardware actually has special support for always masking off the most significant 7-11 bits of the address so that it can be used to encode special information. However, even if this were to ever expand to the full space, it's very likely that the last bit would be reserved for the system regardless. The amount of space required to encode the page tables when you actually have that much memory, the general needs for the system and memory managers to maintain state, the need to represent the actual programs, etc all prevent you from ever actually using all bits, so reserving the last bit is fine.
Depends on the CPU. Many modern CPUs have zero-cost sign-extension as well. However, sign-extension is only needed for values that "could" be negative. The general index usage, bounds checks, and well known state for core types means that values are typically known to be never negative and thus can use zero-extension regardless. There are a couple of niche cases otherwise, but those are easy to workaround. -- More notably, if using
This depends. Many native libraries do take It's ultimately all about tradeoffs and for typical usages the types can be treated equivalently due to knowing the value is |
TBH, I don't think it's possible that unsigned integers are ever considered — they are not even CLS-compliant. That's a huge drawback for interop with or implement some languages. There are several possible ways to introduce huge arrays: new types, simple breaking change, port |
CLS compliance notably doesn't really matter that much anymore and we've even looked at removing it in the past. It was a consideration 20+ years ago when .NET was first introduced and there are technically some languages that may not support unsigned types. However, CLS Compliance itself hasn't been updated since then and doesn't take into account any new features (like We've also, since .NET Core 2.1 at least, been a lot more flexible and overall consistent with exposing support for unsigned types where that makes sense or to ensure they have general parity with signed types (such as for Generic Math, conversion support, optimizations, Math and reading/writing APIs, etc). So, if we truly thought that
I would expect the most likely thing to be some It might be conceivable to then define some |
You still have to check it in
Unsigned cast hack is not an exception of negativity check.
Yes. I knew it. Additional question: Why do you need to reinterpret the values in signed types as unsigned one, if the signed types are better? |
A lot has been written about advantages and disadvantages of signed and unsigned types for lengths. The general consensus is that signed lengths are less error prone for general higher-level programming that is focus of C# and .NET. Other much newer languages that target similar audience have made the same choice:
On the other hand, Rust that is more focused on low level programming has unsigned lengths. |
I don't think this is the appropriate discussion for that here. But my 2 cents are: While I like the appeal of unsigned lengths as something that can be correct by construction if your API never accepts negative values, so that you don't have to check for it on every input... in practice, unsigned lengths have many disadvantages - the biggest one being that if you have two unsigned values, you just can't take their difference, because the difference might not fit into an unsigned value. This is not only an annoying issue, but also a correctness issue, because even if you don't accept negative values, you often need them as part of your computation. Even C++ introduced I would love it if there were unsigned types that had the same max value as corresponding signed types, just couldn't be negative - so you could always take their difference. Those would have none of the disadvantages of unsigned types. But unfortunately, this isn't how either the hardware or platforms are designed, and we have what we have. The best we can do given what we have are signed types. |
That's true but @MineCake147E still has a point (although I don't agree that unsigned would have more benefits than downsides). There are many places where you need to check than an
Although as I said, this isn't actually a benefit - in practice it's a problem. |
I imagined letting the JIT insert some magic to adapt nint-length to 32bit indexing code. This will certainly bring some performance impact. Existing code shouldn't fail while processing small arrays. But yes, making them to have a chance of failing is painful. |
Even Zen4 and Golden Cove need 1 clock cycle to perform
This is why the carry flag exists. var s = length - start;
return length > start ? s : 0; The JIT could optimize it to something like (in the future): xor eax, eax
sub rdx, rcx ; rdx = length - start
cmova rax, rdx
ret
As I mentioned earlier, you can simply replace
I wonder what you mean here. |
Yeah... those should never have existed. If you have no value, just use |
It's not like it's ok to index into array with negative int the reasoning behind using unsigned for length I can see is now you can get an array with more than INT32.MaxValue elements |
I would say it's the opposite, unsigned numbers lead to more bugs (especially when combining them with or converting to/from signed numbers, which you often have to). Using a single data type for all kinds of integers has a huge advantage. This discussion is pointless though, I'm pretty sure the API review board would never approve an unsigned data type for regular usage like length or indexing. At least not without a really good reason for why it's not only useful, but necessary. |
While System.Array API supports LongLength and operator this[long i], the CLR does not allow arrays to be allocated with more than 2^31-1 elements (int.MaxValue).
This limitation has become a daily annoyance when working with HPC or big data. We frequently hit this limit.
Why this matters
In C++ this is solved with std::size_t (whose typedef changes depending on the target platform). Ideally, .NET would have taken the same route when designing System.Array. Why they haven't is a mystery, given that AMD64 and .NET Framework appeared around the same time.
Proposal
I suggest that when the CLR/JIT runs the .NET application in x64, it allows the array long constructor to allocate more than int.MaxValue items:
I naively believe that the above should not break any existing application.
Bonus points for extending 64-bit support to Span and ReadOnlySpan.
The text was updated successfully, but these errors were encountered: