Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rlgl] Software renderer support? #3928

Open
raysan5 opened this issue Apr 21, 2024 · 27 comments
Open

[rlgl] Software renderer support? #3928

raysan5 opened this issue Apr 21, 2024 · 27 comments
Labels
enhancement This is an improvement of some feature

Comments

@raysan5
Copy link
Owner

raysan5 commented Apr 21, 2024

Just a tentative issue to study the possibility of supporting basic software rendering using the new library: https://github.com/Bigfoot71/PixelForge

@Bigfoot71
Copy link
Contributor

I'm currently working on verifying that all the necessary features for rlgl are implemented and reviewing the API to facilitate its integration.

Firstly, we should therefore think about how to update the window framebuffer in a simple and efficient way.

The first idea that comes to mind would be, roughly, to add a #ifdef SOFTWARE_RENDERING case in the EndDrawing() function and proceed with the update here, but this goes beyond the "framework of rlgl."

And there are also other points that may require consideration, such as the fact that "textures" are not managed via identifiers by PixelForge but are full-fledged structures.

@MrScautHD
Copy link
Contributor

is that replacing OpenGL?

@Bigfoot71
Copy link
Contributor

is that replacing OpenGL?

For software rendering; Yes
Entirely: No


@raysan5 Otherwise, if the fact that textures are managed via structs and not by identifiers with PixelForge is an issue for its integration into raylib, there is TinyGL by Fabrice Bellard, and another repository that has taken it up: https://github.com/C-Chads/tinygl/tree/main

I don't plan to change the fact that textures are managed by structs to do like TinyGL, my library is intended to be inspired by the OpenGL 1 API but not entirely copied.

Perhaps TinyGL would be much simpler to integrate than PixelForge in this regard.
On the other hand, the fact that glOrtho is missing in TinyGL can also be problematic for raylib...

I have almost implemented everything necessary compared to TinyGL, and plan to implement the missing functions, some of which have already been done like glOrtho or glPointSize for example.

@Bigfoot71
Copy link
Contributor

Bigfoot71 commented Apr 28, 2024

Also, one thing I've considered is that mesh uploading to the GPU happens automatically in most cases, which might need to be reviewed, perhaps with definition checks.

And UploadMesh should be removed, no ? (or rather that the function does nothing)

The Material structure would also need to be reconsidered in the case of software rendering support.

It's quite tight since raylib uses glfw by default, we'll have to deal with OpenGL directly.

With SDL, it's simpler because we can easily access a window's surface and update it directly, SDL handles everything in the background.


Edit: For the issue of textures not being managed by ID in PixelForge, we could perhaps do something like typedef pfTexture Texture2D; for its integration into raylib?

But this suggestion clearly falls outside the scope of RLGL.

After checking, I realize that if we intend to make this addition to raylib from rlgl, it would require modifying a certain number of function signatures, which is not feasible.

PixelForge therefore does not seem to be the most suitable choice unfortunately for doing it this way.

TinyGL would be much simpler to implement, but it would require a version that adds at least support for 'glOrtho'.

I would be happy to do it if it's for raylib!

@Bigfoot71
Copy link
Contributor

One thing I'm thinking about all this...

The basic idea would be to enable software rendering via rlgl, but that would remove OpenGL support from rlgl (for software rendering), while it would still be needed with glfw in the context of raylib, so using OpenGL outside of rlgl seems odd to me.

If you say rlgl can, for example, handle texture updates via OpenGL itself while being compiled in "software" mode, I would find that even stranger.

The ideal solution would therefore be to allow the use of both rendering modes, via a switch between the two modes for example.

Well, it's easy to say, but much harder to do properly, it's just an idea.

@RobLoach
Copy link
Contributor

Another promising implementation is PortableGL from @rswinkle.

@colesnicov
Copy link

Another promising implementation is PortableGL from @rswinkle.

I am look at like the pook. Impressive.

@Bigfoot71
Copy link
Contributor

Another promising implementation is PortableGL from @rswinkle.

I am look at like the pook. Impressive.

Really yes, we really need to look into this further, it would be great at first glance for raylib

@bohonghuang
Copy link
Contributor

It sounds promising. A software renderer may enable more use cases of Raylib, such as image rendering on servers without OpenGL context.

@raysan5
Copy link
Owner Author

raysan5 commented Aug 24, 2024

@bohonghuang Actually that use case is already possible with provided Image*() functions, latest commit 4c9282b also allows Font loading with no need of an OpenGL context / GPU.

@raysan5
Copy link
Owner Author

raysan5 commented Aug 24, 2024

@Bigfoot71 What is the state of PixelForge? I've seen you keep working on it and you also have some raylib examples!

@raysan5 raysan5 added the enhancement This is an improvement of some feature label Aug 24, 2024
@Bigfoot71
Copy link
Contributor

@raysan5 Yes, indeed, I'm still working on it, although I'm struggling to find time for it.

Currently, I'm focusing on SSE(x)/AVX2 optional support before moving on to other optimization ideas for its architecture.

But as I mentioned, I'm currently severely lacking time...

You can find the current progress on SIMD support here: Bigfoot71/PixelForge#3

These advancements include SIMD-compatible bilinear sampling and some other nice little things.

@dineth-lochana
Copy link

I don't know anything about the implementation of graphics APIs, but I can share some of my own experiences with OpenGL and Software Rendering. There are some edge case devices like my HP Probook 4430s, which has broken OpenGL support on Windows. It still runs OpenGL, but it is very much slow, and I have seen that in games like Doom, Software Rendering is much more faster on my PC than OpenGL (old or modern) ever is.

Even though I'm certain Raylib would benefit from Software Rendering, if Raysan and the other contributors deem it unwanted, it's fine as well. Most people do fine with OpenGL, and it doesn't seem to be going away anytime soon. Vulkan support seems to have stagnated, and most people target DirectX (90% of games on Steam in 2023)

@wwderw
Copy link

wwderw commented Sep 4, 2024

I would love to see a software renderer here. I have some projects that just don't need all of what acceleration provides and it would be nice not having to depend on whatever the popular flavor of graphics API that there is.

I was wondering (and I know just enough to cause damage) would it be possible with the SDL backend to use their software rendering? I don't know if that would make things easier or if it would make it harder with getting it into Raylib or is it better if it is going to be done that it's Raylib's own implementation (or a small header library implementation).

@Bigfoot71
Copy link
Contributor

Bigfoot71 commented Sep 5, 2024

@wwderw raylib already has everything needed under the hood to do simple rendering of 2D shapes or images within other images. Here, the question of software rendering would be to be able to use all of raylib's rendering functions, including those like DrawCube, DrawMesh, DrawModel, etc., to perform direct rendering to the screen in a software manner. Therefore including the steps of clipping, rasterization, color blending, etc...

Edit: What you would do with SDL, that is, create a surface, draw your pixels, and update the window surface or your renderer, is equivalent in raylib to creating an image, drawing in it, and then updating a texture that you subsequently render to the screen

@wwderw
Copy link

wwderw commented Sep 5, 2024

@Bigfoot71 already has everything needed under the hood to do simple rendering of 2D shapes or images within other images. Here, the question of software rendering would be to be able to use all of raylib's rendering functions, including those like DrawCube, DrawMesh, DrawModel, etc., to perform direct rendering to the screen in a software manner. Therefore including the steps of clipping, rasterization, color blending, etc...

Edit: What you would do with SDL, that is, create a surface, draw your pixels, and update the window surface or your renderer, is equivalent in raylib to creating an image, drawing in it, and then updating a texture that you subsequently render to the screen

Yep, just know enough to cause damage. Fact that I didn't think that it already had some capability shows it right there.

@ElectroidDes1
Copy link

what's the point of using software rendering in Raylib - if Raylib uses OpenGL anyway??

@RobLoach
Copy link
Contributor

RobLoach commented Oct 3, 2024

For lower-end devices where OpenGL isn't available. Think raylib on a Gameboy.

@colesnicov
Copy link

colesnicov commented Oct 3, 2024

I like Raylib. But I ran into a minor problem. I only have 4FPS on Raspberry PI 2B :( I haven't found out where the problem is. So now I'm going to try native libDRM, but I haven't tested FPS.

@Bigfoot71
Copy link
Contributor

Bigfoot71 commented Oct 3, 2024

@raysan5 I am personally ready to work on this issue, but we need to agree on how it should function and how we should implement it.

Initially, I considered rewriting a version of rlgl.h that only includes the features it already has for OpenGL 1.1, but that would still be very limited, especially given the current implementation of rlgl.h for GL 1.1, such as the lack of lighting management...

Or the other solution would be to use a third-party library. In both cases, the rendering would be done in our own RAM buffer, and we would need to determine where and how, depending on the platform and the window management library, to update the final framebuffer with our buffer.

Some platforms may allow direct writing to the 'final framebuffer', while others may not, which can be a problematic implementation detail.

Have you thought about this? If you prefer the first method, I can start on it right away. If you prefer the second method, we should look into which library to use.

  • TinyGL has almost all the features of OpenGL 1.2, and I find it really very efficient.
  • PortableGL seems less efficient but handles shaders through function pointers, however.
  • Mesa, but this would be the same as running raylib on a machine with a Mesa driver, so no modifications would be needed...

I also think that if the goal is to render 3D graphics on limited embedded hardware, we should either use TinyGL or create our own minimalist implementation. Otherwise, Mesa already does the job on older machines...

@colesnicov
Copy link

@Bigfoot71

I also think that if the goal is to render 3D graphics on limited embedded hardware, we should either use TinyGL or create our own minimalist implementation. Otherwise, Mesa already does the job on older machines...

TinyGL does not support OpenGL ES2!

@Bigfoot71
Copy link
Contributor

Bigfoot71 commented Oct 4, 2024

@colesnicov

TinyGL does not support OpenGL ES2!

Indeed, I mentioned this clearly in my message (GL 1.2), but if the goal is execution on embedded hardware, shader support is likely to be very difficult to implement.

The most "obvious" issue is that you need to manage the interpolation of a variable number of attributes, which themselves can be of different types. Then there are also the "varyings" that need to be interpolated between vertices, which can also vary in number and type.

It’s already quite complicated to write such a system. I’ve tried it in C before, but it quickly became a real nightmare to manage. If I’m not mistaken, the creator of PortableGL chose to write that part in C++

Also, how are we going to manage shaders? The approach taken by PortableGL, which is to use function pointers, seems ideal to me, but probably not suitable for raylib. The other option would be to write a GLSL interpreter/compiler (?) ...

Edit: Yes, a GLSL interpreter is simply unthinkable for pixel-by-pixel processing, so the answer is in the question...

@rswinkle
Copy link

rswinkle commented Oct 4, 2024

Hi, author of PortableGL here, thought I'd chime in and correct a few things.

Indeed, I mentioned this clearly in my message (GL 1.2), but if the goal is execution on embedded hardware, shader support is likely to be very difficult to implement.

PortableGL already does without modification as long as the hardware supports C99 and 32 bit floats (even soft floats though that would be monstrously slow)

The most "obvious" issue is that you need to manage the interpolation of a variable number of attributes, which themselves can be of different types. Then there are also the "varyings" that need to be interpolated between vertices, which can also vary in number and type.

It’s already quite complicated to write such a system. I’ve tried it in C before, but it quickly became a real nightmare to manage. If I’m not mistaken, the creator of PortableGL chose to write that part in C++

It's really not that complicated. I have a whole section on that in my README(see section on TTSIOD). Long story short, no type information is needed, just how many floats you need to interpolate. Also I'm confused by what you mean by distinguishing between interpolation of attributes and "also the "varyings"". Attributes are attributes, either they vary or they don't.

Also PortableGL is pure C. You can write shaders in pure C. The built in shader library shaders are in C. I just use C++ in many demos/examples for operator overloading to make it easier and more comparable to GLSL.

Also, how are we going to manage shaders? The approach taken by PortableGL, which is to use function pointers, seems ideal to me, but probably not suitable for raylib. The other option would be to write a GLSL interpreter/compiler (?) ...

From some discussions about using PortableGL for raylib here it really would work perfectly fine. As I understand it Raylib uses a few set shaders which if not already covered by the standard shader library would be easy enough to port. Users custom shaders (if they used any, if Raylib even supports that) would have to be ported by the user of course.

EDIT: wrong link

@Bigfoot71
Copy link
Contributor

Bigfoot71 commented Oct 4, 2024

It's really not that complicated. I have a whole section on that in my README(see section on TTSIOD). Long story short, no type information is needed, just how many floats you need to interpolate. Also I'm confused by what you mean by distinguishing between interpolation of attributes and "also the "varyings"". Attributes are attributes, either they vary or they don't.

Excuse me, it had been a long time since I read your readme, and I must have mixed everything up.

And yes, regarding the attributes, I repeated myself. I rewrote my message before sending it, it was very late and I used a translator, I would have been better off re-reading the whole thing....

However, treating all values as if they were floats seems like a quick fix to me. On modern hardware, perhaps the overhead is negligible or equivalent to another method that would use templates/specializations, but on limited embedded hardware that wouldn’t even have Mesa support available, which is the case where such support in raylib would be necessary, I doubt this is a viable solution...

Another example of a rendering library specialized for embedded hardware that has made the choice for specialization is: EmberGL.

So when I say that it would be complicated, I meant for embedded hardware. Achieving something decent on PC is quite feasible, but on limited hardware seems to be a different matter, and treating all values as floats really doesn't seem to be a viable solution in our case...

Maybe it's a mistake to think that way, which is why I will conduct some comparative tests. I will come back here once they are done with reproducible tests; it's quite an interesting question.

@Bigfoot71
Copy link
Contributor

Bigfoot71 commented Oct 4, 2024

I conducted a very naive test on linear interpolation, here are the links:

Here are the results on my side for the same number of iterations as in the snippets on Godbolt:

Version Optimization Average time (ms)
C Without O 0.133465
With Os 0.000021
With O2 0.000020
C++ with template Without O 0.116042
With Os 0.000020
With O2 0.000020
C++ with specialization Without O 0.117768
With Os 0.155322
With O2 0.000021

Here’s the version of GCC I used for the test:

gcc --version
gcc.exe (Rev3, Built by MSYS2 project) 14.1.0

The examples on Godbolt don’t include the timing functions, I used QueryPerformanceFrequency from the Windows API, just to measure the second loop, and then averaged the results once the first loop was completed.

For the non-optimized version, I ran the program three times and kept the best result, but for the optimized version, I simply took the first result, as the values were so small that it obviously wasn’t significant anymore.

It’s not very scientific, of course, but it gives a general overview. I’ll conduct a more thorough test in a real case scenario by rasterizing a triangle using the three methods on different machines as soon as I have time. I’ll put everything on a repository if you’re interested.

For now, the trend seems to show that the C version doesn’t seem to be the most efficient.

By the way, in this test, the specialized version is a bit weak on its own, I didn’t have much flexibility here. Still, without optimization, it remains better than the C version, even after repeating the test again and again on my side... However yes, the function is not inlined in the C version on godbolt, but even by forcing it on my side the results seem identical...

But I still believe that if the goal is to provide a way to do software graphics rendering for embedded hardware using raylib, then each step must be approached in detail. To provide shader support, it needs to be well thought out or more restricted.

If, once again, that's not the goal, then Mesa is already doing the job after all...

@rswinkle
Copy link

rswinkle commented Oct 4, 2024

I wasn't even referring to speed when I said it was simple, but even with your C implementation (which is not what PortableGL does) it matches or surpasses both C++ versions with O2 and which means they're being compiled to essentially the same thing.

However, I think there's a miscommunication here. When I say no types, I mean not caring about whether an attribute is a float, vec2, vec3, etc. or in TTSIOD (which is not OpenGL) whether it's a light direction or an ambient light component or whatever.

In OpenGL all vertex attributes regardless of type are converted to floats internally. So in the end you have some number of float components to interpolate and it doesn't matter what they actually were originally or how they were grouped (ie a uvec2, an ivec3, and 2 vec3's just becomes 11 floats) which brings me to what PortableGL actually does.

I don't use any function calls at all. A for loop doing interpolation on each component inline should always be faster than function calls and your method is particularly inefficient since it's calling a function for every component unlike TTSIOD which uses templates for each hard coded attribute (ie one function handles all components of ambient light). Here's the loop for linear interpolation and here's what the barycentric interpolation looks like for a filled triangle. Clearly I need to update the line numbers referenced in the readme.

One of the biggest reasons PGL is probably slower than TinyGL is that it's doing so much more and has so much more flexibility. While in the long term I may try to optimize it to some extent, I really think it's better for people to optimize it for their use case/platform since that will always be more effective than general platform agnostic optimizations and it will keep the code simpler (vs trying to for example support x86 SIMD and NEON etc. conditionally). It's easy to rip code out that you don't need (don't need blending or zbuffer? toss it out, don't use anything but floats? replace the conversion function with a single line and on and on), and profile your application.

@Bigfoot71
Copy link
Contributor

Bigfoot71 commented Oct 4, 2024

Thank you for the links! And indeed, there was a misunderstanding, it was on my part, I apologize for that...

To get straight to the point, I simply wanted to say that where a software renderer would be useful for raylib is on platforms that don't support a Mesa driver.

These platforms are therefore limited, and no matter how we implement a "programmable shader system", it will require a lot of thought to minimize the additional cost it brings in order to achieve continuous and stable rendering.

And given TinyGL's performance compared to many other software rendering libraries, it is certainly the best choice for raylib to provide stable real-time rendering on limited devices.


Otherwise, if shaders are very important, there is EmberGL, which is the most efficient I know so far for embedded devices (with shaders*). It shows stable rendering with lighting on a Teensy 4.0! But its integration to raylib is likely to be more complex than PortableGL :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This is an improvement of some feature
Projects
None yet
Development

No branches or pull requests

10 participants