Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brave story - graphics glitches (debugged) #6754

Closed
whyisthisfieldhere opened this issue Aug 20, 2014 · 39 comments
Closed

Brave story - graphics glitches (debugged) #6754

whyisthisfieldhere opened this issue Aug 20, 2014 · 39 comments

Comments

@whyisthisfieldhere
Copy link

Hi, I found a few issues in brave story (hardware graphics) and debugged them to varying degrees.
In order of most to least simple to fix / likely to be fixed:

  1. Smoke & fire are colored oddly with hardware graphics (also see Dante's Inferno weird colors around the fire. #6332).
    From debugging, it looks like the smoke primitives contain various color data and the GE's shademodel is set to GE_SHADE_FLAT.
    The software renderer appears to check for the value of shademodel and ignore all but the first color in this case, resulting in the correct image.
    However, it looks like no other renderers check shademodel, likely resulting in the colorful smoke.
    This should be easy for a dev to fix, I think. (NOW FIXED)

  2. Enemy colors don't always differ in hardware graphics: (Notice color of right bunny in below screenshot in software vs hardware)
    From debugging, it looks like brave story sets the framebuffer to the clut area (with inappropriate framebuffer height) and draws colors into it. It then uses the clut for drawing the texture.
    In software, you can see that the clut area in memory gets updated between the first and second rabbit.
    In hardware, you can see that the clut area in memory doesn't get updated in this case.
    (I'm not sure how it's updated at all in hardware, to be honest)

  3. With vertex cache set (which is the default), there is serious flickering in the worldmap. Perhaps vertex cache shouldn't be enabled by default until it's more reliable?

screens of bug 1 & 2:
Hardware:
hardware
Software: (no bug)
software

@hrydgard
Copy link
Owner

  1. Ah, nice catch. We should be able to emulate that fairly easily with the "flat" glsl interpolation qualifier in GL3+ at least.

  2. Hm, rendering to CLUT is nasty... Technically it seems "Read framebuffer ot RAM"- modes should take care of it though. Other than that, catching it would require some fairly expensive checks..

@unknownbrackets
Copy link
Collaborator

The first problem might be a little of a pain to fix. We would possibly have to decode the vertices differently to apply flat shading (GLES 2.0 doesn't support flat.) I'm not really sure, since OpenGL is not my strong suit. @hrydgard?

The second problem is somewhat hard. We try to detect when a game uses a render surface as a texture, because downloading the render to memory is expensive. How different is the height? Are the addresses identical?

Oh, did you mean ti actually renders the clut palette?

The third problem, yes there are several games that vertex cache causes problems for. And optimizations have been made that have made the vertex cache not much faster in some games. I would like to disable it by default, personally, but there are a lot of games that work fine and it is still much faster for. People end up complaining that the new release is slower and will just use the older version (with its other bugs.)

Another idea might be to find a way to make the vertex cache less aggressive in cases where it's wrong, somehow, or learn from its failures better.

-[Unknown]

@hrydgard
Copy link
Owner

Actually 1) is even easier if the game is rendering just RECT primitives, then we can just copy the color from the first to the second vertex (or the other way around?) before expanding to a RECT.

Yeah, vertex cache is a big tricky tradeoff between glitches and speed that could probably use more work.

@unknownbrackets
Copy link
Collaborator

Maybe we could bail all flat prims to software transform? I wonder how common they are.

-[Unknown]

@hrydgard
Copy link
Owner

For ES 2.0 where flat is not available, we would have to add some extra code to "de-index" triangles too, similarly to how we expand rectangles, in order to set vertex colors separately. Although that would be pretty easy.

I don't think enabling flat shading is very common, maybe we should just add some reporting and see. It could be that games enable it for 2D stuff for no good reason then leave it enabled when drawing non-lit 3D stuff by accident though, and then we could lose performance...

@whyisthisfieldhere
Copy link
Author

Thanks, guys.

  1. Yikes, I didn't even imagine there'd be opengl/es versions not supporting something as basic as flat shading, but it seems like you have some ideas to workaround those cases as well.
  2. The game indeed renders to the cluts before using them. There are two cluts at 0x1ff800 & 0x1ffc00. Before drawing each monster, the game renders to them via:
    clutrender
    About the height - it looks like you estimate it based on scissor/viewport/region, which this game doesn't set differently for the clut rendering:
    clutrenderinfo
    hrydgard, by "Read framebuffer ot RAM", do you mean "Read framebuffers to memory (CPU/GPU)"? Both of these cause a crash on startup when running brave story. It look like it reads GE command lists from the VRAM (0x04xxxxxx) and gets off the rails when these settings are enabled. Perhaps it writes junk due to wrong framebuffer size estimations? I can provide more details if needed.
    But perhaps you can detect using clut where a framebuffer was, similar to how you detect using textures where framebuffers were?

@hrydgard
Copy link
Owner

Can definitely detect it, but those checks have a cost. If this is the only game that renders directly to CLUT, maybe we could do something specific like we had to do for Dangan Ronpa...

@unknownbrackets
Copy link
Collaborator

Does the CLUT load address match the render address exactly? 0x041ff800 and 0x041ffc00, I mean. It might not be too bad to call PerformMemoryDownload for the CLUT size in that case.

-[Unknown]

@unknownbrackets
Copy link
Collaborator

Oh, and about the height - we have no other choice but to estimate it. Games do not need to set any form of height to render. They just set a start address, a stride, and send coordinates.

This is actually why "read framebuffers to ram" crashes, since it simply writes all renders to RAM always, and does so based on estimates which aren't always right. We actually have a bug about removing or at least hiding that option.

-[Unknown]

@whyisthisfieldhere
Copy link
Author

Yes, the framebuffer ptr matches the clut load address exactly in Brave Story.
Up to you whether it's worth fixing and how.

Curious - are there are any discussions about any alternative graphics rendering approaches (aka - somewhat slower but still fast on desktop & high quality) that would give better compatibility?

@unknownbrackets
Copy link
Collaborator

I think detecting a direct clut would nit be so expensive (on clut load), will see. Do you have a savestate there with both enemies? I beat the game a while back on my psp... don't remember where such enemies were common.

Hmm. Only fragmented ones. Best place may be the forum our a new bug if we have a specific direction in mind.

The software renderer could use SIMD and definitely has optimization potential. There are still bugs and things we have not tested or fully understood yet in it. But on desktop I think it's feasible to be way faster.

Something that gave us more direct access to unswizzled vram, if it existed, would of course be awesome.

-[Unknown]

@whyisthisfieldhere
Copy link
Author

I uploaded a savestate with the colorful rabbits:
https://drive.google.com/file/d/0B0YP0xIZ8e4gWm9LLWpLdEJkNDg/edit?usp=sharing

@unknownbrackets
Copy link
Collaborator

Looks like it also uses a block transfer for the same CLUT when rendering other enemies. So if we always download the rendered pixels, we end up wrong for the other rabbit (and the render is not updated because the single row of pixels in the top left uses the wrong stride.)

Probably the best thing, in this specific case at least, would be to detect a render to an address that is used as a clut, and flush it right away. It might not be so expensive to check whenever switching render targets, will just need to keep track of (and decimate) a list of CLUT addresses.

A bit hacky though. Ugh.

-[Unknown]

@hrydgard
Copy link
Owner

By the way, it's unclear in general from which of the two input vertices of a rectangle we should take the color for each of the four output vertices, flat mode or not. I think we'll need some testing there, just like the UVs as mentioned in #3902.

@hrydgard
Copy link
Owner

@whyisthisfieldhere yes, a way faster software renderer could be made, but it would be a lot of work. Maybe we could even hook in JPCSP's new "external software renderer" although I'm not sure if they would be OK with that.

@unknownbrackets
Copy link
Collaborator

Does #6855 help with the vertex cache flickering?

-[Unknown]

@daniel229
Copy link
Collaborator

Still flickering,but not that strong.
https://www.youtube.com/watch?v=ySX0-X2h2o8

@daniel229
Copy link
Collaborator

the smoke fixed in #7115
the enemy colors and flickering are not fixed yet.

@hrydgard
Copy link
Owner

Thanks for testing, that's good. It's expected that the enemy colors aren't working yet because we have no checks for rendering to a palette yet...

@unknownbrackets
Copy link
Collaborator

Even forcing the vertex cache to rehash every draw, the flicker is there still. In fact, deleting the vbo/ebo and regenerating it every time still has flicker.

I also tried disabling draws by vert count (they're all either 4, 18, 138, 384, 3750, or 6528 in the area I was testing.) Different areas and types of flicker were caused by different counts of verts. All but 4 used indexes, and disabling all using indexes did kill the flicker.

After drawing the background, it uses a texture (0x041977f0 in my case) to apply a texture to the ground (you can see the texturing in the video.) It uses fixed blending to do this: 0 * src + src * dst, or effectively src * dst. The verts are only normals and positions (UVs are omitted, it uses tex matrix UV gen.)

The following completely eliminated the flicker:

        if (gstate.getUVGenMode() != GE_TEXMAP_TEXTURE_COORDS)
            useCache = false;

This only affected ~4% of draws. So I think this means we have some bug in the vertex cache affecting UVgen. But it could just be a coincidence (something else affecting exactly the same verts.)

-[Unknown]

@hrydgard
Copy link
Owner

Hm, interesting. What kind of bug could that be that would not affect drawing non-cached? We do have a slightly different path when uvgen is active but only when texture coord prescale is enabled..

@unknownbrackets
Copy link
Collaborator

I'm not sure. I wasn't able to find anything - the shader id is the same, the vertex format is the same, etc. The issue only happens when using a vbo - using an ebo but no vbo doesn't do it. And prescale doesn't affect it.

Strange...

-[Unknown]

unknownbrackets added a commit to unknownbrackets/ppsspp that referenced this issue Apr 28, 2015
This is pretty much only tested with Brave Story.  See hrydgard#6754.

There may be other cases which are not handled yet.
@unknownbrackets
Copy link
Collaborator

For reference, a version of render-to-clut that does work for this game:
https://github.com/hrydgard/ppsspp/compare/hrydgard:master...unknownbrackets:clut-render?expand=1
(this is "the lazy way" of downloading on every clut load.)

However, the performance hit is not good at all. 815% -> 420% in battle with the rabbits.

-[Unknown]

@hrydgard
Copy link
Owner

Cool!

But yeah, that's surely the downloads killing performance. Will be even worse on mobile, most likely.. I fear we will have to "depal" on-GPU, either directly when rendering (so we have to do the filtering ourselves too) or in a prepass like we do with paletted texturing from framebuffers, as we discussed.

@unknownbrackets
Copy link
Collaborator

Indeed, I just wanted to validate the uploads to clut and etc. (since the rabbits are rendered differently - one is from a block transfer, the other from a render.)

The upload to clut (to the clut framebuffer, I mean) isn't free either, but it's not nearly as bad. Without download it's about 780%.

-[Unknown]

@hrydgard
Copy link
Owner

Yeah, uploads are much cheaper because they can be queued by the driver and do not stall the rendering more than the time to copy the data to the driver's buffer queue, while downloads require the gpu to catch up to a certain point and sync, which is terrifyingly expensive in comparison, as you know.

@daniel229
Copy link
Collaborator

Also fixes #6162

@unknownbrackets
Copy link
Collaborator

Neat, thanks for testing. I should've realized it might be doing that.

@hrydgard, would it be better to upload as single channel? It's tempting to try to upload the raw texture directly, but I think we'd need to translate everything (definitely at least 4-bit) to 8-bit first. Then, maybe upload as GL_ALPHA / GL_RED depending on GL version?

With my other change (tex-split), we can potentially detect 1:1 texture draws, might might allow us to make different choices in where we decode the texture. But I suppose the simplest thing would be to process it first (like fb depal.)

Not sure if it makes sense to try to combine this code into the existing depal. Maybe not...

-[Unknown]

unknownbrackets added a commit to unknownbrackets/ppsspp that referenced this issue Aug 16, 2015
This is pretty much only tested with Brave Story.  See hrydgard#6754.

There may be other cases which are not handled yet.
unknownbrackets added a commit to unknownbrackets/ppsspp that referenced this issue Sep 14, 2015
This is pretty much only tested with Brave Story.  See hrydgard#6754.

There may be other cases which are not handled yet.
unknownbrackets added a commit to unknownbrackets/ppsspp that referenced this issue Oct 24, 2015
This is pretty much only tested with Brave Story.  See hrydgard#6754.

There may be other cases which are not handled yet.
unknownbrackets added a commit to unknownbrackets/ppsspp that referenced this issue Oct 24, 2015
This is pretty much only tested with Brave Story.  See hrydgard#6754.

There may be other cases which are not handled yet.
unknownbrackets added a commit to unknownbrackets/ppsspp that referenced this issue Nov 26, 2015
This is pretty much only tested with Brave Story.  See hrydgard#6754.

There may be other cases which are not handled yet.
@unknownbrackets
Copy link
Collaborator

I haven't necessarily given up on #8246 yet, but I'm not really sure how much gain we'll see from it now...

Anyway, this particular issue is fixed, I believe. The vertex cache flickering is still there a bit, but not as bad. I would personally like to disable by default, but it's still a big speedup on some devices and in some games.

Can reopen if I missed anything.

-[Unknown]

@LunaMoo
Copy link
Collaborator

LunaMoo commented Dec 18, 2017

Reopening since it still seems to be a thing and differ depending on backend:

  • OGL - enemies are usually completely invisible, their effects like fire are looking correct through
    ulus10279_00001

  • D3d11/D3d9/Vulkan - some enemies are partially flashing or partially invisible - I guess not with those birds, but they have other problem and their fire effects have rainbow colors:
    ulus10279_00002
    (enemies below are from earthrift / optional dungeon at the north east island. in here parts of them disappear like every other frame leaving holes, especially noticeable at the enemy on the right)
    ulus10279_00003

Weirdly the graphics sometimes display fine, or the problems are reduced by disabling and re-enabling slower effects, using savestates seem to also affect it in various ways.

The problem doesn't show on frame dumps, so here's a post-game savedata which has access to pretty much every place on the world map with dragon travel(triangle -> other -> Vyne) as well as ability to freely swap party members as some special attacks might also be affected/buggy on it's own.
ULUS102790002 post game.zip

Edit:
Seems like in non-ogl backends(tested in d3d11) problem can be partially worked around by:

  • making a savestate in place with flashing/glitchy opponents,
  • disabling slow effects and check in-game,
  • re-enabling slow effects, <- the speed of flashing changes in here,
  • loading savestate. <- stops flashing, models looks correctly, fire in those bird enemies still in rainbow colors through.

Just reloading savestate doesn't do anything, so this is all weird and might be some race issue:].

In OGL nothing helps other than disabling slower effects and sacrificing colors, would be nice if someone could confirm on non AMD gpu, just in-case... drivers heh;p.

@LunaMoo LunaMoo reopened this Dec 18, 2017
@unknownbrackets
Copy link
Collaborator

The game does drawing for characters to render CLUTs - it sounds like this is what's going wrong on your card, maybe?

Vulkan definitely has messed up effects for me, but OpenGL - I messed around and didn't see any missing parts or invisible enemies. I only tried like 5 or 6 battles though - how common was it for you?

I think the CLUT download isn't fully working on non-GL, though... that or flat interpolation isn't.

-[Unknown]

@LunaMoo
Copy link
Collaborator

LunaMoo commented Dec 20, 2017

Heh, so another driver bug it is.. as for me every single fight in OGL has invisible enemies. Unless I disable slow effects only their special effects like fire in those birds can be seen. Stupidly OGL is the only backend where that fire actually looks ok:P.

I tried to see if it's not a regression, but in older versions PPSSPP was just crashing when trying to boot that game in OGL for me.

@unknownbrackets
Copy link
Collaborator

Hmm, crash on boot in this game was caused by (and I think has always been caused by) "read framebuffers to memory".

-[Unknown]

@hrydgard
Copy link
Owner

It's really time to rip out those modes now... I know, I know, some rare game still needs it but they're clearly way more trouble than they're worth.

@LunaMoo
Copy link
Collaborator

LunaMoo commented Dec 20, 2017

I never use read framebuffers to memory, so I'm quite sure it's my AMD ogl drivers:C. But yeah the option itself doesn't have much reason to be available. Many people might consider going from skipping buffer effects through buffered rendering as a gradation and consider read framebuffers to memory as the best which is really bad.

Edit:
I tested even older builds and in 1.3-718-g79bd01e which I still had everything worked fine, while 1.4-232 was crashing already and latest builds have missing graphics. Guess I'll try to find out exactly where things started changing. It's probably still amd driver bug, but maybe the problem can be worked around somehow.

@LunaMoo
Copy link
Collaborator

LunaMoo commented Dec 20, 2017

Actually turns out I lied about testing latest buildbot not crashing, it still crashes, I just confused with my custom build. So official version doesn't have missing graphics in OGL, it simply crashes.

Also I found out it's caused by be removal of legacy BGRA path, that also caused missing graphics in Final Fantasy on AMD gpu's as can be seen here. Except in this game it simply crashes:c.

Edit: On a side note, the missing graphics I experienced in custom build instead of the crash could just be me doing a poor job at trying to restore that legacy path.:|

@LunaMoo LunaMoo added D3D11 Direct3D 11 D3D9 Direct3D 9 Vulkan labels Dec 21, 2017
@unknownbrackets
Copy link
Collaborator

On NVIDIA, all backends look good. Is this still an issue on AMD with the latest git?

Note: make sure vertex cache is off. It's now off by default, except when OpenGL is the default.

-[Unknown]

@unknownbrackets
Copy link
Collaborator

How does this look now on AMD devices?

Note: The BGRA path has probably been removed again with the latest rewrite, but hopefully it still works.

-[Unknown]

@unknownbrackets
Copy link
Collaborator

I think this is probably fixed, not sure about AMD devices but hopefully we're in a good place now. Probably should create new issues if there are new problems.

-[Unknown]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants