Skip to content

Reconstructing the viewport matrix from the composite matrix and inverse model view matrix

Erik Abair edited this page Apr 24, 2022 · 6 revisions

In order to operate in clip space as expected by OpenGL, it is necessary to undo the viewport transform done by xbox shaders. Via the pushbuffer, programs provide an inverse model view matrix and a composite matrix.

The composite matrix is: (model_view_matrix x projection_matrix x viewport_matrix)

Thus (inverse_model_view_matrix x composite_matrix) gives (projection_matrix x viewport_matrix)

It is likely safe to assume that users delegate the creation of the projection and viewport matrices to DirectX, so some assumptions can be made about the structure of those matrices (see for example https://docs.microsoft.com/en-us/windows/win32/direct3d9/d3dxmatrixperspectivefovlh):

The D3D projection matrix is of the form:

1 2 3 4
1 xScale 0 0 0
2 0 yScale 0 0
3 0 0 zf/(zf-zn) 1
4 0 0 -zn*zf/(zf-zn) 0

where:

  • yScale = cot(fovY/2)
  • xScale = yScale / aspect_ratio

The D3D viewport matrix is of the form:

1 2 3 4
1 (width / 2) 0 0 0
2 0 (-height / 2) 0 0
3 0 0 MaxDepthBuffer * (MaxZ - MinZ) 0
4 (width / 2) (height / 2) MaxDepthBuffer * MinZ 1

Given a projection matrix p and a viewport matrix v, the combined matrix c is:

1 2 3 4
1 p0 v0 + p3 v12 + p1 v4 + p2 v8 p0 v1 + p3 v13 + p1 v5 + p2 v9 p2 v10 + p3 v14 + p0 v2 + p1 v6 p2 v11 + p3 v15 + p0 v3 + p1 v7
2 p4 v0 + p7 v12 + p5 v4 + p6 v8 p4 v1 + p7 v13 + p5 v5 + p6 v9 p6 v10 + p7 v14 + p4 v2 + p5 v6 p6 v11 + p7 v15 + p4 v3 + p5 v7
3 p8 v0 + p11 v12 + p9 v4 + p10 v8 p8 v1 + p11 v13 + p9 v5 + p10 v9 p10 v10 + p11 v14 + p8 v2 + p9 v6 p10 v1P1 + p11 v15 + p8 v3 + p9 v7
4 p12 v0 + p15 v12 + p13 v4 + p14 v8 p12 v1 + p15 v13 + p13 v5 + p14 v9 p14 v10 + p15 v14 + p12 v2 + p13 v6 p14 v11 + p15 v15 + p12 v3 + p13 v7

Which can be substantially simplified due to the number of 0's in the matrices:

1 2 3 4
1 p0 * v0 0 0 0
2 0 p5 * v5 0 0
3 v12 v13 p10 * v10 + v14 1
4 0 0 p14 * v10 0

Substituting from the matrices:

1 2 3 4
1 xScale * (width / 2) 0 0 0
2 0 yScale * (-height / 2) 0 0
3 (width / 2) (height / 2) (zf/(zf-zn)) * (MaxDepthBuffer * (MaxZ - MinZ)) + (MaxDepthBuffer * MinZ) 1
4 0 0 (-zn*zf/(zf-zn)) * (MaxDepthBuffer * (MaxZ - MinZ)) 0

The primary values of interest in the xemu case are MaxZ and MinZ.

  • c10 = (zf/(zf-zn)) * (MaxDepthBuffer * (MaxZ - MinZ)) + (MaxDepthBuffer * MinZ)
  • c14 = (-zn*zf/(zf-zn)) * (MaxDepthBuffer * (MaxZ - MinZ))

MaxDepthBuffer can be reliably calculated using knowledge of the selected z-buffer bit depth and whether floating point is enabled or not.