Welcome to September’s progress report! Firstly, we would like to apologize for the delay. Our progress reports are written by voluntary writers and sadly most of them were unavailable to contribute this month. However, there is a silver lining here. The additional time we had gave us a unique opportunity to convert this month’s progress report into a technical exposition hybrid!
We’ll be featuring a deep dive into the inner workings of the texture cache in RPCS3, and how it was improved thanks to the contributions of ruipin and Nekotekina. We will also uncover the wide variety of improvements that kd-11 made and showcase some massive improvements to various AAA titles. Without further ado, let’s jump straight into September’s irregular progress report!
In addition to the following report, further details about Nekotekina’s and kd-11’s work during September and about their upcoming contributions can be found in their weekly reports on Patreon. The month’s Patreon reports were:
In the compatibility database statistics, we can see all the numbers moving further in the right direction. The Ingame category has breached the 1300 games barrier while Playable continues to slowly raise due to the amount of time it takes to make a playable compatibility report. Intro also saw a decent reduction as a result of recent improvements and lots of testers making compatibility reports. For more details, take a look at the compatibility history page, to see which games in particular had their status’ changed during the month.
August has been an amazing month for RPCS3 as we crossed multiple new milestones. This month saw massive performance improvements to many AAA titles, accuracy and performance enhancements to SPU LLVM, support for C++ 2017, laying the foundation for macOS support and much more!
In addition to the following report, further details of Nekotekina and kd-11’s work during August and upcoming contributions can be found in their weekly reports on Patreon. This month’s Patreon reports are:
The Playable category has finally crossed 1,000 titles milestone! Considering that this time last year, the Playable category was only a little over 400, it truly demonstrates the amazing pace of development. For all other categories, we can see the metrics moving in the right direction with the elusive Nothing category dropping by 1, with only 5 games remaining in it. For a more detailed look, you can view the compatibility history page to see exactly which games had their status changed this month.
Shader compilation stutter is nothing new to most emulator users, especially on RPCS3. However it is worth clearing up some misconceptions that go around regarding how RPCS3 shaders work. I’ll try to quickly go over the history of shader compilation on RPCS3 and hopefully explain why the shader compilation stutter appeared and why some people believe RPCS3 did not have shaders before.
Shader complexity and custom vertex fetch
In early 2017, I embarked on a task to remove the very expensive vertex preprocessing step from the CPU side of RPCS3. This basically meant implementing all those custom vertex types and vertex reading techniques to the vertex shader and providing only raw memory view that the ps3 hardware would be viewing. This greatly improved RPCS3 performance, more than tenfold in some applications. This change is what made RPCS3 usable for playing real commercial games with playable framerates without needing HEDT system. However, the new fetch technique increased the size of the vertex shader and added a complex function to extract vertex data from the memory block. This made the graphics drivers take very long to link the programs, even without optimizations, likely due to use of vector indexing, switch blocks and loops with dynamic exits. Extra operations including bitshifts and masking were also needed to decode the vertex layout block. The code runs very fast, but the linking step is very slow. A shader cache system already existed before and if you ran an area for the first time, there was slight microstuttering that some users did not notice; its this stutter that got much worse. The solution to this: preload the shaders so that you don’t need to compile them next time. This lead to the infamous “Loading Pipeline Object…” screen and the “Compiling shaders…” notification.
There are several challenges to tackling RSX shaders. First, the RSX is not a unified architecture like most programmers are used to today. It has separate vertex and fragment pipelines, both with their own separate ISA. They are also very limited and larger programs or more complex programs can result in very messy binaries. One of the largest problems is that the bytecode itself does not contain all the information required to run the program, extra configuration is configured via registers as draw calls are passed in. A good example is that the TEX instruction does not differentiate between texture types, but a texture configuration register exists that allows using the same program to read 1D, 2D, 3D, or CUBE textures as well as their shadow comparison variants. This means you can only know the generated shader once the texture register has been set up. There are other examples of things like these that make it so that you need the game to set up the program environment before the program itself is compilable. Continue reading Eliminating Stutter with Asynchronous Shader Implementation!