Welcome to April’s Progress Report! Firstly we would like to apologise for the delay in publishing this report. RPCS3’s progress reports are solely written by volunteers and a few of our regular writers could not contribute to this report due to personal commitments. If you hate seeing RPCS3’s reports get delayed and would like to contribute to them, please apply here.
This has been a very busy month, which saw many contributions from our regular developers and even a few newcomers. Major improvements have been made to RSX emulation by kd-11, fixing the texture cache and improving on the shader decompiler. Meanwhile, eladash in his usual style, fixed a multitude of bugs relating to savedata handling and the PPU/SPU interpreters/recompilers. Numan (Inviuz) implemented a fringe syscall needed for Metal Gear Solid 4 to boot and Nekotekina squeezed quite a bit more performance out of the SPU LLVM path. To also improve the visual aspect of the emulator drysalter created two beautiful new themes and lastly GalCiv expanded DualShock 3 Support to Linux. These and many more improvements have all contributed to making RPCS3 both a better piece of software and a better emulator, moving a bunch of new games into the Playable category.
In addition to the following report, further details of Nekotekina and kd-11’s work during April and upcoming contributions can be found in their weekly reports on Patreon. This month’s Patreon reports are:
Status update from kd-11 (2019-04-07)
Status update from kd-11 (2019-04-23)
Table of Contents
This month RPCS3 reached another milestone in game compatibility. It’s the first time that the Playable category surpasses 40%! As a result, the Ingame and Intro categories see an equivalent, while the Loadable category goes down to 29 games. We are also gearing up to undertake further maintenance to the compatibility list by identifying and merging duplicate entries in the coming months! For a more detailed look, you can view the compatibility history page to see exactly which games had their status changed this month.
On Git statistics, there have been 9921 lines of code added and 2967 removed through 113 commits by 16 authors.
Major RPCS3 Improvements
Texture cache fixes and shader decompiler improvements (#5785, #5813, #5860)
In April, kd-11 switched focus to tackle the blit engine portion of the texture cache. Even with the initial implementation, multiple games responded positively to the improved accuracy of the texture and surface cache in general without too great of a performance hit. kd-11 added a few more improvements along with those patches, such as implementing a few missing compressed formats, working on finally getting bit-casting of texture data working across a DEPTH_STENCIL<->COLOR border.
For those interested in technical details, this is particularly important since the RSX has no concept of texture storage (this is usually just an API convenience for user-side), and arbitrary recasting of memory pools is quite common. For example, you can store a normal 32-bit RGBA texture somewhere, even render to it and then for the next pass cast it as a 2×16-bit format like RG_16 or RG_16FLOAT, or even a depth format like DEPTH24_STENCIL8. Since many PlayStation 3 formats do not map properly onto modern PC graphic cards, it is essential to implement a proper decoding step to correctly emulate such formats. kd-11 has set this up with compute for Vulkan, while OpenGL has the benefit of a flexible pack/unpack configuration for memory copies between textures and arbitrary buffers. These improvements have fixed effects such as broken lighting, depth of field and fog in titles such as Uncharted 2.
Also included in this changeset is an improvement to RPCS3’s Vulkan memory management which should reduce the likelihood of encountering a Working buffer not big enough crash.
While testing the texture cache improvements, kd-11 picked up a side task to improve the shader decompiler. There were several examples of titles where different hardware was producing wildly varying results. This includes titles like Ridge Racer 7, which was playable on AMD GPUs, but a flickering mess with NVIDIA GPUs. A common issue with graphics hardware is that they are often fundamentally very different internally, something the manufacturers can often get away with due to the API-only access provided to programmers and end users. This means that, unlike a CPU, the same inputs to a function may not always provide the same results on different hardware.
In this case, there was a significant arithmetic drift in NVIDIA GPUs where some mathematical operations seem to have been operating at seemingly low enough precision that a comparison test with a constant provided by the game was failing. AMD and Intel hardware was seemingly unaffected, so kd-11 started investigating further and found many similar problems.
To handle these variances, kd-11 spent a whole week throwing random instructions with corner-case inputs at PlayStation 3 to better document the behaviour of the RSX shader processing unit. This has yielded increased accuracy in several areas where RPCS3 was incorrectly rendering many graphics and hiding few hard-to-fix bugs.
To round off coverage on kd-11’s work this month, here’s comparison images of Ridge Racer 7 now displaying foliage on Nvidia GPUs and Battlefield: Bad Company no longer has glowing lights on screen:
kd-11 had also set up a framework to take advantage of graphic cards offering native FP16 support to both improve accuracy of emulation as well as to potentially improve performance in the future, especially with newer GPUs supporting this feature out of the box e.g. Turing and Vega.
eladash’s improvements (#5749, #5792, #5850)
As with last month, eladash fixed so many bugs that it’s nigh impossible to cover them all adequately, but here’s the summary.
One of eladash’s focuses was to fix issues with how RPCS3 handled savedata. This was apparent in some games such as Lord of the Rings: Conquest, where it would try to read the savefile directory even though it didn’t exist yet. On a real PlayStation 3, an error code is returned to the game, telling it that the directory doesn’t exist. However, before this fix, the emulator never threw this particular error code, so the game tries to load a nonexistent savefile, leading to an access violation crash and rendering the game unplayable unless you already have a savefile in the directory. After eladash implemented the proper handling of this situation, Lord of the Rings: Conquest and Fritz Chess were fixed, the latter going from the Intro category straight to Playable!
eladash also tackled a tricky race condition that was plaguing RPCS3. Race conditions arise with complex multi-threaded applications like RPCS3 where the timings between different parts of the program that run concurrently are critical to the proper functioning of the program. This can be caused by two threads depending on the same data, or expecting that data to be in a specific state when they start working on it, yet that state might be altered if unexpectedly the other thread operated on it first. These issues are difficult to debug and result in issues such as the infinite loading screen in Family Guy: Back To The Multiverse and the inability to load savegames in Cars: Race-O-Rama. eladash fixed both of these issues by simply making the thread sleep (wait) in some places, which corrected the timings.
eladash’s next focus was on improving the PPU and SPU interpreters and recompilers. This includes fixing incorrectly handled instructions, getting all these different decoders to behave more similarly, yielding small performance improvements in certain cases. This fixed the orb rendering in SoulCalibur IV’s menu screen with the PPU Precise Interpreter, blackscreen and crashes in Sonic Unleashed and makes the PPU LLVM Recompiler behave more accurately and closer to the PPU Interpreter.
Finally, let’s look into how he corrected the handling of SIMD with saturation instructions on both the PPU Fast and Precise decoders. When decoding these SIMD instructions we iterate over all the elements individually, apply the instruction and write the results in the appropriate element of the destination register. That last part however causes issues when the destination register is also one of the source registers, as we are now overwriting the source register, corrupting data for the next element. The solution was simple: Just use a temporary variable as a destination and copy that into the actual destination register at the very end. This specifically fixes the loading icon in Beyond Two Souls.
Initial sys_overlay support (#4007)
Many of you probably have heard of Numan’s (Inviuz) implementation of the infamous sys_overlay syscall, used (as far as we know) exclusively by Metal Gear Solid 4. Ever since Numan opened this pull request at the end of 2017 everyone in the community has been eagerly waiting for it to be merged and now it has finally happened!
While it has been talked about quite often, and also teased in this progress report, here is still a quick recap for the uninitiated:
Syscalls are functions, provided by the PlayStation 3’s OS, acting as a way for programs to interact with the system. There are many syscalls for everything from allocating memory to communicating with USB devices, and while RPCS3 implements many of them, some are still missing. One of those missing syscalls was sys_overlay, which was not implemented till now due to having virtually no documentation on its functionality as well as only a single game making use of it. This syscall provides games with a special way to load external self files and is used by Metal Gear Solid 4 to boot a PS1 arcade stage.
Now that it is implemented, we are one step closer to getting MGS4 to boot. Unfortunately, there is still much to do until that happens, but there are fixes in the works from our developers, so stay tuned!
SPU LLVM performance improvements (#5882)
Next up are some much needed performance improvements to the SPU LLVM recompiler by Nekotekina. A trick was used to optimize LLVM’s LICM (loop-invariant code motion) pass, giving up to 10% improvement in FPS in many games when using accurate xfloat. Approximate xfloat was however unaffected by this change.
To understand how Nekotekina achieved this, we first need to understand what the LICM pass does. Simply put, it checks all the loops in the code for parts that don’t change (loop-invariant) and moves them out of the loop (motion) so they are not unnecessarily executed over and over again. This can be a major optimization factor in some cases, however LLVM doesn’t always apply this pass perfectly. That is why Nekotekina decided to improve this by replacing some instructions that caused LLVM to not correctly identify these loop-invariant sections with some dummy instructions that are known to not interfere with the LICM pass. So after LLVM did its job we can once again replace those dummy instructions with the real ones, leaving the code functionally equivalent, yet better optimized.
|Captain America: Super Soldier saw an increase in FPS from 20 to 24 in slow areas!|
New Skyline and Envy themes (#5789, #5884)
For those of you that didn’t know, RPCS3 has given users the ability to customize the appearance of the emulator for quite a while now and since then, a few custom themes have been added to the roster.
This month, two new stunning themes have been added by drysalter in his first ever contribution to RPCS3. The Envy theme sports a completely new neon design giving users another dark theme to use while the Skyline themes are inspired by DAGINATSUKO’s colour scheme of the website and comes in two variations: Standard and Nightfall. You can find these and other cool themes by navigating to Config > GUI > UI > Stylesheets. Just choose one, click Apply and instantly see RPCS3 in a new light!
DualShock 3 support on Linux (#5888)
Lastly, let’s take a quick look at GalCiv’s (RipleyTom) improvements to the native DualShock 3 pad handler. As we covered in last month’s report, GalCiv implemented a native DualShock 3 pad handler that allows emulation of features such as pressure-sensitive buttons and motion controls in RPCS3 when using the DualShock 3 controller. While this change was much welcomed, the implementation was sadly restricted to only Windows as the use of generic USB drivers required special consideration on Linux.
GalCiv, not wanting to leave the job half done, looked into implementing the same on Linux when he made an interesting revelation. The Linux kernel has the drivers for the DualShock 3 controller built-in. This allows direct support for the controller without the need for solutions used on Windows.
Taking advantage of the same, GalCiv implemented a DualShock 3 handler on Linux utilising this driver, providing the full range of features such as pressure-sensitive buttons and motion controls through USB or Bluetooth without the use of any third-party devices. Fascinating to see what started as a deficiency turn into yet another resounding win for team Pingu!
Thanks to Eladash’s improvements to the PPU LLVM recompiler, this game no longer hangs during night stages and when doing quicksteps. Performance has also vastly improved during night stages, but it’s still too slow to be considered Playable. Check out the video footage from our official YouTube channel below:
Let’s begin with some sport games. NCAA Football 09 and NCAA Football 10, two football games are now playable this month. Both games are running well with good graphics and performance.
Madden NFL 11 is the first game in the Madden series to have been moved to the Playable category after Nekotekina fixed some regressions this month, while Madden NFL 13 and 09 are now in the Ingame category.
Most NBA games, for example NBA 2K7 and NBA 2K8, are now going ingame for the first time. However, both titles are too slow to be considered playable at this point.
Moving on to baseball games, MLB 08: The Show and MLB Front Office Manager both became playable this month.
There have been numerous other pull requests merged during the month that couldn’t make it to the Major Improvements section. We have collected a list of all such improvements here, and attached a brief overview to each. Make sure to check out the links provided for them if you are interested, as their GitHub pages usually uncover further details as well as the code changes themselves. To see this whole list right on GitHub, click here.
5784 – Eases cache line interference on TSX path, emulates POSIX unlink on Windows and extends spurious access error workaround to all directory renames;
5822 – Disables GCC build for Travis, improving overall build time;
5823 – Fixes some regressions caused after #4097 and #5784;
5832 – Updates LLVM submodule and increases max stack size to avoid stack overflow crash on Windows;
5844 – Minor refactoring to the SPU;
5855 – Makes a refactoring to the LLVM DSL;
5882 – Improves SPU LLVM performance when Accurate xfloat is enabled. See coverage in major improvements here.
5785 – Further improvements to the texture cache. See coverage in major improvements here;
5813 – Fixups to some regressions caused by the previous PRs in last month. See coverage in major improvements here;
5860 – Improvements made for the shader decompiler, see coverage in major improvements here;
5809 – Adds a new hint to the game list that shows the highest available version if the installed version of that title is lower than the one that was found in the compatibility database.
5780 – Adds more filters to firmware installation file dialogs and enables the search and installation of firmware files that are named differently than PS3UPDAT.PUP.
5666 – Prints the currently installed firmware version into the first line of the log.
5763 – Adds a missing entry for the recently added 3rd party software libusb into the Visual Studio solution files.
5687 – Adds an “Exit RPCS3?” dialog if the emulator is being closed while a game is running. This dialog uses the same option in the gui config tab as the “Exit Game?” dialog. The confirmation dialog will always be on top to prevent it from disappearing behind other windows. From now on the game window will also properly leave fullscreen mode if an “Exit game?” pop up is triggered when the game window is being closed.
5749 – Several improvements for the PPU and SPU interpreters and recompilers, see coverage in major improvements here;
5861 – Sets the performance overlay detail to Medium by default because the RSX Guest utilization confuses people and it’s only meaningful for debugging.
5792 – Various improvements and bugfixes for cellSaveData module, see coverage in major improvements here;
5871 – Adds missing SDK version check for setParam->reserved2 in cellSaveData module.
5850 – Fixes unregistered HLE function access on PPU LLVM and PPU breakpoints, see coverage in major improvements here;
5790 – Adds some missing functions for cellCelpEnc, cellHttp and sceNpMatchingInt modules;
5797 – Adds more missing functions for several modules;
5814 – Adds some other functions for several modules.
5819 – Fixes compiling for MSVC and CMake;
5851 – Drops severity of empty queue error on sys_net_bnet_close from fatal to error, and fixes the stack size argument. Previously, it was spamming unknown option on every compiled file. Only affects people using MSVC with CMake;
5870 – Prints the user’s operating system version to the log.
5789 – Add Envy and Skyline themes, see coverage in major improvements here;
5884 – Touch-ups to Skyline and Envy themes, see coverage in major improvements here.
4007 – Provides initial sys_overlay support which is used in Metal Gear Solid 4, see coverage in major improvements here.
5888 – Reimplements part of DS3 pad handler to make it work with hidapi for Linux, see coverage in major improvements here.
5873 – Changes vendor and product IDs from the Rock Band series to use the Guitar Hero ones and adds vendor and product IDs from the DJ Hero Turntable and the Dance Dance Revolution Mat.
4097 – Adds extra argument checks for _sys_ppu_thread_create and sys_ppu_thread_rename and pause the thread instead of throwing an exception when reaches a trap instruction (useful for debugging).
5897 – Updates the Qt dependency for Debian and Ubuntu in the README file.
5840 – The languages available in the combo box in the system settings will appear sorted alphabetically instead of being sorted by internal ID.
5794 – Fixes a Travis deployment bug for Linux builds.
5782 – Fixes broken macOS compilation.
5775 – Updated the domain name of the Vulkan mirror for our Appveyor build.
If you like in-depth technical reports, early access to information, or you simply want to contribute, consider becoming a patron! All donations are greatly appreciated. RPCS3 now has two full-time coders that greatly benefit from the continued support of over 800 generous patrons.
We’re always looking for dedicated writers to help us write these reports. If you have the skill, time and are willing to help, please apply here. Also, come check out our YouTube channel and Discord to stay up-to-date with any big news.
This report was written by MarioSonic2987, elEnemigo and HerrHulaHoop.