Progress Report: May 2019

Welcome to May’s Progress Report! Firstly we would like to apologise for the delay in publishing this report. RPCS3’s progress reports are solely written by volunteers and a few of our regular writers could not contribute to this report due to personal commitments. If you hate seeing RPCS3’s reports get delayed and would like to contribute to them, please apply here.

This month saw some major leaps by Nekotekina and kd-11 on the SPU and RSX fronts. Nekotekina implemented SPU PIC support while kd-11 improved the surface cache implementation. Meanwhile, Megamouse made multiple improvements to the UI, GalCiv overhauled the DualShock 3 pad handler and ruipin tackled regressions in the SPU LLVM backend when using Mega SPU block size.

In addition to the following report, further details of Nekotekina and kd-11’s work during May and upcoming contributions can be found in their weekly reports on Patreon. This month’s Patreon reports are:

Status update from kd-11 (2019-05-09)
Status update from Nekotekina (2019-05-14)
Status update from kd-11 (2019-05-27)

Table of Contents

Major Improvements
Games
Other Improvements
Conclusion

This month RPCS3 reached one of the most significant milestones in game compatibility. After intensive testing and merging of duplicates, as mentioned in the previous month’s report, the Playable category has tied with the Ingame category at 43.71%. RPCS3 has surely come a long way to have the Playable category stand on the cusp of overtaking the Ingame category. On the other hand, Intro section saw a modest drop while the number of Loadable and Nothing games remained unchanged. For a more detailed look, you can view the compatibility history page to see exactly which games had their status changed this month.

Game Compatibility: Game Status
Game Compatibility: Monthly Improvements (May 2019)

On Git statistics, there have been 9391 lines of code added and 5430 removed through 45 pull requests by 12 authors.

Major RPCS3 Improvements

Surface cache improvements (#5937)

At the beginning of May, kd-11 began improving how RPCS3 handles the surface cache, which is not trivial since the way the PlayStation 3 renders to a framebuffer is different from a PC. Originally kd-11’s plan was to implement variable-sized framebuffers to solve a problem in the rendering method which the emulator used. Previously, RPCS3 kept a list of all the available framebuffers and their locations in the systems memory and while this was fast, it had its limitations. To keep try and keep this as simplified as possible, let’s take a look at a few diagrams on how RPCS3 handles framebuffers:

Variable-sized framebuffers was an idea where instead of making a viewport of the small red area you see above before drawing the circle inside, we would create a big area that covers the full framebuffer (the entire gray area) but only render to the red area. This would make the framebuffer’s size “variable” as it is allowed to be larger than the exact area being rendered to and adjust itself dynamically to preserve data. This sounds like a good idea, but there is one critical flaw and that is memory ordering.

Every drawcall in RPCS3 tags the framebuffer with a value to identify how new it is, so that when we recreate an image it always starts drawing the oldest pixels first. In the example above, It doesn’t matter if we render the red or blue circle first since they are separate from each other. Drawing the black triangle is ok. Drawing the blue circle is also fine. But, when it’s time to draw the red circle, the grey area is inherited with the black triangle now and is tagged as part of the newest data. This results in the blue area being older than the grey area resulting in it being completely overwritten when we draw the scene using variable-sized framebuffers.

Thus, kd-11 had to go back to the drawing board and come up with a solution. Which was to perform intersections for all reads and writes. Instead of the red area extending all the way to the boundary of the frame buffer (gray area), we can now do this:

In the above example, we’ve now split the framebuffer into three surfaces. And the red area is replaced by a new framebuffer and tagged as new which allows us to update it separately from the rest of the surfaces.

This fixed a variety of issues with games having a black screen in certain scenarios. In Ratchet & Clank: A Crack in Time, this happened at certain spots throughout each level making it hard to progress. With Resistance 3, it occured when HP was low. And finally, Nascar 14 and Nascar Inside Line was also outputting a black screen whenever you crashed into a car for a few seconds. NeverDead’s graphics were also fixed, removing the split-screen effect. There’s sure to be many other games that improved from this, so feel free to let us know if you find anything!

PIC support for SPU (#5944)

Continuing to optimize emulation of the Synergistic Processing Unit (SPU), this month Nekotekina has added support for Position-independent code (PIC), bringing with it less stuttering and more frames per second!

To recap, the SPU recompiler works by taking in chunks of game code meant to be run on the PlayStation 3, analyzing it and generating code that can run on your computer (with the use of LLVM), which is then stored as a program. Programs are created every time a unique chunk is identified and cached to be re-used when the same chunk is executed again at a later point. This helps us avoid the arduous process of recompilation as emulation is momentarily paused every time a new game chunk is recompiled.

Since the emulator has no visibility into the game code, there is no way to speculate which chunk will be run and consequently, all such recompilation is done at runtime. This causes a dip in performance when new SPU programs are recompiled. Thankfully, this process generally only lasted for a few minutes and performance uplift obtained from using the SPU LLVM recompiler was well worth the wait. However, for certain titles (famously Red Dead Redemption), the SPU recompiler would continuously compile new SPU programs (colloquially referred to as infinite compilation) giving the impression that the game was continuously generating unique chunks of code. However, on closer inspection, this was not the case. While the games used the same chunk of code, it was loaded into a different memory address each time. In the erstwhile implementation, the SPU programs used absolute addressing, making them reliant on being loaded into a specific location. If the same chunk was loaded in any other location, it was wrongly identified as unique code and recompiled once again.

This is where Nekotekina’s pull request comes in, allowing generation of these programs in a position-independent way, making them loadable anywhere and thus reusable. This completely solved the issue of infinite compilation and greatly reduced the number of SPU programs compiled. For example, in the case of Spider-Man Shattered Dimensions, just thirty seconds of gameplay would previously generate over 11,000 SPU programs with only 2,054 of those programs actually being unique. Now, with Nekotekina’s improvements, this title generates only 2,113 SPU programs in the same amount of time.

This is a huge improvement that benefits almost every game when using the SPU LLVM recompiler. For example, where the FPS used to drop to 11 when using an intensive attack in Fist of the North Star: Ken’s Rage, it now stays at a smooth 25-30FPS. Currently this only works with either Safe or Mega SPU block sizes.

Comparison of how many SPU modules are compiled before and after these changes across various games. Note that some games shown in this graph (GTA V or One Piece) previously were compiling SPU modules indefinitely.

Improved Game Collections Support (#4450)

Among the various GUI improvements this month, Megamouse also reworked the way RPCS3 handles game collection discs. Before we dive in to the changes made, here’s a quick overview of how PlayStation 3 discs are structured. Every disc contains 2 folders, namely,PS3_GAME,PS3_Updateand one file,PS3_DISC.SFB. This is the general layout for the game discs. The PS3_DISC.SFB file does not contain any relevant data within it other than the product ID. However, while the file does not contain any critical information, the file itself is used by RPCS3 to identify the existence of a disc-based game. Once identified, the emulator scans the PS3_GAME folder to pull all the necessary details about the game and populates the gamelist to allow users to easily launch the game.

With game collections, there were two distinct file layouts. The first (and more common) layout followed the above layout with all games being stored within the PS3_GAME folder and appearing as a single game in the game list. Once launched, users will be prompted with the choice to select which specific game they wish to launch and the selected game will boot in a new window.

Launch screen of Kingdom Hearts HD 1.5 ReMIX collection that allows selection of titles present within the collection

However, with the second type, the file layout is a little different. Here, each game in the collection is stored separately and hence you find additional folders such asPS3_GM01andPS3_GM02 within a disc. This can be seen in discs such as The Disgaea Triple Play Collection, Ultimate Stealth Triple Pack and Ultimate Action Triple Pack among others. When inserting these discs into an actual PlayStation 3, three separate disc games would appear in the game list as opposed to just one. In RPCS3’s previous implementation, only the contents present in the PS3_GAME folder were scanned and populated to the gamelist and hence the games present in the additional folders were ignored.

Here’s a comparison showing the file layout difference between regular discs and the special collection discs

To fix this, Megamouse reworked the existing implementation of detecting game files by adding m_game_dir, a new variable to point the relevant game folders to. Now, instead of hardcoding the PS3_GAME folder, the emulator will search for other folders and identify whether they are also games and correctly populate them into the RPCS3’s gamelist.

Batch processing for all titles (#5519)

Lastly, we have a quality of life improvement by Megamouse, who added the option to compile the PPU cache for all titles at once, but also to clear all caches or custom configurations. Gone are the days of having to compile the PPU cache for each title individually. While this improvement may not be as exciting as the ones mentioned above, it does a much needed comfort that regular RPCS3 users will definitely come to appreciate.
This new feature can be accessed underFile > All Titles.

Games

Yakuza 3 & 4

Thanks to improvements made by kd-11 to frame data handling, Yakuza 3 & 4 no longer have exploding vertices when using AMD GPUs. This change brings further parity of graphical output between Nvidia and AMD GPUs to provide users a more uniform experience irrespective of the hardware used. Check out the clip below to see the improvement in action:

Genji: Days of the Blade

Genji: Days of the Blade is now the second PlayStation 3 launch title to become playable after Ridge Racer 7.

Cash Gun Chaos DLX

From the arcade corner, this month saw Cash Gun Chaos DLX become fully playable. This arcade style shooter works with steady performance and graphics. Check out some gameplay footage below:

SIREN: Blood Curse

The PlayStation 3 exclusive survival horror game, SIREN: Blood Curse was found to be playable this month. A user from our forums, jade010, managed to finish this game on RPCS3 as well. Please note that you need a DualShock 3 or 4 to play this game as it makes use of the controllers gyroscope.

One Piece: Pirate Warriors

This is one of the titles that has improved after Nekotekina’s PIC support for SPU. While the game runs well with great performance and graphics, it suffers from audio stuttering when there are too many enemies on screen, keeping the title from being playable.

Mortal Kombat vs. DC Universe

This crossover fighting game is now playable with great graphics and performance. Previously, this title suffered from various graphical glitches and did not work with the PPU LLVM recompiler. However, thanks to the improvements made to the emulator in the recent months, all these issues have now been summarily addressed.

Cross Edge

Another niche console exclusive, Cross Edge, a tactical RPG title that features characters from a wide variety of franchises, was found to be playable this month.

Dark Mist

To finish our roundup of games, Dark Mist, was found to be playable this month. This title previously required LLE libvdec to be enabled which reduced performance enough to keep it from being playable. However, with improvements made to the emulator, the same is not required anymore. Users should note that this title currently only works with the OpenGL renderer.

Other Improvements

There have been numerous other pull requests merged during the month that couldn’t make it to the Major Improvements section. We have collected a list of all such improvements here, and attached a brief overview to each. Make sure to check out the links provided for them if you are interested, as their GitHub pages usually uncover further details as well as the code changes themselves. To see this whole list right on GitHub, click here.

Nekotekina

5901 – Fixed few regressions in SPU LLVM caused by PR 5882;

5915 – Fixed a regression in SPU LLVM where Drakengard 3 couldn’t create a new savegame;

5923 – Improved SPU Analyser when block size is set to Giga;

5975 – Fixed regressions caused by the above pull request;

5944 – Provided PIC support for the SPU LLVM recompiler. See coverage in major improvements here;

5967 – Fixed regressions caused by the above pull request;

5976 – Fixed a minor bug when using Safe SPU block size. Also, fixed broken BISLED instruction behaviour in SPU ASMJIT recompiler;

5993 – Improved performance when using TSX instructions;

6020 – Fixed regressions caused by the above pull request.

kd-11

5895 – Removed a workaround for AMD GPUs that is no longer needed with newer drivers (since version 19.4.3), fixed texture cache search for flip source, fixed confirmed_range calculation for hardware blit transfers (software memory write barrier for flushing), fixed out-of-bounds transfers due to no bounds check when testing local resources for overlap and fixed clear failure due to null stencil mask triggering an assert;

5922 – Allowed certain drivers to bypass the window state polling if they properly handle OUT_OF_DATE and/or SUBOPTIMAL return codes to signal that the window surface has changed. This offsets the surprisingly large penalty of polling the window size from Qt on some linux window managers. Also, fixed a typo in OpenGL renderer code;

5937 – Improvements to the surface cache. See coverage in major improvements here;

5977 – Transition attachments to LAYOUT_GENERAL in case of a feedback loop. Fixed appearance of garbage along polygon edges in some post-processing passes. Also reversed this transition when rendering goes back to normal. Optimized transitions from LAYOUT_GENERAL to some common read/write layouts;

5979 – Fixed a bug where vk::sampler::matches would always return false because the info was never initialized. Added the first step of centralized resource management by managing a sampler pool instead of pushing and deleting resources every frame. Fixed performance gap between AMD and Nvidia GPUs found in Ninja Gaiden Sigma;

5988 – Bumped max allocated draw call resources up to 16K from 4K to avoid easily running out of resources mid-frame in heavy scenes. Used a simple FIFO queue for frame data handling. In the future, this can easily be expanded to use a present scheduler thread for frame-pacing support; Fixed exploding vertices on Ratchet & Clank and Yakuza titles when using AMD GPUs; Fixed a regression caused by PR 5565 where inFamous crashes during intro;

5995 – Fixed staging buffer size calculation where pitch was row-aligned to 1 byte instead of 4 bytes; Fixed gsl::span assert when using OpenGL renderer in some games; Fixed Ni no Kuni stuttering ingame caused by the above pull request; Fixed a crash in Uncharted: Drake’s Fortune when the Debug output option is enabled;

6011 – Ensured the current renderpass matched the image properties even when a cyclic reference was detected; Solved problems due to mismatching layouts and renderpasses;

6025 – Refactored out framebuffers from the renderer core, bumped up shader cache version and implemented a proper cache with sorted queues for faster searching;

6030 – Updated glslang;

6039 – Fixed aux context usage when handling swap queues. The aux context did not have its own descriptor pool and borrows from the laggy frame context which had a pool that was still in use; Refactored out GLSLCommon from VKHelpers since VKHelpers was included in GUI code to assist with context initialization. This removed a lot of compiler warning spam about unused static functions declared in the header.

Megamouse

5519 – Fixed wrong module count in Qt Compilation Dialogs; Fixed a glitch that stacks Qt Compilation Dialogs; Don’t add games to the Recent Games List when the add_only option is used; Ignore operations other than boot when rebooting the last game from the GUI; Show all unique game data entries instead of only the first; Added option to batch create PPU cache and batch remove PPU cache, SPU cache, shader cache and custom configurations;

5916 – Properly scale game icons as well as the compatibility circle-marks in both the game list and the game grid on high-DPI screens. This is especially noticeable on 4K screens where, prior to this pull request, the icons would look jagged and aliased;

5655 – Updated the Travis Mac build process to Xcode 10.2 and removed an unnecessary workaround for Xcode 10.1;

5929 – Added per controller color picker in the pad settings dialog for DualShock 4 controller LEDs and future implementations of other game pad LEDs;

5960 – Added an optional custom pad config for every game.
The configs will be saved per game inconfig/custom_input_configs/. The configs behave exactly like the normal pad configs just with their own config files including the profiles. New icons were added to show which game has its own pad configuration. Users can assign a new config from the game list using the context menu;

5980 – Added 10 step navigation to the native in-game save data dialogs. You can now use the L1 and R1 buttons to jump 10 entries back and forth in the list, which makes navigating through big lists faster and more responsive;

5978 – Added simple multipliers for left and right analog sticks to the input config files. The allowed range is 0-200 (0.0 – 2.0) with the default value being 100 (1.0). Access to this functionality through the GUI is also planned to be added; Fixed a crash in the pad settings dialog that was caused when clicking “Filter Noise” while the current pad was disabled. The button will now only be activated when there is a pad connected; Fixed a bug that shows the wrong color when selecting LED colors; Fixed a bug that resets the LED colors when clicking the vibration checkboxes; Fixed a crash when downloading the compatibility database twice in a row; Fixed a crash when trying to load an empty PARAM.SFO in the save data manager; Removed unnecessary warnings;

4450 – Improved support for game collections. See coverage in major improvements here.

scribam

5949 – Fixed a minor typo in OpenGL and Vulkan renderers used only for debugging;

5956 – Removed redundant semicolons. Added missing #pragma once directive;

6042 – Removed duplicated condition in pipeline_props struct equal operator in Vulkan.

MSuih

5905 – Used RPCS3’s wiki for DualShock 3 and DualShock 4 troubleshooting instructions;

5982 – Added a Max SPURS threads option in the Debug tab;

6040 – Removed the SPU verification setting from Debug tab and changed the Log shader programs setting’s behaviour.

Whatcookie

5919 – Doubled dpad repeat rate;

5998 – Fixed inaccurate FPS counter in the performance overlay on Windows when the detail level is set to High;

6006 – Enabled window resizing again on Linux with amdvlk/amdgpu-pro drivers.

elad335

5907 – Fixed sys_rwlock_runlock on waiting readers and sys_rwlock_wlock timedout event; Returned ESRCH if PPU thread ID was not found in sys_cond_signal_to;

5939 – Fixed 3D swizzled texture to linear conversion. This fixed an old regression in Call of Duty titles.

drysalter

5899 – Fixed minor visual bugs in RPCS3’s GUI;

5932 – Further improvements to the Skyline and Envy themes.

GalCiv (RipleyTom)

5933 – Improved the DS3 pad handler on Windows. Now supports Sony’s official driver distributed with the PlayStation Now installer.

ruipin

5951 – Fixed several regressions caused by PR 5923 when using the SPU LLVM recompiler with Mega SPU block size.

z0z0z

5963 – Used setenv instead of qputenv for QT_AUTO_SCREEN_SCALE_FACTOR parameter.

Exfiltratior

5930 – Fixed cache deletion on Linux.

Closing Words

We hope you liked this report and look forward to the next one! If you would like to contribute to the project, you can do so either by contributing code, helping the community or becoming a patron. RPCS3 has two full-time developers working on it who greatly benefit from the continued support of the many generous patrons. In exchange, patrons also get special support over on our Discord server and get access to early updates directly from our lead developers. If you are interested in supporting us, consider visiting our Patreon page at the link below and becoming a patron, or join our Discord server to learn about other ways of contribution.

We’re always looking for dedicated writers to help us write these reports. If you have the skill, time and are willing to help, please apply here. Also, come check out our YouTube channel and Discord to stay up-to-date with any big news.

This report was written by Asinine, elEnemigo, MarioSonic2987, HerrHulaHoop, Digitaldude555 and Megamouse.