Progress Report: May 2018

RPCS3 continues to see major improvements over the month of May, with Nekotekina implementing SPU LLVM (still WiP) and kd-11 continuing with improvements to RPCS3’s RSX emulation. More major AAA exclusives have also started to go ingame for the first time! We also saw new contributors join in and make much appreciated improvements to RPCS3.

In addition to the following report, further details of Nekotekina and kd-11’s work during May and upcoming contributions can be found in their weekly reports on Patreon. This month’s Patreon reports are:

Status update from kd-11 (2018-05-05)
Status update from Nekotekina (2018-05-14)
Status update from kd-11 (2018-05-20)
Status update from Nekotekina (2018-05-29)

Table of Contents

Major Improvements
Games
Commits
Conclusion

The Nothing category is now at an all-time low, reducing by just over half, work is already being done to make this category shrink even further! The Playable category has had another nice increase, as more Playable titles were found. Furthermore, many entries for the same Game Media on the list were merged (due to new reports for different regions being submitted these past months), so the overall game count has decreased, even though there were new unique submissions. For a more detailed look, you can view the compatibility history page to see exactly which games had their status changed this month.

Game Compatibility: Game Status
Game Compatibility: Monthly Improvements (May 2018)

On Git statistics, 23 166 lines of code were added and 6 732 were removed by 17 authors.

Major RPCS3 Improvements

SPU LLVM Recompiler (#4504)

While still experimental, Nekotekina implemented SPU LLVM Recompiler during May! It may not currently work with many games, but the ones that do work saw a good improvement with performance and audio. For example, Wangan Midnight which works with SPU LLVM has seen audio stutter disappear, with other titles also receiving note-worthy improvements from SPU LLVM such as Drakengard 3, Killzone 3 – which now goes ingame for the first time – and Tales of Vesperia. Be sure to test your games with it and report back if you find anything interesting!

Shader Analyser (#4505)

On the first of May kd-11 begun implementing a Work In Progress shader analyser which will help reduce the amount of shaders RPCS3 compiles. This is done by gathering metadata during the compilation of a shader and using that information to check shaders against bound textures. By doing this, unbound textures will simply be skipped instead of having to be decompiled just to check if they are actually being used.

kd-11 also made RPCS3 assume that if there is no L2 cache eviction then the fragment program micro code has not been updated and to do the same for some registers that control vertex programs. As the hardware would not see the changes anyway, there is no need for RPCS3 to do extra work. This improves performance in titles such as Far Cry 2, Prince of Persia and Metal Gear Solid 1 by a significant margin.

This lays the foundation for more future improvements that should eventually eliminate shader compilation stutter entirely. This Pull Request also fixed/improved shadows in Assassin’s Creed, Saw 2 and Prince of Persia.

Texture cache Improvements (#4611)

By re-implementing RPCS3’s texture cache to take advantage of the memory mirror support added by Nekotekina, kd-11 fixed the following issues; Flickering cutscenes in Uncharted, Sly Cooper Thieves in Time, Tekken Tag Tournament 2 and broken particle and alpha effects in Beyond Good & Evil HD. This change also improved RPCS3’s stability.

Memory protection fixes and RSX Improvements (#4649, #4671)

kd-11 made some fixes to RPCS3’s memory protection which fixes flickering and missing objects in some games especially if Write Color Buffers was enabled. This also reduces the need for Strict Rendering Mode in various Naughty Dog games such as The Last of Us and Uncharted games. This means that you can now use resolution scaling in The Last of Us! Click on the image below to view it at the full resolution.


You may have noticed already that the visuals are not very impressive, even though it’s running at 4k. This is caused by issues with post-processing. kd-11 plans to work on improving this in the future.

kd-11 also fixed resolution scaling in God of War 3 by constantly pinging the RSX thread while in the spinwait state to check for cyclic dependencies. This allowed the game to no-longer run in software mode which resulted in scaling being fixed! kd-11 also fixed a regression where God Of War 3 required the debug option “Force CPU Blit” to be enabled which lowered performance and caused confusion among users.

These improvements among others can be seen in the video below.

RSX Capture/Replay (#4510)

On the RSX front Jarveson implemented the beginnings of a fairly useful debugging feature, capturing and replaying! For those that have ever tried to debug graphics issues, you know that RenderDoc is invaluable when trying to track down and fix them. The shortcomings with it, however, is that it doesn’t tell the whole story from the emulator side, it leaves out what the game is actually telling the RSX to render. This feature aims to fix that. By capturing the RSX commands and memory needed for a displayed frame, we’re now able to replay and display that exact frame in the emulator. A developer can now see the full path from game to RSX to screen, without having to waste time on trying to reproduce the error. This can also help with regression testing as well, as replays can be compared between builds. It should be noted there are a few restrictions and issues with it in its initial creation, check out the pull request description for more info.

Vulkan Memory Allocator & Timing Fixes (#4579, #4635)

Pauls-gh discovered that Tales of Vesperia was bottlenecked by the GPU in certain areas and went on to profile the code. He then found that Vulkan video memory allocations were very slow. This led to adding the open source Vulkan Memory Allocator library, with this library Tales of Vesperia sees a performance improvement of up to 100%.

Shortly after, Pauls-gh looked into a long standing issue with Tales of Vesperia where character animations would play in slow motion. After finding that Linux did not have this issue, he used Microsoft Concurrency Visualizer and found that the game was spending a lot of time waiting on conditional variables (cond_variable::imp_wait).

This led to discovering that the OS wait (NtWaitForKeyedEvent) was not microsecond accurate. Windows can only sleep using a resolution of 0.5 msec. After modifying the PS3 function sys_timer_usleep to skip sleeping if the time was less than 300 usec character animations were finally running at the correct speed on Windows.

However, this wasn’t hardware accurate as kd-11 quickly discovered after running hardware tests on a real PS3. To address this, kd-11 modified the RPCS3 implementation of sys_timer_usleep to make it usec accurate with real PS3 hardware then tuned values to ensure performance was not lost on low-end CPUs (committed as part of #4661).

This lead to massive Windows performance improvements in Nier, Tales of Vesperia, Atelier Escha & Logy Alchemists of the Dusk Sky and more! Nier in particular went from single digit frame rates to a locked 30 FPS on many machines which resulted in this exclusive becoming playable!

Here are a few comparative screenshots of Tales of Vesperia where the image on the left has Vulkan Memory Allocator disabled and the image on the right has Vulkan Memory Allocator enabled:

Performance Overlay (#4620)

This month VelocityRa added a much requested feature, a Performance Overlay. Work on the overlay began after nitrohigito opened a feature request (#4500) in April. After 2 months of fleshing out the scope and design of the overlay, the PR was finalised and merged this month. The Overlay presents various information such as:
1. Total CPU Usage along with the breakup between the PPU, SPU and RSX emulation
2. Thread count for PPU, SPU and RSX emulation
3. FPS and frame time
4. RSX Load

The Overlay has a few configurable options, the most important of which is the Detail Level.
It has four such presets, namely ‘Minimal’, ‘Low’, ‘Medium’ and ‘High’, each with increasing levels of detail:

Since the Performance Overlay made use of the native UI, kd-11 refactored and fixed various issues present in it to make it easier for other developers to cleanly manage adding new dialogs and interfaces without worrying about object management (#4623).
Also, Megamouse handled the Qt aspects by adding a GUI tab to the settings dialogue and exposed some Performance Overlay options to it (#4673).

Revamped Trophy Manager (#4604)

Thanks to the work of flash-fire, RPCS3 gained a trophy manager back in October 2017. While the initial implementation focused primarily on the design and functionality, the trophy manager did come with a collapsible column structure that maintained both the game and trophies in a single table. While this was a welcomed addition, the trophy list proved arduous to use with multiple games in the list. To fix this, Megamouse completely rewrote the list structure of the trophy manager this month! The Trophy Manager now uses two tables, one for the games and one for selected game’s trophies. You can switch to a game simply by selecting it with either mouse or keyboard and thus display its trophies to the right.

Also, game icons have now been added to the games table! Just like the trophy icons, these icons are also resizable. You can also use the handle between the tables to resize the tables and even hide either of them. A dropdown was added to the leftmost toolbar in order to make it possible to still choose games while the games list is hidden.

Games

NieR

As mentioned earlier in the report, NieR saw a massive performance boost on Windows specifically due to some timing fixes. Instead of single digit framerates on Windows, you can now expect a locked 30 FPS even on a mid-range CPU. This made NieR playable for the first time! If you want to try running the game for yourself, just remember to enable Write Color Buffers (GPU Settings Tab).

Lollipop Chainsaw Massacre

While this title has been playable for around a month now, graphics have recently improved and it no longer requires strict rendering mode meaning you can play at high resolutions without breaking visuals!

Shadows of the Damned

Another console exclusive game is now Playable, this one runs quite well just remember to keep the aspect ratio set to 16:9 and Anisotropic Filtering on Automatic if you want to try it out for yourself!

Killzone 2 & 3

With the recent improvements to SPU ASMJIT Recompiler by Nekotekina, these 2 AAA PlayStation 3 exclusives finally go ingame!

Silent Hill Downpour

An issue with cutscenes that also affected Ni No Kuni which was fixed in April also allowed Silent Hill Downpour to get past the prologue without a work-around. This change, among other graphical and performance improvements have allowed Silent Hill Downpour to be considered playable!

NCAA Football

NCAA Football 10 and 11 and most likely many more titles go ingame after user “sftt” fixed an underflow issue in sys_memory.

Pro Evolution Soccer

Thanks to recent improvements to elad335 to the memory mapping alignment, various PES games now reach Intro.

Disorder 6

This console exclusive was previously ingame due to graphics and audio glitches. Recent testing has shown that this title is playable!

Ketsui Kizuna Jigokutachi Extra

Another arcade console exclusive title which made its way to playable.

Muv-Luv Photonflowers and Photonmelodies

These exclusive VNs are now playable. More weeb games!

Commits

As always, this is not a complete list of PRs or commits, nor does it necessarily list every single thing done by a given PR. For a full list, see PRs merged in May. Many of the unmentioned PRs are simply small updates, fixes, and other small improvements, sometimes only to the GUI.

Nekotekina

4504 – SPU improvements, see coverage in major improvements.

4553 – Improves Travis automatic build systems, SPU analyzer and code generation.

4565 – Fixes some regressions caused by the previous PR.

4580 – Asmfixes, rewrite of the TSX transactions’ logic in assembler which improves the speed and consistency between different platforms and compilers. Also fixed a bug in Linux where some games would only show a black screen #4460.

4603 – Asmfixups, fixes some regressions caused by the previous PR.

4622 – Diagnostic, fixes some more regressions caused by #4580;

4631 – Perf fixes, Improves emulator for non-TSX processors. This work is intertwined with eliminating deadlocks in RSX.

4646 – Perf fixups, Continuation of the previous PR.

4664 – Insignificant fixes, rewrite of SSSE3 path to use ASMJIT to generate SSSE3+ code at runtime. Helps AppImages which were previously compiled without SSSE3 support. Also, LLVM compilation progress dialog will now use a single dialog for all threads!

kd-11

4505 – rsx: Stuff, Introduces the base for the gpu programs analysis and management; Properly works around two AMD driver bugs, one Vulkan bug relating to primitive restart and another OpenGL bug caused by broken generation of gl_vertexID in some situations; Properly initialize CC registers to fix some poorly written assembly shader ucode; Implement flow control. This implements properly jumps and function calls (CAL, BRI, RET, etc). This adds proper support for subroutines among other improvements.

4623 – Refactor of the Native UI overlays system, see coverage in major improvements. This PR also fixes some annoying flickering with native interfaces or in some cases invisible interfaces.

4628 – Fixup: Overlays, addendum to the above PR.

4611 – rsx: Fixes, Reimplement texture cache to take advantage of the memory mirror support and minor fixes to logging to reduce log spam.

4649 – rsx: Fixup, Memory protection fixup which fixes flickering and missing stuff in some games especially if Write color buffer is enabled; Optimize a check in main RSX loop to restore performance lost in the above PR; Fix OpenGL Write color buffer regression (‘unreachable’ assert).

4661 – Stuff, Disable thread scheduler for intel CPUs; Fix a race condition causing crashing in overlays; Promote FIFO optimizations to stable which prevents the feature from getting disabled by when using Strict rendering mode and allows Strict rendering mode to work with usable performance. A debug option has also been added for developers; Finish reimplementation of sys_timer_usleep.

4671 – More stuff, Evade semaphore acquire deadlock by constantly pinging RSX thread during spinwait to check for cyclic dependencies; Simplify task queue management and use yield instead of busy_wait to reduce audio stutter on Intel i5 processors; Fix silly bug causing native UI crashes when trophy popup closes.

4679 – Fixes a leak in performance overlay causing ever-increasing compiled buffer size with ridiculous redraw and rework of Vulkan overlays handling with better memory management.

Megamouse

4550 – Moved the additional Qt plugins (like the folders: bearer, imageformats, styles and platforms) into a subfolder “qt/plugins” using windeployqt and a new .conf file. The old folder structure can still be used to preserve backwards compatibility;

4562 – Fixes a bug in the previous PR which resulted in broken Appimages. Paths for windeployqt are now moved to an extra windows.conf file;

4594 – The prior gitignore file did not always exclude all files inside the bin directory. This resulted in config.yml files, logs, 7z and many more to show up in the unstaged files in gitkraken. A rather simple rewording fixed this issue;

4407 – Includes notable new feature additions to the gamelist along with other minor fixes. First of all, a new column has been added to the gamelist: “Move Support”. By checking a flag in the param.sfo this column shows us if a game supports the move controllers, which will hopefully soon be added to the roster of usable peripherals. Currently this is very useful in order to sort for games that can or can’t be played at the moment. Another bigger change (apart from some rare GUI crash fixes and optimizations) is the complete overhaul of the gamelist data model. This fixed a lot of minor issues, like weird scrolling after sorting or game additions/removals, wrongly selected items after similar interactions and last but not least wrong sorting results;

4604 – Major rewrite of the Trophy Manager, see coverage in major improvements. Further improvements were also made to the gamelist such as fixing size issues regarding the gamelist headers, rework the way their settings are saved in order to fix future inconsistencies and more;

4378 – You can now just press any “Play” option on starting RPCS3 in order to boot your last played game. This is cool if you play or test the same game repeatedly, whether if it crashes the emulator, whether you use different builds, or maybe just want to backup logs. You could do this before already with a shortcut: ctrl+1. Also, the log message severity for “Recent Game Not Valid” has been moved from Error to Warning (only the common occurrence that happens when you delete or move games);

4617 – Bugfixing for 4594, 4604 and 4407;

4621 – Bugfixing for 4604;

4397 – Fixes an error that prevented “Battle Princess of Arcadias” to proceed ingame. A savedata function falsely aborted when an allowed (albeit technically incorrect) status was returned by a callback. Allowing the function to progress and only filtering out actual errors was the correct solution to this problem;

4637 – Fixes various minor things in the gamelist and also changes the size ratios of a few main window elements. For example: The log frame is now smaller when you first boot up RPCS3 and also the gamelist columns will have a slight offset to each other;

4673 – All UI settings in the Emulator tab have now been moved to a dedicated GUI tab. This was done primarily to accommodate Performance overlay options of VelocityRa in the Emulator tab. It now has a checkbox that enables the Performance Overlay and its other options: a combobox for the 4 detail levels from Minimal to High, a slider to change its update interval in milliseconds and a slider that controls the font size. Hopefully this part of the settings dialog will come in handy!

elad335

4660 – Rsx: Naughty fixes, Fixes RSX IO addressing assignment when one of the IO address ranges has been deallocated which allows the IO mapper to return the correct IO offset values.
This fix allowed exclusives such as Uncharted 2 and Uncharted 3 to finally go ingame and also allowed The Last of Us to progress a little further!

4672 – Fixes memory mapping alignment check by checking the address alignment via the alignment of its page. This fix allowed five Pro Evolution Soccer games to go from Nothing/Loadable into Intro!

jarveson

4510 – rsx: initial capture/replay functionality, see coverage in major improvements.

jbeich

4532 – Unbreak build on FreeBSD;

4540 – Make sure assertions (e.g., in LLVM) are disabled by default;

4596 – Unbreak build on FreeBSD;

4655 – Unbreak build on FreeBSD.

Nicba1010

4650 – Vulkan SDK Mirror, Setup mirror for Vulkan SDK executable which were previously causing automatic builds to fails due to a download cap set.

scribam

4514 – hle: make cellSubDisplayInit returns CELL_SUBDISPLAY_ERROR_ZERO_REGISTERED, Simulates that a remote play device is not connected. Improves Class of Heroes 2G and Guacamelee! (Playable Demo);

4572 – Fixes some typos in RPCS3’s code.

4574 – Improved cellSync2 error checks and logging (HLE).

4567 – Housekeeping, Removes some redundant code relating to rpcs3-tests and ps3emu_api;

4593 – Compilation fix for GCC 8;

4578 – Cmake improvements;

4577 – Travis improvements, Linux builds now rely on Vulkan packages which negates the need to checkout the submodule. Also, leftover instructions related to the usage of the QT installer has been removed;

4608 – 3rdparty: update Vulkan integration, Remove the Vulkan-LoaderAndValidationLayers submodule/project to directly use the official Vulkan SDK;

4639 – Link dynamically with the Vulkan loader, Avoids linking to the static library which may cause issues. With this PR, Vulkan is now compatible monitoring tools such as RivaTurner/ MSI afterburner.

pauls-gh

4579 – Timer resolution fix for Windows 10 update 1803 performance regression, see coverage in major improvements.

4635 – Vulkan memory allocator performance enhancement, see coverage in major improvements.

VelocityRa

4618 – Fix Windows build;

4620 – Performance Overlay, see coverage in major improvements.

hcorion

4581 – Make Appimages work and fixes xxHash issues.

4588 – Re-enable LLVM for travis and build AppImages with LLVM 6.

creeperjedi

4564 – Added “(experimental)” next to the SPU LLVM Recompiler.

tlm-2501

4558 – Add description for SPU LLVM.

isJuhn

4539 – Fix setParam in cellHddGameCheck which allows Record of Agarest War to go ingame.

AniLeo

4526 – gui/themes: Refactor Kuroi (Dark).

Orphis

4507 – CMake build optimization.

Maxetto

4490 – Update some lv2 syscall names.

sftt

4399 – Avoid illegal available_user_memory (<0 or underflow) in sys_memory.cpp, which allowed Shining Resonance, Everybody’s Golf: World Tour, and NCAA Football 10 and 11 to go ingame!

Closing Words

If you like in-depth technical reports, early access to information, or you simply want to contribute, consider becoming a patron! All donations are greatly appreciated. As of last month, RPCS3 now has two full-time coders that greatly benefit from the continued support of over 700 generous patrons.

Also, come check out our YouTube channel and Discord to stay up-to-date with any big news.

This report was written by Asinine, HerrHulaHoop, Jarves, VelocityRa and Megamouse.