When we execute a JIT block, we have to make sure that both the PC and
the DR/IR bits of MSR are the same as they were when the block was
compiled. When jumping to a block from the dispatcher, this is done in
the way you would expect: By checking the PC and the relevant MSR bits.
However, when returning to a block using the BLR optimization, we only
check the PC. Checking the MSR bits is done by instead resetting the
stack when the MSR changes, making PC checks afterwards fail.
Except... We were only resetting the stack on rfi instructions. There
are actually many more ways for the MSR to change, and we weren't
covering those at all. I looked into resetting the stack on all of them,
but it would be pretty cumbersome both in terms of writing the code and
in terms of how often at runtime we'd have to reset the stack, so I
think the better option would be to check the MSR bits along with the
PC. That's what this commit implements.
Modify PPCSTATE_OFF and PPCSTATE_OFF_ARRAY macros when using GCC to
avoid useless log spam. Specifically, use a consteval lambda with gcc
_Pragma statements to disable the -Winvalid-offsetof warning inside the
macros.
Each successful build (and many failing ones) on the Android buildbot
generates almost 300 cases of -Winvalid-offsetof, resulting in thousands
of lines of log spam per build. In addition to bloating the log filesize
these spurious warnings make it harder to find actual warnings.
These warnings are generated by calls to the macros PPCSTATE_OFF and
PPCSTATE_OFF_ARRAY, which in turn are used by many other macros used by
the JIT. The ultimate cause is that offsetof is only conditionally
supported on non-standard-layout types, which includes the PowerPCState
struct.
To address potential questions of whether there's a better way to handle
this:
The obvious solution would be to modify PowerPCState so that it does
have a standard layout. This is unfortunately impractical.
To have a standard layout a type can only contain other types with
standard layouts. None of the stl containers are guaranteed to have
standard layouts, and PowerPCState contains a std::tuple and std::array.
PowerPCState also contains a PowerPC::Cache and InstructionCache which
themselves contain std:arrays and std::vectors.
Furthermore InstructionCache derives from Cache, and a derived class can
only have standard layout if at most one class in its hierarchy has a
non-static data member, but both classes have such members. Making
InstructionCache have a standard layout would require duplicating all
the functionality of Cache so it no longer derived from it, as well as
replacing the stl containers. This might require having a raw pointer to
said containers, with the manual memory management that implies.
All of that would be much more disruptive than would be justified to get
rid of some warnings (however annoying they might be). This is
compounded by the fact that PowerPCState hasn't had a standard layout
for a long time, if ever, and if the PPCSTATE_OFF macros weren't working
reliably it would have become obvious a long time ago.
As to why I picked the lambda solution over other potential changes:
- Keeping the define as-is and wrapping some gcc #pragmas around it
doesn't work because the pragmas don't get included when the define is
substituted to the call site.
- Keeping the define as a non-lambda expression and using inline
_Pragma() statements would ideally be better and works fine for msvc,
but fails for GCC with "'#pragma' is not allowed here".
- Turning off -Winvalid-offsetof globally for gcc would work, but there
might be other contexts where offsetof is problematic and GCC seems to
be the only compiler warning about it.
Turns out that we have two subsystems that want to register CPU thread
callbacks from a different thread than the CPU thread: FreeLookConfig
and VideoConfig. Both seem to happen from the host thread before the CPU
thread starts, so there's no thread safety issue. But ideally, if we
want to allow registering callbacks from threads other than the CPU
thread, we should make registering callbacks actually thread-safe. This
is an unsolved problem for the regular Config system, so I would like to
leave it outside the scope of this PR.
In theory, our config system supports calling Set from any thread. But
because we have config callbacks that call RunAsCPUThread, it's a lot
more restricted in practice. Calling Set from any thread other than the
host thread or the CPU thread is formally thread unsafe, and calling Set
on the host thread while the CPU thread is showing a panic alert causes
a deadlock. This is especially a problem because 04072f0 made the
"Ignore for this session" button in panic alerts call Set.
Because so many of our config callbacks want their code to run on the
CPU thread, I thought it would make sense to have a centralized way to
move execution to the CPU thread for config callbacks. To solve the
deadlock problem, this new way is non-blocking. This means that threads
other than the CPU thread might continue executing before the CPU thread
is informed of the new config, but I don't think there's any problem
with that.
Intends to fix https://bugs.dolphin-emu.org/issues/13108.
The previous list had some issues. A lot of variant id's were set to 0x0000. Althought this works for some figures, on a technicallity implemented into the games, they are technically wrong and don't result in exactly the same experience as the real figures. For example, the previous small fry got a "series 1" text in the summon screen. The real small fry does not have this. I also added figure types so I can add seperate generation logic later.
The Kaos element only applies to 3 items. So, I decided to throw it under others since it's not listed as an element in the manual and you can easily search for Kaos
Now block link nearcode is back to a length of three instructions.
Unfortunately, the code I'm adding to Jit.cpp ends up being a bit messy
because we need to handle the case of already being in farcode...
Jumping between linked blocks currently works as follows: First, at the
end of the first block, we check if the downcount is greater than zero.
If it is, we jump to the `normalEntry` of the block. So far so good. But
if the downcount wasn't greater than zero, we jump to the `checkedEntry`
of the block, which checks the downcount *again* and then jumps to
`do_timing` if it's less than zero (which seems like an off by one error
- Jit64 doesn't do anything like this). This second check is rather
redundant. Let's jump to `do_timing` where we previously jumped to
`checkedEntry`.
Jit64 doesn't check the downcount on block entry. See 5236dc3.
I initially thought the 0x01 side was both sides (equavalent to just C. However, this turned out to be something I forgot I implemented in my personal interface. 0x01 does not seem to change any colors.
Recently discovered how exactly the last 2 bytes of the J command for timing data
using portmapping with hosting while using traversal server (which is possible by checking the option while under "direct connect" and flipping back to traversal server) causes dolphin to request a mapping to external port 0.
In UPnP a mapping to external port 0 is actually the wildcard, which means that connection requests on all
external ports (that are not otherwise mapped) will be forwarded to the client.
Additionally it seems like using port mapping with traversal server is probably not expected behavior, as the option checkbox disappears when traversal server is used.
By misusing Config, this netplay-related code opened up a race condition between Config::OnConfigChanged() and SerialInterface::SerialInterfaceManager::UpdateDevices() that could cause iterator invalidation.