Sintendo
0637a7ec59
Jit64: divwx - Optimize power-of-two divisors
...
Power-of-two divisors can be done more elegantly, so handle them
separately.
- Division by 4
Before:
41 BD 04 00 00 00 mov r13d,4
41 8B C0 mov eax,r8d
45 85 ED test r13d,r13d
74 0D je overflow
3D 00 00 00 80 cmp eax,80000000h
75 0E jne normal_path
41 83 FD FF cmp r13d,0FFFFFFFFh
75 08 jne normal_path
overflow:
C1 F8 1F sar eax,1Fh
44 8B E8 mov r13d,eax
EB 07 jmp done
normal_path:
99 cdq
41 F7 FD idiv eax,r13d
44 8B E8 mov r13d,eax
done:
After:
45 85 C0 test r8d,r8d
45 8D 68 03 lea r13d,[r8+3]
45 0F 49 E8 cmovns r13d,r8d
41 C1 FD 02 sar r13d,2
2021-03-07 18:29:12 +01:00
Sintendo
530475dce8
Jit64: divwx - Micro-optimize certain divisors
...
When the multiplier is positive (which is the most common case), we can
generate slightly better code.
- Division by 30307
Before:
49 63 C5 movsxd rax,r13d
48 69 C0 65 6B 32 45 imul rax,rax,45326B65h
4C 8B C0 mov r8,rax
48 C1 E8 3F shr rax,3Fh
49 C1 F8 2D sar r8,2Dh
44 03 C0 add r8d,eax
After:
49 63 C5 movsxd rax,r13d
4C 69 C0 65 6B 32 45 imul r8,rax,45326B65h
C1 E8 1F shr eax,1Fh
49 C1 F8 2D sar r8,2Dh
44 03 C0 add r8d,eax
2021-03-07 18:29:12 +01:00
Sintendo
95698c5ae1
Jit64: divwx - Optimize constant divisor
...
Optimize division by a constant into multiplication. This method is also
used by GCC and LLVM.
We also add optimized paths for divisors 0, 1, and -1, because they
don't work using this method. They don't occur very often, but are
necessary for correctness.
- Division by 1
Before:
41 BF 01 00 00 00 mov r15d,1
41 8B C5 mov eax,r13d
45 85 FF test r15d,r15d
74 0D je overflow
3D 00 00 00 80 cmp eax,80000000h
75 0E jne normal_path
41 83 FF FF cmp r15d,0FFFFFFFFh
75 08 jne normal_path
overflow:
C1 F8 1F sar eax,1Fh
44 8B F8 mov r15d,eax
EB 07 jmp done
normal_path:
99 cdq
41 F7 FF idiv eax,r15d
44 8B F8 mov r15d,eax
done:
After:
45 8B FD mov r15d,r13d
- Division by 30307
Before:
41 BA 63 76 00 00 mov r10d,7663h
41 8B C5 mov eax,r13d
45 85 D2 test r10d,r10d
74 0D je overflow
3D 00 00 00 80 cmp eax,80000000h
75 0E jne normal_path
41 83 FA FF cmp r10d,0FFFFFFFFh
75 08 jne normal_path
overflow:
C1 F8 1F sar eax,1Fh
44 8B C0 mov r8d,eax
EB 07 jmp done
normal_path:
99 cdq
41 F7 FA idiv eax,r10d
44 8B C0 mov r8d,eax
done:
After:
49 63 C5 movsxd rax,r13d
48 69 C0 65 6B 32 45 imul rax,rax,45326B65h
4C 8B C0 mov r8,rax
48 C1 E8 3F shr rax,3Fh
49 C1 F8 2D sar r8,2Dh
44 03 C0 add r8d,eax
- Division by 30323
Before:
41 BA 73 76 00 00 mov r10d,7673h
41 8B C5 mov eax,r13d
45 85 D2 test r10d,r10d
74 0D je overflow
3D 00 00 00 80 cmp eax,80000000h
75 0E jne normal_path
41 83 FA FF cmp r10d,0FFFFFFFFh
75 08 jne normal_path
overflow:
C1 F8 1F sar eax,1Fh
44 8B C0 mov r8d,eax
EB 07 jmp 00000000161737E7
normal_path:
99 cdq
41 F7 FA idiv eax,r10d
44 8B C0 mov r8d,eax
done:
After:
49 63 C5 movsxd rax,r13d
4C 69 C0 19 25 52 8A imul r8,rax,0FFFFFFFF8A522519h
49 C1 E8 20 shr r8,20h
44 03 C0 add r8d,eax
C1 E8 1F shr eax,1Fh
41 C1 F8 0E sar r8d,0Eh
44 03 C0 add r8d,eax
2021-03-07 18:29:01 +01:00
Sintendo
5bb8798df6
JitCommon: Signed 32-bit division magic constants
...
Add a function to calculate the magic constants required to optimize
signed 32-bit division.
Since this optimization is not exclusive to any particular architecture,
JitCommon seemed like a good place to put this.
2021-03-07 18:27:36 +01:00
Sintendo
c9adc60d73
Jit64: divwx - Special case dividend == 0
...
Zero divided by any number is still zero. For whatever reason, this case
shows up frequently too.
Before:
B8 00 00 00 00 mov eax,0
85 F6 test esi,esi
74 0C je overflow
3D 00 00 00 80 cmp eax,80000000h
75 0C jne normal_path
83 FE FF cmp esi,0FFFFFFFFh
75 07 jne normal_path
overflow:
C1 F8 1F sar eax,1Fh
8B F8 mov edi,eax
EB 05 jmp done
normal_path:
99 cdq
F7 FE idiv eax,esi
8B F8 mov edi,eax
done:
After:
Nothing!
2021-03-07 18:27:30 +01:00
Sintendo
c081e3f2b3
Jit64: divwx - Optimize constant dividend
...
When the dividend is known at compile time, we can eliminate some of the
branching and precompute the result for the overflow case.
Before:
B8 54 D3 E6 02 mov eax,2E6D354h
85 FF test edi,edi
74 0C je overflow
3D 00 00 00 80 cmp eax,80000000h
75 0C jne normal_path
83 FF FF cmp edi,0FFFFFFFFh
75 07 jne normal_path
overflow:
C1 F8 1F sar eax,1Fh
8B F8 mov edi,eax
EB 05 jmp done
normal_path:
99 cdq
F7 FF idiv eax,edi
8B F8 mov edi,eax
done:
After:
85 FF test edi,edi
75 04 jne normal_path
33 FF xor edi,edi
EB 0A jmp done
normal_path:
B8 54 D3 E6 02 mov eax,2E6D354h
99 cdq
F7 FF idiv eax,edi
8B F8 mov edi,eax
done:
Fairly common with constant dividend of zero. Non-zero values occur
frequently in Ocarina of Time Master Quest.
2021-03-07 18:25:08 +01:00
JosJuice
14bfc0be78
DiscIO: Fix reading certain WIA chunks with many exceptions
...
The loop in WIARVZFileReader::Chunk::Read could terminate
prematurely if the size argument was smaller than the size
of an exception list which had only been partially loaded.
2021-03-07 14:14:45 +01:00
JosJuice
96ebf01ea8
VolumeVerifier: Fix potential crash when cancelling
...
The async operations may contain references to class members, so
any running async operations must end before destroying the class.
2021-03-07 13:56:06 +01:00
Léo Lam
61198541a0
Merge pull request #9562 from sepalani/dis-icons
...
Breakpoints: Change icon when disabled
2021-03-07 12:14:12 +01:00
Pokechu22
df81210e96
Use formatters in GetBPRegInfo; add missing commands
...
BPMEM_TEV_COLOR_ENV + 6 (0xC6) was missing due to a typo. BPMEM_BP_MASK (0xFE) does not lend itself well to documentation with the current FIFO analyzer implementation (since it requires remembering the values in BP memory) but still shouldn't be treated as unknown. BPMEM_TX_SETMODE0_4 and BPMEM_TX_SETMODE1_4 (0xA4-0xAB) were missing entirely.
2021-03-06 19:27:20 -08:00
Pokechu22
70f9fc4e75
Convert BPMemory to BitField and enum class
...
Additional changes:
- For TevStageCombiner's ColorCombiner and AlphaCombiner, op/comparison and scale/compare_mode have been split as there are different meanings and enums if bias is set to compare. (Shift has also been renamed to scale)
- In TexMode0, min_filter has been split into min_mip and min_filter.
- In TexImage1, image_type is now cache_manually_managed.
- The unused bit in GenMode is now exposed.
- LPSize's lineaspect is now named adjust_for_aspect_ratio.
2021-03-06 19:27:19 -08:00
Pokechu22
db8ced7e4e
Add FogParam0::FloatValue and FogParam3::FloatValue
...
This value will be used in the register description; so expose it in a way that can be re-used instead of calculating it in 2 places later.
2021-03-06 19:27:18 -08:00
Pokechu22
f2bea67709
Fix typo with ztex2 op in UseVertexDepthRange
2021-03-06 19:27:17 -08:00
Pokechu22
762fe33a3d
Rename BPMEM_EFB_BR to BPMEM_EFB_WH
2021-03-06 19:27:16 -08:00
Pokechu22
81b84a5ebe
Use XFMEM_REGISTERS_START/END in XFRegWritten and LoadXFReg
2021-03-06 19:27:15 -08:00
Pokechu22
8c80369373
Add names and descriptions for regular XF memory
2021-03-06 19:27:15 -08:00
Pokechu22
2d6ec7457d
Add names and descriptions for XF registers to the FIFO analyzer
2021-03-06 19:27:14 -08:00
Pokechu22
aab81d5aa0
Convert XFMemory to BitField and enum class
...
Additionally a new ClipDisable union has been added (though it is not currently used by Dolphin).
2021-03-06 19:27:14 -08:00
Pokechu22
953e09428f
Add names and descriptions for CP registers to the FIFO analyzer
2021-03-06 19:27:14 -08:00
Pokechu22
f749fcfa9f
Convert CPMemory to BitField and enum class
...
Additionally, VCacheEnhance has been added to UVAT_group1. According to YAGCD, this field is always 1.
TVtxDesc also now has separate low and high fields whose hex values correspond with the proper registers, instead of having one 33-bit value. This change was made in a way that should be backwards-compatible.
2021-03-06 19:27:08 -08:00
Pokechu22
c27efb3f1f
Create constants for CP registers and masks
2021-03-06 17:34:05 -08:00
Pokechu22
d702f3b4ad
DolphinNoGUI/PlatformX11: Work around X.h's None being undefined
2021-03-06 17:34:04 -08:00
Pokechu22
f697e17dd1
Create BitFieldArray
2021-03-06 17:34:03 -08:00
Pokechu22
1273c5e395
Add fmt support to BitField
2021-03-06 14:58:32 -08:00
Pokechu22
cf95deaf6d
Allow specifying StorageType for BitField
...
This is useful for BitFields that are bools.
2021-03-06 14:57:44 -08:00
Pokechu22
6653bd7199
Create EnumFormatter
2021-03-06 14:57:42 -08:00
iwubcode
dbb0b72cc5
InputCommon: instead of blocking on individual DSU server sockets, block on a selector built up from all server sockets
2021-03-05 12:05:38 -06:00
Sintendo
2454bd5ba6
Jit64: Add optional argument to GenerateOverflow
...
This allows setting the overflow flag based on any condition code.
Defaults to NO (no overflow).
2021-03-05 17:14:45 +01:00
Léo Lam
5f7d935b0a
Merge pull request #9533 from sepalani/mmu-is-ram
...
MMU: Fix IsRAMAddress not working
2021-03-05 11:49:55 +01:00
JMC47
fc86e554e0
Merge pull request #9559 from iwubcode/gdb-stub-raii
...
Common / Core: add raii object that cleans up WSA on destruction in gdb-stub
2021-03-05 05:28:31 -05:00
Léo Lam
adcdeda372
Merge pull request #9565 from sepalani/qt-blocker
...
BreakpointWidget: Use QSignalBlocker
2021-03-05 10:44:44 +01:00
Léo Lam
a4de2502c5
Merge pull request #9550 from endrift/gba-flush
...
SI/DeviceGBA: Ensure data socket isn't backed up
2021-03-05 10:38:55 +01:00
Sepalani
1e6dfc6b91
BreakpointWidget: Use QSignalBlocker
2021-03-05 13:35:33 +04:00
Sepalani
fd7eeb7221
BreakpointWidget: Fix delete deleting both MBP and BP at address
2021-03-05 13:01:32 +04:00
Sepalani
359a539f25
Breakpoints: Change icon when disabled
2021-03-05 11:21:37 +04:00
Léo Lam
1e3e5680db
Merge pull request #9561 from sepalani/fix-watches
...
Watches: Fix Save and Load from strings
2021-03-05 00:57:40 +01:00
iwubcode
7d5052896d
IOS: update network/ip/top to use the RAII winsock context
2021-03-04 13:55:20 -06:00
iwubcode
e4f74bea42
Core: Use RAII winsock object to cleanly create and destroy WSA in gdb-stub
2021-03-04 13:47:32 -06:00
iwubcode
00bc7e6b38
Common: Add RAII object that initializes and cleans up winsock
2021-03-04 13:44:12 -06:00
Sepalani
ef977123d5
BreakpointWidget: Emit BreakpointsChanged to update views
2021-03-04 21:10:37 +04:00
Sepalani
6786340a7c
Watches: Fix Save and Load from strings
2021-03-04 17:55:52 +04:00
Léo Lam
be500a98e2
Merge pull request #8779 from sepalani/open-dump
...
NetworkWidget: Reorganise SSL options group box
2021-03-04 13:37:10 +01:00
Léo Lam
511e9dcd2f
Merge pull request #9542 from InusualZ/toggle-bp
...
BreakpointWidget: Allow breakpoints to be toggled between enable/disable
2021-03-04 12:34:03 +01:00
Léo Lam
48a5846aee
Merge pull request #9548 from AdmiralCurtiss/fastmem-active-regions
...
Core/Memmap: Memory mapping logic fixes.
2021-03-04 12:18:59 +01:00
Léo Lam
9c6c77351f
Merge pull request #9556 from JosJuice/cmake-msvc-latest
...
CMake: Build with -std:c++latest for MSVC
2021-03-04 12:12:06 +01:00
Léo Lam
00db622d50
Merge pull request #9560 from JosJuice/cmake-msvc-wil
...
CMake: Include WIL headers
2021-03-04 12:08:05 +01:00
JosJuice
2cb3f663bc
CMake: Include WIL headers
...
MSBuild does this, so CMake should too. Fixes a Windows build error.
2021-03-04 10:26:31 +01:00
JosJuice
0cb71d3f47
CMake: Disable warning C5054 on DolphinQt
...
Same as 33c0abd
.
Also removing -D_SILENCE_CXX17_RESULT_OF_DEPRECATION_WARNING
to match MSBuild. Qt is no longer triggering that warning.
2021-03-04 09:29:30 +01:00
Dentomologist
6e13d35026
DolphinQt: Removed unused this capture in lambda
...
The Host constructor sets a callback on a lambda that in turn calls
Host_UpdateDisasmDialog. Since that function is not a member function
capturing this is unnecessary.
Fixes -Wunused-lambda-capture warning on freebsd-x64.
2021-03-03 13:18:17 -08:00
JMC47
d2eb846e6a
Merge pull request #9549 from Dentomologist/ppcstate_off_to_s32
...
JitArm64: Fix unsigned/signed argument/parameter mismatch
2021-03-03 14:56:40 -05:00