Code: Select all
dt 0.0
Code: Select all
dt 0.0
There are no bit/integer operations on 80-bit values in x86. You'll need to compose e.g. a 80-bit XOR using a 64-bit XOR and a 16-bit XOR.~ wrote:To define an 80-bit value we can do:
How to do the same if I want to define an 80-bit bitwise variable?Code: Select all
dt 0.0
Code: Select all
dd bits63_0
dw bits79_64
Code: Select all
dq 0000000000000000000000000000000000000000000000000000000000000000b ;0-63
dw 0000000000000000b ;64-80
Code: Select all
dd 00000000000000000000000000000000b ;0-31
dd 00000000000000000000000000000000b ;32-63
dw 0000000000000000b ;64-80
You're right! For some reason I read 80 bytes.iansjack wrote:I think you mean
resb 10
I want to bulk-move 80 bytes of binary data with the FPU (from 8 10-byte registers).Solar wrote:Smelling a X-Y problem here... what are you wanting a 80-bit variable for?
Are you, by any chance, trying to store a x86 extended precision floating point?
Code: Select all
;It can make 0xFFFFFFFF iterations in
;around 1:09.05 minutes in a dual core
;at 3.4GHz (less than 1:12 minutes).
;;
;It could be extended with all GPRs
;to move through the destination pointer
;every 10 bytes (70 bytes for EAX,
;EBX, ECX, EDX, ESI, EDI, EBP and maybe
;ESP with which we address for the
;80-byte FPU stack)
;instead of subtracting a single one, or a
;2-pointer struct for the initial source and
;destination values for each general-purpose
;register.
;
;Param 1 -- Source register value
;Param 2 -- Destination register value
;;
%macro MICRO_LONG__move_80_bytes_with_x87_FPU 2
;Load the whole FPU stack:
;;
add %1,10*7 ;Move to the last TWORD of the 8
fld tword[%1]
sub %1,10
fld tword[%1]
sub %1,10
fld tword[%1]
sub %1,10
fld tword[%1]
sub %1,10
fld tword[%1]
sub %1,10
fld tword[%1]
sub %1,10
fld tword[%1]
sub %1,10
fld tword[%1]
;Copy 10 bytes at a time:
;;
fstp tword[%2]
add %2,10
fstp tword[%2]
add %2,10
fstp tword[%2]
add %2,10
fstp tword[%2]
add %2,10
fstp tword[%2]
add %2,10
fstp tword[%2]
add %2,10
fstp tword[%2]
add %2,10
fstp tword[%2]
%endmacro
;;INIT: Initialize the FPU for this program
;;INIT: Initialize the FPU for this program
;;INIT: Initialize the FPU for this program
;;INIT: Initialize the FPU for this program
;Initialize the FPU to its default state and configuration
;after checking for pending unmasked floating-point exceptions:
;;
finit
;Wait for any pending FPU operations
;or exceptions and clear any
;pending exceptions:
;;
fclex
;Load new x87 Control Word
;into FPU control register:
;;
fldcw [x87_FPU_New_Control_Word]
;;END: Initialize the FPU for this program
;;END: Initialize the FPU for this program
;;END: Initialize the FPU for this program
;;END: Initialize the FPU for this program
align 4,db 0
x87_FPU_New_Control_Word equ $+ImageBase-__data_RVA_Localize__
dw 0001111110111111b
; -__--_ -_-_-_
; || || ||||||
; || || ||||||
; || || ||||||
; || || ||||||
; || || ||||| ---- 0 - IM - Invalid Operation Interrupt Mask (exception):
; || || ||||| 0: Generate INT/IRQ (disable handling at the FPU).
; || || ||||| 1: Do not generate INT/IRQ (selected at initialization, enable handling at the FPU).
; || || |||||
; || || |||| ----- 1 - DM - Denormalized Interrupt Mask (exception):
; || || |||| 0: Generate INT/IRQ (disable handling at the FPU).
; || || |||| 1: Do not generate INT/IRQ (selected at initialization, enable handling at the FPU).
; || || ||||
; || || ||| ------ 2 - ZM - Zero Divide Interrupt Mask (exception):
; || || ||| 0: Generate INT/IRQ (disable handling at the FPU).
; || || ||| 1: Do not generate INT/IRQ (selected at initialization, enable handling at the FPU).
; || || |||
; || || || ------- 3 - OM - Overflow Interrupt Mask (exception):
; || || || 0: Generate INT/IRQ (disable handling at the FPU).
; || || || 1: Do not generate INT/IRQ (selected at initialization, enable handling at the FPU).
; || || ||
; || || | -------- 4 - UM - Underflow Interrupt Mask (exception):
; || || | 0: Generate INT/IRQ (disable handling at the FPU).
; || || | 1: Do not generate INT/IRQ (selected at initialization, enable handling at the FPU).
; || || |
; || || --------- 5 - PM - Precision Interrupt Mask (exception):
; || || 0: Generate INT/IRQ (disable handling at the FPU).
; || || 1: Do not generate INT/IRQ (selected at initialization, enable handling at the FPU).
; || ||
; || | ---------- 7 - IEM - Interrupt Enable Mask (global for INT bits 0-5):
; || | 0: Enable interrupts.
; || | 1: Disable interrupts (selected at initialization).
; || |
; || ---------- 8-9 - PC - Precision Control:
; || 00b: 24 bits (REAL4).
; || 01b: Unused.
; || 10b: 53 bits (REAL8).
; || 11b: 64 bits (REAL10, selected at initialization).
; ||
; | ----------- 10-11 - RC - Rounding Control:
; | 00b: Round to nearest, or to even if equidistant (selected at initialization).
; | 01b: Round down (towards -Infinity).
; | 10b: Round up (towards +Infinity).
; | 11b: Truncate (towards 0).
; |
; --------------- 12 - IC - Infinity Control (more modern CPUs always use -Infinity and +Infinity):
; 0: Use unsigned Infinity (selected at initialization).
; 1: Respect -Infinity and +Infinity.
;
;
;
;
;
;
Code: Select all
;INIT: Main benchmark
;INIT: Main benchmark
;INIT: Main benchmark
;INIT: Main benchmark
mov widecx,0
mov widesi,source_10_byte_buffer_or_String_10_byte_end
move widedi,dest_buffer
align wideword_sz
.tttt:
MICRO_LONG__move_80_bytes_with_x87_FPU widesi,widedi
dec widecx
jnz .tttt
;END: Main benchmark
;END: Main benchmark
;END: Main benchmark
;END: Main benchmark
According to these tables, with a bit of optimization it's pretty close to the fastest way to move memory on an 8088. I'm not sure it's possible to beat REP MOVSW though.iansjack wrote:I suspect that you have discovered the slowest way of moving memory.