Linear algebra using shows different results on different m

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
devc1
Member
Member
Posts: 439
Joined: Fri Feb 11, 2022 4:55 am
Location: behind the keyboard

Linear algebra using shows different results on different m

Post by devc1 »

My simple cubic bezier drawing function shows different results on different machines.
For example :
- on my SSE core 2 duo laptop, QEMU and (AVX) VirtualBox it shows the right result.
- on (AVX) VMWare and My main computer (AVX) it shows a straight line on the top of the screen.

Is there something or some SSE,AVX initialization that I miss, or it is something with my code ?
- At startup I set MXCSR to 0x1F80 (Default value), do I need to do the AVX one on AVX machine (vldmxcsr) instead of ldmxcsr ?
(SSE) this function is from my invention,

Code: Select all

_SSE_ComputeBezier: 
 ; for(register UINT k = 1;k < NumCordinates;k++) 
     mov r8, 1 
     mov r9, rdx 
     cvtsi2ss xmm1, r8 
     subss xmm1, xmm2 
     movd eax, xmm1
     mov r11, rax 
     shl r11, 32 
     or rax, r11 
     movq xmm1, rax 
  
     movlhps xmm1, xmm1 
  
     movd eax, xmm2 
     mov r11, rax 
     shl r11, 32 
     or rax, r11 
     movq xmm2, rax 
     movlhps xmm2, xmm2 
  
 .loop0: 
     cmp r8, rdx 
     je .Exit 
     ; for(register UINT i = 0;i<NumCordinates - k;i++) 
     inc r8 
     dec r9 
     xor r11, r11 
     mov r10, rcx 
     .loop1: 
         cmp r11, r9 
         jae .loop0 
         ; beta[i]:XMM0 = (1 - percent) * beta[i] + percent * beta[i + 1]; 
  
         ; XMM0 = (1 - percent) * beta[i] 
         ; XMM0 = XMM1 * XMM3 
         movaps xmm0, xmm1 
         movups xmm3, [r10] 
         mulps xmm0, xmm3 
         ; XMM3 = percent * beta[i + 1] 
         ; XMM3 = XMM2 * XMM4 
         movaps xmm3, xmm2 
         movups xmm4, [r10 + 4] 
         mulps xmm3, xmm4 
         ; XMM0 = XMM0 + XMM3 
         addps xmm0, xmm3 
         movups [r10], xmm0 
         add r10, 0x10 
         add r11, 4 
         jmp .loop1 
 .Exit: 
     movups xmm0, [rcx] 
     cvtss2si rax, xmm0 
     ret
Proper result in SSE QEMU, AVX VirtualBox, SSE Laptop :
Image

Wrong result in AVX VMWARE, AVX Host Computer :
Image
Devc1, got banned for a few weeks :(
nullplan
Member
Member
Posts: 1790
Joined: Wed Aug 30, 2017 8:24 am

Re: Linear algebra using shows different results on differe

Post by nullplan »

What steps have you taken to debug the issue? This is calling out for some exploratory printing at different steps. That would also immediately tip you off if your SSE initialization is inadequate. However, I do think you only need to initialize the AVX unit if you enable it.
Carpe diem!
devc1
Member
Member
Posts: 439
Joined: Fri Feb 11, 2022 4:55 am
Location: behind the keyboard

Re: Linear algebra using shows different results on differe

Post by devc1 »

I discovered that running QEMU with UEFI on an 1920p resolution shows the same problem, and QEMU has no AVX. How do you initialize SSE ?
Debugging : I did not take any steps, just tried the OS on different machines.
Octocontrabass
Member
Member
Posts: 5563
Joined: Mon Mar 25, 2013 7:01 pm

Re: Linear algebra using shows different results on differe

Post by Octocontrabass »

You should do some more debugging. What values are you passing to this function?
devc1
Member
Member
Posts: 439
Joined: Fri Feb 11, 2022 4:55 am
Location: behind the keyboard

Re: Linear algebra using shows different results on differe

Post by devc1 »

This is the declared functions :

Code: Select all

extern UINT64 __fastcall _SSE_ComputeBezier(float* beta, UINT NumCordinates, float percent);
This is how it is called (currently) :

Code: Select all

UINT XOff = 200;
	UINT YOff = 300;
	float XCords[] = {0, 50, 100, 150};
	float YCords[] = {0, 50, -50, 0};
	float betabuffer[0x10] = {0};
	float IncValue = 0.1;
	float X0 = GetBezierPoint(XCords, betabuffer, 4, 0.1), X1 = GetBezierPoint(XCords, betabuffer, 4, 0.2), Y0 = GetBezierPoint(YCords, betabuffer, 4, 0.1), Y1 = GetBezierPoint(YCords, betabuffer, 4, 0.2);
	double Distance = __sqrt(pow(X1 - X0, 2) + pow(Y1-Y0, 2));
	if(Distance > 2) {
		IncValue /= (Distance - 1);
	}

	// IncValue /= 10;

	SystemDebugPrint(L"Starting...");
	for(UINT c = 0;c<0x10000;c++) { 
	UINT64 LastX = XOff;
	UINT64 LastY = YOff;

	UINT64 X = 0;
	UINT64 Y = 0;
	

	for(float t = 0;t<=1;t+=IncValue) {
		X = XOff + GetBezierPoint(XCords, betabuffer, 4, t);
		Y = YOff + GetBezierPoint(YCords, betabuffer, 4, t);
		// if(X > LastX + 1 || X < LastX - 1 || Y > LastY + 1 || Y < LastY - 1) {
		// 	LineTo(LastX, LastY, X, Y, 0xFF0000);
		// } else {
			*(UINT32*)(InitData.fb->FrameBufferBase + ((UINT64)X << 2) + ((UINT64)Y * InitData.fb->Pitch)) = 0xFF0000;
		// }
		LastX = X;
		LastY = Y;
	}
	}
SystemDebugPrint(L"All done");
GetBezierPoint() :

Code: Select all

UINT64 __fastcall GetBezierPoint(float* cordinates, float* beta, UINT8 NumCordinates, float percent){
	memcpy(beta, cordinates, NumCordinates << 2);
	if(ExtensionLevel == EXTENSION_LEVEL_SSE) {
		return _SSE_ComputeBezier(beta, NumCordinates, percent);
	} else if(ExtensionLevel == EXTENSION_LEVEL_AVX) {
		return _AVX_ComputeBezier(beta, NumCordinates, percent);
	}
	return 0;
	// _SSE_BezierCopyCords(beta, cordinates, NumCordinates);
	// return _SSE_ComputeBezier(beta, NumCordinates, percent);
}

I do not initialize MXCSR, I just reset it to 0x1F80. I don't know anything about that.
Post Reply