Page 1 of 1

Video Memory and Memcpy Optimizations

Posted: Sat Mar 14, 2009 1:57 am
by JohnnyTheDon
When doing a memcpy, my implementation does enough moves per byte (rep movsb) to align the remaining length on qword boundaries, and then moves a qword at a time (rep movsq). This didn't seem to be a problem until I tried running my system using kvm virtualization. Now whenever I do a memcpy on text memory (for scrolling the screen) if I use the normal memcpy it only seems to change every 4th letter. I never have this problem with real hardware or non-kvm emulation.

Is this just an artifact of kvm, or will some hardware have issues with doing qword moves in video memory? If it makes a difference, my computer has VMX (not AMDV).

PS: This is on a 64-bit host, running 64-bit Arch Linux, and my OS is 64-bit as well.

Re: Video Memory and Memcpy Optimizations

Posted: Sun Mar 15, 2009 10:38 am
by 01000101
maybe if we can see your code for the memcmp and a sample usage (maybe where you notice an issue?). I have a similar memcpy function that ailgns on a 16-byte boundary and then uses SSE's 16-byte moves where possible or QWord moves on 64-bit systems. I haven't encountered any issues with it on real or emulated hardware.

Re: Video Memory and Memcpy Optimizations

Posted: Sun Mar 15, 2009 12:15 pm
by JohnnyTheDon
The memcpy:

Code: Select all

memcpy:
	mov rax, rdi
	mov rcx, rdx
	cld
.test:
	test rcx, 0111b
	jz .q
	cmp rcx, 0
	je .ret
	movsb
	dec rcx
	jmp .test
.q:
	shr rcx, 3
	rep movsq
.ret:
	ret
The usage:

Code: Select all

static void putc_text(char c)
{
	if(c == '\n') // If the character is a newline just move to the next line
	{
		console_text_x = 0;
		console_text_y++;
	} else {
		// Print the character to the screen
		console_text_mem[(console_text_y*CONSOLE_TEXT_WIDTH+console_text_x)*2] = c;
		if(++console_text_x >= CONSOLE_TEXT_WIDTH)
		{
			console_text_x = 0;
			console_text_y++;
		}
	}
	// Move the screen up if necessary
	if(console_text_y >= CONSOLE_TEXT_HEIGHT)
	{
		console_text_y = CONSOLE_TEXT_HEIGHT-1;

               
		memcpy(console_text_mem, console_text_mem+(CONSOLE_TEXT_WIDTH*2), CONSOLE_TEXT_WIDTH*(CONSOLE_TEXT_HEIGHT-1)*2);
		 // Epic Fail ^^^^^^^^^^^^


		// Clear the last tine
		for(int i = CONSOLE_TEXT_WIDTH-1; i >= 0; i--)
		{
			console_text_mem[((CONSOLE_TEXT_WIDTH*(CONSOLE_TEXT_HEIGHT-1))+i)*2] = ' ';
		}
	}
}
My workaround was to create a new memcpy_noopt function that does a regular rep movsb move. Seems to work okay, and I don't stay in text mode long in any case.