Page 1 of 1
Video Memory and Memcpy Optimizations
Posted: Sat Mar 14, 2009 1:57 am
by JohnnyTheDon
When doing a memcpy, my implementation does enough moves per byte (rep movsb) to align the remaining length on qword boundaries, and then moves a qword at a time (rep movsq). This didn't seem to be a problem until I tried running my system using kvm virtualization. Now whenever I do a memcpy on text memory (for scrolling the screen) if I use the normal memcpy it only seems to change every 4th letter. I never have this problem with real hardware or non-kvm emulation.
Is this just an artifact of kvm, or will some hardware have issues with doing qword moves in video memory? If it makes a difference, my computer has VMX (not AMDV).
PS: This is on a 64-bit host, running 64-bit Arch Linux, and my OS is 64-bit as well.
Re: Video Memory and Memcpy Optimizations
Posted: Sun Mar 15, 2009 10:38 am
by 01000101
maybe if we can see your code for the memcmp and a sample usage (maybe where you notice an issue?). I have a similar memcpy function that ailgns on a 16-byte boundary and then uses SSE's 16-byte moves where possible or QWord moves on 64-bit systems. I haven't encountered any issues with it on real or emulated hardware.
Re: Video Memory and Memcpy Optimizations
Posted: Sun Mar 15, 2009 12:15 pm
by JohnnyTheDon
The memcpy:
Code: Select all
memcpy:
mov rax, rdi
mov rcx, rdx
cld
.test:
test rcx, 0111b
jz .q
cmp rcx, 0
je .ret
movsb
dec rcx
jmp .test
.q:
shr rcx, 3
rep movsq
.ret:
ret
The usage:
Code: Select all
static void putc_text(char c)
{
if(c == '\n') // If the character is a newline just move to the next line
{
console_text_x = 0;
console_text_y++;
} else {
// Print the character to the screen
console_text_mem[(console_text_y*CONSOLE_TEXT_WIDTH+console_text_x)*2] = c;
if(++console_text_x >= CONSOLE_TEXT_WIDTH)
{
console_text_x = 0;
console_text_y++;
}
}
// Move the screen up if necessary
if(console_text_y >= CONSOLE_TEXT_HEIGHT)
{
console_text_y = CONSOLE_TEXT_HEIGHT-1;
memcpy(console_text_mem, console_text_mem+(CONSOLE_TEXT_WIDTH*2), CONSOLE_TEXT_WIDTH*(CONSOLE_TEXT_HEIGHT-1)*2);
// Epic Fail ^^^^^^^^^^^^
// Clear the last tine
for(int i = CONSOLE_TEXT_WIDTH-1; i >= 0; i--)
{
console_text_mem[((CONSOLE_TEXT_WIDTH*(CONSOLE_TEXT_HEIGHT-1))+i)*2] = ' ';
}
}
}
My workaround was to create a new memcpy_noopt function that does a regular rep movsb move. Seems to work okay, and I don't stay in text mode long in any case.