memcpy large number of 4KB chunks

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
sawdust
Member
Member
Posts: 51
Joined: Thu Dec 20, 2007 4:04 pm

memcpy large number of 4KB chunks

Post by sawdust »

Hello Gurus,
What is a good way to do an efficient memcpy of a large number of 4KB chunks? The source is near 1GB address. If my cpu spends lot of time doing this memcpy, it is going to crawl. I'm doing a lot of this memory copy in my little kernel. There is no paging.
TIA :?: :idea:
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: memcpy large number of 4KB chunks

Post by Combuster »

Grab the AMD optimisation reference (includes optimizing memcpy as an example) and try to minimize the amount of copying you need to do.

The real guru would probably do paging and CoW, but you since you don't want to hear that... :-#
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
i586coder
Member
Member
Posts: 143
Joined: Sat Sep 20, 2008 6:43 am

Re: memcpy large number of 4KB chunks

Post by i586coder »

first of all i'am not a guru,i'am just KERNEL coder :mrgreen:
any way,
the solution for your memcopy, that copy huge chunks of data is
SSE
this instruction present in INTEL & AMD

TO READ MORE ABOUT SSE, USE THIS LINK
http://softpixel.com/~cwright/programming/simd/sse.php

GOOD LUCK
Distance doesn't make you any smaller,
but it does make you part of a larger picture.
sawdust
Member
Member
Posts: 51
Joined: Thu Dec 20, 2007 4:04 pm

Re: memcpy large number of 4KB chunks

Post by sawdust »

AhmadTayseerDajani wrote: the solution for your memcopy, that copy huge chunks of data is
SSE
this instruction present in INTEL & AMD

TO READ MORE ABOUT SSE, USE THIS LINK
http://softpixel.com/~cwright/programming/simd/sse.php
Thanks. I haven't used SIMD so far. I'll look now.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: memcpy large number of 4KB chunks

Post by Brendan »

Hi,
sawdust wrote: What is a good way to do an efficient memcpy of a large number of 4KB chunks? The source is near 1GB address. If my cpu spends lot of time doing this memcpy, it is going to crawl. I'm doing a lot of this memory copy in my little kernel. There is no paging.
These clues lead me to several possibilities, but none of these possibilities lead to anything sane.

It's like saying that hitting your head with a hammer hurts, and asking if there's something you can do to make it hurt less.... ;)

What exactly are you copying, and why?

Note: Using SSE won't increase RAM bandwidth, and RAM bandwidth is probably the biggest bottleneck with whatever you're already using. Prefetching won't help because the CPU's own hardware prefetcher will start prefetching for you after the first few cache lines. Flushing cache lines (and/or using non-temporal stores) won't improve the time spent copying (but would reduce the number of cache misses you get after copying).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Post Reply