The point is, when you put it all together, that the Amiga was a very different machine from modern PCs and cache friendliness was not a consideration. So choosing to go for linked lists was a very different decision there. These days, self-referential data structures are just one cache miss after the next.
That said, using dynamic arrays all over the place is disastrous for other reasons (like memory fragmentation). As I said in the beginning, there is no silver bullet. There are many choices, and there are pros and cons to all of them. If your array goes beyond the cache size, it too becomes one cache miss after the next, especially if you access it randomly. And allocating an array when the key space is large and sparsely populated is wasteful.
Optimization
Re: Optimization
I am sure this discussion is very helpful to the OP.
Re: Optimization
You must admit that the OP had a rather, er, I'll be diplomatic and say overly-broad, question. That's just asking for someone to go off-topic.kzinti wrote:I am sure this discussion is very helpful to the OP.
I'll admit I'm enjoying the discussion about the Amiga's cache more than the OP, as the OP's answer could be discovered with knowledge of CS fundamentals.
Re: Optimization
So, Optimization at the micro level (cpu cache, registers, smaller and aligned data) will be helpfull for all kinds of tasks. It may achieve quadriples of performance and even more looking at some simple tests I made.
How about things such as task switching, can you get afaster register save/restore in long mode with some instruction ?
The fxsave saves the state of MMX, FPU and SSE.
What about the xsave, xsaves, xsaveopt. I guess they save the AVX state but do they also save the registers ?
Should I in 64 bit mode save registers one by one, isn't there any better way ?
How about things such as task switching, can you get afaster register save/restore in long mode with some instruction ?
The fxsave saves the state of MMX, FPU and SSE.
What about the xsave, xsaves, xsaveopt. I guess they save the AVX state but do they also save the registers ?
Should I in 64 bit mode save registers one by one, isn't there any better way ?
Re: Optimization
No. Micro-optimization is always optional. Beneficial, yes, but not necessary.devc1 wrote:So, Optimization at the micro level (cpu cache, registers, smaller and aligned data) will be helpfull for all kinds of tasks. It may achieve quadriples of performance and even more looking at some simple tests I made.
No. Just several good ole' pushes.devc1 wrote:How about things such as task switching, can you get afaster register save/restore in long mode with some instruction ?
Yes.devc1 wrote:The fxsave saves the state of MMX, FPU and SSE.
Read the Intel manuals ?devc1 wrote:What about the xsave, xsaves, xsaveopt. I guess they save the AVX state but do they also save the registers ?
It's the only way.devc1 wrote:Should I in 64 bit mode save registers one by one, isn't there any better way ?
-
- Member
- Posts: 5563
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Optimization
There are some microarchitectures where using SUB and several MOV will be faster than several PUSH if the increased code size doesn't cause additional cache misses. So, there is another way, but it's probably not a better way.nexos wrote:No. Just several good ole' pushes.
Re: Optimization
I recall people preferring LEA over SUB for manipulating ESP, but it was a long time since I read it and I'm not sure why. LEA doesn't change the flags and lets you put the result in a different register, but the latter especially isn't relevant. Address computations use a separate unit so it may have improved parallelization in some older processors. Maybe it's obsolete, maybe not.Octocontrabass wrote:There are some microarchitectures where using SUB and several MOV will be faster than several PUSH if the increased code size doesn't cause additional cache misses. So, there is another way, but it's probably not a better way.nexos wrote:No. Just several good ole' pushes.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
Re: Optimization
I think I recall seeing that method in TempleOS now that you mention it.Octocontrabass wrote:There are some microarchitectures where using SUB and several MOV will be faster than several PUSH if the increased code size doesn't cause additional cache misses. So, there is another way, but it's probably not a better way.nexos wrote:No. Just several good ole' pushes.