Been looking around for some recent information on opcode times. All seems to be based on old processors Pentium 1, etc.. I have an AMD Athlon XP 2600 there is only a couple of instructions I'am after.....
1) Callgate in use on the same segment and of less protection
2) Ret in the same segment and of lesser protection
Any input would be much appreciated
Thanks in advance
Opcode times
Re:Opcode times
The reason they don't publish them anymore is because they were never accurate once all the caching/pipline stuff had been taken into consideration. Just do some benchmarking against your clock/counter.
- Pype.Clicker
- Member
- Posts: 5964
- Joined: Wed Oct 18, 2006 2:31 am
- Location: In a galaxy, far, far away
- Contact:
Re:Opcode times
moreover, the actual amount of clock cycles required for such operations will heavily depend on how those 'complex instructions' are encoded into micro-rom instructions (they're more or less interpreted on modern RISC-cored cpus ...)
Re:Opcode times
If you want to optimize your code, check out the "IA-32 Intel Architecture Optimization Reference Manual". It's a companion to the Software Developer Manuals. It explains all the branch-prediction, pipelining etc. - I'm afraid there is no easy "here's the table" answer to the question how to write utmost efficient code.
Note, however, the olden rule: Premature optimization is the root of all evil. (D. Knuth) Before you start hacking Assembler "because it's faster", assert that:
* the code section you are optimizing actually is the one causing performance problems, and
* your Assembler actually is faster than compiled C.
In both cases, chances are the answer is "no"...
Note, however, the olden rule: Premature optimization is the root of all evil. (D. Knuth) Before you start hacking Assembler "because it's faster", assert that:
* the code section you are optimizing actually is the one causing performance problems, and
* your Assembler actually is faster than compiled C.
In both cases, chances are the answer is "no"...
Every good solution is obvious once you've found it.
Re:Opcode times
Here's the table: (amd.com) http://www.amd.com/us-en/assets/content ... /22007.pdfSolar wrote: I'm afraid there is no easy "here's the table" answer to the question how to write utmost efficient code.
It's in appendix F.
I just hate to see people give incorrect answers...
Note that they ARE correct about it being totally useless. It does exist though. The other 80% of the book is about how to optimize.
For more asm optimization stuff, look at this thread: (masmforum.com)
http://www.masmforum.com/viewtopic.php?t=3329&start=0
- Pype.Clicker
- Member
- Posts: 5964
- Joined: Wed Oct 18, 2006 2:31 am
- Location: In a galaxy, far, far away
- Contact:
Re:Opcode times
great stuff, candy ... all those "optimized long integer multiply" and other "optimized decimal-to-asciiz conversion" routines will certainly please those who're writing a stdlib replacement
Re:Opcode times
Note that:Candy wrote:
I just hate to see people give incorrect answers...
1) That table does not list execution time in clock cycles, but execute latencies and decode type - which is a different ballgame, and to understand all implications you have to understand the architecture;
2) Numbers given are for AMD Athlon; are you sure it won't differ with the Athlon XP?
Every good solution is obvious once you've found it.
Re:Opcode times
Not necessarily. The numbers CAN also be seen as max clock cycles, not counting memory access latencies, just like on a 486 f.ex.Solar wrote: 1) That table does not list execution time in clock cycles, but execute latencies and decode type - which is a different ballgame, and to understand all implications you have to understand the architecture;
Athlon is an architecture (actually K7), Athlon XP is a die size plus marketing. The Morgan-series Durons are the exact same but have less cache, the Spitfire Durons are the same at a larger die size (0.18 afaik), etc etc etc. The 32-bit Athlons are all the same (note, this is NOT true for the 64-bit ones), except for some minor differences in features (but not in latencies!). A processor won't have a latency for a feature it doesn't have, of course.2) Numbers given are for AMD Athlon; are you sure it won't differ with the Athlon XP?