Fast FXSR

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
JohnnyTheDon
Member
Member
Posts: 524
Joined: Sun Nov 09, 2008 2:55 am
Location: Pennsylvania, USA

Fast FXSR

Post by JohnnyTheDon »

While looking into supporting 3DNow! instructions, I found a Fast FXSR bit. This apparently allow the processor to skip saving the XMM registers when doing a FXSAVE/FXRSTORE. My question is: why? Is it substansiably faster to save these registers manually on task switches?
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Re: Fast FXSR

Post by jal »

JohnnyTheDon wrote:While looking into supporting 3DNow! instructions, I found a Fast FXSR bit. This apparently allow the processor to skip saving the XMM registers when doing a FXSAVE/FXRSTORE. My question is: why? Is it substansiably faster to save these registers manually on task switches?
Presumably, when many applications are using the FPU but not that many are using XMM, it will safe time (you'd manually safe/restore only when a task is using them).


JAL
User avatar
01000101
Member
Member
Posts: 1599
Joined: Fri Jun 22, 2007 12:47 pm
Contact:

Re: Fast FXSR

Post by 01000101 »

If you save/restore the FPU/MMX state every interrupt routine or if you do it frequently, it will probably save you some time. But if you ever want/plan to use SSE (XMM regs) then it's probably best to do a full save/restore via FXSAVE/FXRSTR as some things are worth taking the cycle hit. I haven't really looked into the Fast FXSR bit but does it require less memory space? The FXSAVE/FXRSTR (normal) requires a full 512 bytes of memory that is aligned on a 16 byte boundary (iirc).
JohnnyTheDon
Member
Member
Posts: 524
Joined: Sun Nov 09, 2008 2:55 am
Location: Pennsylvania, USA

Re: Fast FXSR

Post by JohnnyTheDon »

I would probably have to allocated the space anyway, just in case the application started using SSE at some point. It would also be annoying set TS until the thread did SSE (it would interrupt on every FPU/MMX instruction as well as SSE). I think I'm just going to forget it, it doesn't seem like that good of an idea unless I know that the program won't use SSE.
Post Reply