I'm writing my CPUID interface (so that programs can get the CPU information) and it works properly apart from one small detail: when I use Bochs compiled for SMP support and enable 2 processors, the 'CPU count' field is 0, which seems wrong to me. Any ideas why, I'm not planning on using both processors, I would just like to be able to have the ability to know if there are 2 (or more) processors.
Also, how do I use SSE instructions? I've heard that they can help me (128-bit variables?) but I don't know how to implement them into my kernel. Any pointers, tips or code samples (I have Googled, so don't point me towards Google ).
CPUID Processor Count, and SSE...
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Normally, the multimedia registers are not used in the kernel, as this would cause speed issues with state saving (upon kernel entry the fpu and sse state would need to be saved and restored upon return), while these registers might not even be used by the program (where lazy fpu switching can be used instead)
Enabling the multimedia extensions requires some bits in the control registers to be set correctly:
cr0.em should be cleared
cr0.mp should be set
cr0.ts should be cleared
for SSE you need to set these as well (don't set them if the system does not support it):
cr4.osfxsr should be set
cr4.osxmmexcpt should be set
note that on a hardware task switch cr0.ts will be set and any multimedia instructions after that will raise an exception to allow context switching.
My kernel uses that handler to save/restore the fpu context, and it sets cr0.ts on software task switches.
Enabling the multimedia extensions requires some bits in the control registers to be set correctly:
cr0.em should be cleared
cr0.mp should be set
cr0.ts should be cleared
for SSE you need to set these as well (don't set them if the system does not support it):
cr4.osfxsr should be set
cr4.osxmmexcpt should be set
note that on a hardware task switch cr0.ts will be set and any multimedia instructions after that will raise an exception to allow context switching.
My kernel uses that handler to save/restore the fpu context, and it sets cr0.ts on software task switches.
Re: CPUID Processor Count, and SSE...
Hi,
However, CPUID will tell you how many cores are in the same chip, and (for hyperthreading) will also tell you how many logical CPUs are in each core.
I can't remember the exact details, but IIRC Bochs is buggy when it comes to hyper-threading, and reports logical CPUs as multiple cores (or something like that). For my own purposes I rewrote the code in Bochs that emulates CPUID a while ago to avoid this problem (and others).
Cheers,
Brendan
CPUID will tell you information for that CPU only. This means that in your case (Bochs emulating 2 seperate chips) the information you're getting is correct - there's "0 + 1" logical CPUs in the chip.pcmattman wrote:I'm writing my CPUID interface (so that programs can get the CPU information) and it works properly apart from one small detail: when I use Bochs compiled for SMP support and enable 2 processors, the 'CPU count' field is 0, which seems wrong to me. Any ideas why, I'm not planning on using both processors, I would just like to be able to have the ability to know if there are 2 (or more) processors.
However, CPUID will tell you how many cores are in the same chip, and (for hyperthreading) will also tell you how many logical CPUs are in each core.
I can't remember the exact details, but IIRC Bochs is buggy when it comes to hyper-threading, and reports logical CPUs as multiple cores (or something like that). For my own purposes I rewrote the code in Bochs that emulates CPUID a while ago to avoid this problem (and others).
Except for Combuster's comments, I've got no idea how you'd use MMX or SSE in GCC. I'd assume you'd need to use inline assembly....pcmattman wrote:Also, how do I use SSE instructions? I've heard that they can help me (128-bit variables?) but I don't know how to implement them into my kernel. Any pointers, tips or code samples (I have Googled, so don't point me towards Google ).
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Output of cc1 --help from gcc 4.1.2:
Regards,
John.
Code: Select all
Target specific options:
-m128bit-long-double sizeof(long double) is 16
-m32 Generate 32bit i386 code
-m3dnow Support 3DNow! built-in functions
-m64 Generate 64bit x86-64 code
-m80387 Use hardware fp
-m96bit-long-double sizeof(long double) is 12
-maccumulate-outgoing-args Reserve space for outgoing arguments in the
function prologue
-malign-double Align some doubles on dword boundary
-malign-functions= Function starts are aligned to this power of 2
-malign-jumps= Jump targets are aligned to this power of 2
-malign-loops= Loop code aligned to this power of 2
-malign-stringops Align destination of the string operations
-march= Generate code for given CPU
-masm= Use given assembler dialect
-mbranch-cost= Branches are this expensive (1-5, arbitrary units)
-mcmodel= Use given x86-64 code model
-mfancy-math-387 Generate sin, cos, sqrt for FPU
-mfp-ret-in-387 Return values of functions in FPU registers
-mfpmath= Generate floating point mathematics using given
instruction set
-mhard-float Use hardware fp
-mieee-fp Use IEEE math for fp comparisons
-minline-all-stringops Inline all known string operations
-mlarge-data-threshold= Data greater than given threshold will go into
.ldata section in x86-64 medium model
-mmmx Support MMX built-in functions
-mms-bitfields Use native (MS) bitfield layout
-momit-leaf-frame-pointer Omit the frame pointer in leaf functions
-mpreferred-stack-boundary= Attempt to keep stack aligned to this power of 2
-mpush-args Use push instructions to save outgoing arguments
-mred-zone Use red-zone in the x86-64 code
-mregparm= Number of registers used to pass integer arguments
-mrtd Alternate calling convention
-msoft-float Do not use hardware fp
-msse Support MMX and SSE built-in functions and code
generation
-msse2 Support MMX, SSE and SSE2 built-in functions and
code generation
-msse3 Support MMX, SSE, SSE2 and SSE3 built-in
functions and code generation
-msseregparm Use SSE register passing conventions for SF and
DF mode
-mstack-arg-probe Enable stack probing
-msvr3-shlib Uninitialized locals in .bss
-mtls-dialect= Use given thread-local storage dialect
-mtls-direct-seg-refs Use direct references against %gs when accessing
tls data
-mtune= Schedule code for given CPU
John.