Page 1 of 1

Reverse engineering Intel Speed Step

Posted: Tue May 28, 2013 9:15 am
by rdos
Key to doing power management on Intel processors seems to be the IA32_PERF_CTL and IA32_PERF_STATUS MSRs. The problem is that Intel doesn't document the 16-bit ID field in IA32_PERF_CTL, and neither the fields in IA32_PERF_STATUS.

However, on a real system (Intel Atom N455), trying to set different values seems indicative.

First, it seems like IA32_PERF_CTL is composed of two different values. The higher byte can range between 6 and 0xA on my CPU, while the lower byte can range between 0x13 and 0x24. Also, the CPU doesn't allow values higher than 0x3F in the lower byte. It seems like the lower byte should be the VID values documented in the Intel Atom manual. After boot, the frequency value is 0xA and the VID value is 0x24 (which corresponds to 1.05v and 1.66GHz according to BIOS).

The IA32_PERF_STATUS seems to echo the requested frequency and VID values in the lower word. The actual operating point seems to be in the higher 32-bits.

The question is how to design a power management system for Intel CPUs without relying on processor specific drivers (which you need to obtain from Intel, and they won't have those for your OS)? Could a driver just assume that frequency and VID values are coded in the low word, and try to change them and observe the effects?

The bad thing is that ACPI doesn't provide relevant objects for supported P-states and their frequencies (the HI0 and HC0 objects are undocumented, while the PDC and OSC objects are useless).

Also interesting is this web page: https://lkml.org/lkml/2007/6/4/238

Re: Reverse engineering Intel Speed Step

Posted: Tue May 28, 2013 3:04 pm
by rdos
I've tried the code on a dual core Intel Atom, and it seems to live it's own life. The status values changes dynamically between cores, which make it hard to figure out how it works.

So I did a more complex test that can set a requested value to IA32_PERF_CTL in all cores, while also reading-out the two control registers. Looking at the results, it seems like high values in the high byte (possibly requested frequency) will not be reflected in all cores, while low values generally will be. The same seems to be the case for the low byte.

I also tried the performance MSRs (IA32_MPERF and IA32_APERF), but these seems to be useless to measure actual processor frequency. I think I need to write a stress application that times it's own performance in order to see which of the settings is the frequency.

Re: Reverse engineering Intel Speed Step

Posted: Tue May 28, 2013 4:49 pm
by jnc100
Do you have a _PSS object in your ACPI namespace? According to the ACPI specification it provides a mapping between core frequency, power and IA32_PERF_CTL values for your system.

Regards,
John.

Re: Reverse engineering Intel Speed Step

Posted: Tue May 28, 2013 6:06 pm
by Cognition
jnc100 wrote:Do you have a _PSS object in your ACPI namespace? According to the ACPI specification it provides a mapping between core frequency, power and IA32_PERF_CTL values for your system.

Regards,
John.
That's the correct way to do things on an ACPI system. Some relavent documentation is here.

Re: Reverse engineering Intel Speed Step

Posted: Wed May 29, 2013 12:35 am
by rdos
No, none of my Intel Atom computers have a _PSS object. The dual core machine does have information on frequency modulation, but that is not really what I want. I want the power interface that can lower voltage in order to save power and decrease temperature. The frequency modulation interface mostly seems useless as the hlt instruction would do the same thing (basically).

Both have the _PDC and _OSC objects, but AFAIK this doesn't help much, unless setting bits in _PDC would expose new objects.

Re: Reverse engineering Intel Speed Step

Posted: Wed May 29, 2013 9:01 am
by rdos
Using a "load-application" that creates one thread per processor-core, I now know that the high byte in IA32_PERF_CTL is the frequency. My computer supports from 1GHz to 1.66GHz in 5 steps (settings 06 to 0A).

Using a ampere meter I also know that the lower byte affects power consumption, and that setting lower values will lower power consumption. Thus, the lower byte must be the VID setting (settings 13 to 24).

There is also another peculiarity that seems to hold for both Atom processors, and that is that the limits are in the higher 32 bits. Thus, the minimum setting (lowest power and frequency) is in bit 48 to 63 and the maximum setting (highest power and frequency) is in bit 32 to 47.

This means that I have reversed-engineered the power control of the Intel Atom family at least. What is less clear is which combinations that are guaranteed to work, since it is possible to set any combination of VID and frequency, which might cause malfunction. The power measures seems to show that these are also activated.

Total power: (24 v supply voltage)
Unloaded after boot: 0.607A / 14.5W
Loaded after boot: 0.673A / 16.2W (f=1.66GHz, VID=24h)
Loaded min freq: 0.643A / 15.4W (f=1GHz, VID=24h)
Loaded min freq: 0.610A / 14.6W (f=1GHz, VID=13h)
Unloaded min freq: 0.584A / 14.0W (f=1GHz, VID=13h)

Edit: The dual core Atom processor has the limits in the top 32 bits as well. It also seems like the frequency value can be used to calculate the operating frequency as f = nominal frequency * freq value / max freq value.

I think a reasonable algorithm for the VID value is to use a linear algorithm. For instance, if the selected frequency value is in between min and max, then so would the VID value.

New power management algorithm now running on the dual core Atom processor, and it appears to work. :wink:

Re: Reverse engineering Intel Speed Step

Posted: Wed May 29, 2013 2:58 pm
by jnc100
rdos wrote:No, none of my Intel Atom computers have a _PSS object.
That's odd, as its a relatively modern chip and according to Microsoft they require this object for processor speed state control in Win7. Are you searching in both the _SB and _PR namespaces? Also, are you writing PSTATE_CNT to SMI_CMD? It may be (I'm theorising here) that you need to do this before the BIOS exposes the relevant ACPI objects.

Regards,
John.

Re: Reverse engineering Intel Speed Step

Posted: Wed May 29, 2013 3:18 pm
by rdos
Here is the full ACPI namespace on the dual core:

Code: Select all

\_SB_ (TCG1, PPRQ, PPLO, PPRP, PPOR, TPRS, TPMV, MOR_, PHSR, SMI0, SMIC, SMI1, BCMD, DID_, INFO, INF_, PR00, AR00, PR04, AR04, PR05, AR05, PR06, AR06, PR07, AR07, PR01, AR01, SNVS, SECI, DB00, DW00, OSYS, BFCC, PVFN, IGDS, TLST, CADL, PADL, CSTE, NSTE, SSTE, NDID, BRTL, PSVT, TC1V, TC2V, TSPV, CRTT, ACTT, MPEN, PPCS, PPCM, PCP0, PCP1, GSSR, DIAG, TZON, NIST, RIST, RCST, CCST, RCNT, C3SU, C1ON, BMLF, TEST, MDEL, BCMV, C4AC, TOUA, MSEC, SECT, TRPS, SECS, SECB, SECW, STRP, SOST, OSTB, OSTY, TPOS, OSTP, OSHT, SEQL)
\_SB_.PCI0 (MPCE, PEXE, LENG, EXBA, _INI, _HID, _CID, _ADR, _OSC, REGS, PAM0, PAM1, PAM2, PAM3, PAM4, PAM5, PAM6, HEN_, TASM, Z000, RSRC, _CRS, _S3D, _S4D, _PRT, NATA, GETP, GETD, GETT, GETF, SETP, SETD, SETT)
\_SB_.PCI0.IGD0 (_ADR, IGDP, GIVD, GUMA, GMFN, SSRW, ASLE, GSSE, GSSB, GSES, CDVL, ASLS, _STA, IGDM, SIGN, SIZE, OVER, SVER, VVER, GVER, MBOX, DRDY, CSTS, CEVT, DIDL, CPDL, CADL, NADL, ASLP, TIDX, CHPD, CLID, CDCK, SXSW, EVTS, CNOT, NRDY, SCIE, GEFC, GXFC, GESF, PARM, DSLP, ARDY, ASLC, TCHE, ALSI, BCLP, PFIT, GVD1, IBTT, IPAT, ITVF, ITVM, IPSC, IBLC, IBIA, ISSC, I409, I509, I609, I709, IDMM, IDMS, IF1E, GSMI, HVCO, LIDS, CGCS, DBTB, SUCC, NVLD, CRIT, NCRT, GBDA, SBCB, OPRN, _DOS, _DOD, BRTN)
\_SB_.PCI0.IGD0.DD01 (_ADR, _DCS, _DGS, _DSS)
\_SB_.PCI0.IGD0.DD02 (_ADR, _DCS, _DGS, _DSS)
\_SB_.PCI0.IGD0.DD03 (_ADR, _DCS, _DGS, _DSS)
\_SB_.PCI0.IGD0.DD04 (_ADR, _DCS, _DGS, _DSS, _BCL, _BCM, _BQC)
\_SB_.PCI0.IGD0.DD05 (_ADR, _DCS, _DGS, _DSS)
\_SB_.PCI0.EXP1 (_ADR, P1CS, ABP1, PDC1, PDS1, PSP1, HPCS, PMCS, _PRW, _PRT)
\_SB_.PCI0.EXP1.PXS1 (_ADR, X1CS, X1DV, _RMV)
\_SB_.PCI0.EXP2 (_ADR, P2CS, ABP2, PDC2, PDS2, PSP2, HPCS, PMCS, _PRW, _PRT)
\_SB_.PCI0.EXP2.PXS2 (_ADR, X2CS, X2DV, _RMV)
\_SB_.PCI0.EXP3 (_ADR, P3CS, ABP3, PDC3, PDS3, PSP3, HPCS, PMCS, _PRW, _PRT)
\_SB_.PCI0.EXP3.PXS3 (_ADR, X3CS, X3DV, _RMV)
\_SB_.PCI0.EXP4 (_ADR, P4CS, ABP4, PDC4, PDS4, PSP4, HPCS, PMCS, _PRW, _PRT)
\_SB_.PCI0.EXP4.PXS4 (_ADR, X4CS, X4DV, _RMV)
\_SB_.PCI0.PCIB (_ADR, _PRW, _PRT)
\_SB_.PCI0.LPC0 (_ADR, DVEN, TCOI, SCIS, DECD, MMTO, MTSE, GPOX, IO27, LV27, BL27, PIRX, PIRA, PIRB, PIRC, PIRD, PIRY, PIRE, PIRF, PIRG, PIRH, ELR0, PBLV, ELSS, ELST, ELPB, ELLO, ELGN, ELYL, ELBE, ELIE, ELSN, ELOC, ELSO, ROUT, GPI0, GPI1, GPI2, GPI3, GPI4, GPI5, GPI6, GPI7, GPI8, GPI9, GP10, GP11, GP12, GP13, GP14, GP15, PMIO, GPES, GPEE, REGS, PMBA, GPBA)
\_SB_.PCI0.LPC0.H_EC (_HID, _UID, _CRS, ECR_, SPTR, SSTS, SADR, SCMD, SBFR, SCNT, B1EX, ACEX, SWBE, DCBE, WLST, LIDS, B1ST, BRIT, B1RP, B1RA, B1PR, B1VO, B1DA, B1DF, B1DV, B1DL, CTMP, TIST, B1TI, B1SE, B1CR, B1TM, _REG, _GPE, _Q51, _Q52, _Q53, _Q54, _Q55, _Q5B, _Q5D, _Q5E, _Q5F, _Q60, _Q61, _Q63, _Q64, _Q65, _Q66, _Q68, _Q69, _Q70, _Q73, _Q76, _Q77, _Q79, _Q7A, _Q7D, _Q7E, _Q7F, _Q80)
\_SB_.PCI0.LPC0.H_EC.BAT1 (_HID, _UID, BATI, _BIF, STAT, _BST, _STA, _PCL)
\_SB_.PCI0.LPC0.MBRD (_HID, _UID, RSRC, _CRS)
\_SB_.PCI0.LPC0.DMAC (_HID, _CRS)
\_SB_.PCI0.LPC0.MATH (_HID, _CRS)
\_SB_.PCI0.LPC0.PIC_ (_HID, _CRS)
\_SB_.PCI0.LPC0.RTC_ (_HID, BUF0, BUF1, _CRS)
\_SB_.PCI0.LPC0.SPKR (_HID, _CRS)
\_SB_.PCI0.LPC0.TIMR (_HID, BUF0, BUF1, _CRS)
\_SB_.PCI0.LPC0.TPM_ (_HID, _CID, _UID, _STA, BUF0, BUF1, BUF2, _CRS, UCMP, _DSM)
\_SB_.PCI0.LPC0.KBC0 (_HID, _CRS)
\_SB_.PCI0.LPC0.MSE0 (_HID, _STA, _CRS)
\_SB_.PCI0.LPC0.MSE1 (_HID, _STA, _CID, _CRS)
\_SB_.PCI0.LPC0.LNKA (_HID, _UID, _PRS, RSRC, _DIS, _CRS, _SRS, _STA)
\_SB_.PCI0.LPC0.LNKB (_HID, _UID, _PRS, RSRC, _DIS, _CRS, _SRS, _STA)
\_SB_.PCI0.LPC0.LNKC (_HID, _UID, _PRS, RSRC, _DIS, _CRS, _SRS, _STA)
\_SB_.PCI0.LPC0.LNKD (_HID, _UID, _PRS, RSRC, _DIS, _CRS, _SRS, _STA)
\_SB_.PCI0.LPC0.LNKE (_HID, _UID, _PRS, RSRC, _DIS, _CRS, _SRS, _STA)
\_SB_.PCI0.LPC0.LNKF (_HID, _UID, _PRS, RSRC, _DIS, _CRS, _SRS, _STA)
\_SB_.PCI0.LPC0.LNKG (_HID, _UID, _PRS, RSRC, _DIS, _CRS, _SRS, _STA)
\_SB_.PCI0.LPC0.LNKH (_HID, _UID, _PRS, RSRC, _DIS, _CRS, _SRS, _STA)
\_SB_.PCI0.LPC0.FWH_ (_HID, _CRS)
\_SB_.PCI0.IDE1 (_ADR, IDEP, PCMD, IDES, SCMD, IDEC, PRIT, SECT, PSIT, SSIT, SDMA, SDT0, SDT1, SDT2, SDT3, ICR0, ICR1, ICR2, ICR3, ICR4, ICR5, IDE1, MAP_, PCS_, PBIO, PBSY, SBIO, SBSY, BSSP, CTYP)
\_SB_.PCI0.IDE1.PRID (_ADR, _GTM, _STM, _PS0, _PS3)
\_SB_.PCI0.IDE1.PRID.P_D0 (_ADR, _GTF)
\_SB_.PCI0.IDE1.PRID.P_D1 (_ADR, _GTF)
\_SB_.PCI0.IDE1.SECD (_ADR, _GTM, _STM, _PS0, _PS3)
\_SB_.PCI0.IDE1.SECD.S_D0 (_ADR, _GTF)
\_SB_.PCI0.IDE1.SECD.S_D1 (_ADR, _GTF)
\_SB_.PCI0.SMBS (_ADR)
\_SB_.PCI0.USB1 (_ADR, USBO, RSEN, UPRW, _PSW, _S3D, _S4D)
\_SB_.PCI0.USB1.RHUB (_ADR)
\_SB_.PCI0.USB1.RHUB.PRT2 (_ADR, _UPC)
\_SB_.PCI0.USB2 (_ADR, USBO, RSEN, UPRW, _PSW, _S3D, _S4D)
\_SB_.PCI0.USB2.RHUB (_ADR)
\_SB_.PCI0.USB2.RHUB.PRT2 (_ADR, _UPC)
\_SB_.PCI0.USB3 (_ADR, USBO, RSEN, UPRW, _PSW, _S3D, _S4D)
\_SB_.PCI0.USB3.RHUB (_ADR)
\_SB_.PCI0.USB3.RHUB.PRT2 (_ADR, _UPC)
\_SB_.PCI0.USB4 (_ADR, USBO, RSEN, EPRW, _PSW, _S3D, _S4D)
\_SB_.PCI0.USB4.RHUB (_ADR)
\_SB_.PCI0.USB4.RHUB.PRT2 (_ADR, _UPC)
\_SB_.PCI0.EUSB (_ADR, UPRW, _S3D, _S4D)
\_SB_.PCI0.EUSB.RHUB (_ADR)
\_SB_.PCI0.EUSB.RHUB.PRT2 (_ADR, _UPC)
\_SB_.PCI0.EUSB.RHUB.PRT4 (_ADR, _UPC)
\_SB_.PCI0.EUSB.RHUB.PRT6 (_ADR, _UPC)
\_SB_.PCI0.EUSB.RHUB.PRT8 (_ADR, _UPC)
\_SB_.ADP1 (_HID, _PSR, _PCL, _STA)
\_SB_.LID0 (_HID, _LID)
\_SB_.PWRB (_HID, _PRW)
\_SB_.SLPB (_HID)
\_TZ_ (TZ00)
\_PR_.CPU0 (_TPC, _PTC, TSSI, TSSM, TSSF, _TSS, _TSD, HI0_, HC0_, _PDC, _OSC)
\_PR_.CPU1 (_TPC, _PTC, _TSS, _TSD, HI1_, HC1_, _PDC, _OSC)
\_PR_.CPU2 (_TPC, _PTC, _TSS, _TSD, HI2_, HC2_, _PDC, _OSC)
\_PR_.CPU3 (_TPC, _PTC, _TSS, _TSD, HI3_, HC3_, _PDC, _OSC)
I use Acpica, and it should identify itself as Windows.