Page 1 of 1

Write Through

Posted: Thu Jun 16, 2005 1:59 pm
by dc0d32
How does the so called "Write through" technique work? Does it affect L2 alone or both L1 and L2 caches?

Re:Write Through

Posted: Thu Jun 16, 2005 2:36 pm
by mystran
Write through stores writes in both cache and real memory at once. Write back, stores writes to cache, and syncs to real memory only when the cache line is discarded.

So write through is useful when you want to get your writes to real memory when you perform them. Good idea for things like framebuffers, audio buffers.

Performance of Write-back is usually superior.

AFAIK cache settings affect both L1 and L2 cache.

Re:Write Through

Posted: Thu Jun 16, 2005 10:26 pm
by Brendan
Hi,

It's possibly worth pointing out that there's 6 different methods the CPU can use:

Strong uncacheable
Uncacheable
Write combining
Write through
Write back
Write protected

Write combining is where writes go into a special CPU buffer and are sent out in larger chunks - typically used for video display memory.

The method used for any one access is determined by combining the MTRR and the PAT entry for the page.

MTRR is the Memory Type Range Registers - MSR's that apply to physical addresses. The Page Attribute Table is configured through MSR's too, but the index into this table comes from page table entries. On older CPUs there's 2 cache control flags in the page table entries instead.

In general the MTRRs set the highest usable caching type, and the PAT entry reduces it - for example, if the MTRRs say an access uses write back then the PAT entry can change the access to any caching type, but if the MTRRs says an access is uncachable then the PAT entry can't override this (although write combining is allowed).

Then there's some global flags - CD and NW in CR0 (used as a "master switch"), and PCD and PWT in CR3 (used as a "master switch" for the current address space only).

Mystran is right too - the final setting effects all CPU caches (except the TLB, which is a completely different story).


Cheers,

Brendan

Re:Write Through

Posted: Fri Jun 17, 2005 12:45 am
by dc0d32
thanks
then how do i enable it for the linear framebuffer [i486+] ?
thanks

Re:Write Through

Posted: Fri Jun 17, 2005 1:27 am
by Brendan
Hi,
prashant wrote:then how do i enable it for the linear framebuffer [i486+] ?
To set a LFB to write combining, you'd need to set the MTRR's and PAT entry to one of the following combinations:

[tt]MTRR PAT
UC WC
WC UC (not strong uncachable)
WC WC
WC WB
WT WC
WB WC
WP WC

Where:
UC = uncachable
WC = write combining
WT = write through
WB = write back
WP = write protected[/tt]

The PAT itself was introduced with the Pentium III processor, and defaults to "compatibility" values so you'd need to modify the PAT MSRs and the page table entries to set a page to write combining. The compatibility values are used because the PAT can't be disabled. The index into the PAT comes from bit 12 and the old PCD and PWT flags in the page table entry. For older CPUs the PCD and PWT flags in the page table entry can be used to select uncachable, write-through or write back only.

The MTRRs were introduced with the P6 family of processors but you should check CPUID to see if it's supported or not. Once you know MTRRs are supported you'd need to check a flag (bit 10) in the "IA32_MTRRCAP" MSR (MTRR capabilities) to see if write combining is also supported (IIRC write combining is supported on all Intel CPUs that support MTRRs, but other CPU manufacturers may differ).

This creates a painful situation - for older CPUs there is no write combining, for P6 you need to set MTRRs and for Pentium III and above you can leave the MTRRs set to uncachable and just use the page table entries and PAT.

There's also some caution needed. If the same physical area can be accessed via. different page tables with different PAT values the CPU can get messed up (make sure each physical page always uses the same caching method). You also need to be careful if you're dealing with multi-CPUs (and hyper-threading) - in this case all CPUs must have the same MTTR values and the same PAT values, and you need to make sure each physical page always uses the same caching method on all CPUs (otherwise the MP cache snooping fails). For hyper-threading the same MTRR and PAT MSRs are shared on current implementations (but this is an implementation defined thing and may change, so it's best/easiest to ignore this and do changes on each logical CPU anyway).

This means when changing the caching, the safest sequence is to use CD in CR0 to disable all caching, then issue a WBINVD to flush any caches, then change the MTRR and/or PAT MSRs, then enable caching again with CD in CRO. For multi-CPU this sequence should be followed by all CPUs, and synchronized to ensure consistancy (ie. synchronize, disable caching and do WBINVD on all CPUs, synchronize, make changes, synchronize, then re-enable caching on all CPUs). In any case (even with single CPU) it's probably advisable (but not necessarily required) to disable interrupts during this sequence...

For older CPUs that don't support write combining, use write through. It'll waste some cache space but it's better than using uncachable.

Also, if you're writing a video driver and using memory mapped IO make sure that you only use write combining (or write through) for display memory and not the control registers (which should be set to uncachable).


Cheers,

Brendan