Hi all,
I am wondering if the NOP instruction on x86 is issued on the processor bus or it's not even issued outside the CPU core.
Also, is there any indication in terms of the duration of the NOP instruction in cycles?
Lastly, if NOP is actually issued on the processor bus, is there any instruction that I can call that does not perform an operation but it is not issued on the processor bus (i.e. outside the CPU core itself).
Thanks a lot in advance.
Is NOP instruction on x86 issued on the processor bus?
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: Is NOP instruction on x86 issued on the processor bus?
Undefined,
undefined,
undefined.
If you want a specific amount of cycles, tune a function against the TSC for a low amount of cycles or use a timer for larger time quantities. Behaviour of individual instructions is only defined in respect of their visible effects. A NOP is allowed to stall for two minutes if the chip designer wants and still be completely correct.
The architecture does not define the bus and any bus silence can not be proven. HLT is the closest in practicalities you can get.
undefined,
undefined.
If you want a specific amount of cycles, tune a function against the TSC for a low amount of cycles or use a timer for larger time quantities. Behaviour of individual instructions is only defined in respect of their visible effects. A NOP is allowed to stall for two minutes if the chip designer wants and still be completely correct.
The architecture does not define the bus and any bus silence can not be proven. HLT is the closest in practicalities you can get.
Re: Is NOP instruction on x86 issued on the processor bus?
Hi,
The only instructions that deliberately (rather than accidentally) cause bus activity are the IO port instructions (e.g. IN, OUT) and reads and/or writes to physical addresses that aren't using the "write-back" cache type (e.g. areas configured as "uncacheable" in the MTRRs).
Cheers,
Brendan
In general, most instructions only cause bus activity as a (typically unwanted but unavoidable) side effect (e.g. cache miss, TLB fetch, etc). This includes instructions with a LOCK prefix, as modern 80x86 CPUs will try to do that inside the cache without any bus activity if possible.limp wrote:I am wondering if the NOP instruction on x86 is issued on the processor bus or it's not even issued outside the CPU core.
The only instructions that deliberately (rather than accidentally) cause bus activity are the IO port instructions (e.g. IN, OUT) and reads and/or writes to physical addresses that aren't using the "write-back" cache type (e.g. areas configured as "uncacheable" in the MTRRs).
No. In some CPUs it can be effectively zero cycles (e.g. discarded by the instruction decoder and never sent to any pipeline or stored in a trace cache); depending on where your bottleneck is (e.g. if the bottleneck is instruction fetch, then any instruction will cost something). In other CPUs (especially older CPUs) it may cost one or more cycles.limp wrote:Also, is there any indication in terms of the duration of the NOP instruction in cycles?
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Is NOP instruction on x86 issued on the processor bus?
Thanks a lot guys for your replies.
I guess intrstructions like CPUID, RDTSC or a UD2 (Undefined Instruction) shoudn't issue anything on the bus (all their effect should just be CPU internal)...any comments on that? I find it a great miss by Intel not to mention that in their instruction set refrence manual.
Thanks again guys.
That's quite clear Brendan, thanks...however, I need to ensure somehow that i execute an instruction that certainlty doesn't cause any bus actubity.Brendan wrote: In general, most instructions only cause bus activity as a (typically unwanted but unavoidable) side effect (e.g. cache miss, TLB fetch, etc). This includes instructions with a LOCK prefix, as modern 80x86 CPUs will try to do that inside the cache without any bus activity if possible.
The only instructions that deliberately (rather than accidentally) cause bus activity are the IO port instructions (e.g. IN, OUT) and reads and/or writes to physical addresses that aren't using the "write-back" cache type (e.g. areas configured as "uncacheable" in the MTRRs).
I guess intrstructions like CPUID, RDTSC or a UD2 (Undefined Instruction) shoudn't issue anything on the bus (all their effect should just be CPU internal)...any comments on that? I find it a great miss by Intel not to mention that in their instruction set refrence manual.
Ok, so we don't know for sure...my target is an Intel Atom 330 by the way.Brendan wrote: No. In some CPUs it can be effectively zero cycles (e.g. discarded by the instruction decoder and never sent to any pipeline or stored in a trace cache); depending on where your bottleneck is (e.g. if the bottleneck is instruction fetch, then any instruction will cost something). In other CPUs (especially older CPUs) it may cost one or more cycles.
Thanks again guys.
Re: Is NOP instruction on x86 issued on the processor bus?
Would you care to tell us why?limp wrote:I need to ensure somehow that i execute an instruction that certainlty doesn't cause any bus actubity.
Every good solution is obvious once you've found it.
Re: Is NOP instruction on x86 issued on the processor bus?
I just want to explore if that is a more efficient way for delaying without affecting the other cores by using any bus bandwidth.Solar wrote:Would you care to tell us why?limp wrote:I need to ensure somehow that i execute an instruction that certainlty doesn't cause any bus actubity.
Re: Is NOP instruction on x86 issued on the processor bus?
Hi,
CPUs don't run at a fixed speed - a CPU's speed will vary due to things like hyper-threading, turbo-boost, thermal throttling and power management. You must use a proper time source.
A proper time source may include the CPU's TSC (but only for newer CPUs that have the "invariant TSC" feature flag set in CPUID), local APIC timer (but be careful of certain sleep states), HPET, ACPI timer, PIT, RTC, etc.
Of these, the TSC and local APIC timer shouldn't involve bus activity. The rest all involve IO ports and/or reading from "uncacheable" parts of the physical address space.
For longer delays (delays greater than maybe 1 ms) you want something like "sleep()", where the scheduler doesn't give the task any CPU time and executes other tasks until the delay expires. You don't want to waste CPU time on "while(get_time() < start_time + six_hours) {}".
Cheers,
Brendan
It's not.limp wrote:I just want to explore if that is a more efficient way for delaying without affecting the other cores by using any bus bandwidth.
CPUs don't run at a fixed speed - a CPU's speed will vary due to things like hyper-threading, turbo-boost, thermal throttling and power management. You must use a proper time source.
A proper time source may include the CPU's TSC (but only for newer CPUs that have the "invariant TSC" feature flag set in CPUID), local APIC timer (but be careful of certain sleep states), HPET, ACPI timer, PIT, RTC, etc.
Of these, the TSC and local APIC timer shouldn't involve bus activity. The rest all involve IO ports and/or reading from "uncacheable" parts of the physical address space.
For longer delays (delays greater than maybe 1 ms) you want something like "sleep()", where the scheduler doesn't give the task any CPU time and executes other tasks until the delay expires. You don't want to waste CPU time on "while(get_time() < start_time + six_hours) {}".
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Is NOP instruction on x86 issued on the processor bus?
Cheers Brendan!