performance - How do SYSCALL/SYSRET instructions perform across x86 CPUs? -
syscall
, sysret
(and 32-bit-only intel counterparts sysenter
, sysexit
) described “generally faster” way enter , exit supervisor mode in x86 processors call gates or software interrupts, exact figures underlying claim remain largely undocumented. in particular, of intel or amd optimization guides able find contain no mention of these instructions @ all. so:
- how many cycles (estimated)
syscall
,sysret
take across recent intel 64 microarchitectures? measurable direct experimentation, there quite few of different cpus test.
depending on order of magnitude of number, more detailed questions may relevant:
- do incur complete pipeline stall, or other kind of stall?
- how, if @ all, interact branch prediction (e.g. return stack buffer) , fetch logic?
- what latencies, data dependencies, serialization?
- &tc.
assume 64-bit code on userspace side, no additional address-space switches (writes cr3) , matching syscall
, sysret
pairs if matters.
Comments
Post a Comment