Skip to main content

Linux 6.18 Lands Retpoline Optimization To Help With Intel E Cores

·310 words·2 mins
MoriTsukiKawa
Author
MoriTsukiKawa
一只来自南极洲的企鹅程序员

The Linux 6.18 merge window is winding down this weekend ahead of Linux 6.18-rc1 expected on Sunday. Merged today were some remaining x86 core updates, which includes a Retpoline optimization patch intended to help out Intel E core CPUs.

Return trampolines “Retpolines ” are needed for Spectre Variant Two mitigations. Intel engineer Peter Zijlstra landed a patch for optimizing the x86 patch_retpoline() code within the kernel. He explains with the patch :

Currently the very common retpoline: “CS CALL __x86_indirect_thunk_r11” is transformed into “CALL *R11; NOP3” for eIBRS/BHI_NO parts.

Similarly, paranoid fineibt has: “CALL *R11; NOP”.

Recognise that CS stuffing can avoid the extra NOP. However, due to prefix decode penalties, make sure to not emit too many CS prefixes. Notably: “CS CALL __x86_indirect_thunk_rax” must not become “CS CS CS CS CALL *RAX”. Prefix decode penalties are typically many more cycles than decoding an extra NOP.

Additionally, if the retpoline is a tail-call, the “JMP *%\reg” should be followed by INT3 for straight-line-speculation mitigation, since emit_indirect() now has a length argument, move this into emit_indirect() such that other users (paranoid-fineibt) also do this.

The original mailing list post  for the patch adds more context:

“Finding the exact prefix decode penalties for uarchs that have eIBRS/BHI_NO is not a fun time. I’ve stuck to the general wisdom that 3 prefixes is mostly good (notably, the instruction at hand has no 0x0f escape which is sometimes counted towards the prefix budget – it can have a REX prefix, but those are generally not counted towards the prefix budget).

In general Intel P-cores do not have prefix decode penalties, but the E-cores (or rather the Atom line) generally does. And since this all runs on hybrid cores, the code must accommodate them.

I hate all this.”

That patch was merged to Linux Git today via the x86/core pull  ahead of Linux 6.18-rc1 tomorrow.