Skip to content
  • Lai Jiangshan's avatar
    1591584e
    x86/process/64: Move cpu_current_top_of_stack out of TSS · 1591584e
    Lai Jiangshan authored
    
    
    cpu_current_top_of_stack is currently stored in TSS.sp1. TSS is exposed
    through the cpu_entry_area which is visible with user CR3 when PTI is
    enabled and active.
    
    This makes it a coveted fruit for attackers.  An attacker can fetch the
    kernel stack top from it and continue next steps of actions based on the
    kernel stack.
    
    But it is actualy not necessary to be stored in the TSS.  It is only
    accessed after the entry code switched to kernel CR3 and kernel GS_BASE
    which means it can be in any regular percpu variable.
    
    The reason why it is in TSS is historical (pre PTI) because TSS is also
    used as scratch space in SYSCALL_64 and therefore cache hot.
    
    A syscall also needs the per CPU variable current_task and eventually
    __preempt_count, so placing cpu_current_top_of_stack next to them makes it
    likely that they end up in the same cache line which should avoid
    performance regressions. This is not enforced as the compiler is free to
    place these variables, so these entry relevant variables should move into
    a data structure to make this enforceable.
    
    The seccomp_benchmark doesn't show any performance loss in the "getpid
    native" test result.  Actually, the result changes from 93ns before to 92ns
    with this change when KPTI is disabled. The test is very stable and
    although the test doesn't show a higher degree of precision it gives enough
    confidence that moving cpu_current_top_of_stack does not cause a
    regression.
    
    [ tglx: Removed unneeded export. Massaged changelog ]
    
    Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Link: https://lore.kernel.org/r/20210125173444.22696-2-jiangshanlai@gmail.com
    1591584e
    x86/process/64: Move cpu_current_top_of_stack out of TSS
    Lai Jiangshan authored
    
    
    cpu_current_top_of_stack is currently stored in TSS.sp1. TSS is exposed
    through the cpu_entry_area which is visible with user CR3 when PTI is
    enabled and active.
    
    This makes it a coveted fruit for attackers.  An attacker can fetch the
    kernel stack top from it and continue next steps of actions based on the
    kernel stack.
    
    But it is actualy not necessary to be stored in the TSS.  It is only
    accessed after the entry code switched to kernel CR3 and kernel GS_BASE
    which means it can be in any regular percpu variable.
    
    The reason why it is in TSS is historical (pre PTI) because TSS is also
    used as scratch space in SYSCALL_64 and therefore cache hot.
    
    A syscall also needs the per CPU variable current_task and eventually
    __preempt_count, so placing cpu_current_top_of_stack next to them makes it
    likely that they end up in the same cache line which should avoid
    performance regressions. This is not enforced as the compiler is free to
    place these variables, so these entry relevant variables should move into
    a data structure to make this enforceable.
    
    The seccomp_benchmark doesn't show any performance loss in the "getpid
    native" test result.  Actually, the result changes from 93ns before to 92ns
    with this change when KPTI is disabled. The test is very stable and
    although the test doesn't show a higher degree of precision it gives enough
    confidence that moving cpu_current_top_of_stack does not cause a
    regression.
    
    [ tglx: Removed unneeded export. Massaged changelog ]
    
    Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Link: https://lore.kernel.org/r/20210125173444.22696-2-jiangshanlai@gmail.com
Loading