stack-validation.txt 14 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362
  1. Compile-time stack metadata validation
  2. ======================================
  3. Overview
  4. --------
  5. The kernel CONFIG_STACK_VALIDATION option enables a host tool named
  6. objtool which runs at compile time. It has a "check" subcommand which
  7. analyzes every .o file and ensures the validity of its stack metadata.
  8. It enforces a set of rules on asm code and C inline assembly code so
  9. that stack traces can be reliable.
  10. Currently it only checks frame pointer usage, but there are plans to add
  11. CFI validation for C files and CFI generation for asm files.
  12. For each function, it recursively follows all possible code paths and
  13. validates the correct frame pointer state at each instruction.
  14. It also follows code paths involving special sections, like
  15. .altinstructions, __jump_table, and __ex_table, which can add
  16. alternative execution paths to a given instruction (or set of
  17. instructions). Similarly, it knows how to follow switch statements, for
  18. which gcc sometimes uses jump tables.
  19. Why do we need stack metadata validation?
  20. -----------------------------------------
  21. Here are some of the benefits of validating stack metadata:
  22. a) More reliable stack traces for frame pointer enabled kernels
  23. Frame pointers are used for debugging purposes. They allow runtime
  24. code and debug tools to be able to walk the stack to determine the
  25. chain of function call sites that led to the currently executing
  26. code.
  27. For some architectures, frame pointers are enabled by
  28. CONFIG_FRAME_POINTER. For some other architectures they may be
  29. required by the ABI (sometimes referred to as "backchain pointers").
  30. For C code, gcc automatically generates instructions for setting up
  31. frame pointers when the -fno-omit-frame-pointer option is used.
  32. But for asm code, the frame setup instructions have to be written by
  33. hand, which most people don't do. So the end result is that
  34. CONFIG_FRAME_POINTER is honored for C code but not for most asm code.
  35. For stack traces based on frame pointers to be reliable, all
  36. functions which call other functions must first create a stack frame
  37. and update the frame pointer. If a first function doesn't properly
  38. create a stack frame before calling a second function, the *caller*
  39. of the first function will be skipped on the stack trace.
  40. For example, consider the following example backtrace with frame
  41. pointers enabled:
  42. [<ffffffff81812584>] dump_stack+0x4b/0x63
  43. [<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30
  44. [<ffffffff8127f568>] seq_read+0x108/0x3e0
  45. [<ffffffff812cce62>] proc_reg_read+0x42/0x70
  46. [<ffffffff81256197>] __vfs_read+0x37/0x100
  47. [<ffffffff81256b16>] vfs_read+0x86/0x130
  48. [<ffffffff81257898>] SyS_read+0x58/0xd0
  49. [<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76
  50. It correctly shows that the caller of cmdline_proc_show() is
  51. seq_read().
  52. If we remove the frame pointer logic from cmdline_proc_show() by
  53. replacing the frame pointer related instructions with nops, here's
  54. what it looks like instead:
  55. [<ffffffff81812584>] dump_stack+0x4b/0x63
  56. [<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30
  57. [<ffffffff812cce62>] proc_reg_read+0x42/0x70
  58. [<ffffffff81256197>] __vfs_read+0x37/0x100
  59. [<ffffffff81256b16>] vfs_read+0x86/0x130
  60. [<ffffffff81257898>] SyS_read+0x58/0xd0
  61. [<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76
  62. Notice that cmdline_proc_show()'s caller, seq_read(), has been
  63. skipped. Instead the stack trace seems to show that
  64. cmdline_proc_show() was called by proc_reg_read().
  65. The benefit of objtool here is that because it ensures that *all*
  66. functions honor CONFIG_FRAME_POINTER, no functions will ever[*] be
  67. skipped on a stack trace.
  68. [*] unless an interrupt or exception has occurred at the very
  69. beginning of a function before the stack frame has been created,
  70. or at the very end of the function after the stack frame has been
  71. destroyed. This is an inherent limitation of frame pointers.
  72. b) 100% reliable stack traces for DWARF enabled kernels
  73. (NOTE: This is not yet implemented)
  74. As an alternative to frame pointers, DWARF Call Frame Information
  75. (CFI) metadata can be used to walk the stack. Unlike frame pointers,
  76. CFI metadata is out of band. So it doesn't affect runtime
  77. performance and it can be reliable even when interrupts or exceptions
  78. are involved.
  79. For C code, gcc automatically generates DWARF CFI metadata. But for
  80. asm code, generating CFI is a tedious manual approach which requires
  81. manually placed .cfi assembler macros to be scattered throughout the
  82. code. It's clumsy and very easy to get wrong, and it makes the real
  83. code harder to read.
  84. Stacktool will improve this situation in several ways. For code
  85. which already has CFI annotations, it will validate them. For code
  86. which doesn't have CFI annotations, it will generate them. So an
  87. architecture can opt to strip out all the manual .cfi annotations
  88. from their asm code and have objtool generate them instead.
  89. We might also add a runtime stack validation debug option where we
  90. periodically walk the stack from schedule() and/or an NMI to ensure
  91. that the stack metadata is sane and that we reach the bottom of the
  92. stack.
  93. So the benefit of objtool here will be that external tooling should
  94. always show perfect stack traces. And the same will be true for
  95. kernel warning/oops traces if the architecture has a runtime DWARF
  96. unwinder.
  97. c) Higher live patching compatibility rate
  98. (NOTE: This is not yet implemented)
  99. Currently with CONFIG_LIVEPATCH there's a basic live patching
  100. framework which is safe for roughly 85-90% of "security" fixes. But
  101. patches can't have complex features like function dependency or
  102. prototype changes, or data structure changes.
  103. There's a strong need to support patches which have the more complex
  104. features so that the patch compatibility rate for security fixes can
  105. eventually approach something resembling 100%. To achieve that, a
  106. "consistency model" is needed, which allows tasks to be safely
  107. transitioned from an unpatched state to a patched state.
  108. One of the key requirements of the currently proposed livepatch
  109. consistency model [*] is that it needs to walk the stack of each
  110. sleeping task to determine if it can be transitioned to the patched
  111. state. If objtool can ensure that stack traces are reliable, this
  112. consistency model can be used and the live patching compatibility
  113. rate can be improved significantly.
  114. [*] https://lkml.kernel.org/r/cover.1423499826.git.jpoimboe@redhat.com
  115. Rules
  116. -----
  117. To achieve the validation, objtool enforces the following rules:
  118. 1. Each callable function must be annotated as such with the ELF
  119. function type. In asm code, this is typically done using the
  120. ENTRY/ENDPROC macros. If objtool finds a return instruction
  121. outside of a function, it flags an error since that usually indicates
  122. callable code which should be annotated accordingly.
  123. This rule is needed so that objtool can properly identify each
  124. callable function in order to analyze its stack metadata.
  125. 2. Conversely, each section of code which is *not* callable should *not*
  126. be annotated as an ELF function. The ENDPROC macro shouldn't be used
  127. in this case.
  128. This rule is needed so that objtool can ignore non-callable code.
  129. Such code doesn't have to follow any of the other rules.
  130. 3. Each callable function which calls another function must have the
  131. correct frame pointer logic, if required by CONFIG_FRAME_POINTER or
  132. the architecture's back chain rules. This can by done in asm code
  133. with the FRAME_BEGIN/FRAME_END macros.
  134. This rule ensures that frame pointer based stack traces will work as
  135. designed. If function A doesn't create a stack frame before calling
  136. function B, the _caller_ of function A will be skipped on the stack
  137. trace.
  138. 4. Dynamic jumps and jumps to undefined symbols are only allowed if:
  139. a) the jump is part of a switch statement; or
  140. b) the jump matches sibling call semantics and the frame pointer has
  141. the same value it had on function entry.
  142. This rule is needed so that objtool can reliably analyze all of a
  143. function's code paths. If a function jumps to code in another file,
  144. and it's not a sibling call, objtool has no way to follow the jump
  145. because it only analyzes a single file at a time.
  146. 5. A callable function may not execute kernel entry/exit instructions.
  147. The only code which needs such instructions is kernel entry code,
  148. which shouldn't be be in callable functions anyway.
  149. This rule is just a sanity check to ensure that callable functions
  150. return normally.
  151. Errors in .S files
  152. ------------------
  153. If you're getting an error in a compiled .S file which you don't
  154. understand, first make sure that the affected code follows the above
  155. rules.
  156. Here are some examples of common warnings reported by objtool, what
  157. they mean, and suggestions for how to fix them.
  158. 1. asm_file.o: warning: objtool: func()+0x128: call without frame pointer save/setup
  159. The func() function made a function call without first saving and/or
  160. updating the frame pointer.
  161. If func() is indeed a callable function, add proper frame pointer
  162. logic using the FRAME_BEGIN and FRAME_END macros. Otherwise, remove
  163. its ELF function annotation by changing ENDPROC to END.
  164. If you're getting this error in a .c file, see the "Errors in .c
  165. files" section.
  166. 2. asm_file.o: warning: objtool: .text+0x53: return instruction outside of a callable function
  167. A return instruction was detected, but objtool couldn't find a way
  168. for a callable function to reach the instruction.
  169. If the return instruction is inside (or reachable from) a callable
  170. function, the function needs to be annotated with the ENTRY/ENDPROC
  171. macros.
  172. If you _really_ need a return instruction outside of a function, and
  173. are 100% sure that it won't affect stack traces, you can tell
  174. objtool to ignore it. See the "Adding exceptions" section below.
  175. 3. asm_file.o: warning: objtool: func()+0x9: function has unreachable instruction
  176. The instruction lives inside of a callable function, but there's no
  177. possible control flow path from the beginning of the function to the
  178. instruction.
  179. If the instruction is actually needed, and it's actually in a
  180. callable function, ensure that its function is properly annotated
  181. with ENTRY/ENDPROC.
  182. If it's not actually in a callable function (e.g. kernel entry code),
  183. change ENDPROC to END.
  184. 4. asm_file.o: warning: objtool: func(): can't find starting instruction
  185. or
  186. asm_file.o: warning: objtool: func()+0x11dd: can't decode instruction
  187. Did you put data in a text section? If so, that can confuse
  188. objtool's instruction decoder. Move the data to a more appropriate
  189. section like .data or .rodata.
  190. 5. asm_file.o: warning: objtool: func()+0x6: kernel entry/exit from callable instruction
  191. This is a kernel entry/exit instruction like sysenter or sysret.
  192. Such instructions aren't allowed in a callable function, and are most
  193. likely part of the kernel entry code.
  194. If the instruction isn't actually in a callable function, change
  195. ENDPROC to END.
  196. 6. asm_file.o: warning: objtool: func()+0x26: sibling call from callable instruction with changed frame pointer
  197. This is a dynamic jump or a jump to an undefined symbol. Stacktool
  198. assumed it's a sibling call and detected that the frame pointer
  199. wasn't first restored to its original state.
  200. If it's not really a sibling call, you may need to move the
  201. destination code to the local file.
  202. If the instruction is not actually in a callable function (e.g.
  203. kernel entry code), change ENDPROC to END.
  204. 7. asm_file: warning: objtool: func()+0x5c: frame pointer state mismatch
  205. The instruction's frame pointer state is inconsistent, depending on
  206. which execution path was taken to reach the instruction.
  207. Make sure the function pushes and sets up the frame pointer (for
  208. x86_64, this means rbp) at the beginning of the function and pops it
  209. at the end of the function. Also make sure that no other code in the
  210. function touches the frame pointer.
  211. Errors in .c files
  212. ------------------
  213. 1. c_file.o: warning: objtool: funcA() falls through to next function funcB()
  214. This means that funcA() doesn't end with a return instruction or an
  215. unconditional jump, and that objtool has determined that the function
  216. can fall through into the next function. There could be different
  217. reasons for this:
  218. 1) funcA()'s last instruction is a call to a "noreturn" function like
  219. panic(). In this case the noreturn function needs to be added to
  220. objtool's hard-coded global_noreturns array. Feel free to bug the
  221. objtool maintainer, or you can submit a patch.
  222. 2) funcA() uses the unreachable() annotation in a section of code
  223. that is actually reachable.
  224. 3) If funcA() calls an inline function, the object code for funcA()
  225. might be corrupt due to a gcc bug. For more details, see:
  226. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646
  227. 2. If you're getting any other objtool error in a compiled .c file, it
  228. may be because the file uses an asm() statement which has a "call"
  229. instruction. An asm() statement with a call instruction must declare
  230. the use of the stack pointer in its output operand. For example, on
  231. x86_64:
  232. register void *__sp asm("rsp");
  233. asm volatile("call func" : "+r" (__sp));
  234. Otherwise the stack frame may not get created before the call.
  235. 3. Another possible cause for errors in C code is if the Makefile removes
  236. -fno-omit-frame-pointer or adds -fomit-frame-pointer to the gcc options.
  237. Also see the above section for .S file errors for more information what
  238. the individual error messages mean.
  239. If the error doesn't seem to make sense, it could be a bug in objtool.
  240. Feel free to ask the objtool maintainer for help.
  241. Adding exceptions
  242. -----------------
  243. If you _really_ need objtool to ignore something, and are 100% sure
  244. that it won't affect kernel stack traces, you can tell objtool to
  245. ignore it:
  246. - To skip validation of a function, use the STACK_FRAME_NON_STANDARD
  247. macro.
  248. - To skip validation of a file, add
  249. OBJECT_FILES_NON_STANDARD_filename.o := n
  250. to the Makefile.
  251. - To skip validation of a directory, add
  252. OBJECT_FILES_NON_STANDARD := y
  253. to the Makefile.