bzt wrote:I know ROP, and that link does not demonstrate "injecting an arbitrary code and then executing it". Just to make it extremely simple and crystal clear:
PoC code injection with WNX, that's the mission.
The Wikipedia article on ROP claims that the technique is Turing complete. I think this may be a bit of a strong claim, more accurate is probably that it is *likely* to be Turing complete for a given codebase that is being attacked. Furthermore, as set forth in the next paragraph, I think there's a good argument for viewing ROP as code-injection:
As I see it, any codebase
K is has a statistical likelyhood
L, when compiled for a given host ISA
H, to have a subset that accidentally implements some ad-hoc bytecode interpreter
I.
L is going to be some increasing function of size (the larger
K is, the more likely you can patch together some
I), and there is going to be some further likelyhood
L_t that
I is Turing-complete. L_t will likewise be an increasing function of size. Individual instructions in
H machine code form the "microcode" for
I, instruction sequences in
K implement the instructions for
I, and the stack pointer for
H is the instruction pointer for
I. The
ret opcode for
H is the
fetch microop for
I,
H instructions that add to or subtract from the stack pointer are
I microops used to implement
jmp instructions in
I, etc. If
H enforces W^X, then buffer overflows cannot execute arbitrary
H code, but they still can execute arbitrary
I code. If
I is Turing complete (which is probably the case if
K contains any sequence of bytes that constitute a conditional instruction in
H followed by a
pop,
add sp, or
sub sp, which would constitute the microcode for a conditional branch in
I), then it doesn't really matter if we can't inject
H instructions, there will be some sequence of
I instructions that will achieve the same result as any given sequence of
H instructions. It would probably be overkill, but we could even have the exploit code we want to run already written in C, and then, once we find a suitable
I within
K, write a C compiler for
I (or for real overkill, target GCC to
I), and compile our code to
I bytecode. It is my view that the fact that ROP has the success it does demonstrates that
L_t becomes significant for decently small codebases. This also demonstrates why putting the dynamic linker in the kernel is questionable: if the kernel is vulnerable to buffer overflows, it increases the size, and thus the
L_t, for the kernel. There's certainly an argument for putting certain *parts* of the functionality of the dynamic linker in the kernel, and having some way of making sure that the only userspace component that can make syscalls to the relevant kernel components is the dynamic linker, but stuffing the whole linker into the kernel strikes me as likely to do more harm than good.
The sources that Wikipedia cites for the claim that ROP is Turing complete are:
[*] Abadi, M. N.; Budiu, M.; Erlingsson, Ú.; Ligatti, J. (November 2005). "Control-Flow Integrity: Principles, Implementations, and Applications". Proceedings of the 12th ACM conference on Computer and communications security - CCS '05. pp. 340–353. doi:10.1145/1102120.1102165. ISBN 1-59593-226-7.
and
[*] Abadi, M. N.; Budiu, M.; Erlingsson, Ú.; Ligatti, J. (October 2009). "Control-flow integrity principles, implementations, and applications". ACM Transactions on Information and System Security. 13: 1–40. doi:10.1145/1609956.1609960.
The following is cited in the same paragraph and seems to be relevant:
[*] Shacham, H. (October 2007). "The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86)". Proceedings of the 14th ACM conference on Computer and communications security - CCS '07. pp. 552–561.
Unfortunately, none of them are available online, but I expect that they contain proof-of-concept, Turing-complete exploits, and either an analysis of the statistical likelyhood of finding a Turing-complete set of code fragments, or findings on the results of looking for such fragments in existing code. Combined with my analysis above as to the rationale for viewing ROP as code injection, we then have code injection even if the hardware and OS enforce W^X, because code for
I is data for
H, so R permission for
H is X permission for
I, and denying a program either read or write access to its stack won't work well.