Code obfuscating on popular OS

pranavappu007 · Post by **pranavappu007** » Sun Jan 17, 2021 1:21 am

I was wondering, can you encrypt, or obfuscate parts of your code, and decrypt it at runtime and run? On non-OS environment, I think it's easier, as you have to write the function in assembly, and hardcode the encrypted binary as an array or something in the OS code, along with the decryptor and caller. Because our code runs on ring-0, we can easily execute stack, just as code(correct me if I'm wrong).

Running it on windows (and linux, probably) is way more trickier due to not just the protections in place, but how the programs actually run. I tried to run without any encryption at first. I kinda made it work in windows, using VirtualAlloc, as stack execution is not possible by default. But it is going haywire. You can't access anything outside the 'function', unless you have a pointer passed into it. And sometimes even if you have, you might just get MEMORY_ACCESS_VIOLATION. I was trying to make some changes to a global array(that is not accessible, so passed a pointer into it) inside this 'array' function. Weirdly it started to get Access violations after modifying 4 bytes. I still don't know why.

Have you guys ever tried to do stuff like this? Execution from stack? On OS or non-OS environment? Can you help me?

(It might not be relevant to this forum, but again this is general Ramblings

)

nullplan · Post by **nullplan** » Sun Jan 17, 2021 6:50 am

What is the point? I mean, you give the hackers the run-around for a while, but it will only hinder them, not make it impossible for them to understand your code. And it will make your code slower even for all of your legitimate users. So again, what's the point?

Anyway, yes, you can save an encrypted version of your program in the executable file, then at run time decrypt it and execute it. On Linux, you have fexecve(), so you can do the decryption in a temporary file and just run it. On Windows I'm not sure. I think you need to execute an actually reachable file. None of that needs to be in assembler, so I don't know what you mean there. But that is utterly pointless, since you encrypt something, and then put the means of its decryption right beside it. It would be like having a safe with the key lying on top of it, you might as well not have bothered.

pranavappu007 wrote:Because our code runs on ring-0, we can easily execute stack, just as code(correct me if I'm wrong).

The NX bit affects ring-0 code as well. So no, this is not possible.

pranavappu007 wrote:You can't access anything outside the 'function', unless you have a pointer passed into it.

So you have an encrypted function as part of a bigger program? That can only work if you fix up any relocations in that function. Or else if the function is self-contained.

pranavappu007 · Post by **pranavappu007** » Sun Jan 17, 2021 11:15 am

nullplan wrote:What is the point? I mean, you give the hackers the run-around for a while, but it will only hinder them, not make it impossible for them to understand your code. And it will make your code slower even for all of your legitimate users. So again, what's the point?

To know how to do it, just in case, also how to identify such code, in case we might want to reverse engineer someone else's code.(Legal use ofc)

nullplan wrote:. But that is utterly pointless, since you encrypt something, and then put the means of its decryption right beside it. It would be like having a safe with the key lying on top of it, you might as well not have bothered.

I didn't mean full blown encryption, more like partly hidden. So that anyone reversing wouldn't notice right away, I think it's useful to prevent cracking of stuff. Anyway I am not a pro coder so my intentions are learning and experience.

nullplan wrote:The NX bit affects ring-0 code as well. So no, this is not possible.

I mean I don't know about paging much, but on segmentation, you can just copy the code into an executable segment and execute it, right?

nullplan wrote:So you have an encrypted function as part of a bigger program? That can only work if you fix up any relocations in that function. Or else if the function is self-contained.

As I've said above, it's partly hidden(atleast that's the idea). Yeah agree on that. It's haard..

I was just curious to know what others have to say about this, as I have already found it's a bit higher than my current level. More like about other people experiences of stuff like this. And also opinions like this one.

nullplan · Post by **nullplan** » Sun Jan 17, 2021 12:26 pm

pranavappu007 wrote:To know how to do it, just in case, also how to identify such code, in case we might want to reverse engineer someone else's code.(Legal use ofc)

In my country, it is always legal to reverse engineer a program you posses lawfully (freedom of research), and you cannot contract that right away.

Again, it is nice that you are studying the topic for your own edification, but what is the point even if someone else were to do it? Not the point of studying, but the point of doing something like that.

pranavappu007 wrote: I didn't mean full blown encryption, more like partly hidden. So that anyone reversing wouldn't notice right away, I think it's useful to prevent cracking of stuff.

Yeah, that will give a potential reverse engineer the run-around, but will not actually prevent anything. And if someone finding out how your program works means they cracked something, something has gone terribly wrong with your program already. I can think of few non-adversarial uses of this idea.

pranavappu007 wrote:I mean I don't know about paging much, but on segmentation, you can just copy the code into an executable segment and execute it, right?

Yes, you can. You can also copy stuff into a page, then mark it executable and execute it. If the OS is particularly hardened and doesn't let you mark a page that previously was writable as executable, then you can also write the stuff to a file and execute that.

pranavappu007 wrote:As I've said above, it's partly hidden(atleast that's the idea). Yeah agree on that. It's haard..

Encrypting a single function is much harder than the whole program. So just encrypt the whole program. That way, you can take the entire completely linked executable file as input and do something sensible with it.

The probably best known implementation of this idea would be UPX. That program has an actual use, however, since it compresses the file, rather than encrypting it. It compresses the executable, then adds a little loader program that will decompress the program and execute it. bzip2 also has an implementation of that called bzexe. However, both of these optimize for something people tend not to care a great deal about: Disk space. Even in most resource-constrained settings I have seen, disk space has been way more readily available than RAM. And this approach will use up more RAM. First of all, without the encoding, the program could be faulted into address space, one page at a time, therefore only loading the code you are actually using, but with this encoding scheme, the entire executable will be put into RAM at the start. Second, with a non-encoded program, multiple instances of the program could at least share the code pages, but with the encoded program, the pages cannot be shared. Thus, the whole thing also becomes a waste of memory, in addition to time.

The whole thing, in the end, reminds me of several implementations of DRM, and it is doomed to failure like most of those schemes have.

eekee · Post by **eekee** » Sun Jan 17, 2021 4:21 pm

nullplan wrote:
pranavappu007 wrote:To know how to do it, just in case, also how to identify such code, in case we might want to reverse engineer someone else's code.(Legal use ofc)
In my country, it is always legal to reverse engineer a program you posses lawfully (freedom of research), and you cannot contract that right away.

Note that the majority of today's commercial software licenses don't give you right of ownership. They only give you a right to use it in the ways they permit, and they always explicitly deny reverse engineering. There are exceptions, of course.

bzt · Post by **bzt** » Sun Jan 17, 2021 5:20 pm

eekee wrote:Note that the majority of today's commercial software licenses don't give you right of ownership. They only give you a right to use it in the ways they permit, and they always explicitly deny reverse engineering. There are exceptions, of course.

I would argue if that's even legit. I mean I'm not buying a service with a monthly fee, rather something for a single price. As long as I pay the entire price for something, then ownership must be mine. Retaining the ownership wouldn't be the first illegal move on big corps' part... just because they have an armada of lawyers looking for loopholes in law, doesn't make this morally right (but this is very off-topic)

nullplan wrote:What is the point? I mean, you give the hackers the run-around for a while, but it will only hinder them, not make it impossible for them to understand your code. And it will make your code slower even for all of your legitimate users.

This is very true.

pranavappu007 wrote:I was wondering, can you encrypt, or obfuscate parts of your code, and decrypt it at runtime and run? On non-OS environment, I think it's easier, as you have to write the function in assembly, and hardcode the encrypted binary as an array or something in the OS code, along with the decryptor and caller.

You can obfuscate without encrypting. For example, instead of

Code: Select all

char hello[12] = "Hello World";

one could do

Code: Select all

char hello[12];
hello[0]='H'; hello[1]='e'; hello[11]=hello[1]-1; hello[2]=hello[3]=hello[10]='l'; hello[4]=hello[7]='o'; hello[5]=' '; hello[6]='W'; hello[8]=hello[7]+2;

This way the executable would be obfuscated, as there would be no clearly readable "Hello World" bitchunk in the binary. Problematic to use, granted, but might be useful to hardcode passwords or encryption keys into executables.

pranavappu007 wrote:Have you guys ever tried to do stuff like this?

Yes, and it's problematic. It's easier if you put the decrypter in your ELF/PE loader (that way the files are encrypted on disk, but decrypted in memory on load before relocation takes place). Once I had to reverse engineer a virus which used a simple XOR as a cypher on the instructions, but only decrypted a small window around the current IP, so it was a nightmare to debug (it was in the DOS era, long before the NX bit). I've always found Russian coders are very creative about these things. It worth study some of their code (most notably some viruses, they are often polymorphic too).

nullplan wrote:On Linux, you have fexecve(), so you can do the decryption in a temporary file and just run it.

Not a good idea, because then you'll have the decrypted file on your disk (or tmpfs). It's better if you put the decrypter in the ELF/PE loader as I've said and only decrypt in memory where other processes can't access it.

pranavappu007 wrote:I was just curious to know what others have to say about this, as I have already found it's a bit higher than my current level. More like about other people experiences of stuff like this. And also opinions like this one.

I've used a simpler approach in my boot loader. It can load an SHA-XOR-CBC or AES-256-CBC encrypted initrd. On disk, the entire initrd is a bit-sausage, but when booted it becomes clear data, and the kernel is none the wiser. (FYI I place the kernel inside the initrd, so this encrypts the executable as well).

Cheers,
bzt

eekee · Post by **eekee** » Sun Jan 17, 2021 7:54 pm

bzt wrote:
eekee wrote:Note that the majority of today's commercial software licenses don't give you right of ownership. They only give you a right to use it in the ways they permit, and they always explicitly deny reverse engineering. There are exceptions, of course.
I would argue if that's even legit. I mean I'm not buying a service with a monthly fee, rather something for a single price. As long as I pay the entire price for something, then ownership must be mine. Retaining the ownership wouldn't be the first illegal move on big corps' part... just because they have an armada of lawyers looking for loopholes in law, doesn't make this morally right (but this is very off-topic)

You never at any point buy the software; they don't sell it. You buy a license to use it. This has been going on since the 90s, so if it could be cracked, I think it would be by now. Having said that, I just remembered Microsoft's licenses are less draconian than they were, so it's not all bad news. As for the morality of it, I'll get worked up if I think about it, and my heart can't take me getting worked up any more. Besides, when I get worked up, my perspective seems to get unbalanced. I'm just glad I can believe God's Kingdom will set everything right in the future.

bzt wrote:one could do
Code: Select all
char hello[12];
hello[0]='H'; hello[1]='e'; hello[11]=hello[1]-1; hello[2]=hello[3]=hello[10]='l'; hello[4]=hello[7]='o'; hello[5]=' '; hello[6]='W'; hello[8]=hello[7]+2;
This way the executable would be obfuscated, as there would be no clearly readable "Hello World" bitchunk in the binary. Problematic to use, granted, but might be useful to hardcode passwords or encryption keys into executables.

There are obfuscation tools for Javascript. There may be for C, too. These tools take your cleanly-written code and obfuscate it. You could make your build system invoke them.

Incidentally, I once, long ago, looked at the assembly language gcc produced for ARM32. It was a tiny program which assigned a string within a function call. There was no "Hello world" in the output, but rather a series of instructions to build the string at run-time from 32-bit values loaded into registers. I was quite surprised. I've long since lost the code, but it looked something like this:

Code: Select all

foo() {
	char *bar;

	bar = "Hello World";
	printf("%s\n, bar);
}

bar may have been defined as an array; I don't remember.

Octocontrabass · Post by **Octocontrabass** » Sun Jan 17, 2021 8:12 pm

bzt wrote:This way the executable would be obfuscated, as there would be no clearly readable "Hello World" bitchunk in the binary.

Turn on optimizations. Odds are, your compiler will rearrange your obfuscated code into something significantly more readable than you intended.

pranavappu007 · Post by **pranavappu007** » Sun Jan 17, 2021 8:32 pm

Code: Select all

char hello[12];
hello[0]='H'; hello[1]='e'; hello[11]=hello[1]-1; hello[2]=hello[3]=hello[10]='l'; hello[4]=hello[7]='o'; hello[5]=' '; hello[6]='W'; hello[8]=hello[7]+2;

Still you can identify the variable and values writing into it.. If you print the starting address and then contents, it would be visible, right? Also, compiler can rearrange this into more readable code. I have seen actually they make all array assignment into 4 byte chunks regardless of array size or type(8 bytes if x64).

By obfuscation, I meant code cannot be detected by debuggers and reverse engineering tools like ghidra. Something like binary code, but values offset by a number. At runtime some other function subtracts this number from this sequence, thus making it executable.

bzt · Post by **bzt** » Mon Jan 18, 2021 9:22 am

Octocontrabass wrote:Turn on optimizations. Odds are, your compiler will rearrange your obfuscated code into something significantly more readable than you intended.

There's no way it would rearrange independent char operations into a single string data and make "Hello World" clearly readable, but just in case, use "volatile char[]".

pranavappu007 wrote:Still you can identify the variable and values writing into it. If you print the starting address and then contents, it would be visible, right?

Then I must ask, what do you want to obfuscate? a) the source, b) the resulting binary, c) the instructions in run-time memory, d) the data in run-time memory? Of course you could print the variable, that's the point to de-obfuscate data only in run-time, so string not readable in source, neither in binary, but running code can access it without probs.

pranavappu007 wrote:By obfuscation, I meant code cannot be detected by debuggers and reverse engineering tools like ghidra.

What you want is just not possible, because in run-time the CPU must understand the instructions clearly without obfuscation no matter what, and a debugger is nothing more than a monitor on what the CPU does. You can make the life of a debugger hard, you could specifically mislead some tools, but it is not possible to avoid reverse engineering in general. For example you could jump in the middle of an instruction, that would scramble a disassembler like objdump, but if you run that code in qemu with "-d in_asm" you'd still see a properly disassembled code.

Cheers,
bzt

pranavappu007 · Post by **pranavappu007** » Mon Jan 18, 2021 9:51 am

bzt wrote:What you want is just not possible, because in run-time the CPU must understand the instructions clearly without obfuscation no matter what, and a debugger is nothing more than a monitor on what the CPU does.

I did not meant to make it impossible to debug. I meant to make the existing code(and maybe data too)obfuscated so a tool analyzing the binary won't identify the function, but the program de-scrambles the function just before executing. Or you can scramble the function addresses and store in an array, only to de scramble it at the function call.

Or if you have multiple threads, you can execute code in one thread that affects the code in other thread, but it's not noticeable right away. A debugger or analysis tool will think it's unreachable code, or it does nothing, but in reality it does.

rdos · Post by **rdos** » Mon Jan 18, 2021 10:12 am

I think a more interesting topic is if you can install something in a popular OS (like Windows or Linux) that can take over the machine and execute your own code in the kernel. Or spy on the kernel code of the OS.

A very possible way to do this is with an FPGA with a PCIe connector.

nullplan · Post by **nullplan** » Mon Jan 18, 2021 12:59 pm

bzt wrote:Not a good idea, because then you'll have the decrypted file on your disk (or tmpfs).

Well, if you use a tmpfs, and delete the file before writing to it (or just open an FD with O_TMPFILE), the file doesn't have (or will never get) a name, so no-one can look at it. Unless you look at the FD. But anyone that can look at the FD can also just read out the process's memory (/proc/<PID>/mem), so it is a lost cause either way.

Now for the advanced question: Can you fexecve() an FD that was opened with O_CLOEXEC? And what will /proc/self/exe be then?

bzt wrote:It's better if you put the decrypter in the ELF/PE loader as I've said and only decrypt in memory where other processes can't access it.

As I said, /proc/<PID>/mem exists. Other processes of the same user can always read out the memory. And this approach depends on being able to create writable and executable pages (not necessarily at the same time), which a hardened kernel may deny.

bzt wrote:There's no way it would rearrange independent char operations into a single string data and make "Hello World" clearly readable, but just in case, use "volatile char[]".

You underestimate your compilers. If they can prove the results will always be the same, they will make them always the same. Now, in general memory access is slower than computation, so the compiler will likely convert this into a series of constant assignments, but there is no rule saying it can't be done. In fact, there is no rule saying your snippet can't be implemented with strcpy(hello, "Hello World") even if hello is volatile-qualified.

bzt · Post by **bzt** » Mon Jan 18, 2021 3:37 pm

nullplan wrote:Well, if you use a tmpfs, and delete the file before writing to it (or just open an FD with O_TMPFILE), the file doesn't have (or will never get) a name, so no-one can look at it. Unless you look at the FD. But anyone that can look at the FD can also just read out the process's memory (/proc/<PID>/mem), so it is a lost cause either way.

Yep, agreed. But you can't access /proc/<PID>/mem from other processes with the same user id, even though the file permissions says so, I've just checked.

Code: Select all

$ ps aux | grep mc
bzt          856  0.0  0.1  20652  6748 pts/0    S+   Jan15   0:08 mc
bzt       102411  0.2  0.2  20056 10612 pts/2    S+   22:30   0:00 mc
$ cat /proc/856/mem
cat: /proc/856/mem: Permission denied
$ cat /proc/102411/mem
cat: /proc/102411/mem: Permission denied
$ ls -la /proc/856/mem
-rw------- 1 bzt bzt 0 Jan 18 22:33 /proc/856/mem
$

Maybe your Linux is misconfigured?

nullplan wrote:Now for the advanced question: Can you fexecve() an FD that was opened with O_CLOEXEC? And what will /proc/self/exe be then?

Hmm, good question! I'll have to write a small test.

nullplan wrote:And this approach depends on being able to create writable and executable pages (not necessarily at the same time), which a hardened kernel may deny.

Normally true, but not for the ELF/PE loader. That always must have privilege to write to memory which will later be executed. Simply you cannot load executables into memory otherwise, so it makes sense to put the deobfuscator there.

nullplan wrote:You underestimate your compilers. If they can prove the results will always be the same, they will make them always the same. Now, in general memory access is slower than computation, so the compiler will likely convert this into a series of constant assignments, but there is no rule saying it can't be done. In fact, there is no rule saying your snippet can't be implemented with strcpy(hello, "Hello World") even if hello is volatile-qualified.

No, there's no optimization that would replace several character operations with a string data. I've tried. But to be future proof, you could insert some "asm volatile(:::"memory");" to force barriers to avoid any not-yet-implemented, upcoming optimizations, or you could use inline assembly to construct the data with a bunch of "mov"s. The point is, don't store the data as-is, rather provide a procedural way of constructing it in run-time. You can use whatever procedural method you deem necessary.

Cheers,
bzt

OSDev.org

Code obfuscating on popular OS

Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS

Re: Code obfuscating on popular OS