OS Updater

AndrewAPrice · Post by **AndrewAPrice** » Sun Jun 29, 2014 2:05 pm

It depends on how you implement stuff in your OS, but you shouldn't have a problem updating the file, then send a message saying either "quit and relaunch yourself" or "hibernate, relaunch, and resume yourself".

How do you update the file of a running executable? Well, unless you're doing something tricky like deferred loading of machine code or you have metadata/embedded attachments, there's no use to keeping a lock on the file once it's loaded into memory.

Security will probably be your biggest concern (could someone fake that an update is available over a public network?) You may want to consider distributing your updates with some kind of public/private key encryption.

madanra · Post by **madanra** » Sun Jun 29, 2014 2:48 pm

The real fun starts if you want to live update your kernel

I know Linux has limited support for this via ksplice/kpatch - has anyone here attempted live kernel update without affecting running processes?

sortie · Post by **sortie** » Sun Jun 29, 2014 3:05 pm

I think live updates of the kernel could be done in a reasonable manner if you have proper hibernation support: Simply suspend everything (perhaps to RAM, so no disk access to required), kexec, and resume the user-space.

Nable · Post by **Nable** » Sun Jun 29, 2014 4:34 pm

sortie wrote:I think live updates of the kernel could be done in a reasonable manner if you have proper hibernation support: Simply suspend everything (perhaps to RAM, so no disk access to required), kexec, and resume the user-space.

User-space cannot be separated from a lot of kernel variables, at least it cannot be done 'simply'. Opened files, sockets, buffers.. After all, I've heard about such implementation in one commercial product that offers so-called container virtualization for Linux - they suspend containers, do kexec and then resume execution of virtual environments. Maybe it even works.

Love4Boobies · Post by **Love4Boobies** » Sun Jun 29, 2014 6:21 pm

Ksplice and kGraft both have very poor designs, unfortunately. The design I came up with involves a cooperative model, where the update manager announces to the components in question that updates are available, they reach state quiescence, meaning predictable states that will allow the state transfer functions to be simple and predictable/deterministic, in bounded time due to certain constraints. Threre is much to be said on how these state transfer functions can be created automatically, since writing them by hand (this being the common approach) is error-prone. Furthermore, if anything goes wrong with the update, a roll back will performed.

Owen · Post by **Owen** » Sun Jun 29, 2014 7:21 pm

I've always quite liked what Microsoft have done to all of Windows core DLLs, and then sat on and not used since about 3 months after they rolled out the feature* (to everyone's annoyance).

For every function, the compiler emits the code in the format

Code: Select all

    db 0,0,0,0,0
function:
    2bytenop (Windows uses mov edi, edi)
    <function body>

Hot patches are aimed at specific DLL versions, and basically say "Patch function X to redirect to replacement function Y". You can probably only do this at exports, because of the desire to support multiple "layered" hotpatches (e.g. consider a system with 365 days uptime - the core system libraries are liable to have accrued many patches during that time)

The hotpatching procedure is:

The 5 byte scratch space before the function is overwritten with JMP <thenewfunction>
Fence
The 2 byte NOP is atomically replaced with JMP -5, i.e. jump backwards to that NOP

It's important that the first instruction of the function be two bytes long to ensure that it can be replaced atomically (i.e. that no process can have its' IP pointing into the middle of it). This produces

Code: Select all

    JMP replacement
function:
    JMP -5

A similar system can be devised for other architectures. For example, ARM:

Code: Select all

    .long 0
function:
    4byteinsn (All instructions will be 4 bytes for ARM code. For Thumb code, care must be taken)

with result

Code: Select all

    .long replacement
function:
    ldr.w pc, [pc, #-12] // -8 for thumb

Things may vary slightly depending upon the architecture. On x86_64, a more complicated procedure is needed, due to the limited range of the jump instruction (though it benefits from the availability of RIP relative addressing). On ARM64, it is likely required that things be implemented as a jump backwards to a PC relative load followed by a branch to specified register.

This approach has its' limitations - it can only work for patches which do not change data structures. However, it cannot be denied its utility in being able to patch uncooperative running processes.

The ideal situation, IMO, is that every security update to such a system library include a "hotpatch" file along side the new library version. The updater would "ping" all running processes to inform them that a hotpatch has been installed and they would then, if the to-be-patched library is loaded, layer the patch on top of it.

(*Other people have, to their own nefarious ends...)

Love4Boobies · Post by **Love4Boobies** » Sun Jun 29, 2014 8:15 pm

This solution is dangerous. For example, consider something like the following:

Code: Select all

void init()
{
    ++count;
    // ...
}

void run()
{
    // ...
}

gets changed to

Code: Select all

void init()
{
    // ...
}

void run()
{
    ++count;
    // ...
}

vfehring · Post by **vfehring** » Mon Jun 30, 2014 9:14 am

I would love to see live patching come to Mac OS and Windows all the same. At the same time, I find it an interesting concept to patch the system while the user is interacting with it. It could make some users feel insecure about using the system. People have gotten so used to the fact that systems need to reboot in order to perform an update.

bluemoon · Post by **bluemoon** » Mon Jun 30, 2014 9:24 am

vfehring wrote:People have gotten so used to the fact that systems need to reboot in order to perform an update.

It depends, for example, according to FreeBSD handbook:

If the update applies any kernel patches, the system will need a reboot in order to boot into the patched kernel. If the patch was applied to any running binaries, the affected applications should be restarted so that the patched version of the binary is used.

There is a critical part that require reboot to reduce risk on complicated kernel state, however, although reboot to activate is the simplest solution for most user, if you absolutely don't want to reboot you may still update the follow (while complicated actions involved):

Kernel modules (if that can be unloaded, and compatible with active kernel)
Shared libraries (newly launched application can use new version, while existing running application remain not touched)
Userland applications (obvious)

AndrewAPrice · Post by **AndrewAPrice** » Mon Jun 30, 2014 9:53 am

Nable wrote:
sortie wrote:I think live updates of the kernel could be done in a reasonable manner if you have proper hibernation support: Simply suspend everything (perhaps to RAM, so no disk access to required), kexec, and resume the user-space.
User-space cannot be separated from a lot of kernel variables, at least it cannot be done 'simply'. Opened files, sockets, buffers.. After all, I've heard about such implementation in one commercial product that offers so-called container virtualization for Linux - they suspend containers, do kexec and then resume execution of virtual environments. Maybe it even works.

Are you talking about implementing hibernation?

For the most part, you save these variables with the processes's memory dump.

Certain things can't easily be saved and resumed like hardware states, for example, Direct3D has a 'device lost' event, presumably for situations such as hibernating and switching to another 'exclusive' full screen application, which tells the application that the GPU's state has been reset and it needs to re-upload buffers, textures, shaders, and reinitialize the GPU's state. You can do the same with other hardware - send the application or driver a 'device lost' message and tell it to reinitialize it. Network sockets are automatically closed when hibernating, and it's up to the application if it wants to try to reconnect, which in many cases (web browsers) will be seamless to the user.

In most cases, file handles can be reloaded as is - the only time they can't is if you hibernate, remove the hard drive or boot into another operating system, delete/modify the locked file, then reboot back into your OS. You can handle these special cases by forcefully closing the file handle, in the same way as if the user pulled out removable media.

Have a website/web server somewhere that hosts files.
Poll that website to see if a new version is available.
Download the new version.
Decrypt the new version with you private key to ensure it's authentic.
Hibernate all running processes.
Apply update.
Reload the kernel.
Resume running processes.

You need to consider if you'd like your updates to be 'patches' or to be complete versions.

The differences are:

Patches:

With patches, you only store the difference between that version and the previous, so the file size is much smaller as it only contains what has changed.
In the long run, you may end up with a lot of server storage because you need to store all versions on a server, since someone running a very old version may want to update to the last version, and suddenly they're installing 10,000 patches or more.

Complete versions:

Updating from one version to another requires re-downloading the entire OS, even if only a few files changed.
You only need to store the latest version on the server.

Or you could do a hybrid approach. For example, major versions like 1.2, 1.3, 1.4 are complete versions, while minor versions like 1.2.1, 1.2.2 are patches.

You could flag some updates with metadata, e.g. some patches may not be able to hibernate/resume because a critical process data structure changes.

Love4Boobies · Post by **Love4Boobies** » Mon Jun 30, 2014 8:57 pm

That's obviously not all that's involved because processes may have some state associated with the kernel and that may be incompatible between kernel versions. This needs to be translated. If you're going to keep that, you're probably going to have to have bookkeeping data as well---same problem with that. All in all, restarting the kernel seems to be a terrible idea because initialization brings you to a state you don't want. Furthermore, it may be that if the update happens at the wrong time and the translation doesn't cover that case, you will be left with an inconsistent state (see my previous example).

Regarding hibernation, I don't see how that's relevant at all.

madanra · Post by **madanra** » Mon Jun 30, 2014 11:55 pm

I think the use of hibernation is that, if the serialisation format you use for hibernation is kernel independent, then you can you the same serialisation to update the kernel without killing running processes (presumably serialising to RAM instead of disk for efficiency, though that's an implementation detail). Of course that only works as long as you don't change the serialisation format, so it would probably only work between minor version updates.

That seems like a reasonable scheme - though the use of the term hibernation could be confusing, as typically that's only used for when you serialise to disk and power off, and this would be serialising to RAM and not powering off.

Love4Boobies · Post by **Love4Boobies** » Tue Jul 01, 2014 12:48 am

Serialization of what?

madanra · Post by **madanra** » Tue Jul 01, 2014 1:12 am

The running state of the machine. That's what hibernating is: serialising the running state of the machine, so it can be restored later.

Love4Boobies · Post by **Love4Boobies** » Tue Jul 01, 2014 1:22 am

For hibernation, you don't need to serialize anything; you just need to gracefully handle things that can't be saved, save the rest (deciding on the two categories can be tricky, too; e.g., consider tasks where timing is relevant---but several solutions come to mind), and then restore them when you're up and running again. However, the point about live updates is that you need to run this state through transfer functions that do some translation, otherwise you might end up with new code accessing different and/or buggy data that old code acquired. These transfer functions must have intimate knowledge about the differences between the behaviors of the two versions and can get extremely complex.

OSDev.org

OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater

Re: OS Updater