Bootsector Reads Slowly

BugHunter · Post by **BugHunter** » Fri Aug 11, 2006 7:48 am

I've been viewing the OS-FAQ and forum for a while, I like this website alot.

I found some examples on bootsector codes, like John Fines' and Anselm Meyn. Also, I looked at BOS (from Bubach) bootsector, his bootsector also reads the kernel (BOS v0.04 kernel is 18kb) pretty slowly.

My bootsector is pretty similar to the last 2 examples mentioned, and as my kernel is getting bigger too (though I'm still experimenting and coding with OS-Dev stuff) the bootsector reads more slowly.

I was wondering if anyone knows optimizations to speed up the loading process of loading the kernel.

If anyone wants to view my bootstrap, post it here, I'm sorry I did not add it as attachment since I'm in the public library and not having my bootsector by hand (soon I'll have internet at home since I just moved)

-- Greetz, BugHunter

bluecode · Post by **bluecode** » Fri Aug 11, 2006 8:48 am

I assume your loading your kernel from floppy disk. And you know what? Floppy disks are slow.

Kemp · Post by **Kemp** » Fri Aug 11, 2006 9:34 am

I have to admit that I forgot just how slow they were myself. I'm used to USB2 flash sticks, then one day I had to copy a small file onto a floppy and I was like
"Wtf? Is this even working?"

Pype.Clicker · Post by **Pype.Clicker** » Fri Aug 11, 2006 10:37 am

you might switch to grub and use its .gz auto-decompression ... you could also opt for boot-from-cdrom, but that put apart i don't see what kind of "optimization" we could expect from does dmn floppies...

Combuster · Post by **Combuster** » Sun Aug 13, 2006 1:29 pm

To optimize floppy speed, try to:

- Minimize seeking
- Cache the FAT (and directory tables) in local memory
- Read as many consecutive sectors as possible
- Read both heads if possible
- Use DMA

Also, avoid GRUB (more kbs to load than most hobby kernels, but then again, multiboot rules)

To optimize overall speed you can compress it, and not waste the clocks during floppy waits (i.e. decompress block while the fdc loads the next)

If that's not enough for you, try harddisk booting

BugHunter · Post by **BugHunter** » Mon Aug 14, 2006 6:47 am

Ah, I had almost forgot floppy disks are slow.

But since i'm using a win32 virtual floppy driver it should be fast, and it is on BOCHS, but it's slower on VMwar.

Anyway it's clear now.

Another question, does anyone have any ideas on how to implement compression in my custom bootloader (without GRUB)?

Combuster · Post by **Combuster** » Mon Aug 14, 2006 7:15 am

you can try to make a bZip, gZip, 7z or PKZip decompression scheme, but they can be relatively large in comparison to the kernel.

You can also try to make your own compression scheme, huffman or LZ, with the advantage that you can keep code to a minimum and compression rates pretty decent

I made a very cheap LZ algorithm once, and i think it should be codable in the 512 byte limit if you want to.:
read a word (NOT dword). test bit 15, use the rest as count. if set, use count to determine how many bytes to copy from source to destination. if clear, read another word, subtract it from the current pointer, then copy count bytes. Repeat until out of data.

The compressing procedure should be relatively obvious based on this.

the downside for custom schemes is that you need to write your own compressing utility...

Pype.Clicker · Post by **Pype.Clicker** » Mon Aug 14, 2006 7:16 am

BugHunter wrote: Another question, does anyone have any ideas on how to implement compression in my custom bootloader (without GRUB)?

A) Find a library that does zip/gz unpacking and that fits your operating environment (real/pmode, lack of file-reading interrupts, whatever)

B) Hunt for a book about compression techinques and implement your own version of packing/unpacking stuff...

there are also tools for creating self-unpacking executable, but honnestly, if you feel a need for compression in a bootloader, my guess is you should really turn to GRUB -- unless what you want to develop is a bootloader rather than an OS. Same would go for "how can i boot from the network" etc.

Candy · Post by **Candy** » Mon Aug 14, 2006 10:23 am

Combuster wrote: You can also try to make your own compression scheme, huffman or LZ, with the advantage that you can keep code to a minimum and compression rates pretty decent

I made a very cheap LZ algorithm once, and i think it should be codable in the 512 byte limit if you want to.:
read a word (NOT dword). test bit 15, use the rest as count. if set, use count to determine how many bytes to copy from source to destination. if clear, read another word, subtract it from the current pointer, then copy count bytes. Repeat until out of data.

I made a compressor on my website that's public domain, and that includes a 60-byte decompressor for the explicit intent of being used in boot code. Have tested it somewhat, not perfectly but at least a bit, and it works decently. It includes compression + decompression tools for unix that are not at all even close to being quick. I think compressing >200k is sort of masochistic, it'll take half an hour or more. It's O(n^4) for those interested, it can easily be taken down to O(n^2) without decreasing its compression or without making it incompatible with the decompressor.

I believe there's a smallish hack in there that requires a certain define in the compressor/decompressor to keep the boot code fully 16-bit and small. It only affects files larger than 32k and doesn't decrease compression by much.

It's on http://www.sf.net/projects/atlantisos in the SVN archive at http://svn.sourceforge.net/viewvc/atlan ... ompressor/
. Hope you can use it!

BugHunter · Post by **BugHunter** » Fri Aug 18, 2006 6:25 am

Thanks Candy, anyway you forgot and 's' after 'project' in the first URL.

I'll check it out, I hope I can understand some of it as I'm not into compressing at all but very interested in it.

EDIT: I've checked them out, now my question is, how can I compress/decompress some bytes, in the bootsector directory 'dc.asm' should be able to compress/decompress (the note in the directory says so). If you have some compressing tutorials/docs I'm also interested, I hope you want to help me

Combuster · Post by **Combuster** » Fri Aug 18, 2006 8:12 am

Compression docs:
http://en.wikipedia.org/wiki/LZ77_and_LZ78
http://en.wikipedia.org/wiki/Huffman_coding
which are the basic algorithms. Most losless compression schemes use one or more of these three algorithms.

If you want a quickstart using candy's code, I took a look at it to see where the bytes were going.
Even though its poorly documented even for my standards, it looks like you need to supply the following registers:
SI = source image start
DI = destination start
DX = source image size

Use of segments and D bit are a bit vague... (ES=DS, CLD?)
I'm sure Candy knows more specific details of his code than what i can figure at this point - so don't count on me being correct here.
the only real problem i see is the 16 bit limit ( 64 Kbyte ).

Candy · Post by **Candy** » Fri Aug 18, 2006 9:28 am

Combuster wrote: Compression docs:
http://en.wikipedia.org/wiki/LZ77_and_LZ78
http://en.wikipedia.org/wiki/Huffman_coding
which are the basic algorithms. Most losless compression schemes use one or more of these three algorithms.

Check. LZ77 it is.

If you want a quickstart using candy's code, I took a look at it to see where the bytes were going.
Even though its poorly documented even for my standards, it looks like you need to supply the following registers:
SI = source image start
DI = destination start
DX = source image size

Well... that's true, it's very badly documented. It's also quite hacked to make it very small, so you might find a few things unclear. I'll add comments tonight (GMT). I'm pretty sure you have the right info about those, since I think I used ds:si as source and es:di as destination.

Use of segments and D bit are a bit vague... (ES=DS, CLD?)

IIRC DS is the source segment, ES is the destination segment, not sure though. Will check & post a more contentual reply.

I'm sure Candy knows more specific details of his code than what i can figure at this point - so don't count on me being correct here.
the only real problem i see is the 16 bit limit ( 64 Kbyte ).

That's in fact a problem. You can't uncompress more than 64k (at a time) with this code. You can't reference beyond 64k either, but there's a way around it. The format uses back references (a la LZ77 iirc) of up to 32k. If you decode 32k, move it back and decode another 32k, you can keep the compression references available.

Speed shouldn't be the issue in decompressing in your boot sector.

Candy · Post by **Candy** » Fri Aug 18, 2006 9:53 am

It's not poorly documented.

It isn't documented.

----

It wasn't documented.

I've added calling convention documentation at the top, comments in the code about how it works (although the C version remains the reference, since the asm version is optimized for size) and made es!=ds work. I shrunk it by a few bytes in the process, it's now 56 bytes.

----

I have not tested this version on bootcode yet. I'm about to modify my boot loader (also in svn) to allow for compressed images and then I'll test what it does. I have tested the version before changes in Linux, by changing all si/di to esi/edi and compiling it with BITS 32, in a small wrapper. That one worked and did the same as the C decompressor, if the compressor was compiled with limit.

The limit is a limit on the backlink. The 15th bit is used for indicating >16k back to copy from for the C version, whereas the assembly version (as it's limited to 64k anyway) uses that bit for another 16k of valid back linking (as opposed to a few megs it couldn't address anyway, which would also enlarge it by quite a bit).

The numbers are encoded in an expanding number format. The most significant bit is used for indicating an expansion to the next byte (where the next byte would be loaded before this one). To allow a bit more stretch, a two-byte number should be read as the number you find + the first number not encodable by 1-byte encoding. Examples:

length 0: 00
length 66 (42) : 42
length 127 (7F) : 7F
length 128 (80) : 80 00 - that's a coincidence, didn't consider
length 129 (81) : 81 00
length 255 (FF) : FF 00
length 256 (100) : 80 01
length 4095 (FFF) : FF 1E
length 4096 (1000) : 80 1F
length 32767 (7FFF) : FF FE
length 32768 (8000) : 80 FF
length 32995 (807F) : FF FF

This is for the bootsector code.

If the number decoded is the bytecount, the lowest bit is the compressed-bit. You shift the number right by one (limiting it to 0x403F) for the actual byte count (with constant offset).

When copying an uncompressed block, the bytecount should be raised by 1. When copying a compressed block (from the destination), the bytecount should be raised by 3. The raise-by-one is to avoid a bytecount of 0 from being pointless (wasting a small bit of a bit) and the raise-by-3 is to make sure only useful back pointers are encoded (since they take two bytes at least, you need them to fill at least 3 bytes in the output or they're useless - so to save sub-bits the count is increased by 3).
Compressed blocks take between 2 and 4 bytes in the input, uncompressed take bytecount + 1 or 2.

Candy · Post by **Candy** » Fri Aug 18, 2006 10:19 am

I made a very cheap LZ algorithm once, and i think it should be codable in the 512 byte limit if you want to.:
read a word (NOT dword). test bit 15, use the rest as count. if set, use count to determine how many bytes to copy from source to destination. if clear, read another word, subtract it from the current pointer, then copy count bytes. Repeat until out of data.

That's my algorithm if you rip out the number encoding. That would reduce compression by only a few percent (possibly increase it slightly) but would also reduce its size to around 40 bytes. I'm not sure that's worth it, but if I have a chance I'll add a second boot sector example with that in it, plus compressor and decompressor to go along with it.

OSDev.org

Bootsector Reads Slowly

Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly

Re:Bootsector Reads Slowly