[SOLVED] Writing to floppy causes data to be wiped!
[SOLVED] Writing to floppy causes data to be wiped!
Hello fellow os devers!
I am currently writing an OS in protected mode, and have just put my finishing touches to the "Floppy disk: read sector"-related driver code. All works great.
So now was time for writing the "Floppy disk: write sector". Well. No problem: They're kind of similar in terms of programming. All there is to do is change what command is sent to the FDC, and SEND bytes instead of receiving them. I finish writing and programming, and start the bug-hunt. Some minor bugs occured (inc instead of dec in a loop counter, and off-by-one errors) and the driver was ready. I loaded my new system onto a floppy disk, and lo and behold: It executed with no errors. I took out the floppy, put it in my development machine, and examined the desired sector in a Hex-editor. Yup. The sector was filled with 0x00 0x01 0x02 0x03 0x04... 0xff. Test was successful!
I then wipe that sector back to null, just to see it run again. I like it when stuff works twice. But this time my OS was not able to boot! Not even an error showed up!
I examined the floppy again, and in this closer look i discover that random chunks had been filled with zeros!
I reloaded my OS to the floppy and changed some code. Instead of filling a sector with 0x00 0x01 0x02 0x03... 0xff, I made it fill a sector with 512 x 0xdc, just to see what would happen.
Now I ran the code, and saw the same result. Random chunks loaded with 0x00 in the middle of my masterpiece! Except for one of these chunks! It started with 15-20 bytes of 0xdc, and then 4-500 bytes of 0.
I have read the FDC manual and article on the wiki countless times. I have searched hither (this very forum) and thither (Google...) for an answer on what is going on.
I have a guess though: The FDC starts writing before the right sector has been reached.
Does anyone know what is happening here?
I am currently writing an OS in protected mode, and have just put my finishing touches to the "Floppy disk: read sector"-related driver code. All works great.
So now was time for writing the "Floppy disk: write sector". Well. No problem: They're kind of similar in terms of programming. All there is to do is change what command is sent to the FDC, and SEND bytes instead of receiving them. I finish writing and programming, and start the bug-hunt. Some minor bugs occured (inc instead of dec in a loop counter, and off-by-one errors) and the driver was ready. I loaded my new system onto a floppy disk, and lo and behold: It executed with no errors. I took out the floppy, put it in my development machine, and examined the desired sector in a Hex-editor. Yup. The sector was filled with 0x00 0x01 0x02 0x03 0x04... 0xff. Test was successful!
I then wipe that sector back to null, just to see it run again. I like it when stuff works twice. But this time my OS was not able to boot! Not even an error showed up!
I examined the floppy again, and in this closer look i discover that random chunks had been filled with zeros!
I reloaded my OS to the floppy and changed some code. Instead of filling a sector with 0x00 0x01 0x02 0x03... 0xff, I made it fill a sector with 512 x 0xdc, just to see what would happen.
Now I ran the code, and saw the same result. Random chunks loaded with 0x00 in the middle of my masterpiece! Except for one of these chunks! It started with 15-20 bytes of 0xdc, and then 4-500 bytes of 0.
I have read the FDC manual and article on the wiki countless times. I have searched hither (this very forum) and thither (Google...) for an answer on what is going on.
I have a guess though: The FDC starts writing before the right sector has been reached.
Does anyone know what is happening here?
Last edited by danielbj on Thu Jan 08, 2015 9:11 am, edited 1 time in total.
Daniel Broder Jensen
UNICORN OS
Re: Writing to floppy causes data to be wiped!
HiI
There are several sources than can cause your problem. We need more information in first place.
1- Can you post you relevant driver code?
2- Are you running on an emulator or on bare metal?
3- If running on an emulator, which one?
4- Can you provide details about what exactly happens in that "misterious" moments at bootup?
It seems to me you're running on bare metal. If so, please try to use an emulator, thus making debugging easier. Also, you'll be able to use the serial ports to report output, rather than the VGA (which can easily fail).
I'm pretty sure we'll be able to help you, given you provide enough information
There are several sources than can cause your problem. We need more information in first place.
1- Can you post you relevant driver code?
2- Are you running on an emulator or on bare metal?
3- If running on an emulator, which one?
4- Can you provide details about what exactly happens in that "misterious" moments at bootup?
It seems to me you're running on bare metal. If so, please try to use an emulator, thus making debugging easier. Also, you'll be able to use the serial ports to report output, rather than the VGA (which can easily fail).
I'm pretty sure we'll be able to help you, given you provide enough information
Happy New Code!
Hello World in Brainfuck :[/size]
Hello World in Brainfuck :
Code: Select all
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
Re: Writing to floppy causes data to be wiped!
1- Can you post you relevant driver code?
Yes I can. But I must warn you: This is very system specific assembly. I hope you get the general idea of whats going on. Let me try and guide you through the most critical system calls being made when writing to the disk.
First system call number 0x0004000c (fdwrsec(0x0004000c, basesid, base, sector)) is performed by the test code. Below is an example of how:
Before this code, the floppy motor is started. And after this code, the floppy motors are stopped.
Now, the below code will run. This is the system call responsible for writing a sector to the disk.
Below is the fdo system call. This system call outputs 512 bytes to the FIFO register, from a buffer.
Some of this code is not the best. May be optimized in the future. Just have to get it working before moving on to that task
I should also mention the state of the floppy:
IRQs: On
Drive Polling mode: On
FIFO: On
Threshold: 8
Implied seek: On
Precompensation: 0
2- Are you running on an emulator or on bare metal?
Bare metal.
Compaq Armada 1592 DT, to be exact.
3- If running on an emulator, which one?
Nope. No emulators are allowed for me. I believe in doing it the hard way
4- Can you provide details about what exactly happens in that "mysterious" moments at bootup?
POST performs, and then it just hangs. But I don't blame the poor thing: It's executed corrupted code (This is when booting from a previously booted disk (Which is a disk that have undergone the writing-operation of the OS (which is a disk with random patches of 0's in the code.)))
It seems to me you're running on bare metal. If so, please try to use an emulator, thus making debugging easier. Also, you'll be able to use the serial ports to report output, rather than the VGA (which can easily fail).
I'm sorry, but I am not running on any emulator. Neither am I planning to, unless ABSOLUTELY necessary.
I hope this will help. If you got any questions regarding the code, please feel free to ask me
Yes I can. But I must warn you: This is very system specific assembly. I hope you get the general idea of whats going on. Let me try and guide you through the most critical system calls being made when writing to the disk.
First system call number 0x0004000c (fdwrsec(0x0004000c, basesid, base, sector)) is performed by the test code. Below is an example of how:
Code: Select all
push 0 ;first we push the last argument: the LBA sector to be written to.
push 0 ;then the base address of what we want to write
push eax ;then the Service ID of the base. A service is roughly the same as a segment to be selected in any segment register.
push 0x0004000c ;this is the callcode, specifying what we want the kernel to do. When CC=0x0004000c, it knows we want to do fdwrsec
int 0x30 ;this will call the kernel, and return an errorcode in eax.
add esp, 4*4 ;cleaning up the stack
Now, the below code will run. This is the system call responsible for writing a sector to the disk.
Code: Select all
cc0004000c:;fdwrsec(cc, basesid, base, sector), eax=ec
;Writes one sector to floppy disk.
;Get CHS
push 3
call getarg;eax=?sector (getarg is a local function, returning argn in eax)
add esp, 4*1
push eax;sec
push 0x00040000;cc=getchs
int 0x30;eax=c, ebx=h, ecx=s
add esp, 4*2
mov edx, eax;edx=c (to free eax of the burden of storage)
;Issue command 0xC5
push 0xc5;command
push 0x00040001;cc=fdcommand
int 0x30;eax=ec
add esp, 4*2
mov rs_eax, eax;ec
test eax, eax
jnz .ret
;parameter 1: h*4+drive
shl ebx, 2
push ebx
push 0x00040002;cc=fdparameter
int 0x30
add esp, 4*2
shr ebx, 2
;parameter 2: c
push edx;parameter
push 0x00040002;cc=fdparameter
int 0x30
add esp, 4*2
;parameter 3: h
push ebx;parameter
push 0x00040002;cc=fdparameter
int 0x30
add esp, 4*2
;parameter 4: s
push ecx;parameter
push 0x00040002;cc=fdparameter
int 0x30
add esp, 4*2
;parameter 5: 2
push 2;parameter
push 0x00040002;cc=fdparameter
int 0x30
add esp, 4*2
;parameter 6: sector count (1)
push 1;parameter
push 0x00040002;cc=fdparameter
int 0x30
add esp, 4*2
;parameter 7: 0x1b
push 0x1b;parameter
push 0x00040002;cc=fdparameter
int 0x30
add esp, 4*2
;parameter 8: 0xff
push 0xff;parameter
push 0x00040002;cc=fdparameter
int 0x30
add esp, 4*2
;Transfer data
push 2
call getarg;eax=?base
add esp, 4*1
mov ebx, eax;ebx=?base
push 1
call getarg;eax=?basesid
add esp, 4*1
push ebx;base
push eax;basesid
push 0x00040004;cc=fdo (Another critical system call. This will output 512 bytes to the FIFO. Code below this.)
int 0x30;eax=ec
add esp, 4*3
mov rs_eax, eax;ec
test eax, eax
jnz .ret
;result 1: st0
push 0x00040005;cc=fdresult
int 0x30;eax=result
add esp, 4*1
;result 2: st1
push 0x00040005;cc=fdresult
int 0x30;eax=result
add esp, 4*1
mov ebx, eax;ebx=st1
;result 3: st2
push 0x00040005;cc=fdresult
int 0x30;eax=result
add esp, 4*1
;result 4: c
push 0x00040005;cc=fdresult
int 0x30;eax=result
add esp, 4*1
;result 5: h
push 0x00040005;cc=fdresult
int 0x30;eax=result
add esp, 4*1
;result 6: s
push 0x00040005;cc=fdresult
int 0x30;eax=result
add esp, 4*1
;result 7: 2
push 0x00040005;cc=fdresult
int 0x30;eax=result
add esp, 4*1
;Check st1
mov rs_eax, 0x00040001
test bl, 00010000b
jnz .ret
mov rs_eax, 0x00040002
test bl, 00000010b
jnz .ret
mov rs_eax, 0;ec
.ret:
ret
Code: Select all
cc00040004:;fdo(cc, basesid, base), eax=ec
;Send data from base to FDC.
push 1
call getarg;eax=?basesid
add esp, 4*1
push eax;sid
push 0x00020019;cc=sidds
int 0x30;eax=ec, ds=segment (You just have to trust some functions. Including this one. This will load our base segment into ds)
add esp, 4*2
mov rs_eax, eax;ec
test eax, eax
jnz .ret
push 2
call getarg;eax=?base
add esp, 4*1
mov ebx, eax;ebx=?base
mov ecx, 512;counter
@@:
mov dx, 0x03f4;select MSR
in al, dx;get MSR
test al, 00100000b;test NDMA
jz @b;Jump if data not ready
test al, 10000000b;test RQM
jz @b;jump if data not ready
mov dx, 0x03f5;select FIFO
mov al, [ebx];get data
out dx, al;set data
inc ebx;update pointer
dec ecx;update counter
jnz @b
mov rs_eax, 0;ec
.ret:
ret
Some of this code is not the best. May be optimized in the future. Just have to get it working before moving on to that task
I should also mention the state of the floppy:
IRQs: On
Drive Polling mode: On
FIFO: On
Threshold: 8
Implied seek: On
Precompensation: 0
2- Are you running on an emulator or on bare metal?
Bare metal.
Compaq Armada 1592 DT, to be exact.
3- If running on an emulator, which one?
Nope. No emulators are allowed for me. I believe in doing it the hard way
4- Can you provide details about what exactly happens in that "mysterious" moments at bootup?
POST performs, and then it just hangs. But I don't blame the poor thing: It's executed corrupted code (This is when booting from a previously booted disk (Which is a disk that have undergone the writing-operation of the OS (which is a disk with random patches of 0's in the code.)))
It seems to me you're running on bare metal. If so, please try to use an emulator, thus making debugging easier. Also, you'll be able to use the serial ports to report output, rather than the VGA (which can easily fail).
I'm sorry, but I am not running on any emulator. Neither am I planning to, unless ABSOLUTELY necessary.
I hope this will help. If you got any questions regarding the code, please feel free to ask me
Daniel Broder Jensen
UNICORN OS
Re: Writing to floppy causes data to be wiped!
Although it's more hardcore only using real hardware which involves writing to physical media and then the delay in boot times is like coding machine code in hex. It can be done but it's not the most efficient way to get to your end result. Proving something works in an emulator first then moving it to real hardware will save you a lot of time and can provide more feedback about any errors that are detected.danielbj wrote:I'm sorry, but I am not running on any emulator. Neither am I planning to, unless ABSOLUTELY necessary.
If you insist on using hardware first then fall back to an emulator for debugging errors like this, that is what would be considered ABSOLUTELY necessary.
"God! Not Unix" - Richard Stallman
Website: venom Dev
OS project: venom OS
Hexadecimal Editor: hexed
Website: venom Dev
OS project: venom OS
Hexadecimal Editor: hexed
Re: Writing to floppy causes data to be wiped!
I gave it a try, and ran an image of my floppy in Bochs. It did not like that. At all. In the console it tells me the following:
It looks like Bochs does not support the good ol' fashioned way of doing things.
So well... that's all the information I can give you from running the code on an emulator
Code: Select all
[BIOS] Booting from 0000:7c00
[FLOPPY] controller reset in software (This tells me that my floppy driver is starting its initialization)
[FLOPPY] controller reset in software (This tells me that my floppy driver is specifying)
[FLOPPY] non DMA mode not fully implemented yet
So well... that's all the information I can give you from running the code on an emulator
Daniel Broder Jensen
UNICORN OS
Re: Writing to floppy causes data to be wiped!
Are you changing the data selector but still using local variables in that last section of code?danielbj wrote:Code: Select all
int 0x30;eax=ec, ds=segment (You just have to trust some functions. Including this one. This will load our base segment into ds) add esp, 4*2 mov rs_eax, eax;ec test eax, eax jnz .ret push 2 call getarg;eax=?base
"God! Not Unix" - Richard Stallman
Website: venom Dev
OS project: venom OS
Hexadecimal Editor: hexed
Website: venom Dev
OS project: venom OS
Hexadecimal Editor: hexed
Re: Writing to floppy causes data to be wiped!
If you byb.zaar wrote: Are you changing the data selector but still using local variables in that last section of code?
Code: Select all
call getarg;eax=?base
Daniel Broder Jensen
UNICORN OS
Re: Writing to floppy causes data to be wiped!
Now some time has passed, I have been playing around with the controller timing (HLT, HUT and SRT) and nothing seems to work.
Help and ideas are still very welcome!
Help and ideas are still very welcome!
Daniel Broder Jensen
UNICORN OS
Re: Writing to floppy causes data to be wiped!
Hi! Sorry for not responding your replies to my questions two months ago. I forgot the post .danielbj wrote:Now some time has passed, I have been playing around with the controller timing (HLT, HUT and SRT) and nothing seems to work.
Help and ideas are still very welcome!
I will still insist on an emulator. Try using Bochs' debugger, or Bochs + GDB, or even QEMU + GDB if you want to use "the good ol' way". A debugger and step-by-stepping will hopefully help you find the faulty code. I'm not sure, but I think Bochs has a command for seeing the floppy's contents. You may step-by-step with GDB, then see the floppy's contents, so you can find the faulty code .
BTW, I think Bochs now supports "the good ol' way". Check this bug report. Maybe you were using a outdated version (from 2002?); that could be caused by your distro's packages being outdated. Try to compile Bochs from source if so!
Happy New Code!
Hello World in Brainfuck :[/size]
Hello World in Brainfuck :
Code: Select all
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
Re: Writing to floppy causes data to be wiped!
Tried in qemu. All works fine. Nothing to debug.
However I ran more tests on the physical device and made another discovery.
I tried writing 0xdb to a sector, and sure enough, the sector gets filled with 0xdb. But instead of nulling random sectors as well it 0xdb'ed them!
I suspect it has something to do with how the physical magnetic hardware is working. I have read the 82077AA documentation over and over again, but it does not go into any further detail on how the magnetic head is loaded/unloaded, or what that even means.
I will now play around with adding artificial (independent) delays between sending the last command byte and sending the first data byte, just to see if the drive is indeed starting to write while the head mechanism is still moving around.
EDIT: Another discovery i made: When reading no errors are detected. But when writing a buffer underrun/overrun occurs.
EDIT 2: I have now played around with the aforementioned artificial delays. After inserting a 100 ms delay before sending the data to write to the FIFO, the RQM-bit of the MSR will never become 1, meaning the controller is not ready to send nor receive data.
EDIT 3: The information in edit 2 is false. It seems that when reading/writing to/from a sector above 18, the command is not executed, and therefore no bytes can be transmitted to/from the FIFO. I presume that Implied Seek is not working.
Also, in addition to Edit 1, the write command seems to want to write more than one sector, hence the FIFO underrun. I now got the scattered zeroing under control. The specified sector is being written correctly. After that the last byte written to the first sector is written to the first byte of the next sector, followed by the what i think is the contents of the FIFO, and then the rest is zero. If I send 1024 bytes (two sectors) the first two sectors are written correctly, and then the next (third) sector gets last byte+FIFO+nulls.
However I ran more tests on the physical device and made another discovery.
I tried writing 0xdb to a sector, and sure enough, the sector gets filled with 0xdb. But instead of nulling random sectors as well it 0xdb'ed them!
I suspect it has something to do with how the physical magnetic hardware is working. I have read the 82077AA documentation over and over again, but it does not go into any further detail on how the magnetic head is loaded/unloaded, or what that even means.
I will now play around with adding artificial (independent) delays between sending the last command byte and sending the first data byte, just to see if the drive is indeed starting to write while the head mechanism is still moving around.
EDIT: Another discovery i made: When reading no errors are detected. But when writing a buffer underrun/overrun occurs.
EDIT 2: I have now played around with the aforementioned artificial delays. After inserting a 100 ms delay before sending the data to write to the FIFO, the RQM-bit of the MSR will never become 1, meaning the controller is not ready to send nor receive data.
EDIT 3: The information in edit 2 is false. It seems that when reading/writing to/from a sector above 18, the command is not executed, and therefore no bytes can be transmitted to/from the FIFO. I presume that Implied Seek is not working.
Also, in addition to Edit 1, the write command seems to want to write more than one sector, hence the FIFO underrun. I now got the scattered zeroing under control. The specified sector is being written correctly. After that the last byte written to the first sector is written to the first byte of the next sector, followed by the what i think is the contents of the FIFO, and then the rest is zero. If I send 1024 bytes (two sectors) the first two sectors are written correctly, and then the next (third) sector gets last byte+FIFO+nulls.
Daniel Broder Jensen
UNICORN OS
Re: [SOLVED] Writing to floppy causes data to be wiped!
FINALLY!
It turns out that (for some reason) the FDC ignored Implied seek. I also suspect that the head assembly was misaligned to the track, and therefore the FDC would not know when the sector stops. Turning off implied seek, and seeking manually (by a separate command) solved the problem!
Also the EOT parameter of the read/write command is the sector after witch the operation will stop, when encountered and NOT the amount of sectors to transfer!
Thank you all for your help!
It turns out that (for some reason) the FDC ignored Implied seek. I also suspect that the head assembly was misaligned to the track, and therefore the FDC would not know when the sector stops. Turning off implied seek, and seeking manually (by a separate command) solved the problem!
Also the EOT parameter of the read/write command is the sector after witch the operation will stop, when encountered and NOT the amount of sectors to transfer!
Thank you all for your help!
Daniel Broder Jensen
UNICORN OS