[SOLVED] Writing to floppy causes data to be wiped!

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
danielbj
Member
Member
Posts: 36
Joined: Thu Dec 16, 2010 3:08 pm

[SOLVED] Writing to floppy causes data to be wiped!

Post by danielbj »

Hello fellow os devers!

I am currently writing an OS in protected mode, and have just put my finishing touches to the "Floppy disk: read sector"-related driver code. All works great.
So now was time for writing the "Floppy disk: write sector". Well. No problem: They're kind of similar in terms of programming. All there is to do is change what command is sent to the FDC, and SEND bytes instead of receiving them. I finish writing and programming, and start the bug-hunt. Some minor bugs occured (inc instead of dec in a loop counter, and off-by-one errors) and the driver was ready. I loaded my new system onto a floppy disk, and lo and behold: It executed with no errors. I took out the floppy, put it in my development machine, and examined the desired sector in a Hex-editor. Yup. The sector was filled with 0x00 0x01 0x02 0x03 0x04... 0xff. Test was successful!

I then wipe that sector back to null, just to see it run again. I like it when stuff works twice. But this time my OS was not able to boot! Not even an error showed up!

I examined the floppy again, and in this closer look i discover that random chunks had been filled with zeros!
I reloaded my OS to the floppy and changed some code. Instead of filling a sector with 0x00 0x01 0x02 0x03... 0xff, I made it fill a sector with 512 x 0xdc, just to see what would happen.

Now I ran the code, and saw the same result. Random chunks loaded with 0x00 in the middle of my masterpiece! Except for one of these chunks! It started with 15-20 bytes of 0xdc, and then 4-500 bytes of 0.

I have read the FDC manual and article on the wiki countless times. I have searched hither (this very forum) and thither (Google...) for an answer on what is going on.
I have a guess though: The FDC starts writing before the right sector has been reached.

Does anyone know what is happening here? :(
Last edited by danielbj on Thu Jan 08, 2015 9:11 am, edited 1 time in total.

Daniel Broder Jensen
UNICORN OS
User avatar
KemyLand
Member
Member
Posts: 213
Joined: Mon Jun 16, 2014 5:33 pm
Location: Costa Rica

Re: Writing to floppy causes data to be wiped!

Post by KemyLand »

HiI
There are several sources than can cause your problem. We need more information in first place.

1- Can you post you relevant driver code?
2- Are you running on an emulator or on bare metal?
3- If running on an emulator, which one?
4- Can you provide details about what exactly happens in that "misterious" moments at bootup?

It seems to me you're running on bare metal. If so, please try to use an emulator, thus making debugging easier. Also, you'll be able to use the serial ports to report output, rather than the VGA (which can easily fail).

I'm pretty sure we'll be able to help you, given you provide enough information :wink:
Happy New Code!
Hello World in Brainfuck :D:

Code: Select all

++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
[/size]
danielbj
Member
Member
Posts: 36
Joined: Thu Dec 16, 2010 3:08 pm

Re: Writing to floppy causes data to be wiped!

Post by danielbj »

1- Can you post you relevant driver code?
Yes I can. But I must warn you: This is very system specific assembly. I hope you get the general idea of whats going on. Let me try and guide you through the most critical system calls being made when writing to the disk.

First system call number 0x0004000c (fdwrsec(0x0004000c, basesid, base, sector)) is performed by the test code. Below is an example of how:

Code: Select all

push 0 ;first we push the last argument: the LBA sector to be written to.
push 0 ;then the base address of what we want to write
push eax ;then the Service ID of the base. A service is roughly the same as a segment to be selected in any segment register.
push 0x0004000c ;this is the callcode, specifying what we want the kernel to do. When CC=0x0004000c, it knows we want to do fdwrsec
int 0x30 ;this will call the kernel, and return an errorcode in eax.
add esp, 4*4 ;cleaning up the stack
Before this code, the floppy motor is started. And after this code, the floppy motors are stopped.



Now, the below code will run. This is the system call responsible for writing a sector to the disk.

Code: Select all

cc0004000c:;fdwrsec(cc, basesid, base, sector), eax=ec
;Writes one sector to floppy disk.
	
	;Get CHS
	push 3
	call getarg;eax=?sector (getarg is a local function, returning argn in eax)
	add esp, 4*1
	
	push eax;sec
	push 0x00040000;cc=getchs
	int 0x30;eax=c, ebx=h, ecx=s
	add esp, 4*2
	
	mov edx, eax;edx=c (to free eax of the burden of storage)
	
	;Issue command 0xC5
	push 0xc5;command
	push 0x00040001;cc=fdcommand
	int 0x30;eax=ec
	add esp, 4*2
	
	mov rs_eax, eax;ec
	test eax, eax
	jnz .ret
	
	;parameter 1: h*4+drive
	shl ebx, 2
	
	push ebx
	push 0x00040002;cc=fdparameter
	int 0x30
	add esp, 4*2
	
	shr ebx, 2
	
	;parameter 2: c
	push edx;parameter
	push 0x00040002;cc=fdparameter
	int 0x30
	add esp, 4*2
	
	;parameter 3: h
	push ebx;parameter
	push 0x00040002;cc=fdparameter
	int 0x30
	add esp, 4*2
	
	;parameter 4: s
	push ecx;parameter
	push 0x00040002;cc=fdparameter
	int 0x30
	add esp, 4*2
	
	;parameter 5: 2
	push 2;parameter
	push 0x00040002;cc=fdparameter
	int 0x30
	add esp, 4*2
	
	;parameter 6: sector count (1)
	push 1;parameter
	push 0x00040002;cc=fdparameter
	int 0x30
	add esp, 4*2
	
	;parameter 7: 0x1b
	push 0x1b;parameter
	push 0x00040002;cc=fdparameter
	int 0x30
	add esp, 4*2
	
	;parameter 8: 0xff
	push 0xff;parameter
	push 0x00040002;cc=fdparameter
	int 0x30
	add esp, 4*2
	
	;Transfer data
	push 2
	call getarg;eax=?base
	add esp, 4*1
	mov ebx, eax;ebx=?base
	
	push 1
	call getarg;eax=?basesid
	add esp, 4*1
	
	push ebx;base
	push eax;basesid
	push 0x00040004;cc=fdo (Another critical system call. This will output 512 bytes to the FIFO. Code below this.)
	int 0x30;eax=ec
	add esp, 4*3
	
	mov rs_eax, eax;ec
	test eax, eax
	jnz .ret
	
	;result 1: st0
	push 0x00040005;cc=fdresult
	int 0x30;eax=result
	add esp, 4*1
	
	;result 2: st1
	push 0x00040005;cc=fdresult
	int 0x30;eax=result
	add esp, 4*1
	mov ebx, eax;ebx=st1
	
	;result 3: st2
	push 0x00040005;cc=fdresult
	int 0x30;eax=result
	add esp, 4*1
	
	;result 4: c
	push 0x00040005;cc=fdresult
	int 0x30;eax=result
	add esp, 4*1
	
	;result 5: h
	push 0x00040005;cc=fdresult
	int 0x30;eax=result
	add esp, 4*1
	
	;result 6: s
	push 0x00040005;cc=fdresult
	int 0x30;eax=result
	add esp, 4*1
	
	;result 7: 2
	push 0x00040005;cc=fdresult
	int 0x30;eax=result
	add esp, 4*1
	
	;Check st1	
	mov rs_eax, 0x00040001
	test bl, 00010000b
	jnz .ret
	
	mov rs_eax, 0x00040002
	test bl, 00000010b
	jnz .ret
	
	mov rs_eax, 0;ec
	
	.ret:
	ret
Below is the fdo system call. This system call outputs 512 bytes to the FIFO register, from a buffer.

Code: Select all

cc00040004:;fdo(cc, basesid, base), eax=ec
;Send data from base to FDC.
	
	push 1
	call getarg;eax=?basesid
	add esp, 4*1
	
	push eax;sid
	push 0x00020019;cc=sidds
	int 0x30;eax=ec, ds=segment (You just have to trust some functions. Including this one. This will load our base segment into ds)
	add esp, 4*2
	
	mov rs_eax, eax;ec
	test eax, eax
	jnz .ret
	
	push 2
	call getarg;eax=?base
	add esp, 4*1
	mov ebx, eax;ebx=?base
	
	mov ecx, 512;counter
	
	@@:
		
		mov dx, 0x03f4;select MSR
		in al, dx;get MSR
		
		test al, 00100000b;test NDMA
		jz @b;Jump if data not ready
		
		test al, 10000000b;test RQM
		jz @b;jump if data not ready
		
		mov dx, 0x03f5;select FIFO
		mov al, [ebx];get data
		out dx, al;set data
		
		inc ebx;update pointer
		dec ecx;update counter
		jnz @b
	
	mov rs_eax, 0;ec
	
	.ret:
	ret


Some of this code is not the best. May be optimized in the future. Just have to get it working before moving on to that task :wink:
I should also mention the state of the floppy:

IRQs: On
Drive Polling mode: On
FIFO: On
Threshold: 8
Implied seek: On
Precompensation: 0





2- Are you running on an emulator or on bare metal?
Bare metal.
Compaq Armada 1592 DT, to be exact.


3- If running on an emulator, which one?
Nope. No emulators are allowed for me. I believe in doing it the hard way :D


4- Can you provide details about what exactly happens in that "mysterious" moments at bootup?
POST performs, and then it just hangs. But I don't blame the poor thing: It's executed corrupted code (This is when booting from a previously booted disk (Which is a disk that have undergone the writing-operation of the OS (which is a disk with random patches of 0's in the code.)))



It seems to me you're running on bare metal. If so, please try to use an emulator, thus making debugging easier. Also, you'll be able to use the serial ports to report output, rather than the VGA (which can easily fail).

I'm sorry, but I am not running on any emulator. Neither am I planning to, unless ABSOLUTELY necessary.



I hope this will help. If you got any questions regarding the code, please feel free to ask me :)

Daniel Broder Jensen
UNICORN OS
User avatar
b.zaar
Member
Member
Posts: 294
Joined: Wed May 21, 2008 4:33 am
Location: Mars MTC +6:00
Contact:

Re: Writing to floppy causes data to be wiped!

Post by b.zaar »

danielbj wrote:I'm sorry, but I am not running on any emulator. Neither am I planning to, unless ABSOLUTELY necessary.
Although it's more hardcore only using real hardware which involves writing to physical media and then the delay in boot times is like coding machine code in hex. It can be done but it's not the most efficient way to get to your end result. Proving something works in an emulator first then moving it to real hardware will save you a lot of time and can provide more feedback about any errors that are detected.

If you insist on using hardware first then fall back to an emulator for debugging errors like this, that is what would be considered ABSOLUTELY necessary.
"God! Not Unix" - Richard Stallman

Website: venom Dev
OS project: venom OS
Hexadecimal Editor: hexed
danielbj
Member
Member
Posts: 36
Joined: Thu Dec 16, 2010 3:08 pm

Re: Writing to floppy causes data to be wiped!

Post by danielbj »

I gave it a try, and ran an image of my floppy in Bochs. It did not like that. At all. In the console it tells me the following:

Code: Select all

[BIOS] Booting from 0000:7c00
[FLOPPY] controller reset in software (This tells me that my floppy driver is starting its initialization) 
[FLOPPY] controller reset in software (This tells me that my floppy driver is specifying)
[FLOPPY] non DMA mode not fully implemented yet
It looks like Bochs does not support the good ol' fashioned way of doing things.
So well... that's all the information I can give you from running the code on an emulator :)

Daniel Broder Jensen
UNICORN OS
User avatar
b.zaar
Member
Member
Posts: 294
Joined: Wed May 21, 2008 4:33 am
Location: Mars MTC +6:00
Contact:

Re: Writing to floppy causes data to be wiped!

Post by b.zaar »

danielbj wrote:

Code: Select all

   int 0x30;eax=ec, ds=segment (You just have to trust some functions. Including this one. This will load our base segment into ds)
   add esp, 4*2
   
   mov rs_eax, eax;ec
   test eax, eax
   jnz .ret
   
   push 2
   call getarg;eax=?base
Are you changing the data selector but still using local variables in that last section of code?
"God! Not Unix" - Richard Stallman

Website: venom Dev
OS project: venom OS
Hexadecimal Editor: hexed
danielbj
Member
Member
Posts: 36
Joined: Thu Dec 16, 2010 3:08 pm

Re: Writing to floppy causes data to be wiped!

Post by danielbj »

b.zaar wrote: Are you changing the data selector but still using local variables in that last section of code?
If you by

Code: Select all

call getarg;eax=?base
mean local variables, then no. getarg(argn) is a local function, returning an argument from the callee's stack, using some offset of [ss:ebp]. In case of a priv 3 code calling kernel (priv 0), the stack will be switched, and getarg will account for that as well.

Daniel Broder Jensen
UNICORN OS
danielbj
Member
Member
Posts: 36
Joined: Thu Dec 16, 2010 3:08 pm

Re: Writing to floppy causes data to be wiped!

Post by danielbj »

Now some time has passed, I have been playing around with the controller timing (HLT, HUT and SRT) and nothing seems to work.

Help and ideas are still very welcome! :)

Daniel Broder Jensen
UNICORN OS
User avatar
KemyLand
Member
Member
Posts: 213
Joined: Mon Jun 16, 2014 5:33 pm
Location: Costa Rica

Re: Writing to floppy causes data to be wiped!

Post by KemyLand »

danielbj wrote:Now some time has passed, I have been playing around with the controller timing (HLT, HUT and SRT) and nothing seems to work.

Help and ideas are still very welcome! :)
Hi! Sorry for not responding your replies to my questions two months ago. I forgot the post :oops: .

I will still insist on an emulator. Try using Bochs' debugger, or Bochs + GDB, or even QEMU + GDB if you want to use "the good ol' way". A debugger and step-by-stepping will hopefully help you find the faulty code. I'm not sure, but I think Bochs has a command for seeing the floppy's contents. You may step-by-step with GDB, then see the floppy's contents, so you can find the faulty code :wink: .

BTW, I think Bochs now supports "the good ol' way". Check this bug report. Maybe you were using a outdated version (from 2002?); that could be caused by your distro's packages being outdated. Try to compile Bochs from source if so!
Happy New Code!
Hello World in Brainfuck :D:

Code: Select all

++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
[/size]
danielbj
Member
Member
Posts: 36
Joined: Thu Dec 16, 2010 3:08 pm

Re: Writing to floppy causes data to be wiped!

Post by danielbj »

Tried in qemu. All works fine. Nothing to debug.

However I ran more tests on the physical device and made another discovery.
I tried writing 0xdb to a sector, and sure enough, the sector gets filled with 0xdb. But instead of nulling random sectors as well it 0xdb'ed them!
I suspect it has something to do with how the physical magnetic hardware is working. I have read the 82077AA documentation over and over again, but it does not go into any further detail on how the magnetic head is loaded/unloaded, or what that even means.

I will now play around with adding artificial (independent) delays between sending the last command byte and sending the first data byte, just to see if the drive is indeed starting to write while the head mechanism is still moving around.

EDIT: Another discovery i made: When reading no errors are detected. But when writing a buffer underrun/overrun occurs.

EDIT 2: I have now played around with the aforementioned artificial delays. After inserting a 100 ms delay before sending the data to write to the FIFO, the RQM-bit of the MSR will never become 1, meaning the controller is not ready to send nor receive data.

EDIT 3: The information in edit 2 is false. It seems that when reading/writing to/from a sector above 18, the command is not executed, and therefore no bytes can be transmitted to/from the FIFO. I presume that Implied Seek is not working.
Also, in addition to Edit 1, the write command seems to want to write more than one sector, hence the FIFO underrun. I now got the scattered zeroing under control. The specified sector is being written correctly. After that the last byte written to the first sector is written to the first byte of the next sector, followed by the what i think is the contents of the FIFO, and then the rest is zero. If I send 1024 bytes (two sectors) the first two sectors are written correctly, and then the next (third) sector gets last byte+FIFO+nulls.

Daniel Broder Jensen
UNICORN OS
danielbj
Member
Member
Posts: 36
Joined: Thu Dec 16, 2010 3:08 pm

Re: [SOLVED] Writing to floppy causes data to be wiped!

Post by danielbj »

FINALLY!

It turns out that (for some reason) the FDC ignored Implied seek. I also suspect that the head assembly was misaligned to the track, and therefore the FDC would not know when the sector stops. Turning off implied seek, and seeking manually (by a separate command) solved the problem!

Also the EOT parameter of the read/write command is the sector after witch the operation will stop, when encountered and NOT the amount of sectors to transfer!

Thank you all for your help! :D

Daniel Broder Jensen
UNICORN OS
Post Reply