Reading the disk with AHCI.

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: Reading the disk with AHCI.

Post by Bonfra »

With the updated printf function this is the output I get (also pointer CR2):
Image
Regards, Bonfra.
foliagecanine
Member
Member
Posts: 148
Joined: Sun Aug 23, 2020 4:35 pm

Re: Reading the disk with AHCI.

Post by foliagecanine »

Have you tried taking out the AHCI read sector call?
Otherwise, you could try putting for(;;); until you pinpoint the instruction that causes it.

One other thing...
I've found the Wiki's code to read the disk to be somewhat unreliable (overwrites memory after the buffer).
Here's something similar to what I use (line numbers based on last git commit):

Replace configure_device line 77 with:

Code: Select all

cmdheader[i].prdtl = 1;
Remove line 161.

Replace lines 167-178 with:

Code: Select all

cmdtable->prdt_entry[0].dba = (uint32_t)address;
cmdtable->prdt_entry[0].dbau = (address>>32);
cmdtable->prdt_entry[0].dbc = (count * 512) - 1;
cmdtable->prdt_entry[0].i = 1;
My OS: TritiumOS
https://github.com/foliagecanine/tritium-os
void warranty(laptop_t laptop) { if (laptop.broken) return laptop; }
I don't get it: Why's the warranty void?
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: Reading the disk with AHCI.

Post by Bonfra »

foliagecanine wrote:Have you tried taking out the AHCI read sector call?
Bonfra wrote:It's never a total win... I get a page fault on the real hardware if I try to execute this code.
The fun thing is that it is thrown when I call the sata_read function, not when I init the devices
yup, it is exactly the read call that directly causes the crush.
foliagecanine wrote: One other thing...
I've found the Wiki's code to read the disk to be somewhat unreliable (overwrites memory after the buffer).
Here's something similar to what I use (line numbers based on last git commit):
thanks for the advice, your code is also easier to read. sadly it still causes the pagefault / general protection
Regards, Bonfra.
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: Reading the disk with AHCI.

Post by Octocontrabass »

Bonfra wrote:Technically my handler dumps a good portion of the stack, I've run it a few times but for some reasons, I always get one of these two screens with empty stack (if I manually raise the interrupt the stack is full of data):
Those CS and SS values are very obviously wrong. You need to spend some time debugging your exception handler. I notice that the stack is aligned wrong when you make this function call.
Bonfra wrote:I've added this bit of code to the handler to print the content of CR2:
That is some awful inline assembly. Try this instead:

Code: Select all

asm volatile( "mov %0, cr2" : "=r"(cr2) );
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: Reading the disk with AHCI.

Post by Bonfra »

Octocontrabass wrote: Those CS and SS values are very obviously wrong. You need to spend some time debugging your exception handler. I notice that the stack is aligned wrong when you make this function call.
Hmm ok, that call can happen in two circumstances, a normal interrupt or what I called a special interrupt, for the normal it could cause some problems around here and the special one could be around here I could fine dome problems since I edit the content of the stack with the RSP register. Thank's I'll let you know how it goes
Octocontrabass wrote: That is some awful inline assembly. Try this instead:

Code: Select all

asm volatile( "mov %0, cr2" : "=r"(cr2) );
I always forget how to use this syntax, thanks.
Regards, Bonfra.
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: Reading the disk with AHCI.

Post by Octocontrabass »

Bonfra wrote:Hmm ok, that call can happen in two circumstances, a normal interrupt or what I called a special interrupt, for the normal interrupt I don't think that there is a problem but maybe around here I could fine dome problems since I edit the RSP register.
No, the stack is misaligned the same way for both kinds of interrupt. You should fix the stack alignment in the shared code path. (For example, you could subtract 16 instead of 8 from RSP here.)
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: Reading the disk with AHCI.

Post by Bonfra »

Octocontrabass wrote: No, the stack is misaligned the same way for both kinds of interrupt. You should fix the stack alignment in the shared code path. (For example, you could subtract 16 instead of 8 from RSP here.)
to prevent other errors I've also added 16 instead of 8 here but now whenever I hit a key I get a division by zero. I've also changed all the other code between of these two operations that involve rsp but it is still a division by zero

EDIT:
I'm a bit dumb and I wrote

Code: Select all

 mov rax, [rsp + 8 * 16]
instead of

Code: Select all

 mov rax, [rsp + 8 * 18]
(line 49)
Originally was 17 but since we changed the value of rsp by 8 this should be adapted. now it works with normal interrupts. I'm testing it in the real hardware page fault case
Regards, Bonfra.
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: Reading the disk with AHCI.

Post by Octocontrabass »

It sounds like you didn't change the 17 on this line to 18.

...And I see you edited your post as I was replying.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: Reading the disk with AHCI.

Post by Bonfra »

Octocontrabass wrote: ...And I see you edited your post as I was replying.
Yes, I also changed line 63 from

Code: Select all

lea rdi, [rsp + 8]   ; skip the MXCSR register.
to

Code: Select all

lea rdi, [rsp + 16]   ; skip the MXCSR register.
to get the parameter right.

Oddly enough if I raise the exception manually with asm("int 0xe") I get the CS and SS registers right (0x10 and 0x08) but if let it happen naturally without forcing it I still get the wrong values.
Regards, Bonfra.
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: Reading the disk with AHCI.

Post by Octocontrabass »

Bonfra wrote:Oddly enough if I raise the exception manually with asm("int 0xe") I get the CS and SS registers right (0x10 and 0x08) but if let it happen naturally without forcing it I still get the wrong values.
The INT instruction doesn't push an error code, so your stack frame will definitely be wrong if that's how you test it. Try something like this:

Code: Select all

uint8_t temp = *(volatile uint8_t *)0xfedcba9876543210;
This will cause #GP with an error code of 0. Pick a canonical address that isn't mapped to cause #PF instead.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: Reading the disk with AHCI.

Post by Bonfra »

Octocontrabass wrote:The INT instruction doesn't push an error code, so your stack frame will definitely be wrong if that's how you test it.
Oh.
Octocontrabass wrote: Try something like this:

Code: Select all

uint8_t temp = *(volatile uint8_t *)0xfedcba9876543210;
This will cause #GP with an error code of 0. Pick a canonical address that isn't mapped to cause #PF instead.
Bingo, wrong values everywhere but why?

EDIT:
Wrong values also appear if I raise a normal interrupt like a division by zero

Code: Select all

int x = 0;
int y = 7 / x;
it could be a coincidence but with this, I get the values of CS in RIP and the value of SS in RSP so it seems that they are shifted back by some value
Regards, Bonfra.
User avatar
Bonfra
Member
Member
Posts: 270
Joined: Wed Feb 19, 2020 1:08 pm
Libera.chat IRC: Bonfra
Location: Italy

Re: Reading the disk with AHCI.

Post by Bonfra »

I've finally found the issue with the handlers: basically, I assigned to the interrupts that use error code the handler of the ones that don't use it, and vice-versa. for some interrupts, it worked the same (like the keyboard one) but for others, it was causing this issue. Anyway, now that I have the exact values I've been able to understand where the issue was: my page frame allocator again was causing the problem.
There is something really wrong with the code I use for the alignment of the page when I init or deinit some memory. for now, I've stuck some +1 here and there to make him work but I need to work it better.
Even if it's a bit clunky it works as intended and I'm able to read the content of the disk. Now I just need to refactor a bit of everything to make it pretty.
Thanks to everyone for the help.
Regards, Bonfra.
Post Reply