Page 6 of 7

Posted: Mon Mar 24, 2008 1:26 am
by jzgriffin
I've made quite a few dumb mistakes. Here's one: Just finished writing around 2000 lines of brand new, quality, commented code, then I run `make` in the terminal. Guess where the object files went. On top of the source files. :-P

Posted: Mon Mar 24, 2008 11:07 am
by Brynet-Inc
Jeremiah Griffin wrote:I've made quite a few dumb mistakes. Here's one: Just finished writing around 2000 lines of brand new, quality, commented code, then I run `make` in the terminal. Guess where the object files went. On top of the source files. :-P
This is why I use SVN... ;)

If only for the ability to type "svn revert" when I screw things up.. :lol:

Posted: Mon Mar 24, 2008 11:54 am
by mystran
Brynet-Inc wrote:
Jeremiah Griffin wrote:I've made quite a few dumb mistakes. Here's one: Just finished writing around 2000 lines of brand new, quality, commented code, then I run `make` in the terminal. Guess where the object files went. On top of the source files. :-P
This is why I use SVN... ;)

If only for the ability to type "svn revert" when I screw things up.. :lol:
Source control (any, SVN being my favourite as well) also protects from the other type of screw-up: change some algorithm, then change a whole lot of other things, and then want to revert back to the original algorithm, yet keep the other changes? Just check the source control for when the algorithm was changed to immediately find the right version, and pull a reverse diff, cleanup other stuff, and apply.

Posted: Mon Mar 24, 2008 1:02 pm
by jzgriffin
Brynet-Inc wrote:
Jeremiah Griffin wrote:I've made quite a few dumb mistakes. Here's one: Just finished writing around 2000 lines of brand new, quality, commented code, then I run `make` in the terminal. Guess where the object files went. On top of the source files. :-P
This is why I use SVN... ;)

If only for the ability to type "svn revert" when I screw things up.. :lol:
I generally do use SVN, for larger projects. But I have so many small test projects that setting up a SVN repository for each of them would be a massive pain, even if the repository was local.

Posted: Mon Mar 24, 2008 5:47 pm
by mystran
Jeremiah Griffin wrote: I generally do use SVN, for larger projects. But I have so many small test projects that setting up a SVN repository for each of them would be a massive pain, even if the repository was local.
The trick is not to setup a different repository. Instead keep one for every project that doesn't have it's own. One way to do this is to keep all random projects into the same directory, and make that directory into an SVN repository. Then when you add new test projects, you just add subdirectories like you would in a normal project.

Since you can check-in/out any subdirectory in a SVN repo, this works quite nice. The only down-side is that revision numbers are no longer continuous for any specific project, rather they are interleaved with revisions from other projects, but since you can easily query which revisions a given file (or subdirectory) was changed, this isn't as bad as it might sound.

[edit]

I in fact only keep one SVN repo for all my projects at home. :)

Posted: Fri Mar 28, 2008 6:12 am
by AJ
Hey - here's another one I've just spent hours debugging.

I was initialising my AP's and all was going well until I got the AP to jump to some code in the higher-half. The AP kept triple-faulting. My first thought was 'paging' - and yes, CR2 contained the address of the first instruction I wanted to run.

So, as expected, I checked CR3 and CR0 and the paging structures on the AP were exactly the same as on the BSP. So, I pretty much rewrote my trampoline code from scratch. The odd thing was, the AP was already using a stack in the higher-half which was assigned by the BSP and was not page faulting.

I've just seen the glaring mistake. My boot loader uses large pages for the kernel and small pages are allocated for everything else (including the assigned stack). In the trampoline code, I don't even touch CR4, so PAE and PSE are disabled. D'oh!

Cheers,
Adam

Posted: Sat Mar 29, 2008 5:31 am
by Krox
The silliest mistake I made needed several hours: I put the align statement just after the label and not before. After detecting that I put a BIG comment in that piece of code, so Ill never do that again :lol:

Posted: Sun Mar 30, 2008 2:20 pm
by Candy
Krox wrote:The silliest mistake I made needed several hours: I put the align statement just after the label and not before. After detecting that I put a BIG comment in that piece of code, so Ill never do that again :lol:
Most linkers fill with 0x90 bytes, making that mistake harder to detect (and less important - it'll cost you cycles though).

Posted: Mon Mar 31, 2008 7:08 am
by Krox
makes sense. But in my case it was the multiboot header which contains a reference to itself. That reference was broken and grub only said "floppy not bootable" or sth like that

Posted: Tue Apr 15, 2008 12:48 pm
by Khumba
I was porting memory management code from an OS I had started to a console app. The code compiled fine, and I think it ran fine too, but as soon as I tried to do file I/O it would crash. Ran gdb to where it was segfaulting and saw the problem:

#0 0x???????? in free () at memory.c:???
#1 0x???????? in fclose () from /lib/libc.so.6
#2 0x???????? in main ()

The allocate function was called alloc(), and the free function was called free(). A global search and replace fixed that problem.

Forgetting to recompile is always fun, especially when you've been making different adjustments to get something to work and need to retest them all.

Posted: Tue Apr 15, 2008 1:24 pm
by mystran
I made another dump mistake today:

I had an array of strings, with divisions like say 7/12, and I needed those in another array as float values. The logical thing to do, was to copy the string array, and then apply some search/replace to the copy.

Result: the last entry was 1/24, and while everything else got right, the last entry didn't get a .f suffix... so when this was referenced, I'd end up with division by zero, and since that was in audio code (and I was testing with ASIO drivers) I ended up with a dead driver -> had to reboot.

The stupid thing about this is that I actually survived the division-by-zero once, but 'cos I didn't realize where the bug was at first, I had to test again why it crashed, and it was the SECOND time that my driver decided to die.

Fortunately found the bug without further rebooting.

Lesson: don't copy-paste-search-replace too fast. :)

Posted: Tue Apr 15, 2008 6:13 pm
by Zenith
I just made one stupid mistake right now:

As I was porting my OS to x86-64, I added some "#if defined"s to get my IDT functions to work for both x86 and x86-64. Since I hadn't implemented the panic function for x64 yet, I recompiled the x86 kernel and tested it. The panic was printing wacko values on screen, and I spent a lot of time trying to figure out why it wouldn't work.

I rewrote some code, became annoyed, and I couldn't figure out why it wouldn't work. So I just decided to take out the #if directives and start over.

Multiplatform code:

Code: Select all

#if defined __x86__
	pusha
	
	push %ds
	push %es
	push %fs
	push %gs
	
	mov $KDATA_SEG, %ax
	mov %ax, %ds
	mov %ax, %es
	mov %ax, %fs
	mov %ax, %gs
#elif defined __x86_64__
...
#endif
I took out the directives, and the compiler said undefined reference to KDATA_SEG. And sure enough, I hadn't included the file which defined it, which incidentally also defined __x86__/__x86_64__ ... - so the compiler was just ignoring the ifdefs and proceded to run my handler, which just went and read from the (invalid) stack as usual. :x

Oh, joy... :wink:

The premium mistake...

Posted: Wed Apr 16, 2008 6:40 am
by edfed
i made a code to erase drives using INT 13h
and one day, i wanted to reuse it after a long time without edition or code review.

i wanted to delete a floppy drive, and as i was lazy to seek the instruction where the drive is selected, i simply wrote this in the init of the code...
ho my god, it was my worse mistake.

no one could beat this one.
(don't care about this old and dumb asm design, now i don't code like this at all )

Code: Select all

        org 100h
        call mode13h
        call newint9
        mov ax,02000h
        mov gs,ax

;;;;;;;;;;;;  i inserted the instruction below
        mov [disk.drive],0
;;;;;;;;;;;; to delete a floppy drive

        mov [disk.segment],ax
        mov [color1],31
        jmp debut

;;;;;;;;;;;;;;;
....  ~ 300 lines of code
;;;;;;;;;;;;;;;

        mov [disk.segment],2000h
        mov [disk.drive],80h    <----- ARGHHHHH, it is overwriten there
        mov [disk.track],bx
        mov [disk.head],ah
        mov [disk.sector],al
        mov [disk.sectors],1
        shl ebx,16
        and eax,0ffffh
        or eax,ebx
        mov [datatype],4
        call hextoascii
        mov ax,2000h
        mov gs,ax
        mov edi,[numptr]
        add edi,2
        mov ebx,[topofbuffer]
        call copy_text
        call disk.write              <----------- and it write on drive there...
        mov ebx,[topofbuffer]
        mov [txt.x],100
        mov [txt.y],100
        call txt
        cmp eax,0
        jne @f
        cmp ebx,0
        jne @f
        mov [colorclear],40h
@@:
        mov al,0
        call refresh_clear
        test [status],2
        je @f
        mov [curcode],key.echap
@@:
        cmp [curcode],key.echap
        je exit
        mov al,[key+key.cur]
        mov [curcode],al
        jmp nextsectors
.end:
include 'exit.inc'
...
finally, it cleared the first 16 tracks of the primary hard drive with window 98 on it...

ARGHHHHHHH

now, the problem is solved, before to write, it shows the drive parameters and wait for a key for 4 typematic times, hold the space bar 4 scancode repititions to continue. escape to exit

:D

Posted: Wed Apr 16, 2008 7:12 am
by Brynet-Inc
edfed wrote:finally, it cleared the first 16 tracks of the primary hard drive with window 98 on it...
This is why you use an emulator before running risky code on a real machine, fortunately, Windows 98 isn't a big loss - your faulty code did you a favour. 8)

Re: Dumbest mistake you have ever made

Posted: Sat Dec 05, 2009 3:43 pm
by Combuster
(necro but well deserved imo)

I just spent the past three hours fixing a bug that was coming from the freebasic library, In the end i traced it back to the implementation of a system call:

Code: Select all

push ebx
push edx
push edi
(...)
pop ebx
pop edx
pop edi
The main reason it took so long was because I was debugging around all the other (wrong) places and edits, since this code was in my repo for 14 months without causing any problems...

I now have a bump on the bottom side of my desk...