I'm not sure if primitives like DrawLine (or god forbid DrawCircle) are that useful in driver if you want to do typical modern graphics.
For the most part, in a typical modern application, you want to BitBlit bitmaps into screen, and possibly fill some rectangles, then draw text on top of those. For anti-aliased text drawing, you need either hardware BitBlit that can do alpha-blending, or it's faster just draw the stuff in software buffer, and drop it to screen.
All kinds of fancy accelerated primitives were cool when processors were slow, and memory (for software back-buffers / bitmap caches) was expensive. Nowadays, it's not much of a problem to do anti-aliased drawing into software back-buffer (say GDI+ style), and keep a bitmap cache in main memory (so you don't have to redraw if nothing changed really) and then just BitBlit to screen the relevant regions.
So... for an accelerated 2D driver, I personally wouldn't bother with anything but BitBlit (memory->screen, screen->screen, preferably with alpha blending), a routine to swap the front buffer in a refresh synchronized way (for double/triple buffering) and maybe some method to allocate offscreen bitmaps (for GPU memory bitmap caching). Pretty much anything else is IMHO better handled on software levels.
ATI Graphics Card
Re: ATI Graphics Card
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.