Page 1 of 1

[C Language] How to implement abstract data structures?

Posted: Mon Mar 10, 2008 1:32 pm
by Paw
I beg pardon for my beginner's question related to C, but I didn't touch strongly procedural/imperative programming languages for a long time and I'm actually more used to OOP, generic programming and dynamic data types.

I've currently got some problems with the implementation of "data type agnostic", abstract data structures in C -- for example a simple stack. I'd like to make it independent from specific data types to store, and specify the type later when making use of the stack.

I know there is the "void pointer" concept which can be used to point to any data structure/type. But how do I accomplish "void pointer arithmetic" without knowing the stored "target" data structure/type from the stack algorithms' point of view? I'd like to abstract my data structures as much as possible in order to gain a maximum of reusability.

A possible solution that comes to my mind, would be wrapping the data to store, in a struct like the following:

Code: Select all

struct stack_node {
    void* data;
    struct stack_node* next;
}
But I want to use my "data type agnostic" data structures in my kernel at a point where no dynamic memory allocation is available (e.g. for storing available pages). Furthermore I want to avoid that the code which makes use of the stack has to create these node (wrapper) data structures.

Are there any elegant techniques available or am I tied to using my above solution?


Thanks in advance,

Paw

Posted: Mon Mar 10, 2008 1:42 pm
by JamesM
Hi,

Unfortunately data-type agnostic ADTs or generically-typed ADTs are where C falls down. The void* mechanism is pretty much the only mechanism it has for type-independence, which of course raises the problem that the datatype can only be either sizeof(void*) long, or must exist in memory in the same scope as the ADT such that a pointer can be used.

In extreme cases it is possible to accomplish this with a statically sized array used as a primitive memory pool, for example:

Code: Select all

typedef struct
{
  int a;
  int b;
} mystruct_t;

int myfunc_which_needs_a_stack()
{
  mystruct_t pool[32];
  int poolIdx = 0; // <--- Make this point to the next item available from the pool.
  
  pool[poolIdx++] = {/*item 1*/};
  pool[poolIdx++] = {/*item 2*/};

  stack_push((void*)&pool[0]);
  stack_push((void*)&pool[1]);
  mystruct_t *popped = (mystruct_t*)stack_pop();
}
It's ugly, but it's pretty much the only method of accomplishing what you want without dynamic memory.

In short, C isn't really adapted to abstract ADTs, and heavy users of such things would be advised to step up to C++ (which despite pooh-pooh-ing from many people is perfectly useable for an OS project), which provides its templating mechanism for just this purpose.

You'll see time and time again in the linux kernel (which uses C) structures like linked lists being defined over and over again, just with different types, because of this deficiency in the language.

Unfortunate but true.

Hope this helps,

James

Posted: Mon Mar 10, 2008 1:49 pm
by Alboin
You could also try to restructure your program. Try to figure out different ways to do what you need to do. Often times, this results in faster and more condensed code. (However, you usually get rid of simplicity and expandability at the same time. :P )

Posted: Mon Mar 10, 2008 1:50 pm
by Paw
Thanks James, for clearing this up.
I think I'll swallow the pill and look how to get around these issues.
I initially planned writing my kernel in C++ but the initial setup overhead made me decide against it :-/ Maybe somewhen later on my route...

Edit:
(However, you usually get rid of simplicity and expandability at the same time. Razz )
LOL! Thanks for the encouragement...
Ah well, I wanted to get rid of my abstracting mindset anyway :-P

Posted: Mon Mar 10, 2008 2:20 pm
by Alboin
Paw wrote:Thanks James, for clearing this up.
I think I'll swallow the pill and look how to get around these issues.
I initially planned writing my kernel in C++ but the initial setup overhead made me decide against it :-/ Maybe somewhen later on my route...
C++ isn't actually that difficult to setup. I would suggest you at least try before getting hung up because of it.

Posted: Mon Mar 10, 2008 2:59 pm
by Colonel Kernel
You could always use C++ as a "better C". For example, use templates but not classes. That shouldn't require any extra run-time support.

Posted: Mon Mar 10, 2008 3:12 pm
by JamesM
Colonel Kernel wrote:You could always use C++ as a "better C". For example, use templates but not classes. That shouldn't require any extra run-time support.
I agree. The GNU C++ compiler also enforces more type-safety "out of the box", and allows nice things at no cost like operator overloading and being able to define loop variants inside the for statement, like so:

Code: Select all

for(int i = 0; i < bleh; i++) ;

EDIT: Just to show how easy setting up C++ can be, I've posted the ENTIRE contents of our cppsupport.cc file in project Pedigree. It ain't exactly huge ;) [ initial code written by me, neatened up by bluecode. ]

Code: Select all

/*
 * Copyright (c) 2008 James Molloy, James Pritchett, Jörg Pfähler, Matthew Iselin
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
#include <processor/types.h>
#include "cppsupport.h"

// Required for G++ to link static init/destructors.
void *__dso_handle;

// Defined in the linker.
uintptr_t start_ctors;
uintptr_t end_ctors;

/// Calls the constructors for all global objects.
/// Call this before using any global objects.
void initialiseConstructors()
{
  // Constructor list is defined in the linker script.
  // The .ctors section is just an array of function pointers.
  // iterate through, calling each in turn.
  uintptr_t *iterator = reinterpret_cast<uintptr_t*>(&start_ctors);
  while (iterator < reinterpret_cast<uintptr_t*>(&end_ctors))
  {
    void (*fp)(void) = reinterpret_cast<void (*)(void)>(*iterator);
    fp();
    iterator++;
  }
}

/// Required for G++ to compile code.
extern "C" void __cxa_atexit(void (*f)(void *), void *p, void *d)
{
}

/// Called by G++ if a pure virtual function is called. Bad Thing, should never happen!
extern "C" void __cxa_pure_virtual()
{
}

void *operator new (size_t size) throw()
{
  // TODO
  return 0;
}
void *operator new[] (size_t size) throw()
{
  // TODO
  return 0;
}
void *operator new (size_t size, void* memory) throw()
{
  return memory;
}
void *operator new[] (size_t size, void* memory) throw()
{
  return memory;
}
void operator delete (void * p)
{
  // TODO
}
void operator delete[] (void * p)
{
  // TODO
}
void operator delete (void *p, void *q)
{
  // TODO
}
void operator delete[] (void *p, void *q)
{
  // TODO
}

Posted: Mon Mar 10, 2008 3:25 pm
by Paw
Thanks for cheering me up. I'm just a little stubborn, because I want to "tame that C beast". I kind of like it and it's a welcome difference to me opposed to functional/declarative/object-oriented programming etc.

Well I understand now why those ADTs are implemented in a redundant manner in C, since (as far as I'm not mistaken) this automagically happens in C++ as well as soon as a template is "instantiated" with a type.

However, I tried to wrap my head around my re-usability problem in C and came up with following "solution prototype" for a type-agnostic stack, which is based on a macro:
#ifndef STACK_H_
#define STACK_H_

#define DEFSTACK(name, datatype) \
struct name##_struct { \
datatype* array; \
int size; \
}; \
typedef struct name##_struct name; \
void name##_init(datatype* array, name* st) { \
st->array = array; \
st->size = 0; \
} \
void name##_push(datatype* item, name* st) { \
st->array[st->size++] = *item; \
} \
datatype* name##_pop(name* st) { \
if(st->size) { \
return &st->array[--st->size]; \
} \
return 0; \
}

#endif /*STACK_H_*/
Instantiation and usage can be done as follows:

Code: Select all

#include "stack.h"

DEFSTACK(int_stack, int);

void main() {
    int array[100];
    int_stack stack;
    int_stack_init(&array[0], &stack);
    int a = 10, b = 7, c = 1000;
    int_stack_push(&a, &stack);
    int_stack_push(&b, &stack);
    int_stack_push(&c, &stack);

    c = *int_stack_pop(&stack);
    b = *int_stack_pop(&stack);
    a = *int_stack_pop(&stack);
}
It seems to work and I'm happy so far :-)
C++ will come as soon as I'm deeply missing inheritance...

Edit: Oh, I just noticed JamesM's addition, the C++-support file. That's really neat. I'll rethink my decision.

Posted: Mon Mar 10, 2008 3:29 pm
by Brynet-Inc
JamesM wrote:....and allows nice things at no cost like operator overloading and being able to define loop variants inside the for statement, like so:

Code: Select all

for(int i = 0; i < bleh; i++) ;
That's valid C99 as well.. :wink:

Posted: Mon Mar 10, 2008 3:58 pm
by JamesM
Brynet-Inc wrote:
JamesM wrote:....and allows nice things at no cost like operator overloading and being able to define loop variants inside the for statement, like so:

Code: Select all

for(int i = 0; i < bleh; i++) ;
That's valid C99 as well.. :wink:
Grr, I always forget to enable C99 mode on GCC.

Posted: Mon Mar 10, 2008 4:02 pm
by Alboin
JamesM wrote:
Brynet-Inc wrote:
JamesM wrote:....and allows nice things at no cost like operator overloading and being able to define loop variants inside the for statement, like so:

Code: Select all

for(int i = 0; i < bleh; i++) ;
That's valid C99 as well.. :wink:
Grr, I always forget to enable C99 mode on GCC.
Is there a reason that isn't enabled by default?

Did I miss that memo?

Posted: Mon Mar 10, 2008 4:19 pm
by JamesM
No idea. But it isn't!

Code: Select all

[22:19:00] ~/test $ gcc -o test test.c
test.c: In function ‘main’:
test.c:3: error: ‘for’ loop initial declaration used outside C99 mode
test.c:4: error: expected expression before ‘int’

Code: Select all

[22:19:04] ~/test $ cat test.c
int main(char argc, char **argv)
{
  for (int i =0; i < 6; i++)
  int j = 3/0;

}

Posted: Mon Mar 10, 2008 4:46 pm
by Brynet-Inc
It's not that much effort to add -std=c99 or -std=gnu99 in the command line, especially if you're using Makefiles like most sane people.

It's not entirely complete though... but, I don't think any compiler is 100% compliant.
http://gcc.gnu.org/c99status.html

Posted: Mon Mar 10, 2008 6:53 pm
by quok
Alboin wrote:
JamesM wrote: Grr, I always forget to enable C99 mode on GCC.
Is there a reason that isn't enabled by default?

Did I miss that memo?
It probably won't be enabled until support is complete. For now you'll probably want to use -std=gnu99 rather than -std=c99, as support for c99's static (or was it static inline?) keyword isn't complete (as of 4.2.3 anyway), and so GCC would warn on that. Using -std=gnu99 at least disables that warning, as well as enabling some other GNU-specific things, but I have no idea what they may be.