Sign in to follow this  
DevFred

generic containers in C

Recommended Posts

I'm brushing up on my C, and I just realized I never saw something like a C++ std::vector in C. Since C doesn't have templates, I thought of two ways I could implement a vector on my own: a) Implement a vector with element type pointer-to-void and do casting on the client side. Would it be possible to store integers directly as pointer-to-voids? Is (int)((void*)i) guaranteed to be i, assuming sizeof(void*) == sizeof(int)? Or could I only store pointers in such a vector? b) Write a header file that relies on a macro ELEMENT_TYPE to be defined before the client includes the header. I'm not sure if multiple clients in distinct compilation units needing vectors of the same type would pose a problem for the linker though... Surely there's an idiomatic C way to build "generic" data structures. Please enlighten me. Thanks! [Edited by - DevFred on August 20, 2008 3:44:40 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by DevFred
Surely there's an idiomatic C way to build "generic" data structures. Please enlighten me. Thanks!
Generic programming is not idiomatic in C :(

Quote:
a) Implement a vector with element type pointer-to-void and do casting on the client side. Would it be possible to store integers directly as pointer-to-voids? Is (int)((void*)i) guaranteed to be i, assuming sizeof(void*) == sizeof(int)? Or could I only store pointers in such a vector?
This is the approach that I've seen used before. Obviously if everything is done via pointers then performance will be worse than the C++ template version. Your sizeof hack should work to allow you to avoid using pointers for primitive types, but again, this extra check does introduce a performance overhead that wouldn't be present in a C++ template version.


Quote:
b) Write a header file that relies on a macro ELEMENT_TYPE to be defined before the client includes the header. I'm not sure if multiple clients in distinct compilation needing vectors of the same type would pose a problem for the linker though...
This would allow you to tailor the container to the element type at compile time (like a template), but it is a bit messy.

I'm imagining something like this, and now that I think about it, it's not that bad seeing it does give you the power of templates:

#define ELEMENT_TYPE int
#include "vector.h"
#undef ELEMENT_TYPE

void foo()
{
vector_int myVector;
vec_init( &myVector );
vec_push_back( &myVector, 42 );
}

------- vector.h -------
#define vector vector_##ELEMENT_TYPE

struct vector
{
ELEMENT_TYPE* data;
};

void vec_init( vector* v, unsigned int size )
{
v->data = malloc( sizeof(ELEMENT_TYPE)*size );
}

void vec_push_back( vector* v, ELEMENT_TYPE data )
{
...
}

#undef vector

Share this post


Link to post
Share on other sites
Quote:
Original post by Hodgman
I'm imagining something like this, and now that I think about it, it's not that bad seeing it does give you the power of templates:
*** Source Snippet Removed ***


What if I want to use a vector of ints and a vector of floats in the same program?

Share this post


Link to post
Share on other sites
Quote:
Original post by Gage64
What if I want to use a vector of ints and a vector of floats in the same program?
In theory, this could be achieved by adding a namespacing prefix to every defined function, and including the header different times with different ELEMENT_TYPE values.

In practice, as written right now, the substitution shouldn't happen (since substitutions never happen in preprocessor directives) and the header would always define a structure named vector_ELEMENT_TYPE.


Share this post


Link to post
Share on other sites
I heard that the first edition of The C++ Programming Language suggested using macros this way (when templates didn't exist). I think my college library has that book so maybe I'll take a look...

Share this post


Link to post
Share on other sites
Quote:
Original post by Gage64
What if I want to use a vector of ints and a vector of floats in the same program?


in my example the structures would be named vector_int and vector_float. The vec_init and vec_push_back are overloaded for these two types.

Share this post


Link to post
Share on other sites
Quote:
Original post by Hodgman
Quote:
Original post by Gage64
What if I want to use a vector of ints and a vector of floats in the same program?


in my example the structures would be named vector_int and vector_float. The vec_init and vec_push_back are overloaded for these two types.


But the question was about C. Maybe C99 supports function overloading, but AFAIK most compilers don't support C99 so relying on it is probably not a good idea.

Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
this could be achieved by adding a namespacing prefix to every defined function

That's what I was thinking. The following first try obviously doesn't work, because ## can only be used in macro definitions (let's ignore parameters for now):

void push_##ELEMENT_TYPE(void)
{
}

The next step is to define a macro for the appending:

#define GENERIC(x,y) x##y

void GENERIC(push_,ELEMENT_TYPE)(void)
{
}

The problem is that the function is now literally called push_ELEMENT_TYPE. I assume the reason for this is that while replacing GENERIC, its parameters are treated literally. So we need another macro:

#define SUBSTITUTE(x,y) x##y
#define GENERIC(x,y) SUBSTITUTE(x,y)

void GENERIC(push_,ELEMENT_TYPE)(void)
{
}

This works, but I'm not exactly sure in which order macros get expanded. Can anyone explain in detail why we need this one extra level?

Share this post


Link to post
Share on other sites
Quote:
Original post by Hodgman
in my example the structures would be named vector_int and vector_float.

No, both would literally be named vector_ELEMENT_TYPE. Stupid preprocessor :)

It seems we need more macros to get the desired result:

#define ELEMENT_TYPE int

#define APPEND(x,y) x##y
#define SUBSTITUTE(x,y) APPEND(x,y)
#define GENERIC(x) SUBSTITUTE(x,ELEMENT_TYPE)

#define vector GENERIC(vector_)

struct vector
{
/* ... */
};

Share this post


Link to post
Share on other sites
Quote:
Original post by Hodgman
Quote:
Original post by Gage64
What if I want to use a vector of ints and a vector of floats in the same program?


in my example the structures would be named vector_int and vector_float. The vec_init and vec_push_back are overloaded for these two types.


Whether we are considering real overload or not, it will not help you here. Both those functions are referring to the same ELEMENT_TYPE. The "user" would have to redefine the same ELEMENT_TYPE for int and for float which would be not so easy to deal with... Aditionally, if the sizes of the types are different, push_back may not work.

[added: DevFred answered that first then me :)! he's fast :D ]

The approach I've seen most used is the first one
Quote:
Original post by DevFred
a) Implement a vector with element type pointer-to-void and do casting on the client side. Would it be possible to store integers directly as pointer-to-voids? Is (int)((void*)i) guaranteed to be i, assuming sizeof(void*) == sizeof(int)? Or could I only store pointers in such a vector?


where you carry a structure for the vector arround in each function. Something in the lines of:

typedef struct {
int sizeOfEachElement;
int maxCapacity;
int numberOfElements;
void *data;
}GenericVector;

void genericVector_Init(GenericVector *vector, int maxElems, int sizeOfEachElement)
{
vector->sizeOfEachElement = sizeOfEachElement;
vector->maxCapacity = maxElems;
vector->numberOfElements = 0;
vector->data = malloc(sizeOfEachElement * maxElems);
}

int genericVector_PushElement(GenericVector *vector, void *element)
{
if(vector->numberOfElements >= vector->maxCapacity)
{
/*Error condition. vector is full.*/
return ERROR_CODE;
}
else
{
memcpy();
memcpy(vector->data + vector->numberOfElements, element, vector->sizeOfEachElement);
vector->numberOfElements ++;
return 0;
}
}

void genericVector_Destroy()
{
vector->sizeOfEachElement = 0;
vector->maxCapacity = 0;
vector->numberOfElement = 0;
free(vector->data);
vector->data = NULL;
}


int main()
{
GenericVector intVect, doubleVect, elephantVect;

genericVector_Init(&intVect, 10, sizeof(int));
genericVector_Init(&doubleVect, 20, sizeof(double));
genericVector_Init(&elephantVect, 10, sizeof(Elephant));

...
}







[Edited by - wolverine on August 20, 2008 6:35:33 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by DevFred
This works, but I'm not exactly sure in which order macros get expanded. Can anyone explain in detail why we need this one extra level?

Hm, I did some reading up, and I think I understand now.

Macro arguments are themselves macro-expanded BEFORE being substited into the macro body, UNLESS they are used with # or ## (in which case they are treated literally). So, given the following code:

#define ELEMENT_TYPE int

#define APPEND(x,y) x##y
#define SUBSTITUTE(x,y) APPEND(x,y)
#define GENERIC(x) SUBSTITUTE(x,ELEMENT_TYPE)

#define vector GENERIC(vector_)

struct vector { /* ... */ };

I think this is what happens with the macro vector:

1. vector
2. GENERIC(vector_)
3. SUBSTITUTE(vector_,ELEMENT_TYPE)
4. SUBSTITUTE(vector_,int)
5. APPEND(vector_,int)
6. vector_##int
7. vector_int

The important step is step 4 which does not happen if we call APPEND directly. Right?

Share this post


Link to post
Share on other sites
all the C work I've ever done was on embedded systems. Both manufacturers provided their own 'standard' library with concepts appropriate for their domain. Similarly, win32 and linux both provides containers and other utility concepts for their developers. C is just so close to the hardware that I couldn't imagine using it for anything generalized - there would have to be better language choices for higher level development.

Share this post


Link to post
Share on other sites
Run some tests on those macros first. Iirc, there's an error in the generated code. I don't recall exactly where as it's been several years.

Check your compiler specs for the command line switch that outputs preprocessor code. Instance those macros in your test file, then output the preprocessor results and review that code to locate the error.

Share this post


Link to post
Share on other sites
Quote:
Original post by LessBread
Run some tests on those macros first. Iirc, there's an error in the generated code. I don't recall exactly where as it's been several years.

Check your compiler specs for the command line switch that outputs preprocessor code. Instance those macros in your test file, then output the preprocessor results and review that code to locate the error.


Quote:
Original post by DevFred
Quote:
Original post by LessBread
Check your compiler specs for the command line switch that outputs preprocessor code.

gcc -E


I've tried the code and made some changes.

The changes were:
* due to html, the operator "->" was posted as "-". I've replaced that on the code.
* added "#define DEF_SIZE 2" since it was needed by the code (it assumed the value existed). However, it's easy to change the code to eliminate this need.
* in the macro-function "int type##_AddElem(type##_Vector *pV, type Data)", I've replaced the first if from "if(pV-len = pV-size)" to "if(pV->len == pV->size)"
* in the macro-function "type type##_GetElemAt(type##_Vector *pV, int pos)", I've replaced the if "if(pos = 0 && pos <= pV-len)" to "if(pos >= 0 && pos < pV->len)"

Here's the modified code:

#include <stdio.h>
#include <stdlib.h>


/*Macro definitions from http://www.flipcode.com/archives/Faking_Templates_In_C.shtml*/
#define DEF_SIZE 2

#define CREATE_VECTOR_TYPE_H(type) \
typedef struct _##type##_Vector{ \
type *pArray; \
type illegal; \
int size; \
int len; \
} type##_Vector; \
void type##_InitVector(type##_Vector *pV, type illegal); \
void type##_InitVectorEx(type##_Vector *pV, int size, type illegal); \
void type##_ClearVector(type##_Vector *pV); \
void type##_DeleteAll(type##_Vector *pV); \
void type##_EraseVector(type##_Vector *pV); \
int type##_AddElem(type##_Vector *pV, type Data); \
type type##_SetElemAt(type##_Vector *pV, int pos, type data); \
type type##_GetElemAt(type##_Vector *pV, int pos);

#define CREATE_VECTOR_TYPE_C(type) \
void type##_InitVector(type##_Vector *pV, type illegal) \
{ \
type##_InitVectorEx(pV, DEF_SIZE, illegal); \
} \
void type##_InitVectorEx(type##_Vector *pV, int size, type illegal) \
{ \
pV->len = 0; \
pV->illegal = illegal; \
pV->pArray = malloc(sizeof(type) * size); \
pV->size = size; \
} \
void type##_ClearVector(type##_Vector *pV) \
{ \
memset(pV->pArray, 0, sizeof(type) * pV->size); \
pV->len = 0; \
} \
void type##_EraseVector(type##_Vector *pV) \
{ \
if(pV->pArray != NULL) \
free(pV->pArray); \
pV->len = 0; \
pV->size = 0; \
pV->pArray = NULL; \
} \

int type##_AddElem(type##_Vector *pV, type Data) \
{ \
type *pTmp; \
if(pV->len == pV->size) \
{ \
pTmp = malloc(sizeof(type) * pV->size * 2); \
if(pTmp == NULL) \
return -1; \
memcpy(pTmp, pV->pArray, sizeof(type) * pV->size); \
free(pV->pArray); \
pV->pArray = pTmp; \
pV->size *= 2; \
} \
pV->pArray[pV->len] = Data; \
return pV->len++; \
} \
type type##_SetElemAt(type##_Vector *pV, int pos, type data) \
{ \
type old = pV->illegal; \
if(pos = 0 && pos <= pV->len) \
{ \
old = pV->pArray[pos]; \
pV->pArray[pos] = data; \
} \
return old; \
} \
type type##_GetElemAt(type##_Vector *pV, int pos) \
{ \
if(pos >= 0 && pos < pV->len) \
return pV->pArray[pos]; \
return pV->illegal; \
}


/* this is supposed to go into an h file*/
CREATE_VECTOR_TYPE_H(int)
CREATE_VECTOR_TYPE_H(double)
/* this is supposed to go into an c file*/
CREATE_VECTOR_TYPE_C(int)
CREATE_VECTOR_TYPE_C(double)

/*main*/
int main()
{

// then you use the functions like
int_Vector iv;
double_Vector dv;
int_InitVector(&iv, -1); // -1 is the illegal
double_InitVector(&dv, -1.0); // -1 is the illegal

int_AddElem(&iv, 10);
int_AddElem(&iv, 11);
int_AddElem(&iv, 12); /*cool, it grows*/

double_AddElem(&dv, 3.1415);

printf("intvector at[%d]=%d [expected 10]\n",0,int_GetElemAt(&iv,0));
printf("intvector at[%d]=%d [expected 11]\n",1,int_GetElemAt(&iv,1));
printf("intvector at[%d]=%d [expected 12]\n",2,int_GetElemAt(&iv,2));
printf("intvector at[%d]=%d [expected -1]\n",3,int_GetElemAt(&iv,3));


printf("doublevector at[%d]=%e [expected 3.1415]\n",0,double_GetElemAt(&dv,0));
printf("doublevector at[%d]=%e [expected -1.0]\n",1,double_GetElemAt(&dv,1));
printf("doublevector at[%d]=%e [expected -1.0]\n",2,double_GetElemAt(&dv,2));

int_EraseVector(&iv);
double_EraseVector(&dv);

return 0;
}









I didn't had time to make extensive testing... I'll problably do it in a day or two and say something then.

EDIT: ooooooops this code appears as strange here on gamedev... it's because of the macros :(
EDIT2.1: Followed Sneftel sugestion and replaces the "\" by "\ ". If you copy and paste the code and try it, it will generate warnings on some compilers because of the space after the slash. Some other compilers, may even generate an error on this. If you have problems, just replace "\ " for "\"

[Edited by - wolverine on August 20, 2008 4:16:54 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by wolverine
EDIT: ooooooops this code appears as strange here on gamedev... it's because of the macros :(

Yeah, the forum seems to have his own preprocessor who doesn't like a backslash at the end of the line. Does this also happen in normal text?

first line followed by backlash second line

And maybe we can escape the backlash in the code box with another backslash?

first line followed by double-backlash \second line


Share this post


Link to post
Share on other sites
Indeed. If the line ends with a backslash, you'll need to put a space after it to keep the line break from being removed. Totally awesome, huh? Pretty much every thread about macros has this happen sooner or later.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
Indeed. If the line ends with a backslash, you'll need to put a space after it to keep the line break from being removed. Totally awesome, huh? Pretty much every thread about macros has this happen sooner or later.


Yep, it worked :)

Share this post


Link to post
Share on other sites
Quote:
Original post by Gage64
Maybe C99 supports function overloading, but AFAIK most compilers don't support C99 so relying on it is probably not a good idea.
Yeah I forgot you can't do overloading in C. Thanks for reminding me ;)

Manual overloading (putting ELEMENT_TYPE into the func name) it is then.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this