• Advertisement
Sign in to follow this  

List functionality in C

This topic is 3313 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello All, I know that C doesn't come with anysort of list functionality, but I have come towards a situation where a list would be really handy. Does anybody know any tricks? Thanks in advance.

Share this post


Link to post
Share on other sites
Advertisement
Well, I guess it depends on what you mean by list. If you mean something on par with C++'s std::list, no, of coarse not. If you mean something on par with no, but you can write your own, why of coarse!

First, what do you mean by list?
Second, how much experience do you have? (Can you write your own?)
Lastly, are you coming from another language like Python that had something it called a list without requiring you to know what it really was?

Also, what's wrong with plain old array?
Also also, any reason not C++? Much better at this type of thing (the aforementioned std::list, but there's also std::vector (resizable array) and plenty more).

Share this post


Link to post
Share on other sites
I've come from OO languages like Java, C# and C++, I'm using C at the moment because I have been instructed to by my tutor, there's no getting out of it I'm afraid. I can write a list in OO languages easily enough. The only reason why I wouldn't initially use array over a list is a situation when the required dimenisons of the array are unknown, i.e. loading data in from a file.

Share this post


Link to post
Share on other sites
They keyword would be linked list. In C this involves having nodes with a pointer to the next (and optionally a previous node).

You still could use a dynamic array with malloc. Once the size needs to increase, malloc a new buffer of the size of the previous buffer plus the new data, copy stuff over and free the old buffer.

Share this post


Link to post
Share on other sites
Quote:
Original post by Endurion
Once the size needs to increase, malloc a new buffer of the size of the previous buffer plus the new data, copy stuff over and free the old buffer.
Idiomatic C uses realloc() to do this in one step.



Share this post


Link to post
Share on other sites
I'm having a bit of nightmare implementing realloc


// Initialized in a loop somewhere else

vectors.refs = malloc(sizeof(Poly *));

// Later on

Poly* polyArr;

ptr.v1 = &vectors[poly[one]];
polyArr = (Poly *) ptr.v1->refs;
size = sizeof(Poly *)*(ptr.v1->refCount+2);
polyArr[ptr.v1->refCount] = ptr;
vectors[poly[one]].refs = realloc(vectors[poly[one]].refs, size); // Crashes here
ptr.v1->refCount++;

HEAP[gp.exe]: Heap block at 00933290 modified at 009332C0 past requested size of 28
Windows has triggered a breakpoint in gp.exe.

Share this post


Link to post
Share on other sites
Quote:
Original post by DarkMatter2008
I'm having a bit of nightmare implementing realloc


vectors[poly[one]].refs = realloc(vectors[poly[one]].refs, size); // Crashes here


Does vectors[poly[one]].refs refer to memory that was allocated with malloc/calloc/realloc? I would recommend simplifying your code here as it's likely that you have slipped up in this regard.

Also, the way you're using realloc is not quite correct. If it returns NULL, you have no way of free()ing the memory that vectors[poly[one]].refs was previously pointing to. Some people dismiss malloc/realloc/calloc failures as more-or-less impossible and so ignore such things, but I'm not one of those people :)

Edd

Share this post


Link to post
Share on other sites
Quote:
Original post by DarkMatter2008
HEAP[gp.exe]: Heap block at 00933290 modified at 009332C0 past requested size of 28
Windows has triggered a breakpoint in gp.exe.


This error is indicative of a buffer overrun. The debug heap will check the memory when you free it, or when you call realloc, to see whether you've written past the end of it - and it looks like that's what you've done here.

Share this post


Link to post
Share on other sites
Believe me, I am trying to make it simple :)

Here's in essance what I'm trying to do:-


struct Vector
{
void * refs; // Needs to store Polys that have referenced this;
int refsCount;
};

struct Poly
{
Vector * vector;
};

struct Model
{
Poly * polys;
};

Vector vectors [10];

// Initialize Vectors
for(int i = 0; i < 10; i++)
{
vectors.refs = malloc(sizeof(Poly *));
vectors.refCount = 0;
}

Model someModel;
someModel.polys = malloc(sizeof(Poly*)*10);

Poly * ptr = someModel.polys;

for(int i = 0; i < 10; i++)
{
ptr->vector = vectors;
vectors.refs[vectors.refCount] = ptr; // Need to say what poly referenced it;
vectors.refCount++;
vectors.refs = realloc(vectors.refs, sizeof(*Poly))*vectors.refCount+1); // Need to resize refs incase another poly referecnes this vector

ptr++
}

Share this post


Link to post
Share on other sites
Quote:
Original post by DarkMatter2008
struct Vector
{
void * refs; // Needs to store Polys that have referenced this;
int refsCount;
};


Why are you using void? Are you going to have references to structs other than Poly? If not, forward declare Poly and use pointers to Poly. Also, you need to decide between arrays and linked lists here. The typical way of doing a linked list in C would be to add a "Poly *next;" to the Poly struct and then refs should be a pointer to the first Poly in the list. If you're using arrays, then you'll want refs to be a pointer to pointer to Poly (Poly **refs;) so that you can point to an array of Poly structs.

And, for what it's worth, // comments are only valid in C++ and C99 not C89.

Quote:

struct Poly
{
Vector * vector;
};


This shouldn't compile in C since you don't get the automatic typedef's. You should write:
typedef struct Poly_ Poly;
strut Poly_ {
// ...
};


You should actually put the typedef line before you define Vector to forward declare Poly for use in Vector.

Quote:

for(int i = 0; i < 10; i++)


Defining a variable in the for statement like this is, again, a feature of C++ and C99. If you're using C89 it shouldn't compile.

Quote:

vectors.refs = malloc(sizeof(Poly *));


It's generally a good idea to write:
var = malloc(sizeof *var);
since it's a bit more robust in the face of changing the type or typename of the variable var.

Quote:

Model someModel;


Mixing variable declarations with code is, again, only for C++ and C99, not C89.


Other than that, and a few typos, it compiles as C99 code, looks right (although it's a little strange to allocate for space+1 instead of allocating the +1 when you need it), and doesn't crash for me.

Share this post


Link to post
Share on other sites
Quote:
Original post by DarkMatter2008
someModel.polys = malloc(sizeof(Poly*)*10);


sizeof(Poly *) returns the size of a pointer to a Poly, which will be 4 on 32-bit platforms. I'm guess you want sizeof(Poly) here. It only works because your Poly consists of one pointer as well -- if it had anything else in there, it wouldn't work.

As I said before, this error:

Quote:
Heap block at 00933290 modified at 009332C0 past requested size of 28


is indicative of a buffer-overrun, and that's exactly what using sizeof(Poly *) instead of sizeof(Poly) would do.

Quote:

for(int i = 0; i < 10; i++)
{
ptr->vector = vectors;
vectors.refs[vectors.refCount] = ptr; // Need to say what poly referenced it;
vectors.refCount++;
vectors.refs = realloc(vectors.refs, sizeof(*Poly))*vectors.refCount+1); // Need to resize refs incase another poly referecnes this vector
ptr++
}


As mentioned by Way, this code is a little funny. It would make more sense to resize the array when you need to, not when you're finished (as you've got it, your array is always going to be one bigger than needed).

Also, you should probably allocate a chunk of memory up-front and only resize it when you run out of space, rather than reallocating every time. I guess it's not that performance critical since you'll only do this when you "load" your model, but still...


Share this post


Link to post
Share on other sites
Quote:
Original post by Codeka
Quote:
Original post by DarkMatter2008
someModel.polys = malloc(sizeof(Poly*)*10);


sizeof(Poly *) returns the size of a pointer to a Poly, which will be 4 on 32-bit platforms. I'm guess you want sizeof(Poly) here. It only works because your Poly consists of one pointer as well -- if it had anything else in there, it wouldn't work.


Heh, didn't even notice that because I'm so used to using "sizeof *someModel.polys" instead of "sizeof(Poly)".

Share this post


Link to post
Share on other sites
Quote:

Why are you using void? Are you going to have references to structs other than Poly? If not, forward declare Poly and use pointers to Poly.


Ther reason for the void instead of a Poly pointer is a matter of what comes first, the chicken or the egg? this here doesn't compile:


struct Vector
{
Poly * refs;
}

struct Poly
{
Vector * vector
}


As Poly has not declared yet, I cant swap the structs around either because the Poly struct will complain that the Vector struct has not been defined.

Quote:

Also, you need to decide between arrays and linked lists here. The typical way of doing a linked list in C would be to add a "Poly *next;" to the Poly struct and then refs should be a pointer to the first Poly in the list.


Initally I was going to go with a list as they are easier to write to, but i'll only be writing to them at the initialisation stage, arrays access much faster, hmmm but saying that I'll be reading sequentially so would there be any diffence if i did use a list? Anyway it is the vector that needs to store a record of what Poly has referenced, I need that for Gouraud lighting calculations later when working out the light intenisity of the vector.

Quote:

If you're using arrays, then you'll want refs to be a pointer to pointer to Poly (Poly **refs;) so that you can point to an array of Poly structs.


I'm not sure what you mean here, is that 2D array? I only want a 1D array of Poly pointers per Vector Struct.

Quote:

And, for what it's worth, // comments are only valid in C++ and C99 not C89.


I'm using VS2008, is there anyway to make it so it conforms to C? it doesn't seem to have just C in the create new project window.

Defining a variable in the for statement like this is, again, a feature of C++ and C99. If you're using C89 it shouldn't compile.

Quote:

vectors.refs = malloc(sizeof(Poly *));


It's generally a good idea to write:
var = malloc(sizeof *var);
since it's a bit more robust in the face of changing the type or typename of the variable var.
[/quote]

again I'm not too sure what you mean here. Despite me decalring refs as a void pointer I only want it for Poly pointers, the type won't change. What I'm trying to do is set up an array of poly pointers in side the vector struct.

Quote:

Other than that, and a few types, it compiles as C99 code, looks right (although it's a little strange to allocate for space+1 instead of allocating the +1 when you need it), and doesn't crash for me.


I'm always open to a better solution if one is avalible, I lived a sheltered life in the land of Java so I am finding this a bit challenging to get a grasp of.

Quote:

sizeof(Poly *) returns the size of a pointer to a Poly, which will be 4 on 32-bit platforms. I'm guess you want sizeof(Poly) here. It only works because your Poly consists of one pointer as well -- if it had anything else in there, it wouldn't work.


I thought because I want references instead of actually storing a struct in an array i'd use (Poly *), i'm trying to keep a record of what polygons have refernced each vector. Ideally the poly would have 3 pointers to vectors and other such information that a poly would require, i simplifed it for that demonstation.

Appologies if I seem rude or anything like that, i'm just a little confused.

Share this post


Link to post
Share on other sites
Quote:
Original post by DarkMatter2008
I lived a sheltered life in the land of Java so I am finding this a bit challenging to get a grasp of.


So true...

Such rich and flexible dynamic types are cumbersome in C. Linked lists are often simpler, but before going that route, consider whether you really need it.

If you have even the slightest chance of using fixed-size arrays, do it.

#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
#include <string.h>


#define MAX_POLYS 25
#define MAX_VECTORS 25

typedef struct VECTOR Vector;

typedef struct {
Vector * p_vect;
int foo;
} Poly;

typedef struct VECTOR {
char name[15];
size_t n;
Poly poly[MAX_POLYS];
} Vector;

int main() {
char buf[20];
Vector vectors[MAX_VECTORS];
Vector * pv;
size_t n_vectors;

size_t i, j;

n_vectors = (MAX_VECTORS / 2);
for (i = 0; i < n_vectors; i++) {
pv = &vectors;

strcpy(pv->name, "Vector #");
strcat(pv->name, _itoa(i, buf, 10));

pv->n = rand() % MAX_POLYS;
for (j = 0; j < pv->n; j++) {
pv->poly[j].foo = j;
pv->poly[j].p_vect = pv;
}
}

for (i = 0; i < n_vectors; i++) {
printf("%20s: ", vectors.name);
for (j = 0; j < vectors.n; j++) {
printf("%d ", vectors.poly[j].foo);
}
printf("\n");
}

return 0;
}


If over-allocating seems problematic - how much memory would 1,000,000 Poly structures really consume?

And for numeric calculations, fixed allocations may be even better, since it leads to potentially simpler and faster code.

But unless you actually have to deal with dynamic data structures, don't reinvent them in C. It's often not needed, just make sure to double-check array sizes. Nobody will warn you if you write over them.

Share this post


Link to post
Share on other sites
Quote:
Original post by DarkMatter2008
Quote:

Why are you using void? Are you going to have references to structs other than Poly? If not, forward declare Poly and use pointers to Poly.


Ther reason for the void instead of a Poly pointer is a matter of what comes first, the chicken or the egg?


What you wrote above doesn't compile in C, either, because you didn't typedef your structs (or, alternatively, write "struct Poly" instead of "Poly"). Just forward declare the struct:
typedef struct Vector Vector;
typedef struct Poly Poly;
struct Vector {
Poly *refs;
};
struct Poly {
Vector *vector;
};
The same "trick" can be useful in C++ as well.

Quote:

Initally I was going to go with a list as they are easier to write to, but i'll only be writing to them at the initialisation stage, arrays access much faster, hmmm but saying that I'll be reading sequentially so would there be any diffence if i did use a list?


It's not something I know much about, but I believe iteration of an array can be faster due to the array being a single block of memory, but I don't know how much of an effect that would have in the present case.

Quote:

Quote:

If you're using arrays, then you'll want refs to be a pointer to pointer to Poly (Poly **refs;) so that you can point to an array of Poly structs.


I'm not sure what you mean here, is that 2D array? I only want a 1D array of Poly pointers per Vector Struct.


Right, you want an array of pointers. Usually, you work with a pointer to the first element of an array instead of the array itself (e.g. malloc() essentially returns such a pointer). If the array contains pointers, that would be a pointer to pointer. A 2D is something else entirely, but is often emulated by a pointer to pointer.

Quote:

Quote:

And, for what it's worth, // comments are only valid in C++ and C99 not C89.


I'm using VS2008, is there anyway to make it so it conforms to C? it doesn't seem to have just C in the create new project window.


I'm not familiar with VS2008, but it may not be a problem in your current case (these were common extensions to C89 which is why they're in the new version of C, C99). I was just pointing it out so that you'd know what's wrong if it does become a problem.

Quote:

Quote:

vectors.refs = malloc(sizeof(Poly *));


It's generally a good idea to write:
var = malloc(sizeof *var);
since it's a bit more robust in the face of changing the type or typename of the variable var.


again I'm not too sure what you mean here. Despite me decalring refs as a void pointer I only want it for Poly pointers, the type won't change. What I'm trying to do is set up an array of poly pointers in side the vector struct.[/quote]

It's mostly a style guideline that I find can help to reduce problems down the line. It's also harder to mix up the type because you're basically letting the compiler figure it out for you. In another part of what you posted, you wrote "sizeof(Poly *)" when you meant "sizeof(Poly)", but if you'd used the "sizeof *var" form, it would've been a non-issue.

Quote:

I'm always open to a better solution if one is avalible, I lived a sheltered life in the land of Java so I am finding this a bit challenging to get a grasp of.


Well, for the allocation, just switch the order you do it. Just allocate the space you need right before you add another reference instead of allocating extra space after adding the reference.

The other thing to understand is the difference between pointers and variables, which in Java (I hope I get this right, it's been a while since I touched Java) is the difference between class references and primitive values, except that in Java only classes have reference semantics and only primitives have value semantics whereas in C (and C++) you can have a value of any type and a pointer to any type.

Quote:

Quote:

sizeof(Poly *) returns the size of a pointer to a Poly, which will be 4 on 32-bit platforms. I'm guess you want sizeof(Poly) here. It only works because your Poly consists of one pointer as well -- if it had anything else in there, it wouldn't work.


I thought because I want references instead of actually storing a struct in an array i'd use (Poly *), i'm trying to keep a record of what polygons have refernced each vector. Ideally the poly would have 3 pointers to vectors and other such information that a poly would require, i simplifed it for that demonstation.


Right, but you have to store the actual Poly structures somewhere (otherwise there's nothing to point to). In the code, it looked like you intended to store them in someModel.polys, so you need to store something the size of an actual Poly struct, not the size of a pointer to a Poly struct.

If you'd used the malloc(N*sizeof *var) format, the compiler would've seen that var is of type Poly *, so *var is of type Poly, and would've done the right thing.

Share this post


Link to post
Share on other sites
Quote:
Original post by Way Walker

It's not something I know much about, but I believe iteration of an array can be faster due to the array being a single block of memory, but I don't know how much of an effect that would have in the present case.


You are correct. The adjacent elements are also loaded whenever you access something from RAM (from the stack or heap) and if you're using those elements (i.e. you're using an array) the variable will already have been loaded and access is much quicker. (Or, at least that's the theory.)

See: caching, L2 cache, memory loading, etc.

Even if this weren't true, to access the fifth element of a list and array:
    ListNode* ln = nodeBase.begin;
int i=0;
for(i=0; i < 5; i++)
ln = ln->next;
Compared to

ArrayElem* a = arrayBase.begin;
a += 5;


However, lists do grow faster. When a vector is resized, it has to allocate new memory, copy the old, and delete the old pointer. When a list grows, its last node, which for constant access time, your list base class can hold a pointer to, and you can just allocate one node at a time. If growing, or sorting, is more important than iterating, lists are usually better.

Share this post


Link to post
Share on other sites
Quote:
Original post by Splinter of Chaos
However, lists do grow faster. When a vector is resized, it has to allocate new memory, copy the old, and delete the old pointer. When a list grows, its last node, which for constant access time, your list base class can hold a pointer to, and you can just allocate one node at a time. If growing, or sorting, is more important than iterating, lists are usually better.
That's not the whole story. Containers such as std::vector actually grow by a percentage of the original size (e.g. doubling), resulting in amortised constant time push_back operations. This is required by the C++ standard. In practice they aren't going to be much slower than lists at all.

The primary reason to ever use a list is for insertion and removal from the middle, as well as to prevent iterators being invalidated by doing either of those things. Having a class with a very expensive copy-constructor is one other possible reason. Otherwise vector is a better choice.

[Edited by - iMalc on January 23, 2009 3:14:33 PM]

Share this post


Link to post
Share on other sites
Quote:
That's not the whole story. Containers such as std::vector actually grow by a percentage of the original size (e.g. doubling), resulting in amortised constant time push_back operations. This is required by the C++ standard. In practice they aren't going to be much slower than lists at all.


I have read this before and it has always puzzled me. Assume that a std::vector doubles the size of its internal buffer on overflow condition and that it initializes with a size of 1. Now if 1024 elements are appended using push_back() (without reserve) this will involve 10 buffer reallocations and consequent copy of all elements in the buffer. Intuitively this strikes me that push_back() of n elements will be O( n log n). At least it appears to be of a different order from random access of one element eg O( 1).

[Edited by - chairthrower on January 23, 2009 9:11:52 PM]

Share this post


Link to post
Share on other sites
You're right, it's not exactly going to be constant time. It's going to usually be constant time. I don't really understand "amortized", but I think that's what it means.

Quote:
Original post by iMalc
Quote:
Original post by Splinter of Chaos
However, lists do grow faster. When a vector is resized, it has to allocate new memory, copy the old, and delete the old pointer. When a list grows, its last node, which for constant access time, your list base class can hold a pointer to, and you can just allocate one node at a time. If growing, or sorting, is more important than iterating, lists are usually better.
That's not the whole story. Containers such as std::vector actually grow by a percentage of the original size (e.g. doubling), resulting in amortised constant time push_back operations. This is required by the C++ standard. In practice they aren't going to be much slower than lists at all.
Your linked resource forgot to mention the amortized part on that page (but got it on another). I'm surprised that reference missed it.

Although, of interest might be this article, which half way down has a graph showing grow and traverse times, and a few other things as well. It looks like the list is actually slower to grow than the vector...but wait!
Quote:
in fairness I gave vector special treatment by calling vector::reserve(1000000) first to avoid any reallocations.
So, it seems the test might be fair for predictable sizes, but we're talking about not knowing the full size. True, vectors have a great advantage when the end size in known, but are incredibly inefficient otherwise.

According to GCC's interpretation (I'm reading the source):
    std::vector<int> x; // Capacity and size are zero.
x.push_back(1); // Capacity and size are one. Realloc happened.
x.push_back(1); // Capacity and size are two. Realloc happened.
x.push_back(1); // Capacity four, size three. Realloc happened.
x.push_back(1); // Capacity four, size four.
x.push_back(1); // Capacity eight, size five. Realloc happened.

G++ doubles the capacity, but it's worth noting (also checking the source) that MSVC only grows their vectors by 50 percent. Thus, gcc will waste space when the vector gets larger but reallocate less. MSVC will waste time reallocating, but waste less space.

But, I unfortunately can't prove this because I tried the same test as the "guru" and I get around 400 clock ticks (using std::clock()) for the list (push_back()ing 1,000,000 ints) and 32 for the vector test, and same for when I reserve space. These numbers are outrageously low compared to the guru's results and I wonder if my tests are in some way wrong.

Quote:
The primary reason to ever use a list is for insertion and removal from the middle, as well as to prevent iterators being invalidated by doing either of those things. Having a class with a very expensive copy-constructor is one other possible reason. Otherwise vector is a better choice.


Quick sorting is another reason. A really good one, actually. And, really, according to my link, a dequeue would actually be best since grow time is lower and traverse time is reasonably. However, it's incredibly difficult to write one, so a list is a much better choice.

Share this post


Link to post
Share on other sites
Quote:
Original post by Splinter of Chaos
You're right, it's not exactly going to be constant time. It's going to usually be constant time. I don't really understand "amortized", but I think that's what it means.
It's not so much about what it 'usually' is, but what the long term average of the time taken is. For example, pushing back 1024 elements should take about twice as long as pushing back 512, which should again take about twice as long as pushing back 256, etc. Going by those measurements one would have to assume each operation was constant time. It's only when you look at the time taken for each individual push_back that you notice that they don't all take the same amount of time.
Quote:
Quote:
Original post by iMalc
Quote:
Original post by Splinter of Chaos
However, lists do grow faster. When a vector is resized, it has to allocate new memory, copy the old, and delete the old pointer. When a list grows, its last node, which for constant access time, your list base class can hold a pointer to, and you can just allocate one node at a time. If growing, or sorting, is more important than iterating, lists are usually better.
That's not the whole story. Containers such as std::vector actually grow by a percentage of the original size (e.g. doubling), resulting in amortised constant time push_back operations. This is required by the C++ standard. In practice they aren't going to be much slower than lists at all.
Your linked resource forgot to mention the amortized part on that page (but got it on another). I'm surprised that reference missed it.
Yes, so am I.
Quote:
Although, of interest might be this article, which half way down has a graph showing grow and traverse times, and a few other things as well. It looks like the list is actually slower to grow than the vector...but wait!
Quote:
in fairness I gave vector special treatment by calling vector::reserve(1000000) first to avoid any reallocations.
So, it seems the test might be fair for predictable sizes, but we're talking about not knowing the full size. True, vectors have a great advantage when the end size in known, but are incredibly inefficient otherwise.
"incredibly inefficient" is far too strong of a way of putting it. It's actually pretty darn good in reality. Even if the push_back operation takes slightly longer for a vector, iteration through the vector is much much faster than that of a list, due to everything being in contiguous memory and hence more often in the cache. All good sources around will tell you that your default choice of container should always be vector.
Quote:
Quote:
The primary reason to ever use a list is for insertion and removal from the middle, as well as to prevent iterators being invalidated by doing either of those things. Having a class with a very expensive copy-constructor is one other possible reason. Otherwise vector is a better choice.


Quick sorting is another reason. A really good one, actually. And, really, according to my link, a dequeue would actually be best since grow time is lower and traverse time is reasonably. However, it's incredibly difficult to write one, so a list is a much better choice.
Sorting a list is not too bad as long as you correctly use the list's sort() member function and not std::sort from the algorithm header. Merge sort on a list is pretty simple, is always O(n *log n) and only requires O(log n) stack space. Given that it's also stable it's also fairer to compare it against stable_sort for an array. Then I think you'd find that they're about the same speed.

One reasonable way of implemeting a deque is actually really easy to implement. Just use a circular buffer, and keep track of the head and tail.
I think you're not justified in saying that a list is an outright better choice, it depends on the things I mentioned earlier.

Quote:
Original post by Splinter of Chaitthrower
I have read this before and it has always puzzled me. Assume that a std::vector doubles the size of its internal buffer on overflow condition and that it initializes with a size of 1. Now if 1024 elements are appended using push_back() (without reserve) this will involve 10 buffer reallocations and consequent copy of all elements in the buffer. Intuitively this strikes me that push_back() of n elements will be O( n log n). At least it appears to be of a different order from random access of one element eg O( 1).
You're forgetting that all of the items added since the last resize haven't had to be copied at all. Then those since the reallocation before that have only had to be copied once. etc.
Assuming doubling as growth strategy for a push_back of 1024 items:
That's 512 that didn't need copying
256 were copied once
128 were copied twice
64 were copied 3 times
32 were copied 4 times
16 were copied 5 times
8 were copied 6 times
4 were copied 7 times
2 were copied 8 times
1 was copied 9 times
Now sum them up:
512*0 + 256*1 + 128*2 + 64*3 + 32*4 + 16*5 + 8*6 + 4*7 + 2*8 + 1*9 = 1013 copies.
Very close to the number of push_back operations.
Now assume there were only half as many push_backs:
256*0 + 128*1 + 64*2 + 32*3 + 16*4 + 8*5 + 4*6 + 2*7 + 1*8 = 502 copies.
Very close to the number of push_back operations again.
Each item has on average being copied slightly less than once, during all the push_back operations. Pretty neat huh!

[Edited by - iMalc on January 24, 2009 1:58:15 AM]

Share this post


Link to post
Share on other sites
Quote:
Quote:
Original post by Splinter of Chaitthrower
I have read this before and it has always puzzled me. Assume that a std::vector doubles the size of its internal buffer on overflow condition and that it initializes with a size of 1. Now if 1024 elements are appended using push_back() (without reserve) this will involve 10 buffer reallocations and consequent copy of all elements in the buffer. Intuitively this strikes me that push_back() of n elements will be O( n log n). At least it appears to be of a different order from random access of one element eg O( 1).

You're forgetting that all of the items since the last resize haven't had to be copied at all. Then those since the reallocation before that have only had to be copied once. etc.
Assuming doubling as growth strategy:
That's 512 that didn't need copying
256 were copied once
128 were copied twice
64 were copied 3 times
32 were copied 4 times
16 were copied 5 times
8 were copied 6 times
4 were copied 7 times
2 were copied 8 times
1 were copied 9 times
Now sum them up: 512*0 + 256*1 + 128*2 + 64*3 + 32*4 + 16*5 + 8*6 + 4*7 + 2*8 + 1*9 = 1037 copies. Very close to the number of push_back operations. Now assume there were only half as many push_backs:
256*0 + 128*1 + 64*2 + 32*3 + 16*4 + 8*5 + 4*6 + 2*7 + 1*8 = 502 copies. Very close to the number of push_back operations again.


Ahhh, that's very clear - I just never managed to think it through properly. And not to throw anybody off, I agree that std::vector should normally be the first collection class to reach for.

Share this post


Link to post
Share on other sites
Quote:
Original post by iMalc
Now sum them up:
512*0 + 256*1 + 128*2 + 64*3 + 32*4 + 16*5 + 8*6 + 4*7 + 2*8 + 1*9 = 1013 copies.
Very close to the number of push_back operations.
Now assume there were only half as many push_backs:
256*0 + 128*1 + 64*2 + 32*3 + 16*4 + 8*5 + 4*6 + 2*7 + 1*8 = 502 copies.
Very close to the number of push_back operations again.
Each item has on average being copied slightly less than once, during all the push_back operations. Pretty neat huh!


Ah, you're right, saying it's incredibly inefficient was short-sighted because I only looked at the first five pushes. I guess if there are more elements, it's not such a problem. If the type it's holding is complex (which here is pointers, so no), this is a huge problem, but not here.

I'd still write a list myself for the lower code complexity, but I understand why vector isn't that bad.

(Although, deque would be better is we REALLY didn't care about complexity.)

Share this post


Link to post
Share on other sites
Original post by Way Walker
Original post by DarkMatter2008

Quote:

Quote:

Quote:

If you're using arrays, then you'll want refs to be a pointer to pointer to Poly (Poly **refs;) so that you can point to an array of Poly structs.


I'm not sure what you mean here, is that 2D array? I only want a 1D array of Poly pointers per Vector Struct.


Right, you want an array of pointers. Usually, you work with a pointer to the first element of an array instead of the array itself (e.g. malloc() essentially returns such a pointer). If the array contains pointers, that would be a pointer to pointer. A 2D is something else entirely, but is often emulated by a pointer to pointer.


It took a while for it to sink in, but i understand now and it's solved my problem. Here's what I did.


/* Somewhere before, refsCount is a pre counted number */
Poly ** ref = (Poly **) malloc(sizeof(Poly **) * (vectors.refsCount));

/* Later when populating the array with pointers */
Poly *ptr = somemodel.polys;

/* In a loop */
ptr.v1 = &vectors[polys[firstVector]];
ptr.v1->refs[ptr.v1->refIndex++] = &ptr;


What I seemed to be doing before was to clumsily jam a Poly struct into space made for pointers eventualy making a crash, another problem I found out is these [] dereffernce pointers, so your pointer to pointer technique solved that problem for me. Thank you very much :)

P.S. Does anybody know how to make a project in VS2008 conform to plain C? I would like my code to conform to C not C++;

Share this post


Link to post
Share on other sites
Quote:
Original post by Splinter of Chaos
I only looked at the first five pushes. I guess if there are more elements, it's not such a problem.
Yeah that's one reason why some implementations start with a minimum capacity, e.g. 10. Windows allocators don't break up memory into anything smaller than a certain size either, I think it's something like 8 bytes?

[Edited by - iMalc on January 24, 2009 2:21:12 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by DarkMatter2008
P.S. Does anybody know how to make a project in VS2008 conform to plain C? I would like my code to conform to C not C++;


If you change the file extension to *.c, by default it will try to compile it as C. You can change it in the project settings, too, but I tend to just change the extension.

Note that it's very definitely not a C99 compiler.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement