linked lists and templates, Oh My!!

Started by
33 comments, last by webwraith 17 years, 4 months ago
Quote:Original post by rip-off
Quote:Original post by Deyja
I'm fairly certain that putting anything in namespace std is forbidden by the standard. Even writing "namespace std {}" is illegal. Only the standard library itself may open that namespace.


What about forward declaration?

Is it wrong to forward declare std classes/functions? I think I remember doing this at one stage...


Nope. That's why there's standard headers like <iosfwd>, which provide forward declarations for you.

See, there's no guarantee that your standard library's implementation is exactly as shown in the standard. For example, the implementation is allowed to provide additional (defaulted) template arguments.

The nasty bit is that all this standard header compilation can slow down a build. That's why every compiler vendor out there worth their salt has implemented precompiled headers, which pretty much obviate the need for the export keyword implementation in a way that actually implementing the export keyword could never do.

Stephen M. Webb
Professional Free Software Developer

Advertisement
@Nitage:

That comment was directed at a rather abusive AP, who's post has (thankfully) been deleted, now (thanks, Kylotan).

I have nothing against the standard library, and I know that this will probably get a few more people posting to tell me I'm wrong in doing this, but the other reason I wanted to roll my own was in an attempt to minimize the actual size of the code that would be produced. OK, OK, there's bound to be someone who's going to tell me that it doesn't matter for x number of reasons, fair enough. I'm not too bothered about all that, I'd just like to be able to make a linked list as a starting point
writing code is never wrong ... only using the code you've written can be wrong :)

seriously, if you boil down the std vs custom argument, that is the result of the way we on gamedev feel.

Anytime you are doing something that duplicates existing functionality and you know this, and are doing it on purpose, and don't want game dev people to waste all our time posting about how you shouldn't be doing it like that .. then just add statements like "this is a learning exercise" or "I know the (whatever) already does this, but I'm trying to improve my programing skills by doing it myself - will investigate / use the (whatever) after I've finshed my learning expercise".

Sure you shouldn't have to explain this much to just ask a question, but unfortunatly internet forums have almost no context, we don't know anything about you or your situation, so without such statements people cannot properly navigate the different ways they might respond to a question. (and answer to a beginner about MMOs would be different than an answer to a Blizzard employee about the scalability of a partciluar networking system for use in an MMO).

These C++ standards purist can get kinda nazi on you ... and so can the write-everything-themselves posters who are constantly asking questions about how to do X correctly, even though X is not necessary at all (either already done or a less than ideal solution in the first place).
@Xai: Thanks for that, will do from now on.
Right, so lets scratch the posts so far(with the exception of a few, notably Nitage and Xai) and start again.

*************************
* THIS IS A LEARNING EXERCISE *
*************************

and for those that (probably don't) want to know, most of my stuff ends up with "mgl" on the front for "my gaming library" (not much of one, ATM)

could someone pleaser explain to me why the following will compile and link, but then give me an error message?

Be polite, as stated above: THIS IS A LEARNING EXERCISE!!

main.cpp;
#include <iostream>#include "mgllist.h"int main(){    mglList<int> mylist;    return 0;}


mglList.h;
#ifndef MGL_LIST_H_#define MGL_LIST_H_template <class T>class mglList{    protected:        /* the node structure*/        struct node{            node* prev;            node* next;            T* data;            node(node*prev,node*next,T*data):prev(prev),next(next),data(data){if(next!=NULL)next->prev=this;if(prev!=NULL)prev->next=this;}            node* remove(){next->prev=prev;prev->next=next;return this;}        };        node* first;        node* last;        unsigned int _size;    public:        enum position{MLIST_NPOS,MLIST_FIRST,MLIST_LAST};        void push_back(T* data)        {            if(size==0)            {                first=new node(NULL,NULL,data);                last=first;                size++;            }            else            {                node* temp=new node(last,NULL,data);                last->next=temp;                last=temp;                size++;            }        }        void push_front(T* data)        {            if(size==0)            {                first=new node(NULL,NULL,data);                last=first;                size++;            }            else            {                node* temp=new node(NULL,first,data);                first->prev=temp;                first=temp;                size++;            }        }        T* pop_back()        {            T* temp=last->data;            erase(0,MLIST_LAST,1);            return temp;        }        T* pop_front()        {            T* temp=first->data;            erase(0,MLIST_FIRST,1);            return temp;        }        void insert(unsigned int offset,position pos,T* data)        {            unsigned int count=0;            node* temp,*ins;//temp is used to find the right node, ins is the node to be inserted            if(pos==MLIST_FIRST)            {                temp=first;                while(count!=offset || temp!=last)                {                    temp=temp->next;                    count++;                }                ins=new node(temp,temp->next,data);            }            else if(pos==MLIST_LAST)            {                temp=last;                count=offset;                while(count!=0 || temp!=first)                {                    temp=temp->prev;                    count--;                }                ins=new node(temp,temp->next,data);            }            size++;        }        void erase(int offset,position pos,unsigned int num=0)        {            node*temp;            node*t2;// t2 is used to delete a node after moving on to the next,new_last is set before deleting the nodes            int check;            if(pos==MLIST_FIRST)            {                temp=first;                check=0;            }            else if(pos==MLIST_LAST)            {                temp=last;                check=-offset;            }            while(check!=offset)            {                temp=temp->next;                check++;            }            if(num==0)// special case, delete all nodes from offset forward            {                node*new_last;                while(temp!=last)                {                    t2=temp;                    if(temp->prev!=NULL)// if the previous node isn't NULL(ie, start of list)..                        new_last=temp->prev;// ..new_last becomes the node prior to temps starting value                    else new_last=NULL;// don't use new_last                        temp->remove();// get the node to remove itself from the list                    temp=temp->next;                    delete t2;                }                temp=last;// time to delete the last node                if(new_last!=NULL)//check to see if new_last is being used                    last=new_last;// it is, so set last to point to its new node                delete temp;// delete the previously last node            }            else// delete a range. This range may reach, or even surpass the end of the list,so we have to check for this            {                bool reach_end=false;// set to true if last is reached                unsigned int check=_size+offset+num;                (check>=_size)?reach_end=true:reach_end=false;//if check is greater or equal to size, then the user is attempting to go past the end of the list.                if(reach_end)// the num reaches the end of the list                {                    erase(offset,pos);// call this function with num=0 to delete all from this point onwards                    return;// job done                }                // job not done, so..                while(num==0)                {                    t2=temp;                    temp=temp->next;                    t2->remove();                    delete t2;                    num--;                }            }        }        unsigned int size()        {            return _size;        }        ~mglList()        {            erase(0,MLIST_FIRST);        }};#endif //MGL_LIST_H_

I would appreciate any help with this topic. as I feel that I shouldn't try to program anything more complicated until I can program a linked list.

BTW, I am aware of std::list.
Where is your constructor? Your erase function is throwing memory access violations. I'm guessing it is because first and last have not been initialized to anything so when you do temp!=last, last could be pointing anywhere.
I'll do a nazi-mode commenting pass, which will probably cover whatever runtime error you're getting.

EDIT: Note that for the most part I'm only commenting on repeating problems the first time they crop up. I'm leaving it as an exercise to you, the reader, to implement the relevant pieces of advice throughout your entire code - e.g. I don't comment on "t2" being a bad name because I've already pointed out the reasons why "temp" is a bad name (which "t2" shares), and done my best to explain on a generalized level how to improve upon that which applies similarly as well.

Problem #1:
node* mglList< T >::node::remove(){next->prev=prev;prev->next=next;return this;}


This code assumes both next and prev can never be null. If this was your intent, you should explicitly add this preceeding statement: "assert( next && prev );" (with assert() declared in the standard header <cassert>), to immediately bring up an error box informing you that this precondition you've set has been violated should the problem crop up in debug mode. However in this case, the line above this one contains node's constructor, which clearly has cases for handling NULL next and prev pointers. As such, you'll want to check each of these (seperately) before trying to access them.

On windows, this will have the symptom of generating an access violation accessing at or near address 0x00000000.
On linux, this will result in a segfault.
A debugger in both cases should stop you on the line of the attempted access of these pointers. Note that if this occurs within a member function, this can include the implicit "this" pointer inherit in accessing any member variable.

It's also worth noting this should probably be a deconstructor as well - construction "inserts" into this list, deconstruction should "remove" it. Alternatively, if you want the ability to remove a node from one place in the list and insert it into another, the constructor's contents should instead be moved into an "insert" function, although your (de)constructors should likely automatically call these as well.

Note that in this case, remove() should probably assign it's member next and prev variables to NULL in order to indicate it dosn't have any neighbors, to prevent a node from "removing" itself again - clobbering it's former neighbors which may have been further updated in the meantime. Only in a destructor would I find it acceptable to skip this, as those variables are about to cease to exist anyways.

Suggestion #2:
void mglList< T >::push_back( T * data );


Conditionals are "evil". Well, not quite, but they increase the logical complexity of code a lot more than other kinds of statements - three if/else statements will lead to 8 possible paths that code can be executed. As such, keeping things to a minimum as much as possible - or at least more intuitively graspable - is a good idea. Similar advice applies to many of the following functions, but here's how I'd revise your current push_back:
void push_back(T* data){      last = new node(last,NULL,data);      if ( ! first ) first = last;      ++size;}

That's about half the code from the original. I've done my best to reduce special cases, or code repeated in multiple cases - I don't write how to make a new node, or to increment the size, more than once, since the logic between the two bits of code is identical. Or at least, already implemented in node's constructor, and thus in no need of repeating.

This is a form of "refactoring" - changes to code which don't affect the end result, but make it easier to understand, debug, or make further changes to. Simplifying things as much as possible is (almost) always the method to accomplish this, and may even help reduce EXE sizes and speed up your code (by minor amounts).

Nitpick #3.1:
T* mglList< T >::pop_front/back();


"Temp" is a horrible name for a variable which, being defined within the function itself, is already obviously temporary anyways. Call it data, or popped_data. Things describing what it represents.

Nitpick #3.2:
void mglList< T >::insert(unsigned int offset,position pos,T* data);


While it's good that you've gone and commented on the purpouse of "temp", that you felt the need to do this is indicative of a nasty underlying problem - "temp" is a horrible name which dosn't describe anything that's not already apparent, and above all, it dosn't describe what it represents. Similarly, abbreviations hurt code legibility - I'd use "insertee" instead of "ins".

Problem #4:
void mglList< T >::insert(unsigned int offset,position pos,T* data);


This one I'm going to shrowd in mistery, and try to help focus on identifying the telltales of the problem:

Hint #1: Ask yourself what happens when you insert into an empty list.
Hint #2: "ins" is assigned to, but never read from.
Hint #3: What are the values of "first" and "last" after inserting into an empty list?
Hint #4: std::list and other linked list implementations typically deal with this special case by implementing "first" and "last" in terms of a psuedo node - a node with a next and prev, which can be a neighbor to other nodes, but which does not itself have a data component.
Hint #5: This should also remove the need for any conditionals whatsoever inside of push_back.

Problem #5:
void mglList< T >::erase(int offset,position pos,unsigned int num=0);


Special case sushi! A lot of this repeats the work that should be done by remove() which shouldn't have to be explicitly called at all, being done automatically from node's destructor which will be invoked with the deletes. Centralize and simplify my friend! The logic should be very similar to insert's.


Hope this helps!
Right, I've got a working version, and it holds the data,but I want to know if there is anything I can do to make it more streamlined.

here is the class(abbreviated here to aid viewing, I'll ad the definitions to the end)

template <class T>class mglList{    protected:        /* the bounds_node structure */        struct bounds_node;        struct node;        bounds_node* first;        bounds_node* last;        unsigned int _size;        node* int_find(unsigned int offset);    public:        void push_back(T* data);        void push_front(T* data);        T* pop_back();        T* pop_front();        void insert(unsigned int offset,T* data);        void erase(int offset,unsigned int num=0);        unsigned int size()        {            return _size;        }        T* find(unsigned int num=1);        void ToConsoleOutput();        mglList():first(new bounds_node(NULL,last)),last(new bounds_node(first,NULL)),_size(0){}        ~mglList()        {            erase(0);        }


here are the structure difinitions;
struct bounds_node{      public:      bounds_node* next;      bounds_node* prev;      bounds_node(bounds_node* prev,bounds_node* next):next(reinterpret_cast<node*>(next)),prev(reinterpret_cast<node*>(prev)){}      ~bounds_node(){}};struct node:public bounds_node{      T* data;      node* remove()      {          if(bounds_node::next)              bounds_node::next->prev=bounds_node::prev;          if(bounds_node::prev)              bounds_node::prev->next=bounds_node::next;          bounds_node::next=bounds_node::prev=NULL;          return this;      }      node(bounds_node* prev,bounds_node* next,T*data):bounds_node(prev,next),data(data){next->prev=this;prev->next=this;}      ~node(){remove();}};


and the methods for the class/template;
node* int_find(unsigned int offset)        {            bounds_node* temp_search_node=first->next;            unsigned int is_offs_gt_size = (_size-offset);            if(is_offs_gt_size<=0)                return reinterpret_cast<node*>(last->prev);            else            {                for(unsigned int i=1;i<=offset;++i)                {                    temp_search_node=temp_search_node->next;                }                return reinterpret_cast<node*>(temp_search_node);            }        }    public:        void push_back(T* data)        {            node* insertee=new node((node*)last->prev,(node*)last,data);            last->prev=insertee;            _size++;        }        void push_front(T* data)        {            node* insertee=new node((bounds_node*)first,(bounds_node*)(first->next),data);            first->next=insertee;            _size++;        }        T* pop_back()        {            T* t_data_node=reinterpret_cast<node*>(last->prev)->data;            erase(0,1);            return t_data_node;        }        T* pop_front()        {            T* t_data_node=first.next->data;            erase(0,1);            return t_data_node;        }        void insert(unsigned int offset,T* data)        {            node* insertion_point;            if(offset>_size)                push_back(data);//add the data to the end            else            {                insertion_point=int_find(offset);                new node(insertion_point,insertion_point->next,data);                _size++;            }        }        void erase(int offset,unsigned int num=0)        {            node* erase_point=int_find(offset),*catch_point=reinterpret_cast<node*>(erase_point->next);            if (num==0)            {                while((bounds_node*)erase_point!=last)                {                    delete erase_point;                    erase_point=catch_point;                    catch_point=reinterpret_cast<node*>(erase_point->next);                    _size--;                }            }            while(erase_point!=last && num!=0)            {                delete erase_point;                erase_point=catch_point;                catch_point=reinterpret_cast<node*>(erase_point->next);                _size--;                num--;            }        }        unsigned int size()        {            return _size;        }        T* find(unsigned int num=1)        {            node* ret_val=int_find(num);            if(ret_val!=NULL)                return(ret_val->data);            else                return NULL;        }        void ToConsoleOutput()        {            node* reader=(node*)first->next;            bounds_node* node_addr=first;            int unit=0;            while(reader!=last)            {                cout << endl << "unit:\t\t" << unit << endl << "address:\t" << node_addr->next << endl << "\tprev:\t" << reader->prev << endl <<"\tnext:\t" << reader->next << endl << "\tdata:\t" << *(reader->data) << endl;                reader=(node*)reader->next;                node_addr=node_addr->next;                unit++;            }            cout << endl << "-------------------end of list-------------------" << endl;        }        mglList():first(new bounds_node(NULL,last)),last(new bounds_node(first,NULL)),_size(0){}        ~mglList()        {            erase(0);        }


Any help wuld be appreciated.
And another pass! Shorter this time.

mglList<T>::mglList() :
first should be initialized with new bounds_node(NULL,NULL) - at this point, last is uninitialized, resulting in an undefined pointer being passed to the second parameter. The construction of the bounds_node last is initialized with should initialize it's prev pointer correctly.

mglList<T>::erase(int, unsigned) :
This one is completely unobvious to me. I had to fully grok the source code - as well as seeing it's use in ~mglList and realizing it must delete first/last (or leak memory, which is what I was initially checking for). Defaulting to num=1 wouldn't make my brain hurt *quite* as much, although it would still really hurt quite a lot (I would consider describing comments manditory even with that change). The standard library equivilant would be a constant, npos, which AFAIK is typically the maximum number representable (e.g. 0xFFFFFFFF, or std::numeric_limits< unsigned int >::max()).

As is, I have to read the source code to understand what it does, which means it fails to abstract the action it does.

It also breaks the class. The following will fail:

mglList<T> list;list.erase(0); //deletes everything, first/last are now dangling pointerslist.push_back( new T ); //crashes, tries to dereference last multiple times


Special case sushi note: You basically have a big "if ( special_case ) ... else ...", although the else in this case is implicit (the while will never run if special_case is equal to true). That their contents are basically the same is another big hint that something is amiss. And now here's the kicker: you can entirely eliminate the preceeding if, if you use the max-representable npos I suggested earlier.

void mglList<T>::ToConsoleOutput()

Most of the time, people use operator<< for this, like all the other types normally accepted by streams use. Plus, it avoids hard coding yourself to cout. The equivilant would be:

friend std::ostream & operator<<( std::ostream & os , const mglList<T> & self );

std::ostream & operator<<( std::ostream & os , const mglList<T> & self ) {
...mostly as you already have, although member variables must be accessed with self._____, and you use "os" instead of "std::cout"...

return os;
}

General structure

Most linked list implementations will have only 1 main bounds_node member, and it won't be a pointer to one (it's more efficient, less error prone (you can't invalidate a pointer if it's not a pointer)). It won't be eraseable, but your current first/last shouldn't be because everything breaks every which way as is.

In this case, you can have (where head is of type bounds_node):

bounds_node* first() const { assert( ! empty() ); return head.next; }
bounds_node* last () const { assert( ! empty() ); return head.prev; }

Note that these will be the first/last nodes containing actual data (hopefully).



At this point, my pasta is done, so I must bid you adieu. Hope this helps.



EDIT: Dear me, there's more.

mglList<T>::int_find(unsigned):

if(is_offs_gt_size<=0)

Unsigned numbers like is_offs_gt_size cannot be less than zero. This screams "error". In this case, if offset > _size, it will underflow to some really big number. You shouldn't even be writing "is_offs_gt_size" (which is a misdominer) - you should just change the if statement to if ( offset > _size ) (which is your intent, and not a misdominer). It's clear that you were abbreviating "is offset greater than size" so it makes no sense for you not to write exactly that in your code. If you must use a temporary in such a situation, use: bool is_offset_gt_size = offset > size;

mglList<T>::pop_back/front:

You have these returning different things, but erasing the same thing. Actually, no, pop_front won't even compile, since you're using first.next (and first is a pointer). You didn't even put this through a compiler?

mglList<T>::find: Why the hell does find have a default?



Three things you should look into:

Unit testing. This is the process of writing automated tests for your code, thoroughly exercising it and doing your best to walk through all the corner cases. Or in this case, dead-center-of-normalicy cases.

Invariants. Developing classes, invariants should always be on your mind. What do I need to always be true (or else much code will break)?

Asserts. These are like mini unit tests of invariants (things that should always be true) that are inline to your code and have the ultimate use case: your actual program. I see 0 asserts in your code, and that makes me very nervous.

[Edited by - MaulingMonkey on November 7, 2006 12:34:15 AM]
Quote:Original post by MaulingMonkey
And another pass! Shorter this time.

mglList<T>::mglList() :
first should be initialized with new bounds_node(NULL,NULL) - at this point, last is uninitialized, resulting in an undefined pointer being passed to the second parameter. The construction of the bounds_node last is initialized with should initialize it's prev pointer correctly.

Whoops! corrected.

Quote:
mglList<T>::erase(int, unsigned) :
This one is completely unobvious to me. I had to fully grok the source code - as well as seeing it's use in ~mglList and realizing it must delete first/last (or leak memory, which is what I was initially checking for). Defaulting to num=1 wouldn't make my brain hurt *quite* as much, although it would still really hurt quite a lot (I would consider describing comments manditory even with that change). The standard library equivilant would be a constant, npos, which AFAIK is typically the maximum number representable (e.g. 0xFFFFFFFF, or std::numeric_limits< unsigned int >::max()).

I wanted it to delete all nodes from the current onwards if 'num' was set to 0.
If it had been set to 1, then by default, it would only delete the node at offset. Using the largest number is all well and good, but it is possible for the list to get that big, whereas if the list has no elements, then there's none to delete anyway. 0 seemed the best choice. Also, you said in your previous post that erase and insert should have similar code. I couldn't do that, because erase() can delete a range, while insert() can only add one node at a time.

Quote:
It also breaks the class. The following will fail:

mglList<T> list;list.erase(0); //deletes everything, first/last are now dangling pointerslist.push_back( new T ); //crashes, tries to dereference last multiple times


how so? I thought the while(){} loop wouldn't run if it found that erase_point pointed at last. Although I can see how it might delete first, I still don't see how last would be affected.

Quote:
Special case sushi note: You basically have a big "if ( special_case ) ... else ...", although the else in this case is implicit (the while will never run if special_case is equal to true). That their contents are basically the same is another big hint that something is amiss. And now here's the kicker: you can entirely eliminate the preceeding if, if you use the max-representable npos I suggested earlier.

Added a return in, at the end of the if statement block(inside it, before you ask, the code will reach the while loop, should the if statement prove false)
Quote:
void mglList<T>::ToConsoleOutput()

Most of the time, people use operator<< for this, like all the other types normally accepted by streams use. Plus, it avoids hard coding yourself to cout. The equivilant would be:

friend std::ostream & operator<<( std::ostream & os , const mglList<T> & self );

std::ostream & operator<<( std::ostream & os , const mglList<T> & self ) {
...mostly as you already have, although member variables must be accessed with self._____, and you use "os" instead of "std::cout"...

return os;
}


I'm sure they do, but I intend to remove this method when(if?) I finally finish correcting the class
Quote:
General structure

Most linked list implementations will have only 1 main bounds_node member, and it won't be a pointer to one (it's more efficient, less error prone (you can't invalidate a pointer if it's not a pointer)). It won't be eraseable, but your current first/last shouldn't be because everything breaks every which way as is.

In this case, you can have (where head is of type bounds_node):

bounds_node* first() const { assert( ! empty() ); return head.next; }
bounds_node* last () const { assert( ! empty() ); return head.prev; }

Note that these will be the first/last nodes containing actual data (hopefully).

I had thought of it(except, I still used both first and last instead of one 'head' bounds_node), but couldn't get it to work for some reason...
...I'll have another go at it...
Quote:
EDIT: Dear me, there's more.

...aheh... [embarrass]
Quote:
mglList<T>::int_find(unsigned):

if(is_offs_gt_size<=0)

Unsigned numbers like is_offs_gt_size cannot be less than zero. This screams "error". In this case, if offset > _size, it will underflow to some really big number. You shouldn't even be writing "is_offs_gt_size" (which is a misdominer) - you should just change the if statement to if ( offset > _size ) (which is your intent, and not a misdominer). It's clear that you were abbreviating "is offset greater than size" so it makes no sense for you not to write exactly that in your code. If you must use a temporary in such a situation, use: bool is_offset_gt_size = offset > size;

Point, that's what I get for trying to write code at 3:00 AM...

Quote:
mglList<T>::pop_back/front:

You have these returning different things, but erasing the same thing. Actually, no, pop_front won't even compile, since you're using first.next (and first is a pointer). You didn't even put this through a compiler?

Actually, yeah, I did, and it compiled fine. The only thing is that I didn't happen to test that particular method. the whole first.next bit was during my attempt to make first and last actual bounds_nodes as opposed to pointers. Not sure what you mean by "returning different things", though...
Quote:
mglList<T>::find: Why the hell does find have a default?

I was thinking about this one at work today, and realised it functioned more like the []operator. I'm gonna rewrite the find method, I'll probably use this code for the aforementioned operator, and make int_find() a public method(and drop the 'int_')

This topic is closed to new replies.

Advertisement