Sign in to follow this  

How to start using Data Structures in your code

This topic is 2849 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I've done a basic course on Data Structures And Algorithms where I studied about Data Structures like binary trees, Red Black Trees, Heaps, graphs, etc. My problem is, that I don't really know where to use these Data Structures in my programs. I've learned them , so I'm itching to actually use them in my code, but I don't know how to start using them. Any ideas where to start?

Share this post


Link to post
Share on other sites
Data structures are tools, you shouldn't make an application just to use them. Most of them are also probably already implemented in the standard libraries of your preferred language, so you will rarely need to do it yourself.
If you have learned data structure, you have probably also learned when and how to use them. In a real application there are a lot of opportunity to use some kind of data structure, just start implementing one and you will be able to see it by yourself.

Share this post


Link to post
Share on other sites
Quote:
Original post by kvsingh
My problem is, that I don't really know where to use these Data Structures in my programs.
In general, you don't.

On a day-to-day basis you will use the implementations provided by the language developers (i.e. std::list, std::map, etc.). Only when solving complex and unusual problems will you need to implement these data structures yourself.

However, data structures are an essential part of a computer scientist's toolkit. It is important to understand the performance/memory behaviour of each data structure, and how that relates to the implementations you are using (for instance, std::map is typically implemented as a red-black tree), so that you can choose the right tool to solve a given problem.

That, and I hear that data structure implementation often shows up as 'gimme' questions on job interviews [smile]

Share this post


Link to post
Share on other sites
You use them to solve the problems that they are useful in solving. Don't have the problem? Oh well. Don't artificially create a problem (except to practise, in a separate project). Don't know what problem is solved? Well, then you didn't really learn the material, hmm? ;)

Data structures don't really do anything on their own. They're just ways to... structure data. In the same way, cleaning up your room doesn't do anything by itself. It does, however, make it easier to look for things (apply an algorithm that is compatible with the data structure). Correspondingly, a heap, for example, makes it easy to find large (or small, or whatever other property) elements, which in turn makes it easy to sort them by size (or whatever other property) - i.e. HeapSort - or process them in order, without doing all of the sorting ahead of time.

Share this post


Link to post
Share on other sites
Yeah,

In practice it's definitely rare that you actually create your own data structure; most modern languages provide common implementations for you. Your knowledge of Data Structures will allow you to know that (a) you need one and (b) which is the best candidate. The "Algorithms" part of that class is also very important. Being able to evaluate and reduce the complexity of your algorithms is a supremely important part of getting a game to run at framerate.

However, here's a couple of simpler examples off the top of my head:
Texture manager: hashmap of name -> textureID so you only have to load textures once
A* pathfinding: binary heap to store the open node list
AI behavior: stack to represent complex behaviors as compositions of simple ones
General: queue for various types of event passing/processing
Interviewing: you'll be asked to design a data structure for some random task in almost every technical interview.

But generally you'll encounter those interview-type situations commonly. You have some weird set of data that you need to access in very specific ways: design or choose a data structure and algorithm to most efficiently handle those conditions.

-me

Share this post


Link to post
Share on other sites
Quote:
Original post by kvsingh
Hi, I've done a basic course on Data Structures And Algorithms where I studied about Data Structures like binary trees, Red Black Trees, Heaps, graphs, etc.

My problem is, that I don't really know where to use these Data Structures in my programs. I've learned them , so I'm itching to actually use them in my code, but I don't know how to start using them.

Any ideas where to start?


There's plenty of places to use them in computer games, just think of what you want to do and keep in mind your tools in case you need them.

Quote:
Original post by apatriarca
Data structures are tools, you shouldn't make an application just to use them. Most of them are also probably already implemented in the standard libraries of your preferred language, so you will rarely need to do it yourself.


Not for business/database/gui coding or scripting of any kind but for games in C++ using stl is a joke for a graphics engine or computation heavy tools or anything that has very specific needs (which seems to come up all the time).

Share this post


Link to post
Share on other sites
Quote:
Original post by thatguyfromthething
for games in C++ using stl is a joke for a graphics engine or computation heavy tools or anything that has very specific needs (which seems to come up all the time).
Unfortunately, that myth is still very widespread...
Maybe if you're using codewarrior or MSVC6 it's still good advice, but it's possible to get perfect performance out of the STL these days.

std::vector+std::sort over malloc+free+realloc+qsort+memset any day...

Share this post


Link to post
Share on other sites
Quote:
Original post by thatguyfromthething
Quote:
Original post by apatriarca
Data structures are tools, you shouldn't make an application just to use them. Most of them are also probably already implemented in the standard libraries of your preferred language, so you will rarely need to do it yourself.


Not for business/database/gui coding or scripting of any kind but for games in C++ using stl is a joke for a graphics engine or computation heavy tools or anything that has very specific needs (which seems to come up all the time).

My comment was quite general, C++ isn't the only language he can be interested in and games aren't the only applications he can write. But as Hodgman said, STL can be used in games and if this is a problem in some part of code, there are a lot of already available libraries you can try before writing your own.

Share this post


Link to post
Share on other sites
Quote:
Original post by Hodgman
Quote:
Original post by thatguyfromthething
for games in C++ using stl is a joke for a graphics engine or computation heavy tools or anything that has very specific needs (which seems to come up all the time).
Unfortunately, that myth is still very widespread...
Maybe if you're using codewarrior or MSVC6 it's still good advice, but it's possible to get perfect performance out of the STL these days.

std::vector+std::sort over malloc+free+realloc+qsort+memset any day...


Yeah, it's extremely disappointing that this advice is so widespread. At a previous company we had a few developers who refused to use STL and wrote their own containers because "STL was too slow". I did some performance tests and showed that in various cases, the STL was faster, sometimes considerably. There was not a single case where theirs were faster than STL. It's definitely possible to outperform STL, but not so simple and often not worthwhile. The myth gets on my nerves too. Working in C++ is frustrating from the point of view that this mentality is so rampant, when in 97% of cases, it's utterly unnecessary.

Share this post


Link to post
Share on other sites
Quote:
Original post by Rydinare
Quote:
Original post by Hodgman
Quote:
Original post by thatguyfromthething
for games in C++ using stl is a joke for a graphics engine or computation heavy tools or anything that has very specific needs (which seems to come up all the time).
Unfortunately, that myth is still very widespread...
Maybe if you're using codewarrior or MSVC6 it's still good advice, but it's possible to get perfect performance out of the STL these days.

std::vector+std::sort over malloc+free+realloc+qsort+memset any day...


Yeah, it's extremely disappointing that this advice is so widespread. At a previous company we had a few developers who refused to use STL and wrote their own containers because "STL was too slow". I did some performance tests and showed that in various cases, the STL was faster, sometimes considerably. There was not a single case where theirs were faster than STL. It's definitely possible to outperform STL, but not so simple and often not worthwhile. The myth gets on my nerves too. Working in C++ is frustrating from the point of view that this mentality is so rampant, when in 97% of cases, it's utterly unnecessary.


Sure, it will be faster than what someone who is a novice programmer writes, but how many times have you seen people using stl drop down to a straight c array? If you need to do that more than one in a thousand times you suffer all the penalty of if you didn't use it at all, plus the penalty of huge learning curve and million problems and quirks.

To write a good vector class as an example, it's really an exercise in memory management. That's too much for the average guy to tackle in a realistic timeframe, but the gains you can make are enormous even for vector class if you know what you are doing. Like if you might want your memory aligned properly by default or to have some SIMD get generated out (and no, this will never EVER happen automatically in your code).

And that's just easily measurable speed you can improve, not the insidious problems caused by code size bloat when you have everything defined as multiple templates defined in terms of each other, which template on multiple items.

Not to mention where is the stl kdtree, hashtable, octree, bsp, cover tree, ball tree, rtree, btree, rtree, spatial index etc. etc. etc.? I mean, not even a hash table? Seriously? So how far can you go with game programming with just a vector and a map?

Also not mentioning crossplatform compatibility. Linux? Who cares? Has anyone ever made a cent supporting linux? Mac support is nice I guess but then there's what? 5000 other platforms? Wouldn't you like to have code work on more than 1% of those platforms without additional tweaks? I have sat next to guys on multiple occasions who were trying to port stl over to other platforms, and for the time spent you may as well make your own - especially since there's not a hell of a lot in there anyway. Even if an implementation is lame you will be able to change it any time you want easily. I've gone through millions of lines of other people's code and hacked away but there's no way I could go into stlport source and figure out thing one of what's going on without searching around for hours.

A template library written almost entirely in macros! The irony of that is astounding. That's exactly what templates are there to replace, and if you need or want to resort to macros chances are you are seriously abusing templates and have a poor design, which I strongly believe is the case.

So we get this kind of lame vector class, some math utilities that are far from optimal, a sort that is slow and has a ridiculously cumbersome interface using function points. Iostreams? Probably worst implementation idea in history. Except iterators, of course, which is the other big 'selling point' that is supposedly 'object oriented'.

But, templates are the opposite of OOP. Instead of unifying your interface you are shattering it completely and making it impossible to tell what code is actually generated. WHAT GETS GENERATED? This is very hard to track down and is mostly left up to the people making the compiler, and for anything where you care about performance this is an issue. And if you care about more than one compiler, you get to track it down more than once, and you get things working great on one but not the other.

I have gotten surprised countless times at the stupid things compilers will do, especially with templates. Which is not to say not to use them at all because with C++ you are somewhat forced into it, but when you template on serveral items and do all the cute tricks you see in stl and boost, it is almost totally hopeless to ever track down what's really going on at a low level and you will see sudden performance drops for seemingly no reason. From the stupid comments I see all the time it's clear most people using stl and boost don't even know how templates really work or how the compiler works at all.

So obviously it's terrible if you really care about performance, and it doesn't have the data structures you would care about for making anything but some kind of business app without resorting to another API. And it's it's the antithesis of OOP because instead of simplifying interfaces and and types it exponentially grows them, and instead of decoupling interface and implementation it again exponentially increases the coupling effect between files.

So it's a terrible idea from either of the two important perspectives. If you are doing like a large business or database or general GUI app, you should care about the OOP aspect. If you are doing like a game or graphics tool then the performance aspect for most tasks and the OOP for things like events and GUI. If you don't care about either of these things, why are you using C++ and not perl or something?

So no, I am not spreading lies or misinformation. I used to think more like you, but years and years of struggling with C++ to make it behave properly have finally made me realize why this is such a poor approach.

Share this post


Link to post
Share on other sites
Quote:
Original post by Hodgman
Quote:
Original post by thatguyfromthething
for games in C++ using stl is a joke for a graphics engine or computation heavy tools or anything that has very specific needs (which seems to come up all the time).
Unfortunately, that myth is still very widespread...
Maybe if you're using codewarrior or MSVC6 it's still good advice, but it's possible to get perfect performance out of the STL these days.

std::vector+std::sort over malloc+free+realloc+qsort+memset any day...


The irony here is that you use code warrior to compile for the wii (messed up isn't it??)

Share this post


Link to post
Share on other sites
I think we had this particular flame war a bit too recently. It's offtopic and unenlightenting. Anyone below this post either criticizing or defending the performance of STL containers will be moderated harshly.

Share this post


Link to post
Share on other sites
Quote:
Original post by Atrix256
The irony here is that you use code warrior to compile for the wii (messed up isn't it??)
Yeah that's why I mentioned it. The Nintendo and older Sony platforms are the last bastions of horrible C++ compilers ;) (at least for console/PC games).
You know the 360 has a good compiler, coz it's microsoft!
The PS3 is another kettle of fish, because you've got to be so data-centric to be nice to it's CPU that most of your existing C++ code probably is going to be slow no matter what containers/libraries you've used in the past.

Share this post


Link to post
Share on other sites
I'll avoid the topic of performance specifically, but I will say that my day-job has me looking at code from a variety of AAA and Live Arcade titles from developers of all sizes. Very, very rarely do we see bespoke implimentations of data structures that are present in the C++ standard library, and the majority of those cases are in interacting with a 3rd party library developed outside of the primary development shop (eg, for Havok), I'd imagine that the majority of other cases we see stem from legacy code.

Granted that there are "holes" in the standard library from a gaming-centric worldview -- its probably going to be awhile before we see std::octree -- but the features it does provide are being used successfully bu game studios -- even on consoles, even in AAA games, even in cutting edge technology.

If it's good enough for them, its good enough for you and I.


I'm posting this not as a STL-vs-NO-STL debate, but to publicize the fact that STL is used successfully in real, cutting-edge games today.

To bring this back around to the original topic. Don't be affraid to use *all* the features and facilities the C++ standard library provides. I daresay many so-called C++ developers don't know half the depth of all of it. Don't be afraid of Boost either, many of the projects we see are using that as well.

Implimenting correct, performant data structures from scratch is a non-trivial undertaking. You'll be much more efficient using the standard library correctly as-is, and where necessary implimenting new structures in terms of existing ones as much as possible.

Probably the two big things to be aware of (in any case, though this has typically served as ammunition against C++SL adoption in the past) are to watch out for extraneous copies (for some great examples, read up on the new r-value references that are being added to C++) and to learn how to write and supply your own Allocators to the standard containers.

Share this post


Link to post
Share on other sites
When you took your data structures course you learned the algorithmic complexity (both in time and space for each algorithm), as well as the details of each Abstract Data Type, right?

When you need to solve a problem, first pick the structure with the most convenient ADT. Then, if performance becomes an issue, pick the data structure with the complexity characteristics that best suit the problem at hand.

To get a feel for this, pull out your algorithms text, and flip to your favorite chapter. A good text on this will discuss how the data needs to be stored to support the algorithm. If you haven't taken algorithms yet, relax, these details will be made very clear there :).

In most pragmatic programming you will deal with ordered sequences, queues, and lookup tables (implementation varies) more often than anything else. You're either processing lists of things, or looking things up. Traversing trees is not uncommon, but it's still rare compared to these tasks.

As mentioned earlier, game programming and simulations programming tends to be where the fancy data structures come out and play. In these cases, real time processing is crucial, and data structures and algorithms that minimize complexity are needed.

Share this post


Link to post
Share on other sites
The problem is, I'm not really interested in core engine programming (where I'm guessing most of the data structures are used). I'm actually interested in gameplay programming. Are Data Structures used often in gameplay programming?

Also, most of the programs I've made till now really don't have any performance issues(since they are really small), maybe that's why I haven't got a feel for using Data Structures yet.

Another question : My Algorithms course didn't really involve any implementation of the Data Structures studied .. as in we didn't have any Labs. Is it a good idea to implement them once on my own?

Share this post


Link to post
Share on other sites
Quote:
Original post by kvsingh
Another question : My Algorithms course didn't really involve any implementation of the Data Structures studied .. as in we didn't have any Labs. Is it a good idea to implement them once on my own?


Good heavens. Did you actually implement the algorithms, or just talk about them?

Data structures primarily exist in order to have something to which to apply the algorithms.

Share this post


Link to post
Share on other sites
Quote:
Original post by kvsingh
The problem is, I'm not really interested in core engine programming (where I'm guessing most of the data structures are used). I'm actually interested in gameplay programming. Are Data Structures used often in gameplay programming?

Also, most of the programs I've made till now really don't have any performance issues(since they are really small), maybe that's why I haven't got a feel for using Data Structures yet.

Another question : My Algorithms course didn't really involve any implementation of the Data Structures studied .. as in we didn't have any Labs. Is it a good idea to implement them once on my own?


Data structures are used in about everything you will ever do, or generally should be. It's typical that a degree will make you implement many of them at some point. Usually an algorithms analysis class will focus on big o complexity which is important to understand, but you should be implementing many of them as you go in other classes I'd hope. But what really goes on in a degree for computer science varies a lot more than in other sciences.

I do think it's helpful to understand how they work to actually make them, but if you try to do it on your own outside of class as someone just learning you will spend a ton of time on this and your end result won't likely be all that great.

If you want to make games, I suggest to look for an engine first. Whatever it uses for data structures is what you will mainly use. It would be silly to make your own matrix class for example, compared to what someone who writes render code for a living will make it's probably a waste of time.

But there's no remotely cheap game engines out there that provide implementation of every single thing you might possibly need. At the least you'll probably need to do something for AI and pathfinding, but there's also products for just about everything you might want. Unfortunately they tend to cost a lot of money or have some other big issue.

I'd suggest to just look at game engines and to try to make a project and then solve each problem as it comes up in the way that seems the best. As a learning programmer it will be hard to make progress but you will learn a lot as you go.

edit:
I wonder if I get banned when my rating reaches 0.

Share this post


Link to post
Share on other sites
Quote:
Original post by thatguyfromthething
I wonder if I get banned when my rating reaches 0.
There are many people who have lived long and fulfilling* lives after hitting 0 ;)





* Whatever your definition of "fulfilling" is...

Share this post


Link to post
Share on other sites
To the OP,

I think you're over-analyzing everything. In its simplest sense, a data structure is exactly what it's name suggests - a way to structure your data. Certainly every non-trivial application in existence has a need to structure their data in certain ways, otherwise how can you get anything done?

An array is an example of a data structure, surely you have experience with these. Often they're really convenient, because with little to no effort you can trivially access any item in the array. Arrays are also convenient when you know in advance exactly how many items you need.

Essentially the point of *any* data structure is just to organize your collection in various ways. Linked lists organize them sequentially, which is nice when you only never need to access them in order. Trees organize them hierarchically, which is nice when there is a natural hierarchical relationship (imagine windows explorer directory structure, for example).

If your program is a game like civilization, for example, where you have lots of cities that are connected by trade routes, maybe a graph would be a good data structure to apply so you can represent the relationship that actually exists with a similar structure in your code.

I'm surprised you didn't have to implement any data structures in your school program, but yes implementing them will help you to understand them better. After you understand them, you can delete all your implementations and use existing ones instead since they will be much better tested, more robust, and probably faster.

Share this post


Link to post
Share on other sites

This topic is 2849 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this