Jump to content

  • Log In with Google      Sign In   
  • Create Account

FREE SOFTWARE GIVEAWAY

We have 4 x Pro Licences (valued at $59 each) for 2d modular animation software Spriter to give away in this Thursday's GDNet Direct email newsletter.


Read more in this forum topic or make sure you're signed up (from the right-hand sidebar on the homepage) and read Thursday's newsletter to get in the running!


Why do we need to iterate data structures as they are already in RAM


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
20 replies to this topic

#1 Algorithmic_Dream();   Members   -  Reputation: 118

Like
0Likes
Like

Posted 22 June 2014 - 12:20 AM

First off, I hope I placed this in the appropriate section.  If so...

 

 

I've been programming for a while, but this may be a simple question to some (or many).

 

My Question: If a data structure already exist directly in RAM, then why do we need to spend the processing power to iterate in order to find the right data? (Side Note: pointers still exist in managed languages. They're just handled for you)

 

My theory is that in order to save memory, the pointer is the only actual data in RAM, so iterating expands the content within the memory address.

 

Basically if I have a list:

 

list(list name,value)

 

Index 0: list(mylist, 10);

Index 1: list(mylist, 15);

Index 2: list(mylist, 5);

 

All that would actually be in RAM is:

 

index 0, index1, index 2

 

Under my theory, in order to find and access any of it's contents, it has to do some form of parent to child linear search to expand all the data to find '5'? (The parent being the index's pointer to its physical address, and the data structure's value being the child pointer loaded into RAM as it searches):

 

Index 0: --> GET VALUE '10'; FALSE; Next Address!

Index 1: --> GET VALUE '15'; FALSE; Next Address!

Index 2: --> GET VALUE '5'; TRUE; RETURN Value!

 

This is all just my theory on it, so that's why I'm asking why iteration exists with data and data structures that are already in RAM, and why I can't just pull a value or any data attribute out of the data structure directly?

 

It would save so much processing power and development time if I never needed to (or rarely needed to) iterate within RAM.

 

Thanks for your time, and responses smile.png


Edited by Algorithmic_Dream();, 22 June 2014 - 02:01 AM.


Sponsor:

#2 Bacterius   Crossbones+   -  Reputation: 9300

Like
10Likes
Like

Posted 22 June 2014 - 12:49 AM

It depends on the data structure... for an array you can work out the memory address the data is located at by simple arithmetic and then look it that up in the array which does not require iteration, but if you have a simple linked list then you do have to search for the data you need because you can't just instantly "jump" anywhere in the linked list, you have to walk the list (through the "next" pointers, i.e. from node to node).

 

The property you are looking for is called "random access" and not all data structures offer it.


The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#3 Algorithmic_Dream();   Members   -  Reputation: 118

Like
0Likes
Like

Posted 22 June 2014 - 01:53 AM

In data structures that don't offer random access, is it because the pointers that are public only point to the physical addresses of the index? And in order to find anything else like a value, it could only be accessed as a private pointer within that physical address by performing it as a second step?

 

If I displayed on screen an inventory system that didn't use a random access data structure, by clicking directly on a value to manipulate it, would the spatial information expedite the iteration process (or maybe even skip the need for iterating)?



#4 Tribad   Members   -  Reputation: 887

Like
0Likes
Like

Posted 22 June 2014 - 02:05 AM

All data and code is located in RAM. To access the data and code you need an address. At some point all data and code is iterated through. In case of the code to execute the processor iterates with the program counter through the instructions.

Even data is accessed by an address. Using a pointer mostly means that you stored somewhere the address where the data resides. But even this pointer is stored in an address.

If you organize your data in an array you even iterate through memory but because you know that the data structures are layed out in an successive manner you can calculate the address of any structure in this array from the startaddress, the index and the size a single structure occupy.

Using linked lists or hash tables or whatever management structure is used only means to move the iteration process somewhere else.



#5 Chris_F   Members   -  Reputation: 2467

Like
0Likes
Like

Posted 22 June 2014 - 02:15 AM

I'm a bit confused by what you are asking. Are you wanting to know how linked lists work? You don't need to theorize about it, simply look at the wiki article or a basic implementation.

 

Better yet, consider not using a linked list.


Edited by Chris_F, 22 June 2014 - 02:18 AM.


#6 Algorithmic_Dream();   Members   -  Reputation: 118

Like
0Likes
Like

Posted 22 June 2014 - 03:15 AM

Thanks everybody who posted so far.

 

I guess I'm a little caught up on the idea that because I can click on an index/value directly in the UI, it shouldn't need to iterate 100 addresses to find it in memory. With that being said, hopefully I can use spatial data structures to access such data in a less iterative way.

 

@Chris_F: The topic wasn't about how linked lists worked in the typical implementation sense. I'm curious about why data structures need to iterate if everything is in RAM, and how to reduce their iteration cycle to (hopefully) a single point A - Point B iteration. I had the thought that if it's in RAM, all the data could be accessed in one simple want it-grab it function. An example of that is, if the mouse can click it, the data structure already knows where to go. If the player can see it on screen, the data structure should already know how it's organized and therefore, it shouldn't need to reiterate to reorganize.  I'm sure I'm on the right track to something there, but as different data structures will iterate differently, I guess I just have to find the right one for the job. That is in contrast to my thought that RAM could work like how someone would view an image or a movie. It would have a wide scope, and hold the data's big picture, while having no problem picking up on the smaller nuances and details.

 

Hopefully what I said made enough sense to you. I feel that what I said might need a bit more premeditated thought and analysis in order to be clear and concise for everybody. If it made sense to you, what are your thoughts. If it didn't, let me know.  I feel that I should take some time to truly learn how the RAM works, and then learn some algorithms for reducing the iteration/search time. Who knows, maybe i'll eventually get it to a 1 step iteration cycle.



#7 tonemgub   Members   -  Reputation: 1164

Like
0Likes
Like

Posted 22 June 2014 - 03:37 AM

RAM has nothing to do with iteration, or vice-versa.

Iteration is simply one of the many algorithms which may or may not be required for a specific task in a program. For your tree-item example, the program would most likely use a map, which is a data structure optimised for quick searches: each tree item gets a unique identifier, and that identifier can be directly translated to the item's memory address, without any iteration involved (the algorithm is called "mapping"). Iteration is usually only required with data structures and algorithms for which no such direct relationship between the items of the data and their location in memory can be established...

 

The best example would be sorting a list of random numbers. Since they are random, the computer does not know at what position in the list each number is located before sorting, and while sorting, it has to put each item at a specific position in the output list. To do that, you first tell it to "iterate" over the whole list, and set aside the smallest number it finds during iteration, then move it into the lowest position of the output list. Then you tell it to iterate again over the list, to find the second-smallest item, and put it in the second-lowest position in the output list.. And so on until all of the items are sorted.

 

Computers cannot simply look at the list of random numbers as a whole, and arrange them in order from smallest to biggest like us humans can.

 

As for RAM vs. file-storage, the data can be stored anywhere... Files have file offsets, which are similar to memory addresses (file offsets are also called pointers sometimes). Iterating over data from a file is no different than iterating over memory, so you cannot say that iteration is specific to RAM data...


Edited by tonemgub, 22 June 2014 - 04:08 AM.


#8 Tribad   Members   -  Reputation: 887

Like
0Likes
Like

Posted 22 June 2014 - 03:45 AM

RAM has nothing to do with iteration, or vice-versa.

Iteration is simply one of the many algorithms which may or may not be required for a specific task in a program. For your tree-item example, the program would most likely use a map, which is a data structure optimised for quick searches: each tree item gets a unique identifier, and that identifier can be directly translated to the item's memory address, without any iteration involved.

Even a map iterates through a tree. As I said before. Iteration is done somewhere but mostly hidden behind some function or encapsulated in a class/template or however you name it.

 

For my understanding iteration is the step by step walkthrough an organizational unit, map list queue whatever.



#9 Bacterius   Crossbones+   -  Reputation: 9300

Like
1Likes
Like

Posted 22 June 2014 - 03:53 AM

 

RAM has nothing to do with iteration, or vice-versa.

Iteration is simply one of the many algorithms which may or may not be required for a specific task in a program. For your tree-item example, the program would most likely use a map, which is a data structure optimised for quick searches: each tree item gets a unique identifier, and that identifier can be directly translated to the item's memory address, without any iteration involved.

Even a map iterates through a tree. As I said before. Iteration is done somewhere but mostly hidden behind some function or encapsulated in a class/template or however you name it.

 

For my understanding iteration is the step by step walkthrough an organizational unit, map list queue whatever.

 

 

Iteration means repeating something, usually refers to executing the body of a loop repeatedly (because it is a loop). It doesn't mean executing an algorithm. And a map data structure doesn't need to be implemented as a tree.


The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#10 Tribad   Members   -  Reputation: 887

Like
0Likes
Like

Posted 22 June 2014 - 04:16 AM


Iteration means repeating something, usually refers to executing the body of a loop repeatedly (because it is a loop). It doesn't mean executing an algorithm. And a map data structure doesn't need to be implemented as a tree.

A loop is an algorithm too. Not very complex but it describes how something has to be done.

 

I dont said that the map is implemented in a tree. But even the traversal of a tree is done in steps that are done the same way in each step, so it is an iteration anyways you use a loop inside or recursive call to the traversal function.

Maps maybe implemented as hashtables. But even there you only get a good guess where to start the search, that afterwards is iterated through a collection of data to find the right one.



#11 tonemgub   Members   -  Reputation: 1164

Like
0Likes
Like

Posted 22 June 2014 - 04:26 AM


Even a map iterates through a tree. As I said before. Iteration is done somewhere but mostly hidden behind some function or encapsulated in a class/template or however you name it.

I was not talking about trees as a data structure, but as a graphical control... neither was the OP.

This is also why I said that there's a direct relationship between the map keys and their values... If you use the pointers to the values as keys, there's no iteration during lookup. In my experience, this is how most tree-controls are implemented... they use pointers (or handles) to the data.

 

The graphical tree control is free to use whatever implementation it wants in order to speed up the lookups...


But even the traversal of a tree is done in steps that are done the same way in each step, so it is an iteration anyways you use a loop inside or recursive call to the traversal function.

You do not need to traverse the whole map just to get to the item you want... That's the whole point of maps.


Edited by tonemgub, 22 June 2014 - 04:31 AM.


#12 Bacterius   Crossbones+   -  Reputation: 9300

Like
4Likes
Like

Posted 22 June 2014 - 04:31 AM

 


Iteration means repeating something, usually refers to executing the body of a loop repeatedly (because it is a loop). It doesn't mean executing an algorithm. And a map data structure doesn't need to be implemented as a tree.

A loop is an algorithm too. Not very complex but it describes how something has to be done.

 

I dont said that the map is implemented in a tree. But even the traversal of a tree is done in steps that are done the same way in each step, so it is an iteration anyways you use a loop inside or recursive call to the traversal function.

Maps maybe implemented as hashtables. But even there you only get a good guess where to start the search, that afterwards is iterated through a collection of data to find the right one.

 

 

But it is not true. There are data structures which can perform constant time random access, notably arrays as mentioned earlier in the thread. To look up an element in the array from its position (index), you take the base address of the array, add to it the index of the element to be accessed multiplied by the element size in bytes, and that gives you the address of the element you are looking for, which you can then read. Directly. If you want to argue that this process consists of multiple steps and is implemented in assembly as a sequence of a handful of instructions and is thus a form of "iteration", you are free to do so, and you will have successfully redefined iteration to something different than what it means to every other programmer on the planet, and nobody will understand you.


The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#13 Algorithmic_Dream();   Members   -  Reputation: 118

Like
0Likes
Like

Posted 22 June 2014 - 04:43 AM

I'm going to call it a night after this post, but I'll be back.

 

I'm working on a solution. I'm a little tired right now, so I'll keep it short, sweet, and rough around the edges. I was thinking I'd use a geometric data structure that could morph from one data structure type to another. Such as, a list can morph into a tree, and a tree can morph into a graph. A graph can morph into an x/y grid. The 2nd thing about it is the data structure can adjust Level Of Detail, or index resolution. This way, the computer doesn't have to move across a high resolution data structure. It can zoom out for traveling long distance with less iteration, and zoom in at the right moment to line up on the desired index. There's other tricks that can be done to make it even more efficient, but by figuring out how the data structures can best adapt to their environment is imperative to their survival. wink.png   The most important trick is to keep it all simple, and yet give it plenty of depth.  If I can do that, I'll have a low memory footprint, a low processor footprint, and a low time/effort development footprint. Happy outcomes all around.



#14 tonemgub   Members   -  Reputation: 1164

Like
4Likes
Like

Posted 22 June 2014 - 04:47 AM

mellow.png



#15 Tribad   Members   -  Reputation: 887

Like
0Likes
Like

Posted 22 June 2014 - 04:47 AM

 

 


Iteration means repeating something, usually refers to executing the body of a loop repeatedly (because it is a loop). It doesn't mean executing an algorithm. And a map data structure doesn't need to be implemented as a tree.

A loop is an algorithm too. Not very complex but it describes how something has to be done.

 

I dont said that the map is implemented in a tree. But even the traversal of a tree is done in steps that are done the same way in each step, so it is an iteration anyways you use a loop inside or recursive call to the traversal function.

Maps maybe implemented as hashtables. But even there you only get a good guess where to start the search, that afterwards is iterated through a collection of data to find the right one.

 

 

But it is not true. There are data structures which can perform constant time random access, notably arrays as mentioned earlier in the thread. To look up an element in the array from its position (index), you take the base address of the array, add to it the index of the element to be accessed multiplied by the element size in bytes, and that gives you the address of the element you are looking for, which you can then read. Directly. If you want to argue that this process consists of multiple steps and is implemented in assembly as a sequence of a handful of instructions and is thus a form of "iteration", you are free to do so, and you will have successfully redefined iteration to something different than what it means to every other programmer on the planet, and nobody will understand you.

 

 

Mmmh.

I think you misread something. From my very first post.

 

 

If you organize your data in an array you even iterate through memory but because you know that the data structures are layed out in an successive manner you can calculate the address of any structure in this array from the startaddress, the index and the size a single structure occupy.

 

 

Same as you said.

Yes a map is something that maybe organized as an simple array.

I do not talked about assembly language and the like. So I dont understand what makes you so upset. I only wanted to say that "iteration" means doing something in the same way over and over again. And the only thing I do not agree is that iteration is limited to loops. Iteration is anything that is done over and over again until the goal of execution is reached.

And the other thing I said. Loops are algorithms too.

 

Thats all.



#16 Zaoshi Kaba   Crossbones+   -  Reputation: 4644

Like
1Likes
Like

Posted 22 June 2014 - 06:29 AM

I'm not sure I understand correctly. Are you asking why we have to look at all values in the array while searching for the one we need?

Imagine with stack of paper pages with just 1 word on each and you need to find page with specific word. You cannot take it out on first try, so you look at all pages 1 by 1 until you find what you need. Same principle is here.



#17 Vortez   Crossbones+   -  Reputation: 2704

Like
0Likes
Like

Posted 22 June 2014 - 07:48 AM

The thing with list and tree is, each time you create a node, you allocate it with new or malloc, which return some memory address, those address could be anything, they dont follow a pattern like an array where you can calculate the address of the next element. Even worst, list and tree might have their nodes moved wherever the programmer want to, so their is no linearity for those addresses, it's more or less random locations, and that's why you have to traverse the list to find the element you need.

 

Your question is like asking, but why do i need a street and address number to find x building in a city, insn't the city already know where it is? Unless the building is very popular or the town is very small, you wont get any result asking the mayor for the address your looking for.

 

The array is like a bunch of houses on the same street. Everything is one after the other. A list or tree are more like picking a bunch of houses at random.

 

Enough with the metaphores, on win32, every process has 4gb of (virtual) memory, your cpu can't reallisticly store all this in is tiny 1mb cache or whatever the number, so it as to fetch it from ram memory. It also grab the next fews bytes while doing so, so if you iterate over an array, it's faster cause you don't need to constantly fetch from ram to cpu cache. With list or tree, since the data are at more or less random location, you have to search the address first before loading the data (into the cpu cache or registers) and do something with it.

 

It's a trade-off between efficiency(array) and flexibility(list or tree ect)


Edited by Vortez, 22 June 2014 - 08:08 AM.


#18 SeraphLance   Members   -  Reputation: 1457

Like
11Likes
Like

Posted 22 June 2014 - 10:12 AM

Knowing something is in memory, and knowing where something is in memory are very different things.  Data structures all have very specific hard guarantees to fit their usage cases, and this is where you have to make your assumptions.

 

Arrays are contiguous by definition (though they can be indirect arrays of pointers instead of local object data).  This means that if you know the relative offset of your object (say, by index), you can jump straight to the object you want by pointer arithmetic.  This is what people mean by "random access".

 

Linked lists are not guaranteed to be contiguous.  You can certainly make one that is, but enforcing that constraint pretty much defeats the purpose of using one altogether.  What this means is that your item (e.g. item 3) is somewhere in memory, but you have no idea where.  Traversing a linked list is sort of like those breadcrumb-following riddles, where each destination tells you where to go next.

 

As for your other question -- no, objects are probably in RAM in their entirety, even when you're working with a pointer to them.  I say "probably" because RAM is a hardware concept, and memory in software-land can constitute more than ram (e.g. your hard disk).



#19 swiftcoder   Senior Moderators   -  Reputation: 10430

Like
2Likes
Like

Posted 22 June 2014 - 11:39 AM


Knowing something is in memory, and knowing where something is in memory are very different things.

QFE.


Tristam MacDonald - Software Engineer @Amazon - [swiftcoding]


#20 Fredericvo   Members   -  Reputation: 485

Like
1Likes
Like

Posted 23 June 2014 - 06:30 AM

First off, I hope I placed this in the appropriate section.  If so...

 

 

I've been programming for a while, but this may be a simple question to some (or many).

 

My Question: If a data structure already exist directly in RAM, then why do we need to spend the processing power to iterate in order to find the right data? (Side Note: pointers still exist in managed languages. They're just handled for you)

 

 

All that would actually be in RAM is:

 

index 0, index1, index 2

 

 

Index 0: --> GET VALUE '10'; FALSE; Next Address!

Index 1: --> GET VALUE '15'; FALSE; Next Address!

Index 2: --> GET VALUE '5'; TRUE; RETURN Value!

 

 

 

It doesn't work like this. RAM is like one huge array of bytes e.g. char MyWholeRAM[4 294 967 296] for 4 GB.

Every byte (char) has a unique address numbered from 0x00000000 - 0xffffffff 

I'll spare you details about paging and kernel space/user space.

A pointer is just an address of one location stored at another location. The CPU can only load a few values at once in its registers so iterating through RAM means loading data in registers, do a calculation or comparison, and store the value.

Some times it needs to know that another value is at a particular place in RAM and that's when a pointer tells it where it is.

A linked list would be stored as a pointer to the next node or NULL if there are no more nodes, and the actual data. The other node would again contain a pointer (really just a 4 byte value like another or 8 bytes in case of x64) 

Attached Thumbnails

  • ram.png





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS