Thoughts on double link list strings?

Started by
13 comments, last by slicer4ever 11 years, 6 months ago
I'm not sure if I am the one being accused of not knowing what the STL is. In fact, I do know. I just thought that the statement that rope is part of the STL, although technically correct, might mislead people into thinking that it is part of the C++ standard library, because of the ambiguity Servant of the Lord described.

Anyway, the original topic is much more interesting than this side discussion. Has anybody know what other data structure serious text editors use? Is this contiguous-buffer-with-a-gap method the standard approach?
Advertisement

Anyway, the original topic is much more interesting than this side discussion. Has anybody know what other data structure serious text editors use? Is this contiguous-buffer-with-a-gap method the standard approach?


Context is one of my favorites, and the source code is available to download, so it might not be a bad idea for me to check it out, although to be fair, i'm not particularly looking for heavy duty editor solutions, since most common usage of my UI Text editor is for changing some one line piece of information, or writing a description for something.


I was thinking that a link list of characters would be much faster on insertion/removal/modifications to the editor, since it comes down to changing but a few pointers around to do anything to the list. but i'm not certain if the overhead of cache corherency loss makes it worth it. perhaps in an enivronment like an stand-alone editor it wouldn't matter for cache corherency, since your not trying to run at any particular frame rate, but i'm not really sure if the overhead of insertion for my current method would outweigh the cost of what i lose by using a link list?
Check out https://www.facebook.com/LiquidGames for some great games made by me on the Playstation Mobile market.
If you really have to optimize your solution, I'd suggest this approach:

Organizing memory
Use fixed size memory blocks of chars as nodes in your rope. This allows you to pool allocate those blocks, minimizing calls to new and malloc.
When you run out of allocated blocks, allocate a bigger buffer. Don't simply memcpy your blocks to the new location but push_back each character in your new structure, filling up each block completely, starting at the beginning of the buffer (thus making the doubly linked list cache friendly again, in case you want to traverse it).

Finding insert position
For inserting text into the string, I assume you need to know where a line starts. I see multiple solutions for this:
a) Store the number of newlines a node contains in the node (and keep that number updated). Iterate over nodes to find the right block.

  • easy to implement
  • low memory usage
  • inserting newlines is O(1)
  • finding newlines is O(n) and needs to touch a lot of memory

b) Maintain a dynamic array of pointers to nodes. The pointer at index n shall point to the node containing the line with index n

  • higher memory usage
  • inserting newlines is O(n)
  • finding newlines is O(1)

c) Combination of a) and b). Use pointers to find approximate location of your line (for example, pointer n points to line 10 * n).

  • most difficult to implement
  • otherwise, same characteristics as b)
A more practical balance between a wasteful doubly-linked list of characters and a complicated rope could be std::list<std::string> (a list of lines). That's probably where I would start, until it becomes clear that it's not good enough.

A more practical balance between a wasteful doubly-linked list of characters and a complicated rope could be std::list<std::string> (a list of lines). That's probably where I would start, until it becomes clear that it's not good enough.

yes, a link list of char lines is probably a pretty decent comprise, fast new line insertion, but same char speed insertion as my current implementation i do like it.


If you really have to optimize your solution, I'd suggest this approach:

Organizing memory
Use fixed size memory blocks of chars as nodes in your rope. This allows you to pool allocate those blocks, minimizing calls to new and malloc.
When you run out of allocated blocks, allocate a bigger buffer. Don't simply memcpy your blocks to the new location but push_back each character in your new structure, filling up each block completely, starting at the beginning of the buffer (thus making the doubly linked list cache friendly again, in case you want to traverse it).

if i'm pushing back into an array, arn't i pretty much memcpy'ing one array into another, because each char is in an array i can't see how i maintain the double link list for individual characters, but i might be misunderstanding what your saying?.


I think in the end, i'll probably use alvaro's implementation when i do get around to rebuilding the component, thanks for the helpful input guys=-)
Check out https://www.facebook.com/LiquidGames for some great games made by me on the Playstation Mobile market.

This topic is closed to new replies.

Advertisement