Sign in to follow this  

Growth of a C# List

This topic is 4076 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm adding message logging to my MMORPG server. However, having each message get added to the database separately may be a bottleneck (dispite the fact that the database is locally accessible). My idea right now is to store all the messages in a list (C# List), and then have the list unload into the database when the server goes down for its daily server-save. However, if the computation speed growth of a C# List is O(n), this is probably a terrible idea, so before I go and implement it, I'm wondering if someone could tell me if C# does its lists as arrays, linked-lists, BSTs, etc. Also, does it matter if I type safety it? Does it make a difference if it's value or reference typing?

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
However, if the computation speed growth of a C# List is O(n), this is probably a terrible idea, so before I go and implement it, I'm wondering if someone could tell me if C# does its lists as arrays, linked-lists, BSTs, etc.

A List<T> is implemented internally with an array. The array is created with some initial size (there's a constructor overload to specify it, IIRC), and when the list is filled to capacity, a new array is created that's twice as big as the old one, and the contents of the old copied to the new one.

In practice, this performs very well.

Quote:

Also, does it matter if I type safety it?

In general, a type safe generic list is always preferrable to a non-generic one.

Quote:
Does it make a difference if it's value or reference typing?


I can't parse this.

Share this post


Link to post
Share on other sites
What good is a log if logged messages may not be available? What does your log tell you when the server crashes? And when do you need your log, if not when the server crashes? In other words, if you want a useful log, make sure that messages are persisted immediately.

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
I'm adding message logging to my MMORPG server. However, having each message get added to the database separately may be a bottleneck (dispite the fact that the database is locally accessible).

My idea right now is to store all the messages in a list (C# List), and then have the list unload into the database when the server goes down for its daily server-save.

However, if the computation speed growth of a C# List is O(n), this is probably a terrible idea, so before I go and implement it, I'm wondering if someone could tell me if C# does its lists as arrays, linked-lists, BSTs, etc.

Also, does it matter if I type safety it? Does it make a difference if it's value or reference typing?


since C# has bounds checking on all arrays the worst case scenario is probably O(n) (with jagged arrays).
the real issue however is memory footprints, you really want the array to fit into the cache if you intend to traverse it often enough to bother about it being O(n).

a LinkedList generally allows for faster operations (add/delete etc) but suffer from a much larger memory footprint which could result in far worse performance.

modern cpu:s are extremely fast but can loose around 80% of their performance simply by constantly needing data that isn't in the cache.

in what way is the log going to be used ?
if its viewable by admins and GMs only i would suggest throwing it into the DB as soon as the list reaches a predetermined size (since old data won't get accessed extremely often). your server should be multithreaded anyway and the DB server most definitly is so the game shouldn't freeze while you store data. by doing the storing when the list reaches a certain size you're able to control your cache and the query complexity.

if its viewable by players aswell (as a personal messagelog) i suggest you simply store a local copy at their computers (managed by the client) and simply discard it when they sign off. they can load a new size limited log from the server when they sign on. (last 100 messages for example)

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Don't bother saving log messages to a database. Use a text file instead.

I suggest saving messages in a List and then saving them to the text file every so often. You could try different frequencies and see which interval works best. But I really wouldn't worry too much about something like this. The overhead involved in writing to a file is minimal compared the overhead of the MMORPG itself.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
Don't bother saving log messages to a database. Use a text file instead.

I suggest saving messages in a List and then saving them to the text file every so often. You could try different frequencies and see which interval works best. But I really wouldn't worry too much about something like this. The overhead involved in writing to a file is minimal compared the overhead of the MMORPG itself.


it depends a bit on the nature of the log.

a system log used to find the cause of an unexpected crash for example should probably be sent to stderr by default.

the OS can then easily pipe it into a file if you want that.

chat message logs used to detect abusive players etc are better stored in a DB for easy access.

Share this post


Link to post
Share on other sites
Quote:
Original post by Arild Fines
A List<T> is implemented internally with an array. The array is created with some initial size (there's a constructor overload to specify it, IIRC), and when the list is filled to capacity, a new array is created that's twice as big as the old one, and the contents of the old copied to the new one.

In practice, this performs very well.


Performs very well, relative to what, not performing? This sounds absolutely terrible.

Quote:
Original post by Arild Fines
I can't parse this.


<int>, <short>, <byte> VS <Int32> <Short> <Byte>

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
Quote:
Original post by Arild Fines
I can't parse this.


<int>, <short>, <byte> VS <Int32> <Short> <Byte>


C# isn't java. As far as I know, int and Int32 are synonyms, as are short and Short, byte and Byte, ...

Share this post


Link to post
Share on other sites
Quote:
Original post by SamLowry
As far as I know, int and Int32 are synonyms, as are short and Short, byte and Byte, ...


Interesting, I never knew that. Regardless though, the point I'm trying to make should be apparent -- does it make a difference to a C# List's performance if the template is a reference type, or a value type?

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
Quote:
Original post by SamLowry
As far as I know, int and Int32 are synonyms, as are short and Short, byte and Byte, ...


Interesting, I never knew that. Regardless though, the point I'm trying to make should be apparent -- does it make a difference to a C# List's performance if the template is a reference type, or a value type?


There should be no real difference. The .NET runtime machinery will instantiate separate class implementations for generics when used with value types, and will have reference types share one implementation, meaning you won't have to worry about boxing/unboxing (something which java suffers from).

Of course, the typical performance difference between value types and reference types still apply, but that has nothing to do with generics.

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
Interesting, I never knew that. Regardless though, the point I'm trying to make should be apparent -- does it make a difference to a C# List's performance if the template is a reference type, or a value type?
The only difference is that with value types, you're passing the value, and with reference types, you're passing a pointer to the object. You shouldn't have to worry about performance as either way you're still passing something. If you're using a structure type and it contains a large amount of data, then you should consider switching to a class....

Also, computers today are blazingly fast. The only real thing that slows them down are rendering complex graphics and calculating massive equations (AI, pathfinding, physics, etc). You shouldn't have any problems with working with a List object and a bunch of strings.

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
Quote:
Original post by Arild Fines
A List<T> is implemented internally with an array. The array is created with some initial size (there's a constructor overload to specify it, IIRC), and when the list is filled to capacity, a new array is created that's twice as big as the old one, and the contents of the old copied to the new one.

In practice, this performs very well.


Performs very well, relative to what, not performing? This sounds absolutely terrible.

std::vector is implemented exactly the same way.

I'd think any book on data structures worth its salt should give a detailed analysis of the performance implications of implementing a list in this way.


Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
However, having each message get added to the database separately may be a bottleneck (dispite the fact that the database is locally accessible).

Have you actually profiled this? This whole thread smacks of newbieish premature optimization.

I suspect that even if the growth cost of List<T> was O(n^2)(it isn't), you wouldn't even notice the difference.

Bottom line: List<T> is the best you'll get. Microsoft can't do any better, nor can you.

Share this post


Link to post
Share on other sites
Quote:
Original post by Rob Loach
The only difference is that with value types, you're passing the value, and with reference types, you're passing a pointer to the object. You shouldn't have to worry about performance as either way you're still passing something.


Thats hardly what the MSDN library is leading me to believe...

Quote:
MSDN Library
Performance Considerations

In deciding whether to use the List or ArrayList class, both of which have similar functionality, remember that the List class performs better in most cases and is type safe. If a reference type is used for type T of the List class, the behavior of the two classes is identical. However, if a value type is used for type T, you need to consider implementation and boxing issues.


... inaddition, in list theory it seems pretty acceptable to assume that adds can be done in constant time. And subsequently disappointed to find out that adds can now (at apparent randomness) take significant durations of time to complete.

Quote:
Wikipedia
Some languages may instead implement lists using other data structures, such as arrays. However, it is generally assumed that elements can be inserted into a list in constant time, while access of a random element in a list requires linear time; this is to be contrasted with an array (or vector), for which the time complexities are reversed.

Share this post


Link to post
Share on other sites
Quote:
Original post by Arild Fines
Quote:
Original post by Thevenin
However, having each message get added to the database separately may be a bottleneck (dispite the fact that the database is locally accessible).

Have you actually profiled this? This whole thread smacks of newbieish premature optimization.

I suspect that even if the growth cost of List<T> was O(n^2)(it isn't), you wouldn't even notice the difference.

Bottom line: List<T> is the best you'll get. Microsoft can't do any better, nor can you.


I am pretty sure I could do better, infact, I know I could do better. Becuase I don't need remove() functionality, and I don't need any fast method to gather random pieces. The only time this chat log list would get requested to return random data entries would be at the end of the server's uptime duration, and this is where performance no longer matters.

This thread does not smack at all of premature optimization.

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
And subsequently disappointed to find out that adds can now (at apparent randomness) take significant durations of time to complete.

You think a single memcpy once or twice in the lifetime of your list really amounts to "significant durations of time"? Sheesh.

Quote:
Wikipedia
Some languages may instead implement lists using other data structures, such as arrays. However, it is generally assumed that elements can be inserted into a list in constant time, while access of a random element in a list requires linear time; this is to be contrasted with an array (or vector), for which the time complexities are reversed.

Nothing in your original post indicated that you needed to perform insertions at random locations in the list; in fact, you pretty much stated that all insertions would happen at the end.

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
I am pretty sure I could do better, infact, I know I could do better. Becuase I don't need remove() functionality,

What does that have to do with the price of beer in Paris? The Remove method on List isn't costing you anything in terms of insertion.
Quote:

and I don't need any fast method to gather random pieces.

Hm, not sure what to say here. I guess I'll quote it anyway and let it stand for itself.

Share this post


Link to post
Share on other sites
Quote:
Original post by Arild Fines
Quote:
Original post by Thevenin
I am pretty sure I could do better, infact, I know I could do better. Becuase I don't need remove() functionality,

What does that have to do with the price of beer in Paris? The Remove method on List isn't costing you anything in terms of insertion.
Quote:

and I don't need any fast method to gather random pieces.

Hm, not sure what to say here. I guess I'll quote it anyway and let it stand for itself.


Both of these may very well likey affect the implementation used for a list. For instance, if I know that I didn't need to have my implementation of a list allow for segments of it to be removed, I could very well boost performance by slashing features and taking shortcuts (though, I have no ideas atm of what shortcuts I'd take).

It's honestly not that hard to write specialized code that will outperform Microsoft's code, so stop it with the belated pessimism.

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
(though, I have no ideas atm of what shortcuts I'd take).

Exactly.
Quote:

Its honestly not that hard to write specialized code that will outperform Microsoft's code, so stop it with the belated pessimism.

Let's see; there are more or less exactly two ways you can design a dynamically expandable list.

Share this post


Link to post
Share on other sites
Quote:
Original post by Arild Fines
Quote:
Wikipedia
Some languages may instead implement lists using other data structures, such as arrays. However, it is generally assumed that elements can be inserted into a list in constant time, while access of a random element in a list requires linear time; this is to be contrasted with an array (or vector), for which the time complexities are reversed.

Nothing in your original post indicated that you needed to perform insertions at random locations in the list; in fact, you pretty much stated that all insertions would happen at the end.


I think you need to reread my post, and the citation off Wikipedia, because I do not need to do insertions in random parts of my list. They are all at the end.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
I think you are better off writing those log entries to a file and write that file to a database when the server goes down for its daily server-save.

I mean a log file can grow realy huge during a single stressful day. If you log that much that it may be a performance issue then keeping it in the working memory is definitely a bad idea.

Imo you would even be better off by inserting it into the database just in time.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Thevenin
It's honestly not that hard to write specialized code that will outperform Microsoft's code, so stop it with the belated pessimism.


Or you could just use a Queue when you need a Queue...

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
Imo you would even be better off by inserting it into the database just in time.


Thats definently the most straight-forward approach, however, it will probably be tragic -- as interactions with the MySQL database take quite a toll on the performance of my game, I'm assuming it's because my code is entirely synchronis and that the MySQL Connector is blocking (it would have to be for 'reads' I don't know about 'writes' though).

I'm honestly not too worried about server crashes causing all the chat logs to be erased (if I decide to store it in memory), the server use to crash on an hourly basis back when I had it all written out in Procedural-C, but it doesn't anymore. I'm putting in message logging mostly so that I can handle player harassment claims.

Share this post


Link to post
Share on other sites
So dump the list periodically, every 10 seconds or something, and clear it.

That way the list will quickly level off to some good size, and neither data loss nor performance should be a major issue.

Share this post


Link to post
Share on other sites

This topic is 4076 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this