What is the ideal solution for a repository in my case?

Started by
7 comments, last by Prot 9 years, 3 months ago

I am trying to implement something similar to Scrabble or Letter Quest. I am still using XNA with C#. The game I write has to be able to check if the user input is a valid word in the dictionary. So I wonder what would be the most efficient way to implement a repository for the dictionary. I can think of two things:

  1. I could save all the valid words in XML files
  2. I could make the user install and use a Service-based database (SQL)

I think XML files have the disadvantage of having a poor lookup performance compared to SQL, note this is just an assumption of my own I do not know this for sure. On the other hand SQL might produce a little overhead during installation of the game if the user has no SQL Server Express installed, he will be forced to download it. Well actually the download will start automatically, still it requires a internet connection.

So what would be a good solution for this? Is there maybe another way I have not thought of? I am open to any suggestion. Also if you think that I got something wrong in my thoughts above, please let me know.

Btw I also was wondering how people fill dictionaries like the one in Letter Quest. I mean, are they sitting there for months inserting every single word they know into a repository?

Thanks in advance!

Advertisement


I could make the user install and use a Service-based database (SQL)

Well, this has the advantage that you don't need any support, because nobody would play your game.... please , dont force the user to install and configure some (not common) thirdparty software for just playing a game.


I could save all the valid words in XML files

This would be ok. As example, german has around 5 million words, lets say in average 10 letters, that would be ~50 mb of data, nothing you should worry about holding in memory. With some simple searching structure, you will have really good lookup performance (for a start even a simple binary search would do). So, keep it simple and stupid smile.png


Btw I also was wondering how people fill dictionaries like the one in Letter Quest. I mean, are they sitting there for months inserting every single word they know into a repository?

I'm sure , that there are data bases around to get whole directories. Maybe you need to pay for them, at least you need to convert them into your own data format.

Simplest thing:

1) a simple text file for the words (XML is way overkill), with a word on each line. Then,

2) in your code, read it line by line and put the words into a C# HashSet

Only if the performance of step 2 (which would need to run when your game starts) is for some reason too slow, then you can look at more complex options. Perhaps converting the text file offline to some binary format (say, a giant char array with a second array for the word indices) that you can load directly into memory. If the words are sorted you can use a binary search to look up a word. Or use a trie if you want to get real fancy: http://crpit.com/confpapers/CRPITV62Askitis.pdf

What about SQLite? Would that save me the internet connection part and increase the lookup performance?

I don't understand... do you need more functionality than just looking up words to see if they exist? If not, why are you looking for a more complex solution?

Here's the C# code to read a text file of words into a HashSet. Lookup performance is constant time. It will be extremely fast.


            HashSet<string> wordList = new HashSet<string>();
            using (StreamReader wordlistFile = new StreamReader("wordlist.txt"))
            {
                string line;
                while ((line = wordlistFile.ReadLine()) != null)
                {
                    wordList.Add(line);
                }
            }

Here's a word list:

http://www-personal.umich.edu/~jlawler/wordlist.html

You absolutely don't need a database, nor do you need XML. Just Use a text file with one word per line, sorted in lexicographical order. Load each word into a hash set, or put them into a massive character array with words separated by a 0 (zero) character and do a binary search. Either option will be plenty fast.

If you want to get really fancy, you could likely optimize your search times using information like this.

throw table_exception("(? ???)? ? ???");


Words = new HashSet<string>(File.ReadLines("words.txt"), StringComparer.OrdinalIgnoreCase);

// later (do not load the file once per word lookup, obviously)

var wordIsValid = Words.Contains("whatever");
Ship it.

I don't understand... do you need more functionality than just looking up words to see if they exist? If not, why are you looking for a more complex solution?

Here's the C# code to read a text file of words into a HashSet. Lookup performance is constant time. It will be extremely fast.


            HashSet<string> wordList = new HashSet<string>();
            using (StreamReader wordlistFile = new StreamReader("wordlist.txt"))
            {
                string line;
                while ((line = wordlistFile.ReadLine()) != null)
                {
                    wordList.Add(line);
                }
            }

Here's a word list:

http://www-personal.umich.edu/~jlawler/wordlist.html

Thanks a lot I think I will go with your approach, sounds reasonable!


Words = new HashSet<string>(File.ReadLines("words.txt"), StringComparer.OrdinalIgnoreCase);

// later (do not load the file once per word lookup, obviously)

var wordIsValid = Words.Contains("whatever");
Ship it.

This also sounds plausible!

This topic is closed to new replies.

Advertisement