I'm toying with the idea of building a trivia game, but instead of making the questions myself, or outsourcing them, I'm thinking of having the game go out onto the internet and create the questions itself.
I havn't decided if the questions will be multiple choice (chose A,B,C,D) , or fill in the blanks (you have to actually type the answer), or some other question type. Multiple choice makes for a better game (IMO), but I have no idea where I will get the 3 fake answers from. Fill in the blanks would be much easier to program, but it makes for a slower game because people have to type, and also its very tricky because there are usually multiple ways to spell things, and it would be nice to allow minor spelling mistakes.
Also, instead of using the entire internet as a question source, I might restrict the game to using Wikipedia only - that way I can use certain assumptions about the data format, etc. It might make things a lot easier. It would also give me a really easy way to have categories (eg Arts, History, Geography, etc) because Wikipedia articles are already grouped that way.
So for example I'm on Wikipedia now and I just clicked on "Random Article". This got me the page on Heathrow Airport:
https://en.wikipedia.org/wiki/Heathrow_Airport
Lets look at the first sentance in the article:
Heathrow Airport (IATA: LHR, ICAO: EGLL) is a major international airport in west London, England.
Suppose I wanted to use this for a trivia question. There are lots of ways I could make a question from this sentence, some ways no doubt being harder to accomplish than others.
Example 1: multiple choice
________ Airport (IATA: LHR, ICAO: EGLL) is a major international airport in west London, England.
A:) Edinburgh
B:) Bristol
C:) Heathrow
D:) Newcastle
How could I accomplish something like this? My question generator would have to know that Heathrow is an airport, and would have to go and look up 3 fake airports to display, along with the correct answer. Even further, These are all airports in the UK to make it a bit trickier. I have no idea how to write a program to do something like this. Would this require some fancy artificial intelligence, or am I missing something?
Example 2: fill in the blanks
________ Airport (IATA: LHR, ICAO: EGLL) is a major international airport in west London, England.
-> Player must type "Heathrow" to win
This seems much easier to program, however its not as fun of a game in my opinion. Also, I would need to have an algorithm that accepted minor spelling mistakes,so for example "Hethrow" should be accepted. I dont think this algorithm is that difficult but it is a consideration. Also there could be scenarios where a completely different answer with completely different spelling is also correct. For example if Heathrow also had a second unofficial name, I might want to accept that answer as well.
Backing up a bit, how do I even choose "Heathrow" as the guessword from this sentance? How could the question generator know that Heathrow is the most interesting and fun word to blank out? Afterall, words are just words to the program. What if instead of Heathrow, it chose "major" as the guessword? That would be a really stupid and furstrating question. I'm starting to think this program might be way to difficult to build.
Would I need a database or library that classifies each word as a noun, adjective, verb, etc? That might help with choosing which word to blank out.
Any ideas on this topic would be appreciated. I'm hoping I don't need a PHD in linguistics to pull this off!