Hi
Me and some friends are programming a server for the game Minecraft that has a Lua plugins system, styled after GMod Lua. We're coding it in C++ It even has hooks.
We've hit some problems though in the past 4 days that we can't seem to fix.
Before I show you any code, allow me to briefly explain how it works. There is the Main thread, and then a thread for each client connected to it. The Lua implementation loads several independent "plugins" in the same Lua instance, but isolated in different environments. Anyway, when a client thread receives a packet (e.g. a chat message), it calls the function in the PluginManager class called CallPlayerHook(const char *hook, Player *playerObj). Here's what that function does:
Here's the pastebin for easier reading
bool PluginManager::CallPlayerHook(const char *hook, Player *playerObj) {
WaitForSingleObject(this->pluginsMutex, INFINITE);
if(this->isDoingHook)
return false;
else
this->isDoingHook = true;
if(lua_gettop(Lua) > 0) {
Console::PrintText("Stack is > 0 for hook '%s'", hook);
return false;
}
if(lua_gettop(Lua)) {
Console::PrintText("Waiting for Lua: %s", hook);
while(lua_gettop(Lua));
Console::PrintText("Done waiting for Lua: %s", hook);
}
std::list<Plugin*>::iterator iter;
for(iter = this->plugins.begin(); iter != this->plugins.end(); ++iter) {
if((*iter)->CallPlayerHook(hook, playerObj) == false) {
//lua_settop(Lua, 0);
this->isDoingHook = false;
ReleaseMutex(this->pluginsMutex);
return false;
}
}
this->isDoingHook = false;
ReleaseMutex(this->pluginsMutex);
return true;
}
The code is quite cluttered with several attempts to fix the numerous problems, but none of them work. The function returns a bool. If it was true, everything was OK. Otherise, it returns false. The client thread that called it does it like: while(!pluginManager->CallPlayerHook("OnPlayerChat", this, NULL, NULL)); So what PluginManager::CallPlayerThread() does is loop through a std::list of Plugin objects, and calls a similar function in there.
So let's look at Plugin::CallPlayerHook(const char *hook, Player *playerObj). What this function does it loop through a local std::list of structs containing char*'s of a hooked event, and the function name to be called in that plugin's Lua script. Just like GMod's hooks system. If if finds it, it checks if the Lua stack is empty (no functions are running), and then it does Lua stuff to push the function and Player object onto the stack and then pcalls it. The code is
very cluttered with try{}catch(){}'s so that I could pinpoint the error. Here's the code, and then I will describe exactly what the problem it.
Here's the pastebin for easier reading
bool Plugin::CallPlayerHook(const char *hook, Player *playerObj) {
std::list<Hook*>::iterator iter;
try {
for(iter = Hooks.begin(); iter != Hooks.end(); ++iter) {
if(strcmp((*iter)->eventName, hook))
continue;
if(lua_gettop(Lua))
return false;
try {
lua_getfield(Lua, LUA_GLOBALSINDEX, "PLUGINS");
lua_pushnumber(Lua, this->id);
lua_gettable(Lua, -2);
lua_getfield(Lua, -1, (*iter)->functionName);
} catch(...) {
Console::PrintText("Exception with getting function for hook '%s' and plugin '%s'", hook, this->Name);
lua_settop(Lua, 0);
return false;
}
try {
this->SetEnv();
} catch(...) {
Console::PrintText("Exception with SetEnv() for hook '%s' and plugin '%s'", hook, this->Name);
lua_settop(Lua, 0);
return false;
}
try {
if (playerObj != NULL) {
tolua_pushusertype(Lua, playerObj, "Player");
} else {
Console::PrintText("Player is NULL for hook '%s' and plugin '%s'", hook, this->Name);
lua_settop(Lua, 0);
return false;
}
} catch(...) {
Console::PrintText("Exception with pushing params for hook '%s' and plugin '%s'", hook, this->Name);
lua_settop(Lua, 0);
return false;
}
try {
if(lua_pcall(Lua, 1, 0, 0)) {
Console::PrintText("LUA ERROR: %s", lua_tostring(Lua, -1));
lua_pop(Lua, 1);
}
} catch(...) {
Console::PrintText("Exception with hook pcall for hook '%s' and plugin '%s'", hook, this->Name);
lua_settop(Lua, 0);
return false;
}
lua_settop(Lua, 0);
}
} catch(...) {
Console::PrintText("CallHook failed at '%s'. Retrying.", hook);
lua_settop(Lua, 0);
return false;
}
return true;
}
So finally, here's what happens. When one person joins the server, the hooks work fine for the most part. It calls the hooks for OnPlayerJoin, OnPlayerChat, OnPlayerMove, etc. The Lua code is executed, and the plugins function as expected. However, randomly a few times a minute, one of the messages in the catch(){} statements above displays. It's always a random one. Usually it's around pcall, but sometimes it's a problem with pushing params. Now the really bad part is when a second (or even third) client joins. Sometimes, the clients will freeze up, and no packets will go through. Other times it'll work for a while (while displaying lots of random caught exceptions), and then it will crash due to an "uncaught exception: longjump executed", even though the point of the exception was IN a try{} statement! Other times, the situation will be the same as the previous stated one, but instead of a longjump exception, the server will just freeze up spamming a Lua-generated error: "C STACK OVERFLOW", or "Tried to call table object!", or even that an object is nil (which means the right things weren't pushed onto the stack from C++ when they should have).
It's really confusing us, and it all seems so random as it's very different most of the times.
Could it be caused by thread conflicts? Maybe two client threads are trying to call a lua hook at the same time, or maybe one is trying to call it while another hook is currently procesing? Could it be a Mutex problem?
I would like to thank you for just reading this far and thinking about it. This problem is very troubling, and the project is near completion once we fix this. We assume that all of these problems are probably caused by one simple programming mistake or two. This is a major road-block, and a potential show-stopper for our project, even though we are so close to competion.
Thank you for your time,
Drew