It appears that LLVM's function inlining optimizations cause the GC to go slightly insane. I have yet to ascertain the exact interaction between the inliner and the GC, but disabling all inline functions makes the GC run beautifully.
I also remembered a nasty lesson that I should have had in mind a long time ago: deferred operations and garbage collection are a nasty mix. My particular problem was using PostMessage() to send a string to a widget. In between the call to PostMessage and the message pump actually dispatching the message, the GC would run. It would detect that nothing referenced the string anymore, and correctly free it.
Of course, when the message pump dispatched the notification in question, the control would try and read from the now-freed string, and bogosity would result.
Fixing that was just a matter of a find/replace and swapping in SendMessage, since I didn't really need the deferred behavior of PostMessage in the first place.
So my GC bug list is shrinking once again. I hope that I'm actually fairly close this time and not just shrugging off a "small" issue that turns out to require a major rewrite of something. I really don't want to ship without function inlining support, since that's a huge performance win in a language like Epoch where everything is a tiny function. However, I desperately need to find out why inlined functions cause the GC to vomit, and that's going to take a lot of time and patience I suspect. It may be worth taking the perf hit temporarily to get working software out the door. We shall see.
Now that things seem to be behaving themselves, and soak tests of memory-intensive programs reveal no issues, I have a few things I'd like to get back to working on besides garbage collection.
One is shoring up the Era IDE so that it's actually functional enough to use on a consistent basis. That'll likely involve making random additions and refinements over time, but there's some low-hanging fruit I'd like to hit sooner rather than later (mostly surrounding tabbed editing for now).
After Era reaches a point where I'm comfortable spending a few hours in it at a time, I want to go back to working on the self-hosting compiler and getting the parser and semantic analysis code up to snuff. The major semantic features left center around overload resolution and pattern matching; the parser needs to support basically the entire full language grammar instead of the trivialized subset it recognizes now.
Despite the lengthy diversion, I still feel like I'm on track to finish self-hosting by the end of the year. It may be a tight squeeze, and there may well be a number of bugs to sort out before it's 100%, but I think I can get the majority of the port done by then.
Once self-hosting is reached, there's a lot of simple but tedious optimization to do in the runtime itself. One major thing I've mentioned in the past is moving away from storing Epoch "VM bytecode" in the .EXE file and directly storing LLVM bitcode instead. That'll cut back on the dependencies needed to run Epoch programs, and reduce launch times substantially as well.
From there, it's mostly a matter of introducing language and runtime features. I suspect that once the compiler is done there will be a lot of crufty code that can be dramatically improved with some minor language tweaks. There's also the obvious stuff like separate compilation support, better data structures, parallelism...
Obviously still quite a lot to be done, but things are getting pretty exciting. I'm starting to have semi-realistic daydreams about people joining on with the project as it gains momentum and helping to smooth out some of the many remaining rough edges... hint hint :-P
Mostly though I hope that reaching self-hosting will be enough to gain some serious attention. The language is still very, very far from ready for prime-time, but the more people start playing with it the sooner it'll get there.