GameDev.net has the most down time of any site I regularly visit.

Started by
60 comments, last by jjd 12 years, 3 months ago
After the new site lunched last year I expressed my displeasure with the amount of downtime that had been constantly occurring before and especially after that bold move. I assumed that the problem would diminish with time but it didn't. To me one of the primary goals of any website should be it's up time. Down time creates lost revenue, prevents the community from accessing valuable information, and causes us to question if this site is really worth investing our time in.

Now don't get me wrong all the features are nice but features are not why any of us come here! We are here for the community and the wealth of knowledge this site contains. Now I understand that it was unforeseen hardware failure which caused this latest outage but you guys should have been prepared for this. All these features are nice but they should not come at an expense to the amount of up time for this site.

I've placed this thread in the lounge in an attempt to get as many votes as possible in order to make my point clear. Perhaps I am wrong and this site is not as bad as other popular sites but in my opinion it is far worse and the length of down times are ridiculous.
Advertisement

After the new site lunched last year I expressed my displeasure with the amount of downtime that had been constantly occurring before and especially after that bold move. I assumed that the problem would diminish with time but it didn't.

Yes, before the upgrade gamedev.net was having some trouble keeping up. However, I didn't expect it to change after the upgrade. Active topics not loading? Downtime? Check back in 10 minutes. It's fixed. No new interesting threads (in the past hour); shrug.


Now I understand that it was unforeseen hardware failure which caused this latest outage but you guys should have been prepared for this

There is a point where "preparing the unforeseen" becomes unfeasible financially...
I've noticed the same thing as Steve. It's probably because part of my job is ensuring that our servers are up as much as possible that I notice how often other sites have problems, but GDNet has by far the most downtime.


There is a point where "preparing the unforeseen" becomes unfeasible financially...


Trust me GDNet is nowhere close to the financially unfeasible range of preparing against failure unless their budget is almost non-existent.

Trust me GDNet is nowhere close to the financially unfeasible range of preparing against failure unless their budget is almost non-existent.


Do we give off an impression that we are rolling in money here or somethin? =) Almost all of our ads are generic, amazon sales are poor (not to mention their affiliate program has paid less and less each year that they develop 'exciting new payment tiers' , so we depend a lot on gdnet+ subscriptions. This site is a labor of love.

Over the last year we've had to learn how to work in a way we hadn't before.. depending on a lot of open source software and moving to MUCH cheaper cloud hosting. But with so many changes it was a first for us, but it was our only option if we wanted to be able to afford to keep running the site. Prior to 2011 we had our own rack, our own servers running Windows server operating systems.. we go from that software to a LAMP stack for the first time in our existence. 2011 I feel in some ways for us was an adapt or die year.. for real.

To be fair, I made an upgrade a few months ago that has left us with next to zero downtime since then.. this one hit us out of the blue like a nuclear bomb. Our provider actually corrupted our file system by running fsck on the parent server we are on without backing us up first. Their response? "We're sorry.. " followed by our options.

I fully expect, barring some fluke like yesterday morning, to have next to no downtime in the forseeable future with this setup however. Though what we do when our provider literally destroys us.. I don't know. It wasn't intentional, but wow. I really thought we would be good.
Are you guys running a virtual machine on their server? I don't understand how they could have screwed your filesystem up with a fsck, unless the data portion is mapped to a volume on one of their SANs or something. Either way, that's unacceptable on their part.

I help run the IT shop for the computer science department at my University, so I understand what it's like to have unforeseen hardware failures. It's the worst feeling ever! I'm glad your backups worked. I'm always nervous that the backups will be corrupt.

Zach.
I say we have a gamedev fundraiser! if funds are an issue.
A lack of funds in a community of developers. Is it just me or does that sound weird?

I usually don't notice the downtimes, most of them take 5-10m to be resolved.

Are you guys running a virtual machine on their server? I don't understand how they could have screwed your filesystem up with a fsck, unless the data portion is mapped to a volume on one of their SANs or something. Either way, that's unacceptable on their part.



Yep, and the corruption occurred somehow on the host file system that housed our virtual server. Because of the size of our instance we were the only one on that particular server.. so when they ran fsck on our end we saw all sorts of screwed up stuff. The apache config, for example, opened up as some random PHP file from our website.

I say we have a gamedev fundraiser! if funds are an issue.


Well we tried a donation run back in 2010 and honestly the response was barely lukewarm. Perhaps we didn't grovel enough or do a good enough job getting the word out although it was a posted forum announcement for several months. Fundraising is something that does require effort though and perhaps we just didn't put enough into it the first time around.

Drew Sikora
Executive Producer
GameDev.net


[quote name='ZBethel' timestamp='1325947474' post='4900343']
Are you guys running a virtual machine on their server? I don't understand how they could have screwed your filesystem up with a fsck, unless the data portion is mapped to a volume on one of their SANs or something. Either way, that's unacceptable on their part.



Yep, and the corruption occurred somehow on the host file system that housed our virtual server. Because of the size of our instance we were the only one on that particular server.. so when they ran fsck on our end we saw all sorts of screwed up stuff. The apache config, for example, opened up as some random PHP file from our website.
[/quote]

Unbelievable. That's the whole point of running a virtual server, to have a buffer of security between the guest and a hardware failure. I'm assuming they run a VMWare cluster of some kind, in which case they may have had a SAN failure or a disk array failure. At any rate, they should have had your stuff backed up for you at the virtual machine level.

This topic is closed to new replies.

Advertisement