I'm planning on upgrading the server. Perhaps more experienced server admins here can chime in and offer suggestions.
I don't think CPU usage is the issue, since it usually hovers at around 20%. On the other hand, iowait can get as high as 80%. I also don't believe bandwidth is the bottleneck, since I'm only utilizing 10-15mbps out of a 20mbps line.
I'm considering upgrading the RAM from 2GB to 4GB. The applications themselves only use 1GB, but I would like to dedicate more memory to the disk cache. I'm not sure how useful 2GB is gonna be with Danbooru's access patterns, though.
I will also be installing a second hard drive and setting up software RAID 1. I could get dual 10k RPM drives, but that's $75/month compared to an extra $20/month. But maybe it would be worth it.
Not sure if the disk cache is really useful, caching the first pages is a good idea, but beyond that everything is too dynamic, so I would go for the 10k RPM drives, I think it would be worth it.
Also I sent you a PM with some details if you ever consider moving out of hivelocity.
Wow. I didn't know Rails spawned completely separate processes for handling concurrent requests--no wonder it uses so much memory. (I spawned just 5 processes and mongrel used 140 megs of memory--that stuff should be mostly shared.) I was wondering how Danbooru could possibly use that much memory (other than a huge memcached)--what a terrible architecture. (Rails--or Mongrel, at least--not Danbooru itself.)
Anyway: just doing curl 'http://danbooru.donmai.us/stylesheets/default.css' is taking anywhere from 0.2 to 2.5 seconds. 50ms ping, no packet loss, and the TCP connection consistently opens immediately. That won't even touch rails, and even if the disk is thrashing serving random images, that should still be served out of cache and take no time at all, as long as it's not swapping or anything.
Any clue what's up there? What kind of drives is it using? If IDE and Linux, is UDMA on? (Don't know if you can access that on your host, and hopefully the provider is competent enough to get it right, but it's easy to miss.)
petopeto, you said that fetching the stylesheets take ressource, but if so, does that meant thah your configuration proxy every request to mongrel ?
I see a lot of configuration tips for adding apache as fronted of mongrel, but this exeception rules that make apache handle every static files, and I think that kind of configuration can make the cpu charge lower.
I don't know if this helps but Moe runs off of 2gb of ram and a E6550, one sata drive dedicated to just moe and another hd for other websites.
We're using lighttpd 1.4.19 with mod_fastcgi, all logging is sent to /dev/null as well as all logs produced by rails, 5 min processes to a max of 8, but most of the time only 1 process is busy hovering around 8-10% of cpu.
Memory usage of the most busy ruby process hits about 120mb while the other idle ones use about 50mb.
I haven't seen any iowait issues, the hd that is dedicated to moe pushes about 1.6MB/s out constantly, but also I have a second server to push out images and that one has just one hd and that pushes about 1MB/s out on average
Cache hits rate on memcached is about 92% if that helps...
albert it sounds more like you have seeking isissues on your HD's, most probably because to many users on a single HD drive. In fact switching to raid 0 will just worsen this, sure you get higher read speed but that is only true as long as what you read is a single big constant file. Raid 1 won't really change it either except longer write speed, granted a lot more secure to keep the data.
You may be better of just having 2 HD's where the sites datafile is split between.
Seek times don't explain files like default.css taking a long time; that should be in cache anyway. I'd figure that out before buying new hardware.
Is danbooru using the same nginx configuration as the one it outputs? I'd try increasing worker_processes; it may be blocking on I/O (iowait), preventing that process from concurrently servicing other requests from cache. I'd try 8 or 16.
MugiMugi said: Raid 1 won't really change it either except longer write speed, granted a lot more secure to keep the data.
RAID 1 should definitely improve read performance significantly; it can read two files at once without contention. It just seems like there's something more than disk access causing a problem here; buying more hardware without know what the problem is seems like optimizing without profiling.
Unfortunately, Rails (ActionPack specifically) isn't thread safe, so to run one thread you essentially need an entire process. One of many reasons why I've been thinking of migrating to Merb (http://merbivore.com/) in the long term.
The server right now has one SATA drive, currently using UDMA2.
I will try raising the worker process count in Nginx. I didn't even think of that.