We previously announced that Chris would be visiting the datacenter over the Memorial Day weekend to upgrade servers. The four goals of these upgrades are to:
- Improve the reliability of the forums by resolving the issues where the forums server disconnects from the Internet for ten minutes at a time
- Increase coin daemon server capacity to support SHA-256 mining
- Increase the number of workers the mining server can support
- Increase the resiliency of the system to power issues
The maintenance will begin on May 26 and will continue periodically throughout the weekend. He'll announce the start of each task on twitter. Here's a summary of the work to be performed:
- Chris will replace the failing network card in the forums server with a new card that will not disconnect the system periodically. This will result in 0.5-1 hour of downtime for the forums only.
- Chris will assemble and install a new piece of dedicated hardware that will only host the mining server. The system will have the fastest singlethreaded processor available today, which will double the performance of the mining server without any code changes. We expect we will be able to support 6000 workers after this upgrade, from the 3200 that I tested up to this point. When combined with software changes Chris plans to put out tonight, we can probably get the total up to 8000 concurrent workers. There will be 30 minutes of mining downtime while the virtual machine is copied to the new server and brought online.
- Chris will install a surplus HP server he purchased that just came off lease from a large corporation, which we estimate can support about 150 additional coin daemons on its 32 cores. No downtime is expected.
- Chris will install eight solid state disks in the existing and new coin daemon servers. Each of the four servers will have two additional SSDs installed. The servers are disk limited, and these extra disks will provide for lower orphan rates, more up-to-date block explorers, and support for the 175 SHA-256 coins. During this period, each server will be offline for an hour, during which the coins on that server will be unavailable for mining and payouts. Mining will continue, but profitability will fall for a few percent during that hour.
- Chris will install an additional uninterruptible power supply. The power outage last weekend caused a surge when external power was restored and the generator turned off. He will balance the servers between the two UPSs, making it less likely that one of them will be overloaded. Because all servers have two power cords, there will be no impact from this maintenance.
Chris will be investing $2500 in these upgrades to provide an even higher level of service to our customers. After the upgrades, the server room will have 120 cores, 0.55TB of memory, 7TB of solid state disks, 8TB of hard disks, two massive switches, and two UPSs. Feel free to ask questions or comment.