Upcoming system downtime
Posted: Fri May 28, 2021 9:55 am
It's the time of year for the major system upgrade. Right now, we're targeting Monday, May 31 at 11:00pm EDT to Tuesday, June 1, at 3:00am EDT.
We need to perform these upgrades about once every one or two years to bring our database up to the latest version of PostgreSQL. At the same time, we take the opportunity to make other major upgrades so the downtime is limited to one episode per year. The last upgrade occurred on May 15, 2020, and that upgrade involved moving the servers to a new location and upgrading to PostgreSQL 12.
This year, we are going to upgrade PostgreSQL to version 13, and replace the third generation database server, which has been running continuously since 2013, with a fifth generation server that has twice as many (80) cores, more and faster RAM, and disks that are three times as fast. More importantly, the motherboards of these newer servers are more reliable, and we think this upgrade might avert a sudden failure of this obsolete equipment at some point in the future. Besides the motherboard, it's also possible that the wear on the solid state disks on the current server might approach one exabyte of writes per disk by the next upgrade window.
While the current server is sufficient to handle demand, we want to upgrade during the bear market to be ready for the next surge in customers. This new server will likely triple the amount of work able to be handled by the database. We decided to perform the upgrade with short notice because it is possible that a significant number of ethash miners are going to join the pool next week, and we don't want to perform this upgrade two or three months from now and inconvenience them too.
Vance suggested that we investigate creating a replication server using identical equipment, so that downtime could be limited to 30 minutes instead of four hours. The longest portion of the downtime is copying data to a backup disk, and then reformatting the original disk and copying it back. We considered buying additional disks so that only one copy operation would be required. However, we ultimately determined that it would cost more than $3,000 just for just this one upgrade because of Chia mining. Chia mining has pushed up the cost of disks so high - disks now cost more than they did in 2015 - that it is no longer feasible for anyone to have excess server capacity.
All services except for E-Mail and the forums will be offline during this upgrade. Chris will provide updates over the weekend about the upgrade once the upgrade plan is finalized.
We need to perform these upgrades about once every one or two years to bring our database up to the latest version of PostgreSQL. At the same time, we take the opportunity to make other major upgrades so the downtime is limited to one episode per year. The last upgrade occurred on May 15, 2020, and that upgrade involved moving the servers to a new location and upgrading to PostgreSQL 12.
This year, we are going to upgrade PostgreSQL to version 13, and replace the third generation database server, which has been running continuously since 2013, with a fifth generation server that has twice as many (80) cores, more and faster RAM, and disks that are three times as fast. More importantly, the motherboards of these newer servers are more reliable, and we think this upgrade might avert a sudden failure of this obsolete equipment at some point in the future. Besides the motherboard, it's also possible that the wear on the solid state disks on the current server might approach one exabyte of writes per disk by the next upgrade window.
While the current server is sufficient to handle demand, we want to upgrade during the bear market to be ready for the next surge in customers. This new server will likely triple the amount of work able to be handled by the database. We decided to perform the upgrade with short notice because it is possible that a significant number of ethash miners are going to join the pool next week, and we don't want to perform this upgrade two or three months from now and inconvenience them too.
Vance suggested that we investigate creating a replication server using identical equipment, so that downtime could be limited to 30 minutes instead of four hours. The longest portion of the downtime is copying data to a backup disk, and then reformatting the original disk and copying it back. We considered buying additional disks so that only one copy operation would be required. However, we ultimately determined that it would cost more than $3,000 just for just this one upgrade because of Chia mining. Chia mining has pushed up the cost of disks so high - disks now cost more than they did in 2015 - that it is no longer feasible for anyone to have excess server capacity.
All services except for E-Mail and the forums will be offline during this upgrade. Chris will provide updates over the weekend about the upgrade once the upgrade plan is finalized.