Downtime this morning
Posted: Tue Jul 18, 2017 1:08 pm
The pool went offline this morning due to server overload. I was asleep at the time, and while I have alerts set to my phone to wake me when the server goes offline, the phone was accidentally muted preventing me from being woken to fix the issue.
The underlying issue is that we are CPU limited in the process of assigning coins to miners, and the server load is roughly proportional to the number of coins being assigned to miners. As we have more hashrate, we need to distribute miners over more coins to prevent each network from finding blocks too frequently, causing the mining server's CPU load to increase. This morning, the markets were all around the same profitability, causing lots of coins to be mined at once. At the same time, 700+ GH/s of X11 hashrate came online. The server reached 100% CPU usage and was no longer able to process shares frequently enough. Eventually the server crashed lacking the memory to store additional shares.
I am willing to compensate miners for part of the time that the server was online and accepting shares, but I do not have the exact data when the server was online and accepting shares but not recording them versus being completely offline. If someone could send me a private message with hashrate data from an external service, I would appreciate the information.
Steve is working on an architectural overhaul of the mining server that will infinitely parallelize the mining server, which is currently predominantly single-threaded. However, that will not be complete for weeks if not a few months, so I am currently trying to find some ways to reduce server load in the meantime.
I know these issues are frustrating to you, and they are equally frustrating to me. With the growth in mining, managing server load occupies a majority of Steve's time and a lot of mine as well. Please realize we are working to resolve this issue but can't provide an immediate solution. I appreciate your patience while we work to improve stability.
The underlying issue is that we are CPU limited in the process of assigning coins to miners, and the server load is roughly proportional to the number of coins being assigned to miners. As we have more hashrate, we need to distribute miners over more coins to prevent each network from finding blocks too frequently, causing the mining server's CPU load to increase. This morning, the markets were all around the same profitability, causing lots of coins to be mined at once. At the same time, 700+ GH/s of X11 hashrate came online. The server reached 100% CPU usage and was no longer able to process shares frequently enough. Eventually the server crashed lacking the memory to store additional shares.
I am willing to compensate miners for part of the time that the server was online and accepting shares, but I do not have the exact data when the server was online and accepting shares but not recording them versus being completely offline. If someone could send me a private message with hashrate data from an external service, I would appreciate the information.
Steve is working on an architectural overhaul of the mining server that will infinitely parallelize the mining server, which is currently predominantly single-threaded. However, that will not be complete for weeks if not a few months, so I am currently trying to find some ways to reduce server load in the meantime.
I know these issues are frustrating to you, and they are equally frustrating to me. With the growth in mining, managing server load occupies a majority of Steve's time and a lot of mine as well. Please realize we are working to resolve this issue but can't provide an immediate solution. I appreciate your patience while we work to improve stability.