Memory leak present
Posted: Fri Dec 08, 2017 4:51 pm
We discovered a memory leak in the mining servers that is causing them to crash every few hours. When the mining servers crash, the system automatically restarts them, resulting in the affected miners seeing "no valid shares yet" messages. Previously, the load balancing algorithm would then leave that server at 25% capacity and overload the others, but we fixed that.
The problem is that the number of blocks in memory keeps increasing over time, as if some of them aren't being garbage collected due to excess references. I'm investigating the problem and hope to have more information on it tomorrow, although memory leaks are very difficult to reproduce and resolve. Until then, you might see disconnects every six hours or so followed by one minute of downtime while a particular server restarts.
The problem is that the number of blocks in memory keeps increasing over time, as if some of them aren't being garbage collected due to excess references. I'm investigating the problem and hope to have more information on it tomorrow, although memory leaks are very difficult to reproduce and resolve. Until then, you might see disconnects every six hours or so followed by one minute of downtime while a particular server restarts.