Status as of Sunday, November 6
Posted: Sun Nov 06, 2016 9:42 am
Greetings! Here's an update on what's happening now:
- The last seven days have been a whirlwind of non-project activity for us. Each of those days, I've worked other jobs between 11.5 and 18 hours, and Chris has also been away on trips half the time. We've finally gotten back to a normal schedule today. There's still a few physical tasks like cleaning bathrooms and raking leaves, but after that I plan to focus completely on seeing if there are any stability improvements that can be made.
- The first issue that occurred yesterday is that there was a share that was not inserted correctly into the database for an unknown reason, and that caused the database to get behind for 30 minutes. Chris restarted the server, and about 30 minutes of shares were lost. He credited everyone for the lost work. This is a rare issue that also happened about a month ago, and I'll try to figure out why it recurred later today. For now, everything is working normally. No money has been lost during these issues, although the "hashrate" graph may have missing data.
- My investigations into the A4 issues have led me to believe that there are bugs in the A4 firmware involving cooling. Because of the way scrypt works, there is a lot of work performed right after a work restart, and then the results of those calculations are used in hashing with lower power usage. I think that issuing more work restarts causes the A4s, which were poorly tested, to overheat, because that's when the most work is performed. The problem is that LTC profitability has declined to 2.6 cents, and we can still earn between 4 and 5 cents, so simply instructing people to use "c=litecoin" is not feasible and loses a lot of money. The best currently available resolution is to use "g=off h=50 n=[some name]," which will earn you about 3% less than the average pool miner but will still beat litecoin by 60% - if your miner doesn't overheat even at those settings.
- The bug in the overheating is that once a core shuts down, it should restart after it cools to a safe temperature. That doesn't appear to happen; it just crashes and never restarts. Most likely, when Innosilicon "tested" these miners in quality control, they connected to the litecoin network only, and once they got litecoin working decided that the miners were ready for release, so of course what wasn't tested doesn't work well. Innosilicon released new firmware, and I'm hoping that they addressed this issue in the latest release. Comments from miners who have installed the new firmware are appreciated.
- I discovered a major bug in the daemon-network-management program that we wrote that manages upload bandwidth usage. If we had no such program, the 200 daemons we have installed would eat up all available upload bandwidth sending blocks to other nodes. daemon-network-management prioritizes coins that are submitting found blocks to the network, but it had not been running correctly. That explains the orphan rates that were much higher than other pools. The data from networks with larger blocks, like litecoins, was being throttled. The forums and website were also being throttled. We fixed this issue on Tuesday and because the orphan rates are calculated over the past month, profitability is increasing every day as the fewer orphaned blocks are being counted in the rates. We expect profitability to rise by 3% over the next three weeks as a result of this fix. Some networks, like dogecoin and litecoin, may see increases up to 6% by month's end.
- I got the first version of the multiple algorithm server to run and accept connections, but work on that version has been delayed pending the resolution of Innosilicon A4 troubleshooting.