Page 1 of 1

Status as of Monday, October 2, 2017

Posted: Mon Oct 02, 2017 9:05 am
by Steve Sokolowski
Good morning!
  • Yesterday, we successfully completed a release. The release addressed three main issues. First, the mining server will now automatically recover if it shuts down due to a memory leak or some other issue. Second, the "duplicate worker name" error will now be applied to the existing session, rather than the new session, so that when a miner disconnects and reconnects before the original session timed out, the new session will be able to continue using that name.
  • Third, we significantly reduced CPU usage on the WAMP server by modifying the output to the website for coin status data. I wrote a library to calculate the differences between the previous data and the current data, and another one to reconstruct the data from the differences. That means that only the difference data needs to be transmitted to subscribers. The CPU usage was reduced from 60% to 40%, and bandwidth usage was reduced from 10Mbps to 5Mbps. There were no API changes in this release.
  • We think that we can optimize even further by doing the same for miner data, but the interface for miner data is a public API. We'll be investigating how many customers are using bots that would be affected by a change in the miner status API.
  • You can probably see that we've gotten ahead of the performance issues and that the releases aren't critical anymore. Therefore, we will be reducing the frequency of releases to once per week or two weeks to reduce downtime, unless urgent bugs are found.
  • We determined that the cause of the spam filtering with Microsoft accounts was caused by a SPF record that was directed at an incorrect IP address. Once we corrected the record, the ticketing system stopped receiving rejections, so we think that that issue has been resolved.
  • The next task will be to improve performance of the payout operations. The current operation is to start a transaction, execute the INSERT commands, then hold the transaction open while the coin daemon is contacted, and then either commit or rollback based on the effects. Unfortunately, there are so many coins now that the database can get behind during the time the tables are locked. There are a number of solutions we'll investigate. One is a risky one where we try to do the INSERTs and then rollback, execute the payouts, and then do the INSERTs again and commit. Another is to delay the payouts so that the 480 coins are paid throughout the day, which would still meet the time guarantee but the system would be easily able to catch up. The best option is to partition the balance tables by coin, which would also allow greater parallelism in the share inserters when payouts are not occurring.
  • I'm starting to think that the network connectivity issues are almost entirely unrelated to network connectivity and instead are caused by the system working as designed. For example, some coins have "charity blocks" or special format blocks that would be unprofitable for us to support, so those coins go into error periodically. But when we ask customers what password arguments they are using, they often don't reply. Bear in mind that if you submit a support ticket about network connectivity, you'll have to be willing to answer questions so we can get to the bottom of the issue. Unfortunately, we have to close tickets where the only line is "I can't connect" or "I keep getting disconnected." The stratum protocol does not provide for an error message to be returned in the authorization function.

Re: Status as of Monday, October 2, 2017

Posted: Mon Oct 02, 2017 9:29 am
by dog1965
hmmmmm this all remains to be seem keep up the good work.

Re: Status as of Monday, October 2, 2017

Posted: Mon Oct 02, 2017 10:02 am
by gestalt
Good morning!

It's good to see that the WAMP changes resulted in so much improvement! One idea, though it may not be ideal, is to implement the changes in the public API as new functions and subscriptions instead of replacing the old ones. You can then update the docs to reflect the new API so that future integration will use the new stuff. Then you can immediately switch the website to using the new API and possibly get some improvement there. When usage of the old API falls low enough or after you have given enough time for people to switch, then you can shut off the old API. It also solves the problem of trying to figure out how much of the use is from 3rd party applications.

Re: Status as of Monday, October 2, 2017

Posted: Mon Oct 02, 2017 1:38 pm
by jaybizz
I can say personally, that my L3's were having negligible connection issues until after the DDoS issues. Since those connection issues were consistently causing me to have ~90% or less efficiency (as opposed to ~99% before the DDoS issues) I've since pointed them elsewhere, as that's a pretty hefty hit when you include pool fees. I used no special password arguments, only names for my miners. There definitely seems to be something in the new network structure that is causing disconnects - I'd imagine having to do with the VPN solution to the DDoS - because I have a hard time believing it's a coincidence that my miners had so many issues (disconnect/failing over to backup pools) ever since then.

Re: Status as of Monday, October 2, 2017

Posted: Mon Oct 02, 2017 2:06 pm
by Steve Sokolowski
jaybizz wrote:I can say personally, that my L3's were having negligible connection issues until after the DDoS issues. Since those connection issues were consistently causing me to have ~90% or less efficiency (as opposed to ~99% before the DDoS issues) I've since pointed them elsewhere, as that's a pretty hefty hit when you include pool fees. I used no special password arguments, only names for my miners. There definitely seems to be something in the new network structure that is causing disconnects - I'd imagine having to do with the VPN solution to the DDoS - because I have a hard time believing it's a coincidence that my miners had so many issues (disconnect/failing over to backup pools) ever since then.
I hope that it's that easy. If it is, then the network issues will magically go away if we decide to expand once the lawyer replies.

Still, I'm not confident that the VPN is entirely responsible. There are likely a number of issues, many of which aren't related to network connectivity at all. There are many customers who have reported issues with network connectivity, only to help us find that the problem was case sensitivity of usernames, or coin errors, or some other problem.

Re: Status as of Monday, October 2, 2017

Posted: Mon Oct 02, 2017 2:53 pm
by Kfedorek
All seems to be good // better for me now.

Re: Status as of Monday, October 2, 2017

Posted: Mon Oct 02, 2017 3:25 pm
by vhmanu
Posted the same in the news section:

There are serveral good Coins in error mode and only doge_x trading pairs are being listed (only visual it seems).

Only to name a few coins currently in error mode:
Bridgecoin
Verge
Giantbirdcoin
Bitdeal
Gokucoin
Litebitcoin

Re: Status as of Monday, October 2, 2017

Posted: Mon Oct 02, 2017 7:10 pm
by simonjbcmm
Steve and Chris. Good work gentlemen! You guys are quite spectacular.

-Jonathon

Re: Status as of Monday, October 2, 2017

Posted: Tue Oct 03, 2017 9:54 am
by ruptan
Great job guys!