Page 1 of 1

Status as of Tuesday, July 7

Posted: Tue Jul 07, 2015 9:20 am
by Steve Sokolowski
Here's today's status:
  • We made a significant breakthrough yesterday when we discovered that the mining server could block under heavy database load. The sequence of events would be started when all 60 of the server's database connections were used up because the database was under heavy load. The code to obtain a connection to insert a new share row would then block while it tried to get a new connection. To resolve this issue, I placed the code into a new thread. The server is now deployed and it resolved the issue where shares would be delayed in their acceptance. A downside to this solution is that the number of threads can grow large if it takes minutes to insert shares into the database.
  • There was a problem earlier today where the mining server deadlocked with itself. We're working on that problem and will try to fix it tonight.
  • Chris found yesterday that the Fedoracoin network was 51% attacked, costing money. Most likely, the attack will result in the discontinuation of the coin.
  • We may be able to reduce the risk of deadlocks by upgrading our disks. We were fortunate enough to find two blocks of litecoins above what we needed for payouts yesterday. I was wondering if rootdude was willing to weigh in with a comment on whether it is safer to use 2x 1TB SSD in a RAID 0, compared to a single 2TB SSD. One would think that they have the same number of chips in both configurations, so the risk of the entire array failing is similar. But I wonder if the fact that there are two controller cards makes it more risky to use two disks. At the very least, it seems that the 1TB disks, striped, would be faster.
  • Thank Chris for upgrading these forums to the latest version. I'm curious to see what long-time users like kires look like, should they choose to edit their avatars.

Re: Status as of Tuesday, July 7

Posted: Tue Jul 07, 2015 12:02 pm
by kires
I like the new forum now that I found the option for the dark theme. The bright one was too... well, bright. As for the raid question, I'll gladly defer to RootDude's opinion, but I've always adhered to three basic rules. 1. Be nice to people who might be alone with your food. 1. Never do anything else when you have to pee. And: 1. Never ever EVER use RAID 0 in prod. I think the increased (close enough to doubled as makes no difference) probability of it failing would be a deal breaker, if it were my kit. RAID 1 is the way to go, as the performance is good, and the failure of a single disk won't bring anything down. A single drive absolutely will fail at some point; RAID 1 just means you don't have to be down while you replace it. (okay, maybe a reboot, but still)

Re: Status as of Tuesday, July 7

Posted: Tue Jul 07, 2015 12:13 pm
by rootdude
Hey guys -

I was out of touch most of the day yesterday, so I didn't see the bounce until this morning - shame I wasn't paying attention since yesterday was a signature prohashing bounty.

Speaking to the one versus two disks question - no doubt that two 1TB drives striped in Raid 0 will be faster and less prone to deadlocks or (write) cache hits than a single, larger disk. Essentially you are doubling the data bandwidth by spreading the load over two disks... so no doubt that two smaller disks in Raid 0 will perform better, and be much, much cheaper as well. No need for RAID 1, of course, unless there is no backup wallet running on another server.

Re: Status as of Tuesday, July 7

Posted: Tue Jul 07, 2015 1:03 pm
by Steve Sokolowski
No, we need these disks for the database server.

The strategy we would consider is the RAID 0 array for the data itself. This is risky, but we would write transaction logs out to a different disk that's a standard 4TB hard drive. Transaction logs are sequential writes, and data is never overwritten, so the hard drive would just be wiped every time a backup is made and it can start writing from the middle again. If one of the RAID disks ever failed, we would restore the database using the data on the hard drive. There would be an additional backup offsite.

The major issues with this setup are that we might run out of space with only 2TB, and whether having two disks in a RAID 0 is too likely to fail. The chance of data loss is close to zero, since there would be both an onsite and an offsite backup. The problem is that a recovery from a failed SSD would likely require several days, because it takes a lot of CPU to run through the transaction logs, and also because we can't afford to have a 1TB SSD depreciating as a hot spare.

Re: Status as of Tuesday, July 7

Posted: Tue Jul 07, 2015 4:08 pm
by JarBinks
IMHO

I agree that as long as you can withstand loosing whatever is on the volume, using 2x1TB spanned disks will be significantly faster than 1x2TB disk.
It may be even better to do 4x512, or 8x256.

Related Things:
- Some SSDs are better than others, especially for enterprise applications such as this
- For best performance the RAID should be setup on the HD controller, not in the OS
- The size of the stripe being used can also have a significant impact
Since this is DB oriented??, the stripe size should be aligned with the normal DB write size