Page 1 of 1

Status as of Monday, September 3, 2018

Posted: Mon Sep 03, 2018 8:40 am
by Steve Sokolowski
Good morning!
  • Now that we have seven algorithms, our major goal this week is to go to another round of coin maintenance. Chris has been hard at work adding new coins and repairing issues with existing coins. For example, he added images for most of the Neoscrypt coins, and fixed some coins that the system believed were orphaned because the wrong hash was being used to detect whether the blocks were present on the blockchain or not.
  • There appears to be a lingering issue with Ethash, which involves DAG generation. I'm hoping that the customer who reported it will reply so that we can determine if there is any way to prevent miners from having to unnecessarily generate a new DAG every time a coin is switched. There is no reason that the DAGs can't be stored on disk and just loaded, and we are wondering if a switch exists for popular mining software to instruct the software to do that. For now, the excessive DAG creation time is considered when deciding whether to switch coins or not.
  • Last night, the system went into a failsafe mode after prices were unable to be obtained from exchanges. After some investigation, I determined that the cause of the problem was that the server querying the exchanges ran out of disk space. When issues like this happen, we always add code to ensure that the particular issue will never be able to occur again. Therefore, today I'm going to research whether there are free Linux utilities that can notify people through a text message or Telegram when disk space is low, to prevent the issue from recurring.
  • I've implemented Quark mining and the algorithm is ready for testing. However, as stated above, our limitation is our research speed at finding new coins, so we won't release Quark until all existing algorithms have all their coins installed and issues worked out.

Re: Status as of Monday, September 3, 2018

Posted: Mon Sep 03, 2018 12:42 pm
by djliss
Steve Sokolowski wrote:Good morning!
  • Now that we have seven algorithms, our major goal this week is to go to another round of coin maintenance. Chris has been hard at work adding new coins and repairing issues with existing coins. For example, he added images for most of the Neoscrypt coins, and fixed some coins that the system believed were orphaned because the wrong hash was being used to detect whether the blocks were present on the blockchain or not.
  • There appears to be a lingering issue with Ethash, which involves DAG generation. I'm hoping that the customer who reported it will reply so that we can determine if there is any way to prevent miners from having to unnecessarily generate a new DAG every time a coin is switched. There is no reason that the DAGs can't be stored on disk and just loaded, and we are wondering if a switch exists for popular mining software to instruct the software to do that. For now, the excessive DAG creation time is considered when deciding whether to switch coins or not.
  • Last night, the system went into a failsafe mode after prices were unable to be obtained from exchanges. After some investigation, I determined that the cause of the problem was that the server querying the exchanges ran out of disk space. When issues like this happen, we always add code to ensure that the particular issue will never be able to occur again. Therefore, today I'm going to research whether there are free Linux utilities that can notify people through a text message or Telegram when disk space is low, to prevent the issue from recurring.
  • I've implemented Quark mining and the algorithm is ready for testing. However, as stated above, our limitation is our research speed at finding new coins, so we won't release Quark until all existing algorithms have all their coins installed and issues worked out.
have you looked at webmin for linux servers, mabe there is a plugin that can do this, or PRTG is a good monitoring piece of software that is free for 100 sensors, it can query through snmp or ssh

Re: Status as of Monday, September 3, 2018

Posted: Mon Sep 03, 2018 10:58 pm
by bsock
Steve Sokolowski wrote:Good morning!
  • Now that we have seven algorithms, our major goal this week is to go to another round of coin maintenance. Chris has been hard at work adding new coins and repairing issues with existing coins. For example, he added images for most of the Neoscrypt coins, and fixed some coins that the system believed were orphaned because the wrong hash was being used to detect whether the blocks were present on the blockchain or not.
  • There appears to be a lingering issue with Ethash, which involves DAG generation. I'm hoping that the customer who reported it will reply so that we can determine if there is any way to prevent miners from having to unnecessarily generate a new DAG every time a coin is switched. There is no reason that the DAGs can't be stored on disk and just loaded, and we are wondering if a switch exists for popular mining software to instruct the software to do that. For now, the excessive DAG creation time is considered when deciding whether to switch coins or not.
  • Last night, the system went into a failsafe mode after prices were unable to be obtained from exchanges. After some investigation, I determined that the cause of the problem was that the server querying the exchanges ran out of disk space. When issues like this happen, we always add code to ensure that the particular issue will never be able to occur again. Therefore, today I'm going to research whether there are free Linux utilities that can notify people through a text message or Telegram when disk space is low, to prevent the issue from recurring.
  • I've implemented Quark mining and the algorithm is ready for testing. However, as stated above, our limitation is our research speed at finding new coins, so we won't release Quark until all existing algorithms have all their coins installed and issues worked out.


SNMP???? Free. Been around since dirt. C'mon guys, it's disk space, you should have alerts across the boards when over 70% and then a PO is going out at 80% capacity in case there are shipping or stocking issues. At least that's how it works in the real world of SAN Storage. This is a terrible excuse for a pool that has a track record for outages. If anything, monitor the disk space and network. I love the pool, and appreciate your work and efforts - I do stay at the this pool in hopes this would some day be in the past...but it isn't, it's lingering since 2017. If you Google "linux script disk space monitoring," there is more than enough to go by. The blockchains are not that big.........

"Plugging holes as we go" is how this post comes off - I really hope that's not how problems are being fixed over at PH. I'm not sure what context to take that in given this is a forum. You eventually run out of fingers plugging holes.
When issues like this happen, we always add code to ensure that the particular issue will never be able to occur again. Therefore, today I'm going to research whether there are free Linux utilities that can notify people through a text message or Telegram when disk space is low, to prevent the issue from recurring
MMS Email phone-number@mms.att.net or whatever it is for your carrier. I send HTML MMS messages to my iPhone all the time and it works great.

Before the pool was ever close to fully stable, 10 more algos are added, and last I checked, I employee added (2)? Now you are "beta" testing Scrypt in production to "benefit the miners" by 3-5%, then another 5% (i didn't fully understand if that was added to the original 3-5% gain and been a while since i looked) when in turn the miner (at least the smaller guys) never benefits from the increase due to outages. Over time, as we all know, difficulty rises, and by the time it's resolved, it probably wouldn't have mattered.

Now, enough bitching from me, because I do think you all COULD have the best mining pool out there, at least in potential. Please don't bury yourselves by trying to do too much. It creates a ripple effect to those of us that aren't renting 100 miners out of a warehouse in another city. Taking down a single L3 makes a difference to those of us that only have less than 10. I've been mining with you guys for about a year straight and have seen a ton of growth and development and definitely enjoy mining on PH. I'll say it again, please don't bury yourselves. It seemed like you guys were doing that a little while back when trying to make the site more aesthetically pleasing and provide a bunch of stats other pools do not, when in the end, it's about the payouts. Not knocking the stats or the graphics, indeed, PH is hands down the best looking. Ok, I've said enough, hopefully this was more constructive than critical. Thanks PH!

Re: Status as of Monday, September 3, 2018

Posted: Tue Sep 04, 2018 8:08 am
by Steve Sokolowski
bsock wrote:
Steve Sokolowski wrote:Good morning!
  • Now that we have seven algorithms, our major goal this week is to go to another round of coin maintenance. Chris has been hard at work adding new coins and repairing issues with existing coins. For example, he added images for most of the Neoscrypt coins, and fixed some coins that the system believed were orphaned because the wrong hash was being used to detect whether the blocks were present on the blockchain or not.
  • There appears to be a lingering issue with Ethash, which involves DAG generation. I'm hoping that the customer who reported it will reply so that we can determine if there is any way to prevent miners from having to unnecessarily generate a new DAG every time a coin is switched. There is no reason that the DAGs can't be stored on disk and just loaded, and we are wondering if a switch exists for popular mining software to instruct the software to do that. For now, the excessive DAG creation time is considered when deciding whether to switch coins or not.
  • Last night, the system went into a failsafe mode after prices were unable to be obtained from exchanges. After some investigation, I determined that the cause of the problem was that the server querying the exchanges ran out of disk space. When issues like this happen, we always add code to ensure that the particular issue will never be able to occur again. Therefore, today I'm going to research whether there are free Linux utilities that can notify people through a text message or Telegram when disk space is low, to prevent the issue from recurring.
  • I've implemented Quark mining and the algorithm is ready for testing. However, as stated above, our limitation is our research speed at finding new coins, so we won't release Quark until all existing algorithms have all their coins installed and issues worked out.


SNMP???? Free. Been around since dirt. C'mon guys, it's disk space, you should have alerts across the boards when over 70% and then a PO is going out at 80% capacity in case there are shipping or stocking issues. At least that's how it works in the real world of SAN Storage. This is a terrible excuse for a pool that has a track record for outages. If anything, monitor the disk space and network. I love the pool, and appreciate your work and efforts - I do stay at the this pool in hopes this would some day be in the past...but it isn't, it's lingering since 2017. If you Google "linux script disk space monitoring," there is more than enough to go by. The blockchains are not that big.........

"Plugging holes as we go" is how this post comes off - I really hope that's not how problems are being fixed over at PH. I'm not sure what context to take that in given this is a forum. You eventually run out of fingers plugging holes.
When issues like this happen, we always add code to ensure that the particular issue will never be able to occur again. Therefore, today I'm going to research whether there are free Linux utilities that can notify people through a text message or Telegram when disk space is low, to prevent the issue from recurring
MMS Email phone-number@mms.att.net or whatever it is for your carrier. I send HTML MMS messages to my iPhone all the time and it works great.

Before the pool was ever close to fully stable, 10 more algos are added, and last I checked, I employee added (2)? Now you are "beta" testing Scrypt in production to "benefit the miners" by 3-5%, then another 5% (i didn't fully understand if that was added to the original 3-5% gain and been a while since i looked) when in turn the miner (at least the smaller guys) never benefits from the increase due to outages. Over time, as we all know, difficulty rises, and by the time it's resolved, it probably wouldn't have mattered.

Now, enough bitching from me, because I do think you all COULD have the best mining pool out there, at least in potential. Please don't bury yourselves by trying to do too much. It creates a ripple effect to those of us that aren't renting 100 miners out of a warehouse in another city. Taking down a single L3 makes a difference to those of us that only have less than 10. I've been mining with you guys for about a year straight and have seen a ton of growth and development and definitely enjoy mining on PH. I'll say it again, please don't bury yourselves. It seemed like you guys were doing that a little while back when trying to make the site more aesthetically pleasing and provide a bunch of stats other pools do not, when in the end, it's about the payouts. Not knocking the stats or the graphics, indeed, PH is hands down the best looking. Ok, I've said enough, hopefully this was more constructive than critical. Thanks PH!
Actually, the blockchains are pretty big. It takes about 2TB to store all of them, but more importantly they need to be stored on 960 PRO disks or better. ETH, for example, simply doesn't run with a hard disk or a disk that is slower than that; it will never finish downloading its blocks.

In regards to the disk space issue, I think you'll probably agree that it's easy to do a search and install whatever we find. The hard part is thinking about what issues could occur in the first place, which is one of the many reasons that most situations cannot be tested in Dev. In this case, the cause of the disk space usage was that a coin sent much larger block templates than were predicted, which caused the debug logger that was running that day for research to resolve another issue to write them to disk, which filled up the disk, which caused the WAMP server, which was running on the same server, to crash. In this case, three very rare circumstances had to coincide for the WAMP server to crash. I'm sure this happens all the time with most sites; it simply isn't written about.

What's great about this is that all the previous fixes that were made when other bugs occurred in the past worked successfully, proving that the system has gotten dramatically more stable over time. For example, the system disconnected miners when the prices stopped coming across WAMP, as designed. Shares from before then were flushed to the database, eliminating the need for share corrections; that was added in response to a different bug in the past. Finally, as soon as Chris deleted a file from the problem server, everything returned to normal automatically without restarts. Basically, over time, the problems we are encountering become more and more unusual and unlikely, because every one is fixed so that it will never happen again.

In regards to the payout comment, you might be surprised to learn that we've pretty much concluded that profitability is not the only or even the most significant factor in customer retention. If it were, then one would expect that scrypt hashrate would have increased dramatically over the past three weeks, as we increased relative profitability by 10%. There was an increase, but it was short-lived. I'm thinking that making a poll might be a good idea. The poll would ask customers what factors are most important to them in a pool, so that we can find what other than profitability is attracting them here.

Re: Status as of Monday, September 3, 2018

Posted: Wed Sep 05, 2018 2:00 am
by bsock
Thanks for the thought out answer, with supported examples, Steve. That caught me off guard so I wanted to clarify for myself. Much appreciated. 2TB these days isn't much if we really get down to it, especially with the added "for all of the blockchains," but you have your operational costs and your own ways of maintaining the mining pool(s) to get the best profitability and/or payouts for your company and the miners. Payouts and profitability are more of a correlation it seems as the higher the profitability, the higher the payout, MOST of the time, but these are variables that cannot be pre-determined. It's a chicken vs. egg thing I guess. I don't know what you are referring to with 960 PRO Disks, as I am unfamiliar with it. Sounds like server hardware disks, hopefully Flash cards or SSDs! I agree, ETH's blockchain, let alone any blockchain these days, should not AND cannot run on spindles and be efficient for the pool. I use the old school Fusion-IO (old company name, not sure who owns them now) All-Flash, PCI-Express 8x or 16x slots directly attached within the server. These are for our heavier backup workloads that contains millions of small files for indexing. I know they come in at least 3.4 TB capacities each. Just thought I would share. Perhaps the different number of blockchains to download could be split across different devices so if one blockchain goes down, the others do not. [albeit i don't know a lick about setting up a mining pool]

Issues for Disk Space:
- Definitely monitor it more closely or expand, which we both agree on, as well as scripting something into the code
- Corruption >> Are the disks that contain the blockchain in any type of RAID protection level?
- Capacity >> this sucks, but easiest thing to do is create a spreadsheet of all the data you have and need on the servers or accessible by them and start to add up the capacities. yep, it sucks. You guys could somehow code a function that not only monitors capacity (or any resource metric) at specific thresholds, but also knows the approximate sizes of the blockchains and the growth trends. Once you have enough data it becomes easier to spot the anomalies. For instance, the ETH blockchain for some reason or in error tripled in size today, that would be a great indicator to send an alert that the blockchain size(s) are growing at a quicker rate than the normal trending has shown over time, hypothetically, the last 2 years.

Maybe it's not that easy. Just thought I could offer some insight. As for WAMP, that's all you guys hahaha. Maybe split up the services between different hosts? Redundancy is always a beautiful thing. Tomorrow, the 6th, when the site is down, it doesn't have to be with redundancy. Again, all of this takes money, time, resources, etc. and you guys seem up to your ears in it. Best of luck, I still do the majority of mining here and plan to keep pointing newer ones in the future.

Re: Status as of Monday, September 3, 2018

Posted: Wed Sep 05, 2018 8:12 am
by Steve Sokolowski
bsock wrote:Thanks for the thought out answer, with supported examples, Steve. That caught me off guard so I wanted to clarify for myself. Much appreciated. 2TB these days isn't much if we really get down to it, especially with the added "for all of the blockchains," but you have your operational costs and your own ways of maintaining the mining pool(s) to get the best profitability and/or payouts for your company and the miners. Payouts and profitability are more of a correlation it seems as the higher the profitability, the higher the payout, MOST of the time, but these are variables that cannot be pre-determined. It's a chicken vs. egg thing I guess. I don't know what you are referring to with 960 PRO Disks, as I am unfamiliar with it. Sounds like server hardware disks, hopefully Flash cards or SSDs! I agree, ETH's blockchain, let alone any blockchain these days, should not AND cannot run on spindles and be efficient for the pool. I use the old school Fusion-IO (old company name, not sure who owns them now) All-Flash, PCI-Express 8x or 16x slots directly attached within the server. These are for our heavier backup workloads that contains millions of small files for indexing. I know they come in at least 3.4 TB capacities each. Just thought I would share. Perhaps the different number of blockchains to download could be split across different devices so if one blockchain goes down, the others do not. [albeit i don't know a lick about setting up a mining pool]

Issues for Disk Space:
- Definitely monitor it more closely or expand, which we both agree on, as well as scripting something into the code
- Corruption >> Are the disks that contain the blockchain in any type of RAID protection level?
- Capacity >> this sucks, but easiest thing to do is create a spreadsheet of all the data you have and need on the servers or accessible by them and start to add up the capacities. yep, it sucks. You guys could somehow code a function that not only monitors capacity (or any resource metric) at specific thresholds, but also knows the approximate sizes of the blockchains and the growth trends. Once you have enough data it becomes easier to spot the anomalies. For instance, the ETH blockchain for some reason or in error tripled in size today, that would be a great indicator to send an alert that the blockchain size(s) are growing at a quicker rate than the normal trending has shown over time, hypothetically, the last 2 years.

Maybe it's not that easy. Just thought I could offer some insight. As for WAMP, that's all you guys hahaha. Maybe split up the services between different hosts? Redundancy is always a beautiful thing. Tomorrow, the 6th, when the site is down, it doesn't have to be with redundancy. Again, all of this takes money, time, resources, etc. and you guys seem up to your ears in it. Best of luck, I still do the majority of mining here and plan to keep pointing newer ones in the future.
Oh, there are many times that we take servers offline without anyone knowing about it; the system does have quite a bit of redundancy. In this case, we need to take the database server offline. Theoretically, we could program a way to have multiple database servers and each one could synchronize with the others, but the issue would be cost. We made a decision that it's better to take the system offline 2-3 times per year than to spend $30,000 to hire someone to program and test all the software needed to support a live upgrade. Dealing with parallel databases is very difficult and causes bugs. The losses to the pool are only about $20,000 from the downtime, so we have redundant coin servers but stopped there, which is the maximum level of redundancy that makes sense economically.

Likewise, the coin servers do not have RAIDs. The blockchains are stored on single disks, and the hardware is cheap and old. We simply back up the executables every weekend; the keys are already backed up at creation time. If the disks ever fail, we'll simply copy the configuration back to a new disk and let the blockchains redownload, and everything will be back up in a day or two. Surprisingly, that's never happened, as these disks are very good at slowing down and making it obvious that failure is imminent before they fail completely. The database has RAID 1+0 for the earnings data and RAID 0 for the block explorer data, because that also can be easily recreated.

Re: Status as of Monday, September 3, 2018

Posted: Wed Sep 05, 2018 11:57 am
by sherm77
Steve, I have experience with monitoring tools, Nagios is a nice free app that you can install to monitor anything you want on the target system. I'm a Splunk Architect and I can tell you that the free version of Splunk (500mb per day) coupled with a universal forwarder on the servers you need to monitor and deploy the Splunk_TA_nix add-on (assuming & hoping they aren't Windows) and many of the system processes as well as df / disk space can be easily monitored. I wouldn't trust the load average script, it's not right. We had to create our own shell script to get load average. There are many monitoring scripts available in that add-on, you control the frequency that they run and which ones get triggered.

If you need help with Splunk, I'd be happy to help. It's been my main job for 5 years and I built & maintain the environment for a multi-billion retail food chain ( employed 10 years this month!).