Page 1 of 1

Long running tasks?

Posted: Thu Jul 30, 2015 3:13 pm
by rootdude
System status
A long-running task has delayed the recording of shares since 9 minutes ago. Statistics will be out-of-date until the task has completed and all shares have been recorded to the database.
Been seeing a lot of this the last two days guys - any thoughts?

Re: Long running tasks?

Posted: Thu Jul 30, 2015 3:25 pm
by Steve Sokolowski
We discovered an issue with the trader. The code opens a transaction, attempts to insert a row into the trade table, and ensures that no constraints are violated (i.e. if there was an accounting error where the balance would be less than zero of a coin). Then, it attempts to execute the trade. If executed successfully, it commits; if the trade fails, it rolls back. That way, we are ensured that trades cannot occur if a bug has caused balances to go haywire, and the database is definitely able to be updated if the trade succeeds.

But because we update the sell balance, and then update the buy balance, deadlocks can occur if a block is found for the buy coin (which increases the buy coin's balance) after the sell coin's balance is updated. To get around that, about a year ago someone added a "LOCK TABLE IN SHARE ROW EXCLUSIVE MODE" to the trigger, which worked fine with Cryptsy at the time. That wasn't a problem until we added exchanges like Ccex, which throttle sells to as slow as 1 per second. Therefore, no shares can be inserted for 1s while this lock was held.

We didn't figure this out until today. We were able to get rid of that lock by issuing that query like the following:

Code: Select all

	UPDATE status_pool_balance SET balance = CASE
		WHEN coin_id = NEW.buy_coin_id THEN balance + NEW.buy_balance
		WHEN coin_id = NEW.sell_coin_id THEN balance - NEW.sell_balance
	END,
	date_created=NOW()
	WHERE coin_id IN (NEW.buy_coin_id, NEW.sell_coin_id);
As you can see, this code updates two rows in one statement. Only the rows are locked, instead of the entire table. Unless the coin being traded is the same one as the share being inserted, which is much rarer than any coin being traded, this statement no longer blocks all share inserts.

You should be aware that these messages mean that the system is working as designed: it's simply queueing shares to be inserted later. The same amount of money is earned; it just doesn't display in real-time until it catches up.

Re: Long running tasks?

Posted: Thu Jul 30, 2015 3:48 pm
by rootdude
Quite an elegant solution (updating two rows) to elimnate DB blocking. Carry on! Love the transparency and the explanation.
Steve Sokolowski wrote:We discovered an issue with the trader. The code opens a transaction, attempts to insert a row into the trade table, and ensures that no constraints are violated (i.e. if there was an accounting error where the balance would be less than zero of a coin). Then, it attempts to execute the trade. If executed successfully, it commits; if the trade fails, it rolls back. That way, we are ensured that trades cannot occur if a bug has caused balances to go haywire, and the database is definitely able to be updated if the trade succeeds.

But because we update the sell balance, and then update the buy balance, deadlocks can occur if a block is found for the buy coin (which increases the buy coin's balance) after the sell coin's balance is updated. To get around that, about a year ago someone added a "LOCK TABLE IN SHARE ROW EXCLUSIVE MODE" to the trigger, which worked fine with Cryptsy at the time. That wasn't a problem until we added exchanges like Ccex, which throttle sells to as slow as 1 per second. Therefore, no shares can be inserted for 1s while this lock was held.

We didn't figure this out until today. We were able to get rid of that lock by issuing that query like the following:

Code: Select all

	UPDATE status_pool_balance SET balance = CASE
		WHEN coin_id = NEW.buy_coin_id THEN balance + NEW.buy_balance
		WHEN coin_id = NEW.sell_coin_id THEN balance - NEW.sell_balance
	END,
	date_created=NOW()
	WHERE coin_id IN (NEW.buy_coin_id, NEW.sell_coin_id);
As you can see, this code updates two rows in one statement. Only the rows are locked, instead of the entire table. Unless the coin being traded is the same one as the share being inserted, which is much rarer than any coin being traded, this statement no longer blocks all share inserts.

You should be aware that these messages mean that the system is working as designed: it's simply queueing shares to be inserted later. The same amount of money is earned; it just doesn't display in real-time until it catches up.

Re: Long running tasks?

Posted: Thu Jul 30, 2015 4:59 pm
by Chris Sokolowski
Another thing to mention is that we are running our blockchain indexer at the maximum possible speed to try to get all the block explorers up to date as soon as possible, which is causing high load on the database. Once the chains are indexed, then the system will enter a steady state where it only needs to index new blocks, and database load will be reduced.