Tuesday, December 29, 2015

On ProxySQL, MaxScale, Persistent Connection, response time, and bugs

Few days ago I came across the announcement that MaxScale 1.3 finally supports Persistent Connection.
ProxySQL supports persistent connection since it was a prototype (sometime back in 2013), therefore I am very happy that the MaxScale Team finally introduced a long waited feature.
Although, MaxScale implementation of persistent connection has a serious drawback and I would consider it as a serious bug (more details at the end of the article). A bug so serious that shouldn't reach any GA release.

Since I like running benchmark, and due the new feature in MaxScale, I thought it is a good time to compare again ProxySQL vs MaxScale, around 6 months after a previous benchmark .

Benchmark : ProxySQL vs MaxScale


Benchmark setup


The benchmark setup is very simple:
- a physical server (20 CPU cores) running sysbench , proxysql and maxscale
- a physical server (20 CPU cores) running 3 mysqld instances (1 master and 2 slaves) launched using MySQL Sandbox

Software version used:
- MySQL 5.6.28
- 0.4.12
- ProxySQL v1.1.0 stable
- MaxScale 1.3.0-beta

Configuration files can be found at the follow URLs:
ProxySQL
MaxScale

ProxySQL configuration was completed running these commands through the admin interface:

INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply) VALUES(1,1,'^SELECT.*FOR UPDATE$', 1, 1);
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply) VALUES(2,1,'^SELECT', 2, 1);
LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;


Preliminary benchmarks confirm the conclusion of my previous blog post : MaxScale is very CPU intensive, therefore to make a fair comparison between the two proxies I ran both of them with only 1 worker thread.


Benchmarks were executed running sysbench against the local proxy (either ProxySQL or MaxScale) , using this command:

sysbench --max-requests=0 --test=oltp --mysql-user=rcannao --mysql-password=rcannao \
--mysql-db=test --oltp-table-size=1000000 --oltp-read-only=on --oltp-point-selects=1 \
--oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 \
--oltp-skip-trx=off --db-ps-mode=disable --max-time=60 \
--oltp-reconnect-mode=transaction --mysql-host=10.1.1.164 \
--num-threads=$thr  --mysql-port=$port run

What is important to to note is that the workload is read-only , it performs only point selects, and it will reconnect at the end of each transaction : this is a workload meant to check the performance benefit of Persistent Connection .


The benchmark will compare:
- ProxySQL configured with read/write split
- MaxScale with readwritesplit module (RW)
- MaxScale with readconnroute module (RR)
- MaxScale with readwritesplit module and persistent connection (RW-PC)
- MaxScale with readconnroute module and persistent connection (RR-PC)


Benchmark result



Here the graph of the benchmark result about throughput:



There are a lot of information, but also some unanswered questions.
At very low concurrency, ProxySQL is slightly slower.
At 64 connections, ProxySQL and MaxScale RW-PC have very similar throughput, and that is great since these two configurations have similar behaviors.
Always at 64 connections it seems that MaxScale without Persistent Connection has reached its maximum throughput: as throughput with Persistent Connection is higher, we can already conclude that this feature is indeed useful and improves performance. MaxScale RR-PC will continue giving more throughput than the others, but this is expected as this routing module is very simple.

At 256 connections, throughput of ProxySQL and MaxScale RR-PC are the only two that continue growing. That means that the other configurations have saturated 1 core and are unable to scale anymore, while ProxySQL continues providing all its feature and scales with just 1 core.

At 1024 connections, all proxies configurations have a drop in performance. Although the drop in performance in ProxySQL is marginal, the drop in performance in MaxScale is severe.
This confirms ProxySQL's ability to scale.


What about response time?





From this graph of response time we can note that at high concurrency ProxySQL is able to provide the better response time. Let's remove the response time for 1024 connections and compare at lower concurrency:



What about maximum response time? This is really interesting:



No blue columns ... did I forgot to add the response time of ProxySQL? No, I didn't forget, but the max response time of MaxScale is too high for a proper comparison.
Starting at 64 connections, the maximum response time of MaxScale becomes so high that it reaches 60 seconds: this matches the max time in sysbench, therefore we should conclude that at least one (or perhaps more) connection created by sysbench isn't able to complete a transaction until the end of the benchmark.
That needs to be further validated by its developers, but it seems that at high concurrency (even if 64 connections shouldn't be considered "high") MaxScale is only processing a subset of connections while completely ignoring others. If that's correct, this should be considered as a serious bug.

For further testing, I rerun sysbench with 256 connections (not a lot, at all!) against MaxScale for 600 seconds , and max response times was 600 seconds : that is, at least one transaction (maybe more) wasn't processed until all the other transactions were completed.
I am sure nobody wants a transaction stuck for and undefined amount of time because the proxy is busy processing other transactions.

For reference, here the graph of max response times without the off the charts values:





Severe bug in Persistent Connection


The release note of MaxScale 1.3.0 hides a very important detail that is available only in the Administration Tutorial , that I report here for reference:

Please note that because persistent connections have previously been in use, they may give a different environment from a fresh connection. For example, if the previous use of the connection issued "use mydatabase" then this setting will be carried over into the reuse of the same connection. [...] In exceptional cases this feature could be a problem.

If I read it correctly, this is not a feature but a severe series of bugs.
More details below.

MaxScale doesn't track current schema


$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE()" test
+------------+
| DATABASE() |
+------------+
| test       |
+------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE()" mysql
+------------+
| DATABASE() |
+------------+
| test       |
+------------+

I assume nobody wants this to happen : the second client believes to connect to schema "mysql" , but in reality it is connected to schema "test" .
Unless your application is using only one schema, I strongly discourage the use of persistent connection.

MaxScale doesn't track charset and returns incorrect encoding


Let's try to identify another possible problem:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE(), @@session.character_set_client" mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | utf8                           |
+------------+--------------------------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SET NAMES latin1" mysql

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE(), @@session.character_set_client" mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | latin1                         |
+------------+--------------------------------+

In this example the current database is always incorrect (as already pointed out previously), but also the character set is compromised/corrupted. This can be a serious issue for many application, as MaxScale is ignoring the charset as specific by the client.
That is, MaxScale is ignoring schemaname and charset as specific during the initial handshake.


MaxScale doesn't track autocommit


Same applies for autocommit ...

$ mysql -u rcannao -prcannao -h 10.1.1..164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SET autocommit=0"

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | OFF   |
+---------------+-------+


Here we can see another major issue/bug : an application could issue statements assuming autocommit=ON (the default) while in reality another client could have change it.

MaxScale doesn't track transactions


I think this is perhaps the most serious bugs of how Persistent Connection are implemented in MaxScale.
Without MaxScale, when a client disconnects its transaction should be rolled back.
Let's see what happens with MaxScale and Persistent Connection.

First, we create a transaction the way many applications do: SET autocommit=0 , followed by any DML :

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+
$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SET autocommit=0"

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | OFF   |
+---------------+-------+

So far, we have modified autocommit in a Persistent connection.
Second, let's run some SELECT statement:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT COUNT(*) FROM sbtest" test
ERROR 1046 (3D000) at line 1: No database selected

Ops, error ... I forgot that MaxScale ignores my request for a default schema ... Now I must specify it in the query itself!

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT COUNT(*) FROM test.sbtest" test
+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT COUNT(*) FROM test.sbtest WHERE id < 1000" test
+----------+
| COUNT(*) |
+----------+
|      999 |
+----------+

All looks good so far. Let me check if there are active transactions:
 
$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE
$

No active transactions, that's good. Now, let's run a DML statement ...
 
$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "DELETE FROM test.sbtest WHERE id < 1000"
$

Let me check again if there are active transactions ...

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE
---TRANSACTION 2253315, ACTIVE 29 sec
$

Here is the bug! The client that issued the DML statement and started the transaction is gone/disconnected, yet MaxScale is holding a transaction open.
The bad news is that MaxScale doesn't track transaction no matter if they are started due to autocommit or an explicit START TRANSACTION . Here an example (after restarting maxscale) :

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "START TRANSACTION; SELECT 1 FROM test.sbtest LIMIT 1" test
+---+
| 1 |
+---+
| 1 |
+---+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE
---TRANSACTION 2253317, ACTIVE 2 sec

$


ProxySQL provides a safe environment with Persistent Connetion


By comparison, ProxySQL has a more mature implementation of Persistent Connection, and keeps track of the environment set by the client, ensuring that the environments of backend and frontend match.

ProxySQL tracks current schema


ProxySQL isn't affected by the same bug of MaxScale , and correctly tracks the schema as specific by the client:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE()" test
+------------+
| DATABASE() |
+------------+
| test       |
+------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE()" mysql
+------------+
| DATABASE() |
+------------+
| mysql      |
+------------+

ProxySQL tracks character set


ProxySQL isn't affected by the same bug of MaxScale , and correctly tracks the character set as specific by the client:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE(), @@session.character_set_client" mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | utf8                           |
+------------+--------------------------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SET NAMES latin1" mysql

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE(), @@session.character_set_client" mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | utf8                           |
+------------+--------------------------------+

ProxySQL tracks autocommit


Also in this case, ProxySQL isn't affected by the same bug of MaxScale , and correctly tracks the value of autocommit as specific by the client:

$ mysql -u rcannao -prcannao -h 10.22.20.164 -P6033 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SET autocommit=0"

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+  

ProxySQL tracks transactions


Also in this case, ProxySQL isn't affected by the same bug of MaxScale , and correctly tracks transactions, terminating them if required:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "START TRANSACTION; SELECT 1 FROM test.sbtest LIMIT 1" test
+---+
| 1 |
+---+
| 1 |
+---+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE

$

 

Why ProxySQL implements better Persistent Connection?


Since its initial implementation 2 years ago, ProxySQL was designed to handle frontends (clients) and backends (servers) as different entities, only connects them when needed, and remove the link between them as soon as possible.
When a client connects to ProxySQL no connection to any backend is established. When a client issues a request ProxySQL determines if the request needs a connection to the backend or not, and only if required it forwards the request to a backend. As soon as the request is completed, ProxySQL determines if the connection to the backend is still required, and if not it returns it to a connection pool.
That is: the connection to the backend is returned to the connection pool not when the client disconnects, but when the request is completed, assuming that it can be reused by other clients (for example if there are no active transactions).
Similarly, when a backend connection is linked to a client connection, ProxySQL will ensure that the environment is set correctly : schema, charset, autocommit, etc .
In other words, ProxySQL doesn't just implement Persistent Connection, but also Connection Multiplexing. In fact, ProxySQL can handle hundreds of thousands of clients, yet forward all their traffic to few connections to the backend.

As a final note, I invite everybody to try ProxySQL , now GA , and feel free to contact me for any question.