Poor count() performance with MySQL compared to PostgreSQL with large JSON datasets

Yesterday I was moving an app from PostgreSQL to MySQL. After successfully migrating data the app was working fine, but some requests were slow. Some quick debugging showed that a mundane query that took seconds when using MySQL 8.0, rather than milliseconds (that it did on PostgreSQL 9.6):

SELECT count(t0_.id) AS sclr_0 FROM trip t0_

After some basic tuning on the MySQL server, performance stayed the same. I just left it was odd, so I asked for some help on Twitter, got some good ideas but nothing made a difference. Ultimately I found that the JSON blobs I was using for unstructured data was killing count() performance in MySQL 8.0.

Twitter is great, but it's also a sinkhole for information, so I though I'd unroll here. The rundown:

  • Same hardware running MySQL 8.0 and Postgres 9.6
  • ~1.7GB dump size, ~1GB of which was JSON blobs
  • Around 1.5 million rows in problematic table
  • Schema and queries automatically generated by Doctrine ORM
  • Query duration on pgsql: ~one second on first run, 0.3 on repeat
  • Query duration on mysql: consistently from 1.5 to 2.5 seconds

Schema and query execution screenshots:

Screenshot of MySQL client displaying database schemaScreenshot of MySQL client running slow query

I tried a number of things in the linked thread (dropping duplicate indexes, doubling RAM, increasing innodb_buffer_pool_size, modifying SQL query to be more explicit, switching to MyISAM…), but query execution duration remained roughly the same. So around five times that of PostgreSQL.

I gave the schema another look and realized the JSON field is probably huuuuge. It was (I later checked it was ~1GB of the total of 1.7GB in the dump). I dropped the column and boom, a result in 0.1 seconds!

Screenshot of MySQL client running fast query (with JSON field dropped)

I could modify my app not to use this field, look into creating an index for the JSON field in MySQL or just stick with Postgres. I'm not in a hurry, and for now just wanted to share my experience. Which in a nutshell is: It seems that out-of-the-box PostgreSQL 9.6 is able to handle JSON blobs more efficiently than MySQL 8.0 does on the same resources, even for something that looks like a trivial query.

I hope this helps someone struggling with similar issues. Tread carefully when using JSON in MySQL.

-- Jani Tarvainen, 25/05/2020

P.S. Any discussion on Hacker News thread: https://news.ycombinator.com/item?id=23298691