Hey everyone!
I have a rather perplexing question. Our production app has been generating the following error message:
(MyXQL.Error) (1267) (ER_CANT_AGGREGATE_2COLLATIONS) Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8mb4_general_ci,COERCIBLE) for operation '='
That’s not terribly surprising, as the underlying DB was originally created on MySQL 5.6 (now on 5.7) and the latin1 defaults were what we used. We’re now processing events that contain UTF-8 data, like محافظه البحي
.
We’re not quite ready to make an overall character set and collation change to the DB - that may come in the following weeks. For now, we’re just looking to capture those specific errors and get a sense of how often they happen. Only, when attempting to run some test cases on this, we can’t replicate the behavior!
We’ve eliminated some possibilities that were obviously impacting the tests. Local MySQL was getting the Ecto defaults for UTF-8 initially on each run, so character set and collation were overridden to get back to what we have in prod. We disabled the SQL sandbox using unboxed_run(repo, function)
to allow these specific tests to persist their data so we could inspect it afterward. Somewhere, at a level we don’t understand yet, the tests are converting that value to what appears to be some other encoding.
MySQL client hitting the test DB shows this value being inserted for the above Arabic string:
Ù…ØØ§Ùظه البØÙ
We’ve no idea where or how that’s happening, and more-so, it’s rather annoying how it seems to pass silently on test but raises errors in production. Any thoughts or ideas to follow up on?