I’m trying to debug a sporadic test failure issue. This is my current setup:
- Elixir 1.6.4
- Postgres 9.6.10
- Ecto 2.2.10
- Postgrex 0.13.5
While the specific query is complex (and uses a lateral join fragment), I’m not convinced that the error is directly related to the query. Plus I’ll need to take a little bit of time to reconstruct the query.
The issue manifests itself as an ordering issue, the rows that I request do not always come back in the order that I expect. The distribution of the errors makes me think that there is something odd going on.
To help reproduce the issue I tried two mechanism. One was running the query multiple times and the second was duplicating the test. Running the query multiple times was not helpful, if the query worked the first time it would work the 1000th time too. However, duplicating the test was more fruitful.
for i <- 1..400 do
test "query/2 ordering #{i}" do
# test the code here
end
end
Running this gave me some interesting results:
Run Number | # failures |
---|---|
1 | 19/401 |
2 | 0/401 |
3 | 0/401 |
4 | 29/401 |
5 | 100/401 |
6 | 137/401 |
7 | 67/401 |
8 | 0/401 |
9 | 1/401 |
10 | 0/401 |
11 | 0/401 |
12 | 0/401 |
13 | 0/401 |
14 | 0/401 |
15 | 0/401 |
16 | 0/401 |
17 | 0/401 |
18 | 0/401 |
19 | 0/401 |
20 | 123/401 |
21 | 157/401 |
22 | 0.00 |
23 | 0.00 |
24 | 271/401 |
It’s odd to me how there are so many runs that are completely successful and then the ones with failures usually have 20+ errors.
Does anyone have any ideas of what could be causing this?