I have a resource that has the following fields with regexes:
attribute :cnpj_basico, :string do
public? true
allow_nil? false
constraints min_length: 8, max_length: 8, match: ~r/^[A-Za-z0-9]{8}$/
end
attribute :cnpj_sufixo, :string do
public? true
allow_nil? false
constraints min_length: 6, max_length: 6, match: ~r/^\d{6}$/
end
attribute :cnpj_formatado, :string do
public? true
allow_nil? false
constraints min_length: 18, max_length: 18, match: ~r/^\d{2}\.\d{3}\.\d{3}\/\d{4}-\d{2}$/
end
I need to bulk insert millions of rows into the database using this resource.
I noticed that casting the resource would have a major slowdown if the match constraints are in place.
Here are the most expensive calls with match enabled using eprof:
# CALLS % TIME µS/CALL
Total 3676 100.0 2155 0.59
...
:crypto.strong_rand_bytes_nif/1 1 0.79 17 17.00
Ecto.Type.cast_fun/1 13 0.84 18 1.38
Ash.Changeset.force_change_attribute/3 10 1.07 23 2.30
Enum."-map/2-lists^map/1-1-"/2 80 1.16 25 0.31
:lists.member/2 47 1.16 25 0.53
:lists.keyfind/3 40 1.16 25 0.63
:ets.match_object/2 4 1.30 28 7.00
:re.import/1 495 24.41 526 1.06
Core.Cnpj.Estabelecimento.persisted/0 41 38.61 832 20.29
And here are the most expensive ones if I remove the match constraint
# CALLS % TIME µS/CALL
Total 3160 100. 592 0.19
Spark.Dsl.Extension.persisted!/3 43 1.69 10 0.23
Ash.Changeset.do_change_attribute/4 7 1.86 11 1.57
:erlang.module_loaded/1 30 1.86 11 0.37
anonymous fn/1 in Ash.Changeset.expand_upsert_fields/2 37 2.20 13 0.35
Enum."-map/2-lists^map/1-1-"/2 80 2.36 14 0.18
:crypto.strong_rand_bytes_nif/1 1 2.53 15 15.00
Ash.Changeset.force_change_attribute/3 10 2.70 16 1.60
:erlang.binary_to_atom/2 8 2.70 16 2.00
:ets.match_object/2 4 4.22 25 6.25
As you can see, the call went from 2155 µS to 592 µS.
This is even more expressive when processing the data in bulk, my times when from (10_000 chunks) ~6 seconds to 0.3~0.5 seconds.
So, is there some way to optimize this checks in Ash.Changeset calls during casting?
If not, is there some way for me to disable the regex check during bulk insertion so I can make it faster?






















