How to prevent recompilation when copying in artifacts?

Hi,

What I’m trying to do is prevent some expensive Erlang source file generations and recompilations.

In our project we have ASN.1 protocols that blow up to 1.25M lines of Erlang. Even the ASN.1 files themselves are the result of a generation.

So our original process was:

  • Generate/extract .asn files from source documents.
  • Use asn1ct to generate Erlang codecs for them.
  • Compile Erlang files along elixir source.

Only these files barely ever change.

Even with an optimized build order the Erlang files took up to two minutes to compile in parallel, so we basically copied .beam files in instead - this worked. The compiler detected their presence and did not recompile them. (We have a separate version check to detect if this copy is allowed.)

But generating the .asn files takes another 30+ seconds, so I wanted to copy both the .asn and the .erl files in. But now the Erlang compilation happens again, even if we copy the sources first and then the binaries - so even if the .beam files have a newer timestamp than the sources.

What am I missing?

Thank you.

Which build tools do you use for the individual steps?

Hi, @NobbZ.

In general I rely on mix for triggering everything.

The original process:

  • The ASN.1 file generation is handled by a mix task that calls on shell/Awk scripts.
  • These shell scripts also invoke asn1_ct.
  • The Erlang compilation is handled by mix. I invoke a custom compiler through mix but it’s the same as the regular Erlang compiler support except I invoke in parallel and sort files by size. (It still respects the regular stale file handling, I left everything else as is.)

The modified process:

  • I compare a version file by invoking “cmp” on the shell.
  • Inside the mix task I copy the sources and binaries with File.cp!.
  • No file generation nor asn1_ct takes place.
  • Compilation would be handled again by mix.

Is this the information you need? :slightly_smiling_face:

Details might be really important here.

IIRC mix relies on timestamps (mtime IIRC) and content hashes to decide whether it has to recompile an ex/erl file into a beam file. Though especially your custom tasks and invoked tools might use different metrics.

Do you perhaps have some showcase project that can be used to reproduce and debug the issue?

Hi, @NobbZ.

I made this: GitHub - DerKastellan/erl_recompilation

I played around with a few scenarios of deleting and copying and the behavior is a bit odd. See README.md for what my outcomes were.

Thanks!

1 Like

File.cp!/2 copies unconditionally. There is no check in advance that would conditionally copy only whats necessary or newer.

That is okay.

What I’m really mystified by is how it is decided that an Erlang file needs to be recompiled even though its associated binary is newer than it.

Erlang itself doesn’t really, erlc will just compile the module given, or whatever calls the underlying erlang functions.

Thats why it is important to have a reproducer that works exactly the same way as your problematic code base does, such that we can take a look, what is used when.

1 Like

Sorry, I just realized I didn’t update this.

So, basically I did have a missing file. It works just fine.