Dialyzex - an alternative Mix task for dialyzer

seancribbs · November 28, 2017, 5:51pm

Today I released a new dialyzer Mix task as the dialyzex package! At the time we started writing this task, the existing dialyzer integrations for Mix were missing some features that we needed, or operated in ways we didn’t prefer. If you’re using one of the existing tasks, I encourage you to try ours out and give us feedback.

The primary distinguishing features of this task vs. existing options are/were:

It builds separate PLTs for Erlang, Elixir, and your project dependencies automatically, without combining them into a massive PLT that has to be repeatedly rebuilt.
It calls the dialyzer OTP library directly rather than shelling out to the executable.
It uses a stricter set of warnings by default.
You can ignore particular emitted warnings by specifying match patterns in your project configuration. This is especially useful if an upstream library has a bug that you cannot fix immediately, but you want your project to build cleanly.

Here’s all the usual stuff:

I wrote a little announcement on Twitter as well: https://twitter.com/seancribbs/status/935532065445044224

christhekeele · November 28, 2017, 6:39pm

I was just playing with adding dialyxir to a CI build; now I’m evaluating this instead (for that layered PLT caching)!

QQ: I’m running mix dialyzer --check=false on my project but I’m seeing warnings coming from my Elixir install and project dependencies themselves. I thought the check flag was meant to suppress that behaviour, do I have it wrong?

benwilson512 · November 28, 2017, 6:44pm

The thing I think all of these dialyzer packages could use is a “Getting Started” guide that handles a bunch of common issues that show up when using it with Elixir. I’ve tried to use it on various projects before and it raises all kinds of issues about protocols, various functions that are missing, etc.

I can get used to reading the erlang output of the errors themselves, I just don’t know what to do with all the “missing function x” stuff where x definitely exists.

christhekeele · November 28, 2017, 7:02pm

Ah, I just found this behaviour (the persistent dependency warnings despite --check=false being set) explained under the caveats section.

Perhaps an additional mix dialyzer.clean task could assist with this? I’ve wished for a similar task through dialyxir before too. It’d be nice to not have to know how the specific dialyzer package in use has built its plts and where it’s stashed them to get a clean read.

seancribbs · November 28, 2017, 7:24pm

That’s a great idea! Would you file it as a feature request on the Github repo?

seancribbs · November 28, 2017, 7:27pm

Thanks for that comment, Ben. One thing that I personally forget sometimes is that I’ve worked for a long time with Dialyzer and understand most of the warnings. I’ll prepare a guide to them soon.

seancribbs · November 28, 2017, 7:29pm

--check=false disables validation of the existing PLTs. You’ll still have to build them at least once. In a CI build, I would not use that flag.

christhekeele · November 28, 2017, 7:43pm

Done, and cross-referenced!

So should I separate building my PLTs from checking my project in a CI environment, somehow?

My understanding is if I invoke mix dialyzer as a CI job (without the flag), and persist the ~/.cache folder between runs, the first time it will build 3 PLTs: ones for erlang, Elixir, and the deps. Then it will perform checks, but since the PLTs were built without the check flag upstream warnings will fail my build.

When I set this to false it will never fail my build because of issues in upstream PLTs, whether or not this is the first time creating them, which seems like the desired behaviour during CI, is there a more correct way to set this up that you recommend?

benwilson512 · November 28, 2017, 7:44pm

That would be incredibly useful, thank you!

christhekeele · November 28, 2017, 7:50pm

I would love this. What do you think about placing such a guide as its own page, alongside the typespec docs, within Elixir docs itself?

It doesn’t seem efficient to have every dialyzer package owner author their own guide, since the warnings are identical across packages; it makes more sense to me if they all could just link to an upstream Elixir guide.

seancribbs · November 28, 2017, 7:58pm

That’s an excellent idea. I will start authoring one myself, but it could ultimately live in the Elixir documentation.

jeremyjh · November 28, 2017, 11:12pm

Hey, thanks contributing in this space. I wish you’d consider contribution to dialyxir but competition is healthy too.

Still, there are only two other dialyzer solutions for Elixir that I know of, and one of them is dormant and redirects to Dialyxir, which I maintain. So, when you say in your README “Existing Solutions may not…” - its hard not think of it as referring to anything else; but maybe that’s just my perspective.

Anyway, for the record:

These features you mention exist in Dialyxir:

ANSI-colored output.
Exits non-zero when dialyzer produces warnings (good for continuous integration usage).
Calls the dialyzer OTP library directly rather than shelling out to the executable (it used to shell, but hasn’t since 0.5 which was released in February).

These do not:

Defaults to the strictest set of warnings available in Dialyzer, except for the few that are overly expensive. - This is a non-goal and in fact how Dialyxir behaved for its first couple of years of life. Fish’s argument convinced me its not the best for the community (and some of the costs are externalized from the project mainter, to Stackoverflow, IRC, and the forums). Of course you can turn on more flags, and I could see adding strict options to turn that set on all together but if you want a different default, maybe we need two different packages…
Layered PLT files - Dialyxir maintains separate Elixir and Erlang PLTs but combines them with the application dependences in the project directory - Using multiple separate files at runtime is a better idea and would be worth adding.
Ability to ignore acceptable warnings based on match patterns. - This is probably worth adding to Dialyxir. The current string-based ignore matching is easy to understand and use but not as flexible.

Finally, I’d encourage you to read Jose’s issue on Dialyxir. Interpretation and explanation of the error messages produced by Dialyzer is one place there is really a lot of room of improvement and innovation in this space.

mbuhot · November 29, 2017, 12:23am

I love the automatic incremental dialyzer support in Elixir-LS. It basically removes all the ‘dialyzer is slow’ pain.

Any way to get something like that into the mix tasks? Eg using the output of git ls-files -m and only analyzing those modules?

jeremyjh · November 29, 2017, 12:59am

I like the idea and have long thought it is needed. Still, there are lots of details to work out. The warning could get fixed in another module - e.g. in the function head rather than the call site, so I don’t think just looking at the time-stamps would quite do it. Also should it go on reporting warnings in files that haven’t changed? I would guess a language server doesn’t have to worry about that because the editor can maintain the list of previously reported warnings, but probably a mix task should go on reporting them.

Honestly I had not kept up with Elixir-LS, its really come quite far. Editor integration is how Dialyzer should ideally be used for interactive purposes.

The mix tasks are good for CI but maybe we need to focus on getting Elixir-LS into every editor and rally around that.

NobbZ · November 29, 2017, 8:36am

I do not like this feature, as it may break anytime. It is using an internal API of dialyzer. I’d be much happier with that feature if @JakeBecker had taken the nexessary steps to harden the API upstream. I do fear the point when I have to work against my editor because OTP 23 is current and the internal dialyzer API has changed, but I have to maintain a project in OTP 20, and therefore trick the auto-update mechanics to actually use a 3 year old plugin instead of the current one, but only for this piece of old software, while this also makes me unable to use other more current features of the plugin.

fishcakez · November 29, 2017, 9:05am

Somehow the forum sent me an email about this thread… Anyway since it did I am chuffed to see a new effort for dialyzer and Elixir.

I believe the multi layer PLT approached used by this library sacrifices analysis for speed. I haven’t tried the task, only quickly skimmed the code. It should be easy to measure the speed increases however the loss of analysis is harder to quantify. The Elixir PLT has no context of the OTP PLT and will treat unknown types, arguments and returns as any term. Similarly the deps PLT doesn’t have the context of either of those PLTs. When running success typing with multiple PLTs dialyzer does not merge the PLTs. Therefore the analysis runs with PLTs that have weaker type specifications. However the analysis will still be valid and not have false positives because it will still only warn when there something is always wrong, it can just do this less often. I think the hash’ed name for dep PLT is a great way to keep the task simple and fast for the majority of runs.

My gut feeling is that this approach will be much nicer for local usage but if just using in CI I think a stronger PLT approach would be more appropriate. I’ll try to followup about this tradeoff in an issue when I have time to show some examples.

Please reconsider the 3rd feature, strict by default, because it will be such a time sink for inexperienced dialyzer users that the tool can become detrimental for them. It takes quite a lot of time to understand what under specifications are let alone when they should and should not be ignored.

christhekeele · November 29, 2017, 2:19pm

To be fair, the feature’s great–it’s the implementation that’s lacking. I agree with all your pain points though.

JakeBecker · November 29, 2017, 7:26pm

Erlang moves pretty glacially. I doubt OTP 23 will be breaking down your door anytime soon. For what it’s worth, as long as I’m coding in Elixir I plan to keep it updated for future OTP releases.

I thought about trying to contribute the Dialyzer changes upstream, but Dialyzer is written in Erlang which I find much harder to write than Elixir, and I haven’t wanted to put in that effort. The changes needed are not all that complex – because of the OTP conventions Erlang and Elixir promote, Dialyzer is written almost as though it were meant to be a long-running server, but the APIs it exposes are not quite enough to actually run it as one. If anyone is actually good at writing Erlang and wants to try, I’m happy to assist.

JakeBecker · November 29, 2017, 7:52pm

Great to hear that you’re liking ElixirLS!

Dialyzer is really so close to being usable out-of-the-box in an incremental way. When run, it internally maintains a callgraph which is included in the PLT file it writes, so when a module is changed, it can determine which modules need to be re-analyzed. ElixirLS uses the timestamps on the beam files to get an initial list of changed modules, then also checks whether the md5 has changed (since re-analyzing modules is expensive and should be avoided when possible). It writes a manifest file that is basically like a PLT file but also includes the warnings.

ElixirLS’s dialyzer looks at the modules’ abstract code to find any unknown modules and includes them in the analysis. It does this recursively to try to avoid doing any analysis that references “unknown” modules. The resulting manifest file ends up being rather large, often a few megabytes. I thought about decomposing this out into a Mix task, but reading and writing several megabytes to disk every time seemed slow enough that keeping it as a long-running server seemed the better option.

seancribbs · December 2, 2017, 7:01pm

Thank you for your comments, James. I’ll take them into consideration for future versions of this library.