Audit - a library for understanding changes in large data structures

My work at Lob involves a program called Dora that performs address verification. It validates addresses, repairs them to something valid or rejects them and tells you the issue. It’s basically a very long pipeline processing an Address struct with lots of rules and steps that gradually enrich the data structure as it passes along the pipeline.

The original authors have since moved on so we had to come up with techniques to understand why Dora makes the decisions and modifications it does.

To that end, I came up with Audit (hex,
github)

It operates by adding an extra field to the data structure of interest: __audit_trail__.
You then wrap your updates with the audit macro.
This will record the current file and line number (hence the use of a macro) and store the current version of the data structure.
The __audit_trail__ field is effectively a linked list history of every version of the
data structure you wrapped in audit.

The to_string method will take a struct with the __audit_trail__
and pretty print the list deltas and with the line of code that caused it
(as a github link, a filename/line number and a code snippet from the actual line in the file)
e.g. some example output:

github url: https://github.com/lob/dora/tree/leverage_audit_hex_package/lib/dora/.../standard.ex#L234
local path: lib/dora/.../standard.ex:234
code: address = %Address{audit(address) | primary_number: primary}
diff: [{[:primary_number], {:add, "18"}}]
=====
github url: https://github.com/lob/dora/tree/leverage_audit_hex_package/lib/dora/.../standard.ex#L228
local path: lib/dora/.../standard.ex:228
code: address = %Address{audit(address) | street_name: street}
diff: [{[:street_name], {:add, "BERNICE"}}]
=====
github url: https://github.com/lob/dora/tree/leverage_audit_hex_package/lib/dora/.../standard.ex#L217
local path: lib/dora/.../standard.ex:217
code: address = %Address{audit(address) | street_suffix: suffix}
diff: [{[:street_suffix], {:add, "ST"}}]

To kick the tires, add the following to your deps:

      {:audit, "~> 0.1.6"},

Add the __audit_trail__ field to your struct.
And modify your instances of:

%struct{ v | ... }

to be:

%struct{ audit(v) | ... }

then do the following to your resulting data structure:

r 
|> Audit.to_string()
|> IO.puts
8 Likes

That is very cool and the usage seems quite nice.

Can you please edit the main post and include the GitHub Repo and a link to hex.pm? :wink:

Is there also a way to diff 2 structures, after the change has already happened?

Fixed.

Would adding a last_only: true option to to_string do what you want?

r |> Audit.to_string(last_only: true) |> IO.puts
github url: https://github.com/lob/dora/tree/leverage_audit_hex_package/lib/dora/.../standard.ex#L217
local path: lib/dora/.../standard.ex:217
code: address = %Address{audit(address) | street_suffix: suffix}
diff: [{[:street_suffix], {:add, "ST"}}]

i.e. it just stringifies the last diff.

You can access the diff engine directly with:

Audit.Delta.delta(a, b)

but you have to null out the __audit_trail__ field using Audit.unaudit_fun
before passing in the structs.