Possible to create a @type for ets match result?

kseg · May 24, 2017, 1:05pm

The result of an :ets.match looks is an array of arrays. If the “rows” were tuples, one could write a @type spec like:

@type row :: {integer, binary, integer}

But since they are arrays, and this the following doesn’t seem valid, I’m wondering if there’s an alternative [which doesn’t involve iterating all rows and converting them):

# gives: "unexpected list in typespec" compilation error
@type row :: [integer, binary, integer]

peerreynders · May 24, 2017, 1:48pm

Lists, not arrays (there actually is an array type).[quote]

# gives: "unexpected list in typespec" compilation error
@type row :: [integer, binary, integer]

[/quote]
From a typing perspective the type in the list needs to be homogeneous - hence:

@type row :: [term()]

as specified in :ets.match/3

zambal · May 24, 2017, 1:49pm

I think the closest you can get is something like

[integer | binary]

but that just a means any list that can contains any number of elements of type integer or binary.

May I ask why you want to specify the type of a :ets.match result? Although not a strict rule, most Elixir libraries and applications only specify types for public functions and a match result from ets doesn’t seem to be a very api friendly data type.

kseg · May 24, 2017, 3:14pm

Thanks.

It isn’t a library, it’s an application. And, in such, what is a “public api”? Either nothing is or everything is. We get all the benefits of annotating these functions and type. Namely, helping readability / documentation.

Yes, we expose the match data more broadly than we’d like, but the only alternative [I can think of] is to iterate through the results and convert each one to something more meaning (and then, iterate through it again to actually process the results). We’re keen to avoid doubling the iteration (which can range in the millions) and extra allocations.

peerreynders · May 24, 2017, 3:26pm

Maybe it depends on how you are doing things.

If you wrap the data immediately in a tuple or struct the first first time you have contact with it then there would be no double iteration - just an additional function in the transformation pipeline.
If you are processing millions of records I presume your application isn’t just one single BEAM process, so as the data moves from one process to the next, allocations happen all the time anyway - it doesn’t matter if the data is held in a list, tuple, or struct.

zambal · May 24, 2017, 3:30pm

With public api I actually meant any function defined in a module as def as opposed to defp, but I understand your point of not wanting to transform the results from match for efficiency reasons. However, I’m afraid you won’t be able to exactly specify the type of the function’s results in that case.