GSoC 2018: StreamData generators out of type specifications.

Hi everyone!

I’m Nikola and I’ll be working on the following google summer of code project.
I plan on updating you folks in this thread to check out the progress and maybe return feedback :slight_smile:

Week 1 13.05-19.05

I started working on generation of the basic types. Progress for this week can be seen in this branch.
Added most basic types with the exceptions of the likes of functions/pids/ports and some builtin types.
When I say basic types, I mean the basic/built-in and literals that elixir supports, check them out here.

Week 2 20.05 - 26.05

For this week - I mostly experimented with different approaches to recursive data types reading some papers and how I would implement function generation.
Some stuff that might be interesting:

And a branch where I experimented on recursive data types.

Week 3 27.05 - 02.06

Set up a new mix project, so that tests and reviews can be easier, you can find the project here.
Also create function generation, you can (git) check out the branch.

Week 4 03.06 - 03.10

Mostly getting help from mentor to get the basic types merged and review what we’ve been doing.
I plan on adding support for pids/ports this week.

10 Likes

Hi folks, another update from me :slight_smile:

Week 5-6 11.06 - 24.06

Busy with personal trips, university finals, so mostly reading code and learning statistics with R.
Began with the basics union type support, stuff like:
integer | atom, detecting those types and separating them from recursive types such as:
list :: nil | {integer, list}

Week 7 25.06 - 01.07

Start working on recursive types - added support for basic stuff like unions:
data = nil | {integer, data} for example.

A great help was this thread by alfert bow:

Other work includes doing some support stuff, for example raise when people try to
reference protocol types. So if you try generating a stream for Enum.t, you’ll get an error
that will tell you to use the data that implements the protocol instead(lists for example).
The reason this is done is because the Enum.t written by elixir after expansion is equivalent to
any() which is in no way useful. You can check the PR here: Raise on protocols. by NJichev · Pull Request #15 · NJichev/type-generators · GitHub

Another thing is raising when you have a pid/port type in your types, reason for this is simple,
we can’t know what kind of processes you need, hence just show an error message how you could achieve what you want.

Week 8 02.07 - 08.07

Finish work on recursive types(without parameters).
Let’s take this type as an example:

@type operator :: :+ | :- | :* | :/
@type expr :: integer() | {expr(), operator(), expr()}

Here the expr type will have 3 user defined types in the type signature it returns.
The user types will look something like this:

union:
  integer
  tuple:
    {user_type, ..., expr, }
    {user_type, ..., operator, }
    {user_type, ..., expr, }

Because the operator is not inlined as a union of the 4 operations, this caused problems

  • the fix was to just manually inline all types until only those with the name expr remain.
    That way the type tree only has user types such as {:user_type, ..., expr, ...} making it easy
    to detect the recursive ‘nodes’ later on.
  • a bonus is that the module name doesn’t have to be passed around after that

Other work was for other kinds of recursive types, the working example is the following one:

@type forest :: {integer, [forest]}

After inlining - checking for such recursive types is something like: okay it’s not a union recursive type, let’s check if there is any user_type inside, if so it’s of this kind.
The reason this is a valid recursive types is that the list definition is recursive in itself: list = cons a list | [], which means that our types would look something like:

{-4, [
    {0, [{1, [{1, []},
    {2, [{1, [{-5, []},
    {-68, []}]}]},
    {-3, []}]}]}
  ]
}

So in a way the leaves in the forest example are the implicit nils from the list type that works as a container.

Also I experimented with how the API for the function validation would work,
You can check the WIP branch here: WIP · NJichev/type-generators@8c2871a · GitHub

In a nutshell, what would be the rest of the work:
a function’s type signature in the beam code is something like:

product:
  argument types
  return type

The solution’s algorithm is the following:

  • map type generation to get stream data generators out of the argument types
  • get a function that checks if a certain type is of the return type
  • do a check all that feeding the arguments returns the return type.

A possible API could be something like:

  • check_function_specification(&ModuleName.function_name/arity, opts \ )
  • check_function_specification(ModuleName.function_name/arity, opts \ )
  • check_function_specification({ModuleName, function_name, arity}, opts \ )

I like the tuple one the most, the name is TBD :smiley:

This will be a macro that will expand to:

is_return_type = generate_member_function_for(return_type)
check all x_1 <- arg_1, ..., x_n <- arg_n, expand(opts) do
  assert is_return_type(Module.function_name(x_1, ..., x_n))
end

Leave a comment if you have suggestions for what the API could look like.

4 Likes