What Elixir framework is best to build a web chat app

Chawki · January 18, 2019, 4:56pm

Help me to choose a elixir framework

OvermindDL1 · January 18, 2019, 5:23pm

Phoenix with Phoenix Channels and Phoenix PubSub. Or Phoenix with Absinthe on top and subscriptions. All kinds of options but those are the two I’d pick.

Eiji · January 18, 2019, 6:16pm

I have used absinthe one time. It gives you few problems. Firstly you need to validate every request (extra database queries) unlike in WebSocket (Phoenix channels and pubsub which you suggested). Of course it matters only with big number of queries, but it’s not as scalable. Definitely not good when you think of complex queries and big number of entries accessed really frequently. Secondly you need to write much more code - basically double of ecto's context, schemas and changesets.

While I like GraphQL idea I think that there should be an library which translates ecto schemas to Absinthe schema. Such translation could be simply configurable by few pattern-matching optional modules defined in library configuration. I have even started working on it, but for now I stopped working on it since I have other priorities. Last time I have answered in one topic which somebody asks for generating schema, so it’s not what I’m feeling personally.

OvermindDL1 · January 18, 2019, 6:24pm

What do you mean need extra database queries? The big point of absinthe is to aggregate the calls together, which it does very well (plenty of injection points for optimizing your calls). In general using absinthe should ‘reduce’ the number of database calls while also reducing the actual data sent from the DB over the wire as well.

Oh absolutely not I’d say. If someone is just translating Ecto Schemas to Absinthe Schema’s then they are using absinthe horribly inefficiently. Absinthe is an API binding layer, you define something in it kind of like an RPC call where the user passes in args and you return data (in the format the user wants to minimize data transfer), and that call could potentially make 1 query, no query, many queries, optimizing joins, whatever else like that.

Absinthe Schema’s are on the same level as a ‘Context’ (to use Phoenix terminology), it is not a mapping to Ecto Schemas. You can outright replace your Context’s with Absinthe schemas (and a simple internal call layer, this is what I’ve done in one of my projects) and then you get a unified interface for both internal context calls, web calls, and websocket calls.

Eiji · January 18, 2019, 6:42pm

Before you do any call to real data you need to validate query which means fetch some other data from database. Not really big problem when you have simple guest or logged in, but it becomes more complicated when you have other model (like Company), many-to-many relation between User and Company + roles on top of such relation. Imagine that for every simplest query you need to fetch at least 3 records and perform 2 tokens (for each model) + role authorization.

Yeah, so take a look at this issue:

As you can see 25 June 2017 and still not closed.

Definitely no. absinthe have its own conventions (for sure I’m not saying bad or good). You need to pass correct success or error response. Rewriting context means rewriting all tests for them and different usage in iex.

Here is how it works:
absinthe parser → absinthe schema → absinthe resolver → ecto context → ecto schema → after all of that some absinthe middlewares are called which are defined in absinthe schema

Of course it could be much more simple, but just take a look at one pull request:

github.com/Ethelo/kronky

Add support for nested changeset error messages

Ethelo:master ← mirego:master

opened 06:10PM - 24 Jan 18 UTC

simonprev

+894 -441

This adds basic support for errors in nested changesets. I’m open to change the …formatting for nested fields, currently it’s: `association.field` but it could also be `associationField` or something else. The dot notation fitted our use case. ## Results in a GraphQL payload _Client belongs to tutor with validation for the tutor association_ ```elixir %{ data: %{ "updateClient" => %{ "messages" => [ %{"code" => "required", "field" => "firstname"}, %{"code" => "required", "field" => "tutor.email"}, %{"code" => "required", "field" => "tutor.firstname"}, %{"code" => "required", "field" => "tutor.lastname"}, ] } } } ``` I also fixed a warning and a failing test in `test/test_helper_test.exs`

Again it’s opened for almost year. Nested errors are edge-case. absinthe works well with simple solutions, but there still needs to be some works for more advanced usage.

Look that we have ecto schema, ecto migration and absinthe schema. They basically looks really similar (simple REST app). After some changes migrations could be created automatically (like creating tables) with optional code for changing database data. Same goes to absinthe. If you want to simple create delete/get/list/show/update schema you need basically copy-paste ecto schema with really small changes. All of that could be again generated by some library where only few differences could be solved by optional module when you define rules using simple pattern matching.

OvermindDL1 · January 18, 2019, 7:10pm

That’s no queries at all on my system on the average call. On initial access the cache (ETS) is checked, if exists it uses the pre-aggregated permission data, if it does not exist then it populates it, and updates to the related permission data clear all cache of anything that references that permission data, it will be out of sync for at most a few milliseconds. That’s not even an issue on subscriptions either as that is setup once and then pushed there-after with no more checks either (I kill a users socket whenever permissions change so they can be reset back up with the new permissions). Even then the aggregated permissions structure is a single database call and one ldap call here, not multiple database calls (unsure why multiple database calls would even be needed?)

Yep, that’s the absinthe_ecto addon, which is absolutely not something I recommend, ever. ^.^

That’s what the call layer is for. I have many of my context calls migrated to an absinthe layer without any change of the original context api. If I want something super specific then I can just write the graphql query string, but those are very rare and far between (mostly because I usually promote them to a full call for shared access).

And those are changeset errors, back to mapping ecto schemas to absinthe schema’s, which is the wrong way to go. You don’t return a changeset error via http for example, you massage it first into an appropriate error code and message on an HTML page, it doesn’t make sense to just return it straight (how would you even serialize that across a socket?). GraphQL interterfaces shouldn’t map to database tables, they are the API layer, not the DB layer. It’s nothing more but a more efficient version of old RPC.

Though I also consider the usual REST style a significant horribly inefficient mis-design as well, so… ^.^;

That is way too low level and prevents more situational uses. For an API to get something like a, oh, ‘account’ record (to use my system as an example), in REST you’d just get the account data back (which could be as simple as nothing but an account_id mapping, or should it return the over 400 possible related account values that the system holds, or something else…?), but via absinthe you give it a variety of query information such as the access token of the current user trying to get it (else anonymous, which won’t give you much), an account_id, banner_id, pidm, last activity, etc… etc… and request back information like username, legal_name, ssn, image_url, phone numbers, etc… etc… The backend request will make two database calls (to two different databases), an LDAP call, some permission data is encoded into the queries for early-outs to minimize data returns, a full permission test is performed on each returned values for sanitization, if no default for something is allowed then a full error is returned, otherwise something like nil is returned for disallowed parts, etc… It is a full context-like call, it is not a simple Repo.doSomething call and nor would it map cleanly to one at all. No mapping layer I’ve seen would ever be able to handle something like this.

In addition, if the user suddenly wants/needs more data related to an account like their classes for X semester, their course schedule, their grades, all of that is a single extra thing to add to the graphql request and still causes no extra DB queries, it gets cleanly joined on properly and all. With REST this would all be either a monstrous record returned back full of data that the requester doesn’t want at a backend cost higher than what is needed 99.999% of the time, or lots of little REST calls each with it’s own setup and tear-down and validation with lots and lots of little DB queries when it’s needed.

Eiji · January 18, 2019, 8:10pm

OvermindDL1:

That’s no queries at all on my system on the average call. On initial access the cache (ETS) is checked, if exists it uses the pre-aggregated permission data, if it does not exist then it populates it, and updates to the related permission data clear all cache of anything that references that permission data, it will be out of sync for at most a few milliseconds. That’s not even an issue on subscriptions either as that is setup once and then pushed there-after with no more checks either (I kill a users socket whenever permissions change so they can be reset back up with the new permissions). Even then the aggregated permissions structure is a single database call and one ldap call here, not multiple database calls (unsure why multiple database calls would even be needed?)

This means that I would need to cache all logged in User schema + all Company (which were created by User or to which User joined) + all many-to-many relations. How long such cache works? In WebSocket is simple - User disconnects and connection with assigns is closed, but somebody could go for a cup of tea for “5 min”. Tokens would need to work really short or I would need to cache lots of database entries for longer time even if User is already offline.

The problem is that both absinthe_ecto and absinthe_relay were required in project … While personally I could do pagination on myself sometimes you are just stopped by project requirements …

Sorry, I did not get it. If you have migrated ecto context to absinthe resolver then you don’t have ecto context. Even so your ecto context API were unchanged?

I wanted to just return changeset error to front-end which handles errors on its own. kronky helped with that until front-end have started to use nested relations in insert/update API.

After you add filters, orders and some extra API it should be enough.

OvermindDL1:

That is way too low level and prevents more situational uses. For an API to get something like a, oh, ‘account’ record (to use my system as an example), in REST you’d just get the account data back (which could be as simple as nothing but an account_id mapping, or should it return the over 400 possible related account values that the system holds, or something else…?), but via absinthe you give it a variety of query information such as the access token of the current user trying to get it (else anonymous, which won’t give you much), an account_id, banner_id, pidm, last activity, etc… etc… and request back information like username, legal_name, ssn, image_url, phone numbers, etc… etc… The backend request will make two database calls (to two different databases), an LDAP call, some permission data is encoded into the queries for early-outs to minimize data returns, a full permission test is performed on each returned values for sanitization, if no default for something is allowed then a full error is returned, otherwise something like nil is returned for disallowed parts, etc… It is a full context-like call, it is not a simple Repo.doSomething call and nor would it map cleanly to one at all. No mapping layer I’ve seen would ever be able to handle something like this.

Looks like you did not get it. For x models you have exactly same API (REST is just simple example to visualize), but with different fields and for different ecto schema. I did not talked about any custom queries.

For example if you have schema then you can create its object and input_object version - you just need and extra type mapping.

defmodule MyApp.Generator.Rules.Type do
  def input(_object_name, _field_name, _ecto_type), do: …

  def output(_object_name, _field_name, _input_type), do: …
  # for example:
  def output(_, _, :string), do: :string
  def output(:model_name, :field_name, _), do: :custom_output_type
  def output(_, _, :other_custom_type), do: :output_version_of_other_custom_type
end

defmodule MyAppWeb.Schema do
  # …

  use MyApp.Generator, model: MyApp.SchemaName, except: [:mutation_or_query_name]

  # definition of custom types …

  # …
end

config :my_app, generator_rules: [
  type: MyApp.Generator.Rules.Type
]

Sorry, by REST I mean REST-like API i.e. for company: create_company, delete_company and update_company mutations + get_company and list_companies queries. All of that repeats for all models and it could be easily generated without writing same things multiple times.

Just imagine that you want to change list_#{plural_schema_name} to (for example) all_#{plural_model_name} - for all schemas. Try to find a mistake (like simple typo) looking at 4 “copies” of same code (ecto schema, ecto migration, absinthe object and absinthe input_object). If you are just writing 4 times smaller code then you have less chance to make typo.

For sure as long as people are paying me I could write even thousands of ecto schema copies for thousand of libraries, but I think that there is no sense for that from business side.

ibarch · January 18, 2019, 8:21pm

It is considered a bad practice to apply internal DB design to an external GraphQL schema. You shall distinguish them as separate non-related entities, which just resemble each other. It leads to better API design overall. Check out this tutorial by Shopify team.

Sad but true.

OvermindDL1 · January 18, 2019, 8:42pm

System dependent thing I’d think. In my system a user doesn’t go more than 5 minutes without doing ‘something’ so the cache stays fresh, so I’d be fine setting it to 5 minutes (I have it set to 1 hour or to elapse the oldest ones if it gets too full). Every cache access keeps it fresh so it’s good to go, and it’s force cleared for any related entries if any permissions change is made in the database (a super rare event at that but it’s optimized to a single query as well).

I highly recommend using Cachex, it’s wonderfully functional for such work, and it’s pluggable if you want it distributed as well (I prefer per-node caches, the pubsub will distribute out the clear command regardless for my use).

Even if someone goes for a cup of tea for 5 minutes and your timeout is 4 minutes, it’s not like it’s a biggy to reload a single entry.

The tokens themselves timeout after 12 hours regardless (one working day), but if they are early out’d (deleted session by admin, logged out, whatever) then the token gets denied access regardless on the server side via the cache (with fallback to database lookup), still a super fast check in the 99.999% of cases and the slow cases are only a few ms longer (which is then cached for repeated checks).

Websockets are not quite that simple, you also have to deal with a variety of situations, like what if they log out via another tab or are force logged out by an admin? You could of course just kill the socket but then it would reconnect with the same token, which would give access again, so you have to check the token too, but what if they went through a tunnel and lost connection for a minute, if the token’s lifetime is too short then they’d have to be forced to reload the page, losing whatever work they were doing, instead of it just transparently reconnecting, etc… etc…

If a user is offline for an extended period then the janitor will clean out their cache entry after a time, when they come back it will repopoulate as normal back to cache.

These are all things I’ve had to deal with, the WiFi at work is very spotty in some areas (media room, machining building, etc…).

As an aside for anyone reading, always remember to paginate based off of some base index for the pagination cursor, don’t just paginate off all the data or it will shift around annoyingly for the user as data is added and removed while they paginate. ^.^

Oh I was speaking of phoenix style context, I.E. a normal namespaced Module interface, I’m not actually sure what an ecto context is, but a phoenix-style context (normal module namespace) shouldn’t be leaking it’s internals (like using ecto or whatever else) outside of its interface anyway.

What happens when you suddenly have a need for more than one changeset or getting errors from some remote web call or an LDAP server or so, when not supplied via a proper API layer then now you have multiple things to handle on the front-end, which shouldn’t care about this stuff and should only care about the simple API interface and displaying things to the user and passing query data in by the expected API format. I don’t let Ecto leak into my Phoenix Controllers at all, it’s a cross of concerns, the database and view layers should never ever intermingle, only interact via specific API bounds where details don’t leak across. This is probably more the Erlang method of programming than what most do but it serves me very well at work through multiple refactors of things without API’s needing to change.

So…recreate GraphQL? ^.^
Except without a standard style of the queries or easy way to expand them? ^.^

Yeah this is what I have an issue with though, the database format/schema/implementations/etc… should never ever leak outside of any API bounds, you shouldn’t even really be able to determine what the storage is like internally or how it’s accessed or what happens across module API bounds. Any kind of such leakage makes it exceptionally hard to refactor the internals to do it a different way if the need arises, or ties you too tightly to the implementation and not the API, etc… etc…

I’m not actually sure what object or input_object is here or what’s trying to be exampled, more information please? ^.^;

What’s changing here? What schema or model?

If I want to change a word, even a misspelling in an API interface, I just Don’t, hard Don’t. Once I’ve deployed an API it never ever changes. It can be deprecated and eventually removed if it can no longer be implemented via other methods, but it will never Ever be changed, new interfaces will be made instead. One of my absolute hard things in anything that is deployed wide like what I have at work is that API’s are immutable once created, They Never Ever Change.

Would be the wrong pattern anyway, ecto schemas should never ever leak outside of Module API interfaces.

Yes this exactly!
I like that link from a quick cursory glance, may need to add it to my lists. ^.^

But yep, in general GraphSQL is the ‘API’ to the system, it does not expose how the storage works, tables, schemas, anything else like that, rather it encodes ‘Actions’. Same as a proper (phoenix-like) Context Module should do.

Yeah, it could do with a great internal access layer, I have a bit of horror of macro’s that, although work, is not something I want to touch or debug. ^.^;

Eiji · January 18, 2019, 9:47pm

All information about cache is correct, but this does not changes what I said. Still you need to re-validate everything and not just get whole session context and just deal with it.
I know how much cases comes in WebSockets but even this makes WebSockets (especially in Phoenix) pretty easy to use. Of course handling multiple tabs/devices is a bit more complex scenario (in both XHR and WebSocket), so we should not go too deep.
I recommend to write a scraper for old ASP.NET websites with lots of data and session time limits. You would see how problematic it is.
Yup, I recommend to read:
We need tool support for keyset pagination
and similar posts about it.
We are talking about same, so yeah phoenix context. I was just curious how you migrated X to Y and keep X unchanged. If it’s migrated then it means that it no longer exists, right?
hmm, front-end just requested to do something on server. Server validated it and returned changeset error. Personally instead of changeset messages I would send something like %{error_code: :validation_error_code, field_name: :field_name, field_type: :string}, so front-end could pattern match an error code and display proper error. Look that it’s not how I’m solving things - I’m just giving an example. My ideas are a bit more … interesting.
Looks like you did not get it. I have wrote about adding filters and orders arguments to list_model_name query, so I could have something more than Repo.all(ModelName)
Simple example:

defmodule MyApp.MyContext.ObjectName do
  schema "table_name" do
    field :field_name, :ecto_type
  end
end
# …
object :object_name do
  field :field_name, :absinthe_type
end
# …
input_object :object_name_input do
  field :field_name, :absinthe_type
end
# …

Naming convention. If you are going to rename field you need to:
a) change ecto schema
b) write migration
c) change absinthe object
d) change absinthe input object
e) change extra fields like field arguments in filters and orders

So many changes for just one typo. Instead you could do:
a) change ecto schema
b) call mix ecto.gen.migration --auto (diffing schemas and creating simple column raname migration)
c) all absinthe fields c), d) and e) points are generated from a), so there is no need to change them
d) add git hook for checking if database is up-to-date with schemas

For me it looks much simpler.
There are 3 differences between ecto schema and absinthe API:
a) notation defmodule MyApp.MyContext.ObjectName do … schema "table_name" do … end … end vs object :object_name do … end and input_object :object_name_input do … end
b) hidden internal fields - just not visible for GraphQL (like password_hash)
c) types

Rest i.e. schema name and fields name are just copied. Company is changed to :company and :company_input (plain underscore). Maybe you are searching for similar words for every field, but I do not see any sense here.

Look I’m not talking about exposing internal fields like User.password_hash, but see that every model like User, AnyOther is changed to :user object, :user_input input object, :any_other object, :any_other_input input object etc. I just don’t see a value with copying all fields just to not show few fields. Better is to use generator and tell him which fields should not be exposed in API. Typical REST-like GraphQL mutation and queries API could be easily generated by just few rules.

defmodule MyApp.Generator.Rules.Objects do
  def input_object(ecto_schema), do: :"#{object(ecto_schema)}_input"

  # of course just to visualize
  def object(MyApp.FirstContext.SameName), do: :same_name
  def object(MyApp.SecondContext.SameName), do: :second_same_name
  def object(ecto_schema), do: module_to_name(ecto_schema)

  defp module_to_name(module),
    do: module |> Module.split() |> List.last() |> Macro.underscore() |> String.to_atom()
end

benwilson512 · January 18, 2019, 11:51pm

There’s a lot here, but I’ll just jump in and say that doing batching with Absinthe.Relay is exactly no harder than manually writing out a lateral join or window function, which is what you’d have to do anyway. Which is to say, in 95% of cases where you’re loading stuff, Dataloader can be basically as efficient as ordinary custom code without having to actually do all of the work, and in the 5% of cases where it can’t be used, you can just write the same gnarly SQL you’d need to do anyway.

Eiji · January 19, 2019, 12:20am

The problem with that solution is that it’s not generic at all. In my plans there is something bigger which should help a lot, but it’s not a big priority for my use cases now. In short I plan to combine all pros of: WebSockets (for all requests including mutations and queries) + Streaming and generator-based rules, so you only need to declare what to do in edge-cases or optionally change naming convention like company vs get_company queries name.

Before you say I know all absinthe features and its WebSocket support. It’s still not enough for me.

benwilson512 · January 19, 2019, 12:33am

I don’t understand what you mean by not generic. FWIW now that ecto window functions are in, Dataloader is 1 PR away from doing this seamlessly. I just don’t have the time.

Eiji · January 19, 2019, 1:56am

I mean what I wrote earlier. Of course don’t understand that absinthe is not done well, but it has few problems. There are some generic things which as I said with good configuration could be easily generated to handle most of use cases without writing tons of lines (absinthe schema). Rest things (like custom queries) could be handled manually like current absinthe API allows.

Look that absinthe standalone is awesome, but think that much more libraries requires to define something which looks so similar like comparing ecto schema and absinthe object and input object. Both ecto and absinthe are awesome, but copying and copying almost same code is what developers avoids by writing generator macros. There is nothing to hide - people are just lazy - especially developers. If there is any way to automate something then there is only matter of when I would have a time to automate it.

Doing lots of things manually causes stupid mistakes. If you type get/list queries lots of times (for all models) then it’s just a matter of time to made a stupid typo like lits_users vs list_users. Therefore for more complex apps with 10+ or 20+ models (especially with arguments which adds advanced filtering) it’s not good solution, because schema is going to be huge and instead of working on next features we are expanding schema. I believe that there are lots of 100+ schema-like models in other apps. I don’t see writing everything so many times manually to handle so generic things.

I believe that there should be written a configurable generator which should cover exactly your 95% of absinthe schema code. Look that we are writing ecto schema and after that we are writing its version of object and input object which basically looks like schema , but called with different macros and types. Things like naming or type conversion could be trivially handled by simple pattern matching.

As said I have already started advanced generator code. I have tried it and it works perfectly (internally - needs some cleanup to turn it into library). I just need to finish it, make it much more configurable, do cleanup etc., but as said I have other priorities for now.

benwilson512 · January 19, 2019, 2:56am

With 1.5 it should be very easily to programmatically generate absinthe objects from ecto schemas without any macro magic FYI. With that, and two Dataloader PRs (one for limits, one for top level queries) it’ll I think improve those issues a lot. Unfortunately I’ve had to take a step back for a bit from continuous development on those fronts myself, but PRs are always welcome.

Eiji · January 19, 2019, 4:21am

That’s really interesting … this would make my library even easier to implement. I will take a look soon.

btw. Here I mentioned 1 issue and 1 PR in absinthe_* libraries which are not solved for about year. Last time I have take a look at your rework milestone in absinthe and estimate also does not looks promising. How things are going there?

benwilson512 · January 19, 2019, 1:20pm

As mentioned in the issue, the blocker was Ecto. Ecto only received window support at the end of October. Adding this feature to Dataloader is a task I have always said is best left to a dedicated contributor. There’s nothing particularly Absinthe specific about it, it can be accomplished by anyone with a good grasp of SQL and Ecto.

I’d spent quite a while in October and November getting Absinthe to the point where 1.5 dynamic schema generation works well. Its alpha status right now refers to really the new SDL support. We may punt on full SDL support for a later release so that the core internal changes within 1.5 can go out.

EDIT: I’m a little unclear on what you mean by “validating takes extra db calls”. Validating the structural correctness of the GraphQL document doesn’t involve a database at all, and on 1.5 it doesn’t even involve copying any memory from the schema since that’s all on the shared heap. Any validations you do about “can the current user do this thing” is exactly the same amount of loading you’d do to answer that question whether using GraphQL or not.

Eiji · January 19, 2019, 2:35pm

As said - it’s project requirement. There could be multiple roles based on User status (guest, logged in etc.) and role field in many to many relation with Company. So before you are accesing any model you basically need to fetch 3 entries (separately or with inner joins). Since absinthe mutations and queries happens in XHR calls it forces to always create (or load from cache full context) which are extra database queries for every request which in full WebSocket session is not needed.

benwilson512 · January 19, 2019, 2:36pm

How would that not be the case for a REST api? Notably if you do this over websockets, your context loading happens just on connect and not on every all.

Eiji · January 19, 2019, 2:38pm

Yes, yes - it’s simpler, because there is no extra code calls for every absinthe query - only when login and select current Company.