Using GraphQL (and perhaps Absinthe) in messages?

Rich_Morin · July 26, 2023, 6:08pm

Let’s say that I’m building a set of BEAM processes that will serve requested information. It looks like GraphQL’s Schema Definition Language (SDL) could be used to define and document available (sets of) items and GraphQL could be used to request and transfer them.

Although GraphQL is generally used for web-based APIs (e.g., via JSON), this seems to be an implementation detail. So, I could easily imagine using Elixir’s Structs instead.

Has anyone tried using GraphQL (and perhaps Absinthe) in BEAM messages? Alternatively, is there some other approach I should investigate? Inquiring gnomes need to mine…

-r

dimitarvp · July 26, 2023, 9:52pm

Are you after the stricter typing validations that SDL could give you?

Rich_Morin · July 26, 2023, 11:38pm

Are you after the stricter typing validations that SDL could give you?

Only in a fairly general sense. As I read it, a GraphQL server defines its API in terms of available information. The SDL schema looks a lot like JSON; the result is like an Elixir Struct on steroids. Like a Struct, it’s a named, scripting-friendly data structure. Unlike a Struct, however, a GraphQL schema:

can be defined and published dynamically
can inherit root object types such as Mutation and Subscription
can include fine control of data types, using type modifiers
can use built-in scalar and enumeration types

In short, GraphQL and SDL seem to have a lot of useful semantics, of which stricter typing validations are only a part. (Aren’t you glad you asked? :-})

In use, an SDL schema defines and names the server’s offerings. A client can request any desired subset(s) of a server’s schema. If the server can, it will supply the data, using the names and structure specified in the schema.

As a use case, assume that I’ve decided to update a server process, extending its offerings. The server can publish the extended version of the schema and clients can start using the added information immediately. Works for me…

-r

Rich_Morin · July 31, 2023, 12:22pm

I don’t seem to be getting much traction on this topic; after several days, I’ve only received one inquiry. So, maybe I need to clarify my question a bit more. That way, maybe someone will be able to give me a clue…

Basically, I’d like to use GraphQL (and preferably Absinthe) for communication between Elixir processes. I know that I could do this using HTTP (etc), but that seems both awkward and inefficient.

Is there a way that I could transmit Absinthe queries and responses as BEAM messages? This would let my Actors request and supply particular sets of data, using a well-documented protocol. (It might be more convenient and efficient to use Erlang terms, rather than JSON, but that’s not a gating issue…)

olivermt · July 31, 2023, 12:24pm

I dont think that would give you much value over structs or maps or simply just using term to binary.

I am on my phone or I’d elaborate more

dimitarvp · July 31, 2023, 12:33pm

OK but what’s wrong with the normal Elixir process communication? Why is it deficient for your needs?

If you need a bit stronger typing then you can reach for something that implements i.e. ProtoBufs or Cap’n’Proto, or even MessagePack because it doesn’t require you to have a schema beforehand but you can still assert on what you receive.

derek-zhou · July 31, 2023, 5:48pm

A lot of the complexity of GraphQL is to reduce the number of network round-trips. If you use GraphQL for inter-process communication within the BEAM, you are paying for the complexity without the benefit.

Rich_Morin · July 31, 2023, 10:32pm

OK but what’s wrong with the normal Elixir process communication? Why is it deficient for your needs?

I like “normal Elixir process communication” just fine, but it doesn’t have some of the nifty semantics I discussed in my initial reply. To motivate the discussion, here’s a more specific use case than I described above ( apologies in advance for the SciFi aspects :-).

Background

Observer

Erlang’s Observer is an amazing piece of work which collects a lot of interesting information. However, it’s a bit of an information silo:

The information is only available via a (rather restrictive) GUI:
- The API is limited to start, start_and_wait, and stop.
- I haven’t found any way to tweak the presentation format.
- I’m pretty sure it wouldn’t play nicely with a screen reader.
- …

Observer CLI

@zhongwencool’s Observer CLI library is also very nifty, and definitely much less of an information silo. So, for example, one could:

request a particular report
parse the text and terminal control sequences
extract and reorganize the desired information

However, this seems more than a little roundabout. So, it would be nice if Observer CLI could output reports as serialized data.

Discussion

Let’s say that we wanted to take advantage of the data collected by these sorts of tools. For example, I could imagine:

logging it for later analysis
using it in an exploratory Livebook
feeding it into an AI wizard of some sort
making it accessible via a screen reader
and a pony…

Problem is, we don’t really want to receive a firehose of data for most of these use cases. So, we need a way to obtain specified subsets of the available data, possibly in a dynamic fashion. And that is an area in which GraphQL appears to excel…

GraphQL’s Complexity

@derek-zhou said:

A lot of the complexity of GraphQL is to reduce the number of network round-trips. If you use GraphQL for inter-process communication within the BEAM, you are paying for the complexity without the benefit.

Interesting; might you be able to point me to (or provide) a discussion of this? For example, does this added complexity impact either performance or usage difficulty? (I’d agree that reducing the number of round trips isn’t all that interesting in the BEAM environment.)

dimitarvp · July 31, 2023, 10:43pm

I don’t know man, all this simply sounds like one OpenTelemetry collector away from being a solved problem.

Maybe you can clarify further on what are these nifty semantics about? I am not seeing them in your OP.

Rich_Morin · July 31, 2023, 11:01pm

“simply”? Yes, in the sense that creating and maintaining an OpenTelemetry collector which provides all the data found in Observer & Observer CLI is just a “simple matter of software” :-}.

However, let’s not get lost in that aspect of the use case. I think the basic question I’m trying to answer is: What is the best way to set up a data harvesting and serving process so that its clients can specify desired subsets of the available information?

dimitarvp · July 31, 2023, 11:40pm

Still OpenTelemetry, you can filter at the receiver side. I know that HoneyComb and OpenObserve can do it.

Also I don’t know about :observer in particular but you can have a background process periodically reporting resource usages. I agree it’s not very simple though!

In any case, either I’m systematically missing your idea and goals, or you’re over-fixated on a particular solution, so it’s probably best for me to bow out and give others a chance to chime in.

benwilson512 · August 1, 2023, 12:27am

Can you elaborate on your use case a bit more? I am having difficulty thinking of a scenario where BEAM message communication and HTTP are both roughly on the table. Are these processes not on in the same node?

slouchpie · August 1, 2023, 1:57am

Just checking if I understand correctly what OP is talking about. Would a naive solution simply be to send graphql document strings and handle them by calling Absinthe.run/3?

It is a really cool idea because the big win is the classic “the client defines what it needs”. So you can keep (response) messages as small as possible on a case-by-case basis. This would definitely be useful if you’re working with LiveView.

I have been working on a GraphQL API where much of the data is retrieved from long-running processes rather than the database. It was easy enough to write generic dataloader “resolve” functions that call named processes. That is not what you want but it is similarly…deviant.

Rich_Morin · August 1, 2023, 2:27am

I don’t know enough about GraphQL (let alone Absinthe) to know whether @slouchpie’s “naive solution” would work. Might someone else know?

Rich_Morin · August 1, 2023, 3:36am

Can you elaborate on your use case a bit more?

I could, but I’ve probably elaborated this use case far too much already (:-). So, just assume that a “server process” has a whole lot of information and assorted “client processes” should be able to ask for specified subsets.

I am having difficulty thinking of a scenario where BEAM message communication and HTTP are both roughly on the table.

As @slouchpie says: … the big win [in GraphQL] is the classic "the client defines what it needs”. From that perspective, the use of HTTP and JSON are “simply” implementation details.

Are these processes not on in the same node?

They might or might not be, depending on the use case involved. If data is being collected from a remote node, it might make sense for the relevant server to reside there. OTOH, it appears that a local Observer instance is able to monitor various nodes, so YMMV…

benwilson512 · August 1, 2023, 3:43am

At a high level, the overhead introduced by GraphQL isn’t a huge trade off when compared to an HTTP request, and in particular an HTTP request over the wire is already going to have to serialize and deserialize data.

However between Elixir processes the overhead of building a query, validating that query, executing it, deserializing it, etc is substantially more heavy weight than a message passing. For Elixir processes on the same node I would strongly suggest a more “elixir native” solution.

If you have processes that are between nodes and you’re using the distributed protocol in lieu of a more traditional solution then sure you could send a graphql query to the other process and then just call Absinthe.run on it if you want, then send the result back.

Rich_Morin · August 1, 2023, 5:20am

Points taken, but what might a more “elixir native” solution look (and act) like? Hmmmmm.

Let’s assume that the server process publishes an “offerings” Struct, specifying the structure and naming of its available data.

To get started, a prospective client would send the server a “request” Struct (basically, a subset of the offerings Struct). The server would validate this once, then return a token.

After that, the client could request the subset by supplying the token. Upon receipt of a valid token, the server would copy the specified subset into a temporary data structure and send that back to the client.

If (as in @slouchpie’s use case) the server needs to call functions or even interrogate other processes, things could get a lot trickier. But we can start by assuming that all of the server’s data is nicely cached.

Does Elixir have some sort of pattern-matching magic (or whatever) that could efficiently support this sort of functionality? ELI5…

Tuxified · August 1, 2023, 12:14pm

So If I understand it right, the goal is to be able to ‘query’ processes, similar to Observer ? Every process can have certain data, for example in the form of Structs but having it defined in SDL you’re able to get a subset of it?

I guess you could GraphQL to achieve that, but keep in mind that every application that needs to use this also needs to adhere to that standard (so dropping/using it in an everyday project is not possible without big changes).

Since you already mentioned Observer, have you looked where/how it got it’s information? Observer works for all processes that implement the gen_server behavior, but won’t work for bare bones process . For example when you click a process in Observer to inspect it’s state, it uses :sys.get_state, sending that process a message to inquire about it’s state. If that process doesn’t adhere to gen_server behavior there’s good chance Observer will be kept waiting indefinitely.

But let’s say you have a way of doing that structured querying, will it solve your (which?) problem? Even in the field of data science, having a structure in your data isn’t sufficient you still need understand what it represents to turn data into information.

If you only want to go for the structured extraction, you could extract the way observer gets it’s info into a new lib and put it into some repository, but even then you have to massage the data before using it as information.

LostKobrakai · August 1, 2023, 12:15pm

Regarding your Observer example I think there’s a misconception about how things work.

Observer is not the tool providing all the available information. It’s just an aggregator of information. Just like Phoenix LiveDashboard or Observer CLI.

At least the first two, if not all three, use the same underlying functions OTP provides to get access to information. If you want to use the information for something else you could use those very same functions. So if you want to tackle the tasks you mentioned it’s not really a problem of “how do I extract those things from observer” as all the information (or at least the underlying data) is available to you directly as well. Skip dealing with observer and go to the source.

Tuxified · August 1, 2023, 12:15pm

Ow… on the question about “some sort of pattern-matching magic”, you could look into ETS and match specs