GraphQL Abstraction Breakdown?

mmport80 · December 12, 2018, 2:19pm

Hi All,

I know there are a lot of smart GraphQL users here, so I thought IÄd ask…

We have an API, I stumbled across an issue which doesn’t sit right with me.

There two “endpoints”, alarms and events. Each alarm belongs to an event. E.g.

query alarms { …, event { … } }

You get the idea.

We have a lot of alarms, so we also use Relay pagination.

We want to sort and filter alarms.

All good.

Thing is, sometimes we want to sort or filter alarms using the associated underlying event data.

What I have done, is

add filter / sort arguments to the alarms connection query, which refer to the event data.

E.g. I sort alarms by the ‘priority’ field which is found in an event.

Behind the scenes, I use a SQL join to implement.

While this works, it doesn’t seem to be a ‘natural fit’.

I.e. I am mixing ‘event’ related functionality with alarm functionality.

Perhaps I should:

create a new “event-alarm” endpoint, which works more like a Rest endpoint, i.e. it is more monolithic, less ‘normalised’
Another idea, is add another layer of abstration, ala Relay, but Relay is complicated enough…

Any other ideas? Both solutions don’t really feel v idiomatic!

peerreynders · December 12, 2018, 2:54pm

In the spirit of ubiquitous language you need to first give a precise definition of what you understand alarm and event to be. Somebody else may have a different different perspective of what those are - and next thing there’ll be a conversation using the same words but where the participants are just talking past one another.

To the point:

Each alarm belongs to an event.

I found that statement surprising.

In my mind you set an alarm and events capture information whenever those settings are being met (i.e. the alarm is being triggered) - so I would say events belong to alarms (or more precisely the settings of the alarm at the time of the creation of the event).

mmport80 · December 12, 2018, 3:12pm

Correct - the nomenclature is a little misleading.

In the database - we have a description of each possible “event”.

Then when an alarm is triggered, it is of type “event” - i.e. it has an event_id foreign key in the alarms table, etc…

peerreynders · December 12, 2018, 3:27pm

While you are asking a GraphQL question (a domain to which I claim less than little expertise in) I’m sensing the possibility that you may have some inconsistencies on the detail level of the underlying model that are manifesting themselves when you are trying to formulate a higher level abstraction.

Right now those details are clear as mud.

So far it seems that an alarm is an instance of an event or possibly an event is part of an alarm.

I am mixing ‘event’ related functionality with alarm functionality.

By your description both concepts are coupled.

Is it that all alarms are events but not all events are alarms OR
An event is part-of an alarm - if so what are alarm’s non-event parts (if any)?

create a new “event-alarm” endpoint, which works more like a Rest endpoint, i.e. it is more monolithic, less ‘normalised’

Is there any value to exposing your notion of an “event” via the GraphQL interface? You may have good reason to only expose an API to a concrete and segregated set of event types, alarm being one of them without ever exposing a generic event interface.

Then there is the separate decision of representation, e.g. does alarm have an embedded “event” field or do you denormalize the event data into the general alarm data.

About Face 3: The Essentials of Interaction Design (2007), p.31

Your database tables are simply a projection of the “domain model” into the relational realm, possibly adjusted to accommodate non-functional requirements (e.g. performance with reference to the queries and updates being issued). The design of your GraphQL interface should likely be guided by the “domain model” - not the currently implemented datamodel in your database.

hlx · December 12, 2018, 3:34pm

Maybe I’m missing the point here but if you want to filter alarms based on events then why not filter events and then load the associated alarms?

hlx · December 12, 2018, 3:43pm

Side note: I think you might also benefit from https://hexdocs.pm/absinthe/Absinthe.Resolution.Helpers.html#dataloader/1

mmport80 · December 12, 2018, 4:34pm

why not filter events and then load the associated alarms?

oooh, that’s v smart…

the thing is, on the frontend we have a grid which uses both event & alarm data, joined into one table.

i should be able to sort & filter by data which belongs to either alarms or events.

so, your answer would mean querying like

query { alarms { ..., event { ... } } }

if I am filtering / sorting via data found in alarms.

and doing

query { events { ..., alarms { ... } } }

if I am doing event related things.

And then put the result of either query into the same grid (in principle the same data is returned from both queries, just in slightly different forms).

Problem is, if I am sorting / filtering with event & alarm related fields simultaneously…

mmport80 · December 12, 2018, 4:47pm

All good questions.

In my mind, GraphQL suits a normalised representation of data.

And therefore fits some of our database schema well. The mapping in these cases is so natural.

But I am uneasy about this “normalisation” hypothesis.

I think the nub of my problem is: the SQL idea of “joining” data, doesn’t fit the GraphQL abstraction well.

Then again, maybe this problem just tells me that my normalisation hypothesis is wrong!

At the end of the day, I am happy with the API endpoint the Front End sees, I am offering the functionality to display the data with ease.

I am unhappy with how GraphQL doesn’t fit this particular case as neatly as others… Maybe this is too academic…

mmport80 · December 12, 2018, 5:08pm

I just found the solution, a little more convoluted than the usual query.

But for posterity, if you need to sort / filter “joined” data in an idiomatic GraphQL way, you should probably do something /roughly/ along these lines:

query { 
  ... 
  eventConnection( 
    orderBy: [...], 
    filterBy: [...]) { 
      ..., 
      alarmConnection( 
        orderBy: [...], 
        filterBy: [...]
         ) { 
          event ... 
           } 
        } 
      } 
   }

Which means, I can probably rip up the Ecto related work I have done this week, and just use this query instead : )

Not sure whether I should be happy or sad.

gregvaughn · December 12, 2018, 5:11pm

Is this an example of object-relational impedance mismatch that has had much written about it?

peerreynders · December 12, 2018, 5:23pm

But I am uneasy about this “normalisation” hypothesis.

There are different degrees of normalization - however aspects of 3NF are often denormalized to accommodate real world constraints. That’s how some databases have a 3NF logical model but have a physical model that is partially denormalized for performance reasons.

So you have to ask yourself - does that “normalization” add value in your GraphQL model (even if there is value for it in the database)?

To answer that you have to have a clear vision of the purpose of the GraphQL model which may give it a different shape than your data model.

While it is convenient to simply expose the data model, it couples the client tightly to your data model - so whenever you change the data model you have an end to end ripple effect (consumer to implementation coupling).

By properly decoupling the GraphQL model from the data model you are establishing a fire break for that ripple effect. But that decoupling is more work as you have to implement the model transformation layer.

Why Data Models Shouldn’t Drive Object Models (And Vice Versa) - 2002

Consumer-Driven Contracts: A Service Evolution Pattern

OvermindDL1 · December 12, 2018, 5:46pm

Always be happy when deleting code, that means there is less to maintain! ^.^

mmport80 · December 12, 2018, 6:14pm

I think I jumped the gun, but just might suit my particular use case. Have to experiment tomorrow…

mmport80 · December 12, 2018, 6:27pm

Yeh I ask myself too, whether a normalised graphql api is the way to go.

But, I suppose:

with data transfer (e.g. mobile) normalisation helps nudge FE devs to grab only what they need
helps with consistency of API
one source of truth
it feels natural with GraphQL (I’d love to see a survey on how others approach this though!)

But the joining of data does jar (still)…

I suppose GraphQL still has a way to go…

peerreynders · December 12, 2018, 6:45pm

If you google around you’ll see the notion of GraphQL Schema != DB Schema appear repeatedly (example).

So the fact you are joining data shouldn’t jar you - the fact it jars you should be of concern because it suggests that you believe that the DB schema is the ultimate, pristine representation of your information.

It’s not.

It’s simply a relational representation (hopefully) optimized for the use of the RDBMS engine.

Bonus: Forget About Data, Know Your Domain!

This one is a bit less practical but I still believe it’s one of the most important things we need to care about when building any API that will last and will be great to use by our integrators. When designing the shape of your GraphQL schema, try to truly understand what you’re trying to model, and understand your domain the best you can.

With GraphQL having a type system, we see a lot of tools appearing these days that try to generate GraphQL types from databases, ActiveRecord models, or a REST API. While this is tempting to use, and definitely useful at times, by copying our data model or an existing API, we forget to that GraphQL lets us really shape the interface we want to our domain. Try to use that power instead of shaping your API using your data’s shape as inspiration (AvoidAnemic GraphQL).

By doing that, implementation details can change but your API should stay stable(r) as long as we modeled our domain correctly!

GraphQL Schema Design: Building Evolvable Schemas

suazi · December 16, 2018, 2:17pm

@mmport80 Sounds like you’re writing an IDS in elixir?

mmport80 · December 17, 2018, 12:31pm

what’s an IDS? : )

mmport80 · December 17, 2018, 12:41pm

The examples in the link remind me of how I approached things a year or more ago.

Of course they make sense - until you get some better intuition for GraphQL - and the blog is 100% correct, that is what you should avoid.

The problematic difference in semantics boils down to “joining”.

A) In SQL joining is like a flatmap / monadic kind of thing. Everything stays table-like.

B) In GQL, ‘joining’ isn’t really joining at all. It’s traversal and returning back a hierarchy of JSON objects…

As for Domain Driven Development, I have been thinking recently about balancing polymorphism with this…

You want to strike a balance between coupling too tightly with specific uses of the API (a la REST) and too closely to the Database (as you pted out)…

andre1sk · December 17, 2018, 1:46pm

Intrusion detection system

peerreynders · December 17, 2018, 4:43pm

The problematic difference in semantics boils down to “joining”.

The “joining” I’m referring to is purely in the service of data transformation or more specifically data mapping. My starting point:

The DB Schema and GraphQL Schema are distinct representations even if to some degree they contain similar data.
The DB Schema is optimized for the application or service that it is supporting.
The GraphQL Schema is optimized for the client(s) that it is supporting.
When moving between these separate representations data mapping has to be performed.

While tools can certainly algorithmically transform a DB Schema into a functionally equivalent GraphQL schema, the resulting schema will be neither decoupled nor shaped with the client’s needs in mind.

So a client-oriented GraphQL schema will always be more work because it has to be independently designed based on client needs, which means that at the very least the data mappings have to be defined manually and at worse those mappings have to be implemented manually.

I’m still under the impression that you are in pursuit of some all encompassing, canonical, harmonized data schema.

SOA in practice (2007) - 4.2.2 Heterogeneous Data Types p.38:

(the whole section can be viewed via books.google)

Even when it comes to DDD bounded contexts there is duplication of representations to keep the clients of the context decoupled from the context’s internal implementation. For example the sales and support contexts will have separate and distinct internal representations of customer which themselves still differ from the customer representation that is being exchanged between the two (and unifying these representations polymorphically (or worse through inheritance) could lead to undesirable levels of coupling). This smacks in the face of DRY but when it comes distributed systems, dogmatic DRY can lead to tight coupling.