GraphQL Abstraction Breakdown?

Hi All,

I know there are a lot of smart GraphQL users here, so I thought IÄd ask…

We have an API, I stumbled across an issue which doesn’t sit right with me.

There two “endpoints”, alarms and events. Each alarm belongs to an event. E.g.

query alarms { …, event { … } }

You get the idea.

We have a lot of alarms, so we also use Relay pagination.

We want to sort and filter alarms.

All good.

Thing is, sometimes we want to sort or filter alarms using the associated underlying event data.

What I have done, is

  1. add filter / sort arguments to the alarms connection query, which refer to the event data.

E.g. I sort alarms by the ‘priority’ field which is found in an event.

Behind the scenes, I use a SQL join to implement.

While this works, it doesn’t seem to be a ‘natural fit’.

I.e. I am mixing ‘event’ related functionality with alarm functionality.

Perhaps I should:

  1. create a new “event-alarm” endpoint, which works more like a Rest endpoint, i.e. it is more monolithic, less ‘normalised’

  2. Another idea, is add another layer of abstration, ala Relay, but Relay is complicated enough…

Any other ideas? Both solutions don’t really feel v idiomatic!

1 Like

In the spirit of ubiquitous language you need to first give a precise definition of what you understand alarm and event to be. Somebody else may have a different different perspective of what those are - and next thing there’ll be a conversation using the same words but where the participants are just talking past one another.

To the point:

Each alarm belongs to an event.

I found that statement surprising.

In my mind you set an alarm and events capture information whenever those settings are being met (i.e. the alarm is being triggered) - so I would say events belong to alarms (or more precisely the settings of the alarm at the time of the creation of the event).

2 Likes

Correct - the nomenclature is a little misleading.

In the database - we have a description of each possible “event”.

Then when an alarm is triggered, it is of type “event” - i.e. it has an event_id foreign key in the alarms table, etc…

While you are asking a GraphQL question (a domain to which I claim less than little expertise in) I’m sensing the possibility that you may have some inconsistencies on the detail level of the underlying model that are manifesting themselves when you are trying to formulate a higher level abstraction.

Right now those details are clear as mud.

So far it seems that an alarm is an instance of an event or possibly an event is part of an alarm.

I am mixing ‘event’ related functionality with alarm functionality.

By your description both concepts are coupled.

  • Is it that all alarms are events but not all events are alarms OR
  • An event is part-of an alarm - if so what are alarm’s non-event parts (if any)?

create a new “event-alarm” endpoint, which works more like a Rest endpoint, i.e. it is more monolithic, less ‘normalised’

Is there any value to exposing your notion of an “event” via the GraphQL interface? You may have good reason to only expose an API to a concrete and segregated set of event types, alarm being one of them without ever exposing a generic event interface.

Then there is the separate decision of representation, e.g. does alarm have an embedded “event” field or do you denormalize the event data into the general alarm data.


About Face 3: The Essentials of Interaction Design (2007), p.31

Your database tables are simply a projection of the “domain model” into the relational realm, possibly adjusted to accommodate non-functional requirements (e.g. performance with reference to the queries and updates being issued). The design of your GraphQL interface should likely be guided by the “domain model” - not the currently implemented datamodel in your database.

2 Likes

Maybe I’m missing the point here but if you want to filter alarms based on events then why not filter events and then load the associated alarms?

2 Likes

Side note: I think you might also benefit from https://hexdocs.pm/absinthe/Absinthe.Resolution.Helpers.html#dataloader/1

1 Like

why not filter events and then load the associated alarms?

oooh, that’s v smart…

the thing is, on the frontend we have a grid which uses both event & alarm data, joined into one table.

i should be able to sort & filter by data which belongs to either alarms or events.

so, your answer would mean querying like

query { alarms { ..., event { ... } } }

if I am filtering / sorting via data found in alarms.

and doing

query { events { ..., alarms { ... } } }

if I am doing event related things.

And then put the result of either query into the same grid (in principle the same data is returned from both queries, just in slightly different forms).

Problem is, if I am sorting / filtering with event & alarm related fields simultaneously…

All good questions.

In my mind, GraphQL suits a normalised representation of data.

And therefore fits some of our database schema well. The mapping in these cases is so natural.

But I am uneasy about this “normalisation” hypothesis.

I think the nub of my problem is: the SQL idea of “joining” data, doesn’t fit the GraphQL abstraction well.

Then again, maybe this problem just tells me that my normalisation hypothesis is wrong!

At the end of the day, I am happy with the API endpoint the Front End sees, I am offering the functionality to display the data with ease.

I am unhappy with how GraphQL doesn’t fit this particular case as neatly as others… Maybe this is too academic…

I just found the solution, a little more convoluted than the usual query.

But for posterity, if you need to sort / filter “joined” data in an idiomatic GraphQL way, you should probably do something /roughly/ along these lines:

query { 
  ... 
  eventConnection( 
    orderBy: [...], 
    filterBy: [...]) { 
      ..., 
      alarmConnection( 
        orderBy: [...], 
        filterBy: [...]
         ) { 
          event ... 
           } 
        } 
      } 
   }

Which means, I can probably rip up the Ecto related work I have done this week, and just use this query instead : )

Not sure whether I should be happy or sad.

1 Like

Is this an example of object-relational impedance mismatch that has had much written about it?

But I am uneasy about this “normalisation” hypothesis.

There are different degrees of normalization - however aspects of 3NF are often denormalized to accommodate real world constraints. That’s how some databases have a 3NF logical model but have a physical model that is partially denormalized for performance reasons.

So you have to ask yourself - does that “normalization” add value in your GraphQL model (even if there is value for it in the database)?

To answer that you have to have a clear vision of the purpose of the GraphQL model which may give it a different shape than your data model.

While it is convenient to simply expose the data model, it couples the client tightly to your data model - so whenever you change the data model you have an end to end ripple effect (consumer to implementation coupling).

By properly decoupling the GraphQL model from the data model you are establishing a fire break for that ripple effect. But that decoupling is more work as you have to implement the model transformation layer.

Why Data Models Shouldn’t Drive Object Models (And Vice Versa) - 2002



Consumer-Driven Contracts: A Service Evolution Pattern

1 Like

Always be happy when deleting code, that means there is less to maintain! ^.^

2 Likes

I think I jumped the gun, but just might suit my particular use case. Have to experiment tomorrow…

Yeh I ask myself too, whether a normalised graphql api is the way to go.

But, I suppose:

  1. with data transfer (e.g. mobile) normalisation helps nudge FE devs to grab only what they need

  2. helps with consistency of API

  3. one source of truth

  4. it feels natural with GraphQL (I’d love to see a survey on how others approach this though!)

But the joining of data does jar (still)…

I suppose GraphQL still has a way to go…

If you google around you’ll see the notion of GraphQL Schema != DB Schema appear repeatedly (example).

So the fact you are joining data shouldn’t jar you - the fact it jars you should be of concern because it suggests that you believe that the DB schema is the ultimate, pristine representation of your information.

It’s not.

It’s simply a relational representation (hopefully) optimized for the use of the RDBMS engine.


GraphQL Schema Design: Building Evolvable Schemas

2 Likes

@mmport80 Sounds like you’re writing an IDS in elixir?

what’s an IDS? : )

The examples in the link remind me of how I approached things a year or more ago.

Of course they make sense - until you get some better intuition for GraphQL - and the blog is 100% correct, that is what you should avoid.

The problematic difference in semantics boils down to “joining”.

A) In SQL joining is like a flatmap / monadic kind of thing. Everything stays table-like.

B) In GQL, ‘joining’ isn’t really joining at all. It’s traversal and returning back a hierarchy of JSON objects…

As for Domain Driven Development, I have been thinking recently about balancing polymorphism with this…

You want to strike a balance between coupling too tightly with specific uses of the API (a la REST) and too closely to the Database (as you pted out)…

Intrusion detection system

2 Likes

The problematic difference in semantics boils down to “joining”.

The “joining” I’m referring to is purely in the service of data transformation or more specifically data mapping. My starting point:

  • The DB Schema and GraphQL Schema are distinct representations even if to some degree they contain similar data.
  • The DB Schema is optimized for the application or service that it is supporting.
  • The GraphQL Schema is optimized for the client(s) that it is supporting.
  • When moving between these separate representations data mapping has to be performed.

While tools can certainly algorithmically transform a DB Schema into a functionally equivalent GraphQL schema, the resulting schema will be neither decoupled nor shaped with the client’s needs in mind.

So a client-oriented GraphQL schema will always be more work because it has to be independently designed based on client needs, which means that at the very least the data mappings have to be defined manually and at worse those mappings have to be implemented manually.

I’m still under the impression that you are in pursuit of some all encompassing, canonical, harmonized data schema.

SOA in practice (2007) - 4.2.2 Heterogeneous Data Types p.38:

(the whole section can be viewed via books.google)

Even when it comes to DDD bounded contexts there is duplication of representations to keep the clients of the context decoupled from the context’s internal implementation. For example the sales and support contexts will have separate and distinct internal representations of customer which themselves still differ from the customer representation that is being exchanged between the two (and unifying these representations polymorphically (or worse through inheritance) could lead to undesirable levels of coupling). This smacks in the face of DRY but when it comes distributed systems, dogmatic DRY can lead to tight coupling.

2 Likes