A good way to design Graphql schema for heterogeneous entity?

andre1sk · February 4, 2018, 4:14am

Being a graphql nub I need to come up with a way to represent “msgs” from various sources (FB, Twitter etc.) as a single entity there are obviously shared properties across all types but each type can have specific properties. Could anyone with more experience recommend what would be a good approach to representing this via graphql?

benwilson512 · February 4, 2018, 4:08pm

Hey @andre1sk! This seems like a good opportunity for an interface. An interface will let you express fields that are common to each message type, while still making it possible to request specifics about each message if desired.

An interface is like an Elixir Behaviour or a Java interface, it defines a contract that each object which adheres to the interface has to supply.

Here’s a vague sketch of what your schema might look like:

@desc """
Core characteristics of a message
"""
interface :message do
  field :body, :string
  field :posted_at, :datetime
  field :posted_by, :user
end

object :facebook_message do
  interface :message
  # fields common to all messages
  field :body, :string
  field :posted_at, :datetime
  field :posted_by, :user, dataloader(Facebook, :user)

  # fields specific to facebook
  field :like_count, :integer
end

object :twitter_message do
  interface :message
  # fields common to all messages
  field :body, :string
  field :posted_at, :datetime
  field :posted_by, :user, dataloader(Twitter, :user)

  # fields specific to twitter
  field :retweetCount, :integer
end

Then you might have some root messages field that will return a list of messages which may be of each type:

query do
  field :messages, list_of(:message) do
    resolve fn _, _, _ ->
      {:ok, Messages.get()}
    end
  end
end

If someone wants to query your list of messages and they don’t care about any of the specifics they can just do:

{
  messages {
    body
    postedBy { name }
  }
}

If you also want some stuff specific to each type you can do:

{
  messages {
    __typename
    body
    postedBy { name }
    ... on FacebookMessage { likeCount }
    ... on TwitterMessage { retweetCount }
  }
}

Note the use of the __typename introspection field so that each message will also come back annotated with its concrete type, so you can handle them as desired.

Does this help give you a high level direction?

peerreynders · February 4, 2018, 7:00pm

(Just out of curiosity)

At this point I know next to nothing about GraphQL schemas but I also noticed that it supports unions. With interface the polymorphic path seemed the most obvious solution - but I also remembered Commonality/Variability analysis and the option to find what varies and encapsulate it.

So there is also the possibility to define service specific objects (e.g. :facebook_info with a :like_count field; :twitter_info with a :retweet_count field) and union them into a :service_info_type and have a :service_info field in the :message object.

What are the potential problems with that approach?
(It seems more “compositional”).

The more I think about - I’m starting to wonder about the use of the term polymorphism in the context of GraphQL. The LSP is about behavioural subtyping but we’re dealing with data here - there is no behaviour. So is this just another form of data inheritance hiding behind the more positive “polymorphism” moniker?

andre1sk · February 4, 2018, 7:06pm

@benwilson512 thank you so much for very detailed answer this looks like a perfect approach to this problem. BTW. really looking forward to the release of paper version of Craft GraphQL APIs in Elixir with Absinthe, Elixir community is very lucky to have such dedicated and knowledgeable contributors!

benwilson512 · February 4, 2018, 7:57pm

Unions are indeed the other “abstract type” as GraphQL calls them, and your hypothetical schema structure would totally work. To close the loop, here’s what the GraphQL queries would look like:

{
  messages {
    body
    postedAt
    serviceInfo {
      ... on FacebookInfo { likeCount }
      ... on TwitterInfo { retweetCount }
    }
  }
}

Notably, the ... on clauses for serviceInfo are not optional. You must always specify the concrete type when working with unions.

Whether or not there is behaviour happening here is an interesting question. Let’s consider the interface option again, and something like the postedAt field. The thing to keep in mind for GraphQL is that every field is responsible for running whatever code is necessary to produce the value for that field, and the code for doing this is called the “resolver function”. Like a Java interface (or what I remember of Java interfaces) the GraphQL interface does not constitute an implementation, it lacks any resolver function.

It’s up to the facebook_message and twitter_message fields to actually accomplish something. By default this just does a Map.get in Elixir but let’s flesh that out a bit so we can have a difference.

object :facebook_message do
  interface :message
  field :posted_at, :datetime do
    resolve fn facebook_message_struct, _, _ ->
      {:ok, facebook_message_struct.posted_to_timeline_at}
    end
  end
  # other fields
end

object :twitter_message do
  interface :message
  field :posted_at, :datetime do
    resolve fn twitter_message_struct, _, _ ->
      {:ok, twitter_message_struct.tweeted_at}
    end
  end
  # other fields
end

The main thing to note here is that our hypothetical facebook_message_struct and twitter_message_struct values have different internal elixir structures, and consequently we need slightly different code to run in order to get a value for what we’re calling the posted_at field. Thus when you run a GraphQL query and do

{
  messages { postedAt }
}

Absinthe is going to run one or the other resolver functions depending on the actual concrete type of any given message. To me this seems like it qualifies as real polymorphism, since the functions that sit on each field meaningfully constitute behaviour.

Thoughts?

Thanks! We’re looking forward to it being out as well, I’m very happy with where we’ve gotten with it. I think it’ll be a really great help for folks looking to explore what I think is an increasingly central technology.

peerreynders · February 4, 2018, 10:51pm

I would classify that as a major limitation for unions. It seems

A client query can’t express “I don’t care about the Twitter info (or any other for that matter) but I do want the Facebook info”
A client would likely be negatively impacted if the server added a ThirdInfo type to the ServiceInfoType.

Thoughts?

As far as I understand it, your description focuses largely on the implementation details of the resolvers, the adapters that map the physical data structures to the logical model expressed by the interface specification - I personally wouldn’t classify as behaviour in the OO sense as ultimately the client is only served structured data - not capability that is collocated with that data.

I guess in the context of GraphQL “polymorphism” refers to structural subtyping.

And it also seems that with interfaces a client can be more selective as to which type implementations it couples itself to (e.g. narrow on FacebookMessage (to grab the likeCount) while only using the Message interface for all the other possible types). With unions it’s coupled to all existing union types even if it is only interested in some.

Thank You for indulging my curiosity.

benwilson512 · February 4, 2018, 11:36pm

I think I worded this badly. By “always specify the concrete type” I do not mean that you must always exhaustively list all concrete types. I simply mean that if foo is a union field you can’t just foo { bar }, you need to specify some foo ...on A { bar } where A is some type with a bar field.

Interfaces let you ask for certain fields without caring about which sub-type the value of a field is. In order to get any child fields out of a union field you need to specify the variant you care about, but you do not need to specify all variants. This is for at least one of the reasons you mentioned: adding an additional type to the schema would break all existing clients that use that field.

Fair, although when I’m talking about GraphQL I’m referring to the entire specification, which encompasses not just the document format but also an execution model. Really it’s the output format that’s the implementation detail. You can have it as JSON or protobuf or whatever. I would think of GraphQL less as a data transform mapping of input string to JSON blob and more as a full query language with its own specified semantics. In this view, the client’s to do { messages { body ... on FacebookMessage { likeCount } } } I think meaningfully counts as the client choosing to run behaviour that walks them to a polymorphic object, and then running code to get results depending on the type.

However I’ll be the first to say that my formal education here is limited, and I could be entirely wrong

peerreynders · February 5, 2018, 4:31pm

I’m sure that your account is in line with the specification and Facebook’s wording - I just wish these specifications came with glossaries that you could reference, i.e. ‘what do you exactly mean by “X”?’

The October 2016 spec does try to explain what an object is

GraphQL Objects represent a list of named fields, each of which yield a value of a specific type. Object values should be serialized as ordered maps, where the queried field names (or aliases) are the keys and the result of evaluating the field is the value, ordered by the order in which they appear in the query.

… though it doesn’t seem to use the word polymorphism. It’s merely alluded to under interface:

GraphQL objects can then implement an interface