I’m currently working on an application for work that fits within our continuous delivery cycle. The application uses GitHub extensively in the form of receiving webhooks from GitHub and in GETing and POSTing data.
At present, we are working with GitHub data as string-keyed maps directly decoded from the received payload (e.g. Poison.decode!(github_payload)
). I’m currently thinking through the pros and cons of mapping that data to structs for representing the data internally in the application. I’m weighing the option of using Ecto.Schema
to define these structs and nested relationships. At present, none of this data is persisted on our end.
As one example, when the “status” of a commit changes in GitHub, our application receives a StatusEvent
from GitHub’s webhook. The code below is an abbreviated example of using Ecto to map this data to structs:
defmodule Example.Github.StatusPayload do
use Ecto.Schema
import Ecto.Changeset
@primary_key false
embedded_schema do
field :sha, :string
field :description, :string
field :state, :string
embeds_one :commit, Example.Github.Commit
embeds_many :branches, Example.Github.Branch
embeds_one :repository, Example.Github.Repository
end
def from_json(data) when is_binary(data) do
Poison.decode!(data) |> from_json
end
def from_json(data) when is_map(data) do
%__MODULE__{}
|> cast(data, [:sha, :description, :state])
|> cast_embed(:commit)
|> cast_embed(:branches)
|> cast_embed(:repository)
|> apply_changes
end
end
defmodule Example.Github.Commit do
use Ecto.Schema
import Ecto.Changeset
@primary_key false
embedded_schema do
field :sha, :string
field :url, :string
end
def changeset(struct, data) do
struct |> cast(data, [:sha, :url])
end
end
# etc.
Here is a test to show how this might be used in the application.
defmodule Example.Github.StatusPayloadTest do
use ExUnit.Case
alias Example.Github.{StatusPayload, Commit, Repository, Branch}
setup do
json_file = "test/support/fixtures/status_event.json"
binary = json_file |> File.read!
data = binary |> Poison.decode!
{:ok, data: data, binary: binary}
end
test "parsing to structs from map", ctx do
payload = StatusPayload.from_json(ctx.data)
assert_payload_parsed_to_structs(payload)
assert_branches_parsed_nested_structs(payload.branches)
end
def assert_payload_parsed_to_structs(payload) do
assert %{
commit: %Commit{},
repository: %Repository{},
branches: [_branch | _other]
} = payload
end
def assert_branches_parsed_nested_structs(branches) do
for branch <- branches do
assert %Branch{commit: %Commit{}} = branch
end
end
end
My question: is this overkill for working with exteral JSON data? Each time I go down this route and begin setting up all the schema code for the various GitHub resources we use, I feel like I’m overdoing it. On the other hand, I can see advantages of having the data represented in this way. What do you think?