Hey y’all so i’m working on a project that takes in Ecto schema and generates JSONSchemas for them. Part of the project requires that the user be able to document descriptions on the ecto fields so I can put them in the JSONSchema.
I’ve done some splunking and I’m stuck. I was wondering if y’all could point me in the right direction.
I was thinking of leveraging something like module docs, or somehow retrieving the extra field opts at runtime. I’ve looked at the Ecto.Schema.field/4 source code and can’t find any leads there. Similarly it seems like the @doc attribute can only be defined at the module level.
Ideally I’d like to provide a code experience something like this.
defmodule SpamPredicition do
use Ecto.Schema
@primary_key false
schema "predictions" do
@doc "I think ideally i'd put field descriptions as @doc attributes"
field(:class, Ecto.Enum, values: [:spam, :not_spam])
field(:reason, :string, description: "But I could tolerate field descriptions here")
field(:confidence_score, :float)
end
end
Any ideas? Maybe there’s a better approach that’s easier to implement?
From the Ecto.Schema docs I don’t see a great way to do something like this.
If you wanted a solution that would get you close for not a whole lot of effort, you could create a module attribute for field docs that accumulate across the module then do with them what you will in a before_compile. To keep these simple you’d have to include the field names in the module attribute to link them without messing with the calls to field/3.
defmodule SpamPredicition do
use Ecto.Schema
@before_compile MyLib
@primary_key false
schema "predictions" do
@field_doc class: "I think ideally i'd put field descriptions as @doc attributes"
field(:class, Ecto.Enum, values: [:spam, :not_spam])
@field_doc reason: "But I could tolerate field descriptions here"
field(:reason, :string)
field(:confidence_score, :float)
end
end
Killing the need to include the field name in the module attribute would be a bit harder. You could potentially create your own field/3, belongs_to/3, has_many/3, etc. macros that took the current module attr, let’s again call @field_doc, and translate it to an accumulating module attr, let’s call @field_doc_acc, before falling back to the wrapped Ecto.Schema macro.
These are great suggestions. In my particularly usecase I found an easier way to go about it.
I’m working on the GitHub - thmsmlr/instructor_ex: Structured outputs for LLMs in Elixir library and the JSONSchema is getting sent into an LLM. What I realized is that the LLM should be resilient to whether the description is at the schema level or the field level. So I decided to just do something like,
defmodule SpamPredicition do
use Ecto.Schema
use Instructor.Validator
@doc """
## Field Descriptions:
- class: Whether or not the email is spam
- reason: A short, less than 10 word rationalization for the classification
- score: A confidence score between 0.0 and 1.0 for the classification
"""
@primary_key false
embedded_schema do
field(:class, Ecto.Enum, values: [:spam, :not_spam])
field(:reason, :string)
field(:score, :float)
end
In some sense, for my usecase, this is even more flexible because you can write whatever you want about the semantics of this schema in the schema level @doc and the AI can make the appropriate associations, since you know… it’s AI or whatnot.
If I come back to this under other circumstances i’ll definitely explore these solutions. Thanks y’all for your help!