How to read column from calculated index after write?

voger · September 13, 2017, 8:32pm

I have a schema named UserProfile. That schema has a field :birth_date which is a date. That schema has also a field named :zodiac which is calculated by :birth_date.

  schema "user_profiles" do
    # Other fields
    field :birth_date, Timex.Ecto.Date
    field :zodiac, :string

To keep things as up to date and consistent as possible I have set a index on expression in the database to calculate the :zodiac value. So the :zodiac is calculated on the database itself, not in elixir code.

execute("""
CREATE OR REPLACE FUNCTION zodiac(user_profiles)
    RETURNS text AS $$
    SELECT
    CASE
    WHEN (date_part('month', $1.birth_date) = 3 AND date_part('day', $1.birth_date) >= 21) OR (date_part('month', $1.birth_date) = 4 AND date_part('day', $1.birth_date) <= 19) THEN 'ari'
    --- ... More conditions
    END
    $$ STABLE LANGUAGE SQL;
    """)
    execute("CREATE INDEX zodiac_index ON user_profiles(zodiac(user_profiles));")

But now when I create a user_profile the returned struct still has the :zodiac set to nil.

test "list_users/1 returns all users" do
    users = for _ <- 1..1 do
      fixture(:user)
    end
    assert Accounts.list_users() |> Repo.preload(:user_profile) == users
  end

fails with

     left:  [%Mysite.Accounts.User{
             #... 
             user_profile: %Mysite.Accounts.UserProfile{
                      #...
                      zodiac: "tau"}}]
     right: [%Mysite.Accounts.User{
                 #...,  
                 user_profile: %Mysite.Accounts.UserProfile{
                     # ..., 
                      zodiac: nil}}]

I tried setting

field :zodiac, :string, read_after_writes: true

but it complains

 ** (Postgrex.Error) ERROR 42703 (undefined_column): column "zodiac" does not exist

So how can I have the field available in the struct right after creation?

wfgilman · September 13, 2017, 9:05pm

Does fixture(:user) create the user_profile? I don’t see user_profile getting created in your test.

voger · September 13, 2017, 9:13pm

Yes. fixture(:user) uses Accounts.create_user/1 which creates the user and the user_profile schemas. Then associates the profile with the user through a one_to_one relationship and finally writes all this in the database.

here is the fixture code

  def fixture(:user, attrs \\ @create_attrs_atom) do
    {:ok, user} = Accounts.create_user(create_attrs_atom())
    user
  end

where create_attrs_atom/0 just returns a map with the create attributes and here is the Accounts.create_user/1 function

  def create_user(params) do
    profile = UserProfile.registration_changeset(%UserProfile{}, params)

    user_cs =
      %User{}
      |> User.registration_changeset(params)
      |> put_assoc(:user_profile, profile)

    with {:ok, user} <- Repo.insert(user_cs) do
      Log.info(%Log{user: user.id, message: "user created"})
      send_welcome(user)
      {:ok, user}
    end
  end

aseigo · September 13, 2017, 9:32pm

Does this cast the zodiac entry from params?

voger · September 13, 2017, 10:34pm

No. The zodiac field is calculated from a function in the database. It doesn’t even exist as a column in the table.

andre1sk · September 13, 2017, 11:05pm

What is the point of doing it this way?
you’d need to run a manual select
SELECT birthdate, zodiac(user_profiles.*)
or have a view
or …
birthdate does not exactly change so why go through all the pain?

gregvaughn · September 14, 2017, 12:05am

Well, there’s your problem. Thanks for introducing me to index on expression but the zodiac field isn’t on the table, so Ecto can’t really do anything with it automatically. To access it, you’ll have to issue your own query with a select fragment to call the db function.

Though I agree with @andre1sk that I’m not seeing what this gains you. Birthdates don’t change, so your initial Ecto insert could calculate it from the birthdate and write it to a literal field in the table, which Ecto knows how to handle moving forward. Sure, that may cost a bit more disk space, but it’s not clear that’s really a problem you’re facing.

voger · September 14, 2017, 7:20pm

@andre1sk, @gregvaughn
I was thinking leaving this field editable for administrative tasks. Someone may want to correct the birth date.

The reason I am using the database to calculate the field is mostly for consistency of the data. Instead of having to check every time in a changeset for this change I leave it to the database to perform the checking and updating. I am thinking of it something like a database constraint/trigger.

Postgresql does provide tools for this kind of things. Why not use them?

Also from what I have tested Ecto can handle this transparently like a field in a schema.

iex(1)> Repo.get(UserProfile, 4)
[debug] QUERY OK source="user_profiles" db=46.0ms decode=27.3ms queue=0.2ms
SELECT u0."user_id", u0."first_name", u0."last_name", u0."birth_date", u0."area_of_residence_type", u0."area_of_residence", u0."zodiac", u0."country_code", u0."inserted_at", u0."updated_at" FROM "user_profiles" AS u0 WHERE (u0."user_id" = $1) [4]
%Mysite.Accounts.UserProfile{__meta__: #Ecto.Schema.Metadata<:loaded, "user_profiles">,
 area_of_residence: "Grand Canyon", area_of_residence_type: 0,
 birth_date: ~D[1965-02-02],
 country: #Ecto.Association.NotLoaded<association :country is not loaded>,
 country_code: "US", first_name: "John",
 inserted_at: ~N[2017-09-14 21:48:54.723199], last_name: "Doe",
 updated_at: ~N[2017-09-14 21:48:54.723199],
 user: #Ecto.Association.NotLoaded<association :user is not loaded>, user_id: 4,
 zodiac: "aqu"}
iex(2)>

iex(8)> query = from p in UserProfile, select: [p.zodiac]
#Ecto.Query<from u in Mysite.Accounts.UserProfile, select: [u.zodiac]>
iex(9)> Repo.all query
[debug] QUERY OK source="user_profiles" db=39.6ms decode=0.1ms queue=0.2ms
SELECT u0."zodiac" FROM "user_profiles" AS u0 []
[["tau"], ["tau"], ["aqu"]]
iex(10)>

Also schema less

iex(10)> from p in "user_profiles", select: %{zodiac: p.zodiac}
#Ecto.Query<from u in "user_profiles", select: %{zodiac: u.zodiac}>
iex(11)> Repo.all query
[debug] QUERY OK source="user_profiles" db=33.2ms decode=0.2ms queue=0.2ms
SELECT u0."zodiac" FROM "user_profiles" AS u0 []
[["tau"], ["tau"], ["aqu"]]
iex(12)>

I honestly don’t understand what use cases can go wrong with this approach.

The only negative that I see so far is that it doesn’t return that field (properly?) on Repo.insert and I suppose on update too.

I guess I could create a zodiac column in that table and maintain it with database triggers or even maintain it vigilantly with elixir code.

I really appreciate your input. I only have been studying SQL since this August so all this things are really new to me. Maybe I am too eager to use fancy features.

P.S.: In SQL the result is returned as expected

INSERT INTO user_profiles (user_id,
			first_name, 
			last_name,
			birth_date,
			country_code, 
			area_of_residence_type, 
			area_of_residence, 
			inserted_at, 
			updated_at) 
VALUES (4, 'John', 'Doe', '1965-02-02', 'US', 0, 
   'Grand Canyon', current_timestamp, current_timestamp)
RETURNING user_profiles.birth_date, user_profiles.zodiac;

birth_date |zodiac |
-----------|-------|
1965-02-02 |aqu    |

gregvaughn · September 14, 2017, 7:42pm

It’s all about tradeoffs. Using these lesser known postgres features is cool, but the more you go out of the “mainstream” the less help generic tools like Ecto become. I understand what you say about ensuring data integrity within the db, however that means more to me if multiple apps are accessing the db, and less if my elixir app is the only accessor.

And in all honesty the “vigilant” approach with Elixir/Ecto is probably fewer lines of code than doing it in the db. I’d add an extra step in my changeset function that looks to see if birth_date is being updated, and if so update the zodiac.

voger · September 14, 2017, 7:57pm

Thanks. This adds a perspective I didn’t think of.

andre1sk · September 14, 2017, 10:31pm

There is no point in having low cardinality indexes the planner will never use them for starters . But Big bonus points for getting into fairly isoteric areas so quickly . I think many people who used postgress for a lot longer do not have a clue about these features. I highly recommend all talks by Bruce Momjian https://momjian.us/ to get a better understanding of PG.

shamanime · September 15, 2017, 12:34am

Update your schema, set read_after_writes.

field :zodiac, read_after_writes: true

Docs.

Edit: nevermind. Completely missed the point here. OP said he tried this and it’s an index.

andre1sk · September 15, 2017, 2:03am

if zodiac is just an Index that will not help much