Synchronizing external media with changing api resources

I have some resources Entry that have the relationship :media pointing to EntryMedia. I periodically pull in all current Entrys from an API, and get a list of urls that I populate related EntryMedia with. I fetch those files from the urls and upload them to a cdn, storing the resulting new url.

The next time I pull in all Entrys from the API, I’m running into a problem – I want to conditionally process the urls. IE if the returned list of urls has some new, or some missing, I want to add or remove those, and if some have changed (they have an index, which is an integer, they’re shaped like %{index => _, url => _ } ) I want to update those as well. But because adding, removing, and changing involves fetching remote images, and uploading images to my cdn, as well as deleting, I’d like to do it only when necessary. In the vast majority of cases, it won’t be necessary – the Entrys probably won’t change their EntryMedia very often. But they might, and I have to handle that.

I’m searching for the recommended way to handle this in an Ash-like way. Should this be logic in custom changes? Is this pattern a solved thing already? Am I thinking about this correctly? How best to pass this information to manage_relationship, etc

thanks :grimacing:

It’s hard to say without seeing the specifics, but generally speaking you should be able to do what you want using things like after action hooks and/or manage_relationship. manage_relationship effectively handles the process of ignoring existing things, and/or updating them with a given action. Lets say you had something like this on EntryMedia

update :reimport do
  change fn changeset, _ -> 
    if changeset.index < Ash.Changeset.get_attribute(changeset, :index) do
      # fetch some stuff, make some modifications to the changeset
   else
     Ash.Changeset.set_result(changeset, {:ok, changeset.data}) # no need to do anything
   end
  end
end

then on Entry you could have:

create :upsert do
  upsert? true

  change manage_relationship(:entry_media, type: :direct_control, on_match: {:update, :reimport}) # here we customize the behavior when something is found that already exists in the relationship to use the `reimport` action
end

You can read more about manage_relationship here: Ash.Changeset — ash v2.17.3

There is a lot of information there, but it is very useful to understand.

Keep in mind at the end of the day manage_relationship is a convenience around your own hooks, so you could easily do something like:

create :create do
  argument :entry_media, {:array, :map}

  change fn changeset, _ -> 
    Ash.Changeset.after_action(changeset, fn changeset, result -> 
      entry_media = changeset.arguments.entry_media
      # do whatever you like, this is happening in the same transaction as the main `insert`.
    end)
  end
end
1 Like

Okay, I’ve found that I needed

changeset
    |> Changeset.manage_relationship(:media, new_media_urls,
      type: :direct_control,
      use_identities: [:original_url] # <--
    )
    |> Changeset.change_attribute(:media_fetched, true)

Re-reading the docs for use_identities, I see that it only uses the primary key by default, which in my case is just some uuid. All I had to do was tell Ash to look at the original_url instead and it was happy to. (I originally zoomed in on :identity_priority and misread it with an implicit assumption that Ash was looking at all the identities by default).

I now see the correct actions being called for creates and updates. Looks like it should be good. Thank you :sweat_smile:

old: rubberducking with the reply box

I think manage_relationship is exactly what I want, but I might not be holding it correctly yet.

So, all of the following occurs based on an ash_oban trigger on an Entry, when the Entry meets certain criteria (usually right after initial import of the Entry)

I’ve restructured so the EntryMedia have original_url as an identity, expecting manage_relationship to match on that and then:

if there’s an EntryMedia match on original_url, update fields other than the original_url (currently just the index), to avoid pointless file-shuffling if the only thing that’s changed is metadata.

if there’s no match, use the create action to make a new EntryMedia and then mark it as needing further processing by another ash_oban trigger.

I see in the documentation that direct_control works this way:

[
  on_lookup: :ignore,
  on_no_match: :create,
  on_match: :update,
  on_missing: :destroy
]  

In my Change I’m calling:

defp set_media(changeset) do
    new_media_urls =
      Changeset.get_attribute(changeset, :source_id)
      |> MyApp.ERTP.get_entry_images()

    changeset
    |> Changeset.manage_relationship(:media, new_media_urls, type: :direct_control)
    |> Changeset.change_attribute(:media_fetched, true)
end

And on my EntryMedia I have the relevant actions marked as primary:

 # primary create for entry's manage_relationship
    create :create do
      primary? true
      # ...upload this image or mark for later uploading through oban triggers
    end

    # primary update for entry's manage_relationship
    update :update do
      primary? true
      # original_url is an identity so this should only change the index, if the index has changed
      accept [:index]
    end

I also have a destroy marked as primary that should run a change to delete the reuploaded media from s3. But we’re not there yet.

When it attempts to run an update over existing EntryMedia, I see:

"** (Ash.Error.Invalid) Input Invalid\n\n* Invalid value provided for original_url: has already been taken.\n\nnil\n\n
1 Like