Handling Data with Ecto That Persists Partially in S3

Hi, folks. I’m pretty fresh with Elixir, but experienced in other languages, so I’m seeking your guidance on the most “elixir-ish” approach to the problem I’m working on.

I’ve got a system that takes in data from an API. This data consists of metadata and user-generated data. The user data can potentially be large-ish, and I won’t need to query it using a database, so I’d like to store it in S3, while I store the rest of the metadata in Postgres.

I’m familiar with the workflow of getting a map of params from the user, running it through a changeset which casts it to an Ecto struct, and then passing that struct to a Repo to persist.

But in this case, the “user_data” attribute should be in-memory while I’m performing application logic. Right now I have the data type on the Ecto struct as a “string” to represent the path where it’s saved on S3. But during most of the application logic I’d like the user data to be a map, only turning it into a string right after saving to S3 and before saving it to the database. I’d also like to load it from S3 right after loading the metadata from the database.

What’s a good idiomatic approach to this? Here are the ideas I have right now:

  1. Change the type of user_data on the Ecto struct to be a map, so that changeset validation works, and write a small save function that takes the struct, saves the data to S3, swaps out the path, and continues.

  2. Create two structs. One which isn’t connected to a repo, only for in-memory work with a map as the “user_data” attribute, and one for the database with a string as the attribute.

My brain has been steeped in statically typed languages for years, so I’m having a hard time fitting my head around modeling this problem in a dynamic environment. Something’s telling me it can be a lot easier than I’m tending to make it.

Thanks!

1 Like

Hi!

I implemented this in my project:

It integrates s3 with ecto schema.

Hey, @karlosmid. Thanks for the recommendation! That looks really helpful. From browsing the docs I’m not sure whether it’s only compatible with Plug upload structs, or if it’d work for generic serialized data, but that’s a great direction to head in. Thank you.

1 Like

I suggest you to use transaction with Ecto.Multi.

I use this approach to write to the database and save to S3 bucket. If one operation fail both are rolled back.

Important to mention that you must handle the s3 saving failure but the Multi will let you know which transaction failed.

Check this link for more information on Multi: https://hexdocs.pm/ecto/composable-transactions-with-multi.html

Hope this helps

Best regards

3 Likes

Oh… and you can also check this library for AWS: https://hexdocs.pm/ex_aws/1.1.4/ExAws.html

2 Likes