devilray

devilray

What's the most balanced way to store multiple text formats?

While building small in-house LiveView applications, I’ve asked myself the same question with each new entity created: what’s the best way to store formatted text fields?

For example, many developers like to write in Markdown, most general users will want to write in HTML (using a GUI WYSIWYG widget). I prefer the power of a richer markup language such as ASCIIDOC or Org-Mode, and may even like to support others, including plain text, pointing to the need to accept multiple input formats, which need to be stored and presented back to users to later edit in the same language, as required.

For the benefit of fast server response, and low server overhead, I translate Markdown in a validation function, to HTML, saving this in a separate field. And, for the benefit of clean (clean of formatting characters) search results, I want to save a plain-text version to be used for full-text indexing and results presentation.

This makes the database schema quite messy as even with the minimal three choices, there is a variable amount of storage needed, from one to possibly three fields:

Input format Output format(s)
HTML plain text
Markdown/Org-mode/Asciidoc/other markup HTML, plain text
Plain text n/a

So, if the user fills in a textarea field in plain text, there is no need for further formats to be saved; if they write in HTML, only an additional plain text field should be stored; if Markdown, an HTML translation and a plain text one should be stored, for three fields in total.

While I could hardcode a markup, HTML and plain text field in every table, this seems wasteful as they won’t always be used. If I delegate them to an external table with a one-to-many relationship, I’m imposing an additional join for every query. Arguably the cleanest and most balanced solution would be to denormalise, creating an Ecto virtual field, storing a JSON column to store any translation as necessary, supporting any number of translations according to need.

I’m sure many of you have come across the same issue and have dealt with it in different ways. I’d love to learn how some of you have decided to handle this in your use case and why, with regards to performance, data redundancy, storage tradeoffs mentioned above?

Most Liked

garrison

garrison

From your post I got the impression that you were storing multiple formats (i.e. Markdown, plus HTML, plus Asciidoc) in different columns in every table.

What I was proposing is to store the raw content in one column (regardless of its type) and then render it to your desired output on-demand. Then keep a second column (an Enum, say) to store the type, and a third column in plaintext (for search). But that would be it, no more columns if you add more types.

Yeah, this is why I wouldn’t go that route - I would be afraid of destroying the original content during an e.g. (Markdown → HTML → Markdown) conversion, which is unlikely to be lossless.

There are definitely cases where you would just store HTML, but for something which can be edited I wouldn’t.

garrison

garrison

Here’s what I would do:

Store the raw input from the user in a :content column, and the type (:plaintext, :markdown, :asciidoc, etc) in a :content_type column. Compute and store a :content_plaintext column which contains plaintext optimized for search.

On inserts, you send the original content (say markdown), the type, and the markdown with formatting stripped for search.

Then on render you pull the markdown and render it for the user. You could cache the rendered HTML if you need to, but I don’t think you will if this is an internal app. If this is LiveView you would render once and store it in assigns of course.

For search, simply search on the plaintext column.

derek-zhou

derek-zhou

I store text in (sanitized) HTML. Markdown etc. are formats primarily for authoring. It makes a bad choice as an inter-change format.

Where Next?

Popular in Questions Top

mgjohns61585
Could someone help me? I'm making my first elixir program, number guessing game. I can't figure out how to convert the user's guess from ...
New
9mm
I am constructing a JSON object (map) and I need to conditionally set a field. I’m trying to write proper elixir-way code… and I’m at a l...
New
skosch
To my knowledge, put_in, Map.update etc. all have the one limitation of not automatically creating intermediate keys when needed (for exa...
New
greenz1
I have a phoenix application from which a user can download multiple(5-6) files of size 1MB. I couldn’t find anything related to sending ...
New
JulienCorb
I am trying to implement my new.html.eex file to create new posts on my website. new.html.eex: <h1>Create Post</h1> <...
New
electic
Hi, I am new to Elixir. I am trying to use the DateTime component to insert a date into MySQL however the there seems to be no way to fo...
New
johnnyicon
Hi all, I've just started learning Elixir and Phoenix Framework, so please pardon my n00bness at this stage. I'm trying to use Postg...
New
vonH
When I run the Plug and I recompile I wind up having to use Ctrl C to quit iex and start again. Witht the help of rlwrap I can use the cu...
New
sergio_101
I am VERY much an elixir newbie. I have taken one elixir course and one phoenix course on Udemy. During that course, I saw the instructor...
New
PeterCarter
There are pre-rolled solutions for other frameworks that do work. However, Phoenix does not seem to have these. Have people had good expe...
New

Other popular topics Top

danschultzer
None of the current solutions worked well for me, so I went ahead and built a user management system from scratch. This project took far...
548 29305 241
New
stefanchrobot
What’s the safe way to decode a JSON string into a struct? I want to avoid calling String.to_atom. Jason.decode can give me a map with st...
New
JeremM34
Hello, how can I check the Phoenix version ? Thanks !
New
ovidiubadita
Hey all, I discovered Elixir and I love it. I always wanted to learn a functional programming and I intended to go for Haskell, but afte...
New
grych
Hi folks, Few months ago I have announced the proof-of-concept of the library to manipulate the browsers DOM objects directly from Elixi...
639 52238 488
New
Lily
In templates/appointment/index.html.eex: <%= for appointment <- @appointments do %> <tr> <td><%= appoi...
New
fayddelight
I tried installing elixir 1.11.2 erlang 23.3.4 via asdf in my zsh shell. Enabled the versions locally and globally. When I list them ...
New
AngeloChecked
What learn first? Rust or Elixir Hi Elixir community! I’m here because i want learn a new language. I’m a junior developer and mainly i ...
New
klo
Got a question about when to concat vs. prepending items to list then reversing to achieve appending. So i know lists boil down to [1 | ...
New
joaquinalcerro
Hi there, I am working with Ecto-Postgresql and I need to call all of the records from a specific table but the table has 40,000 record...
New

We're in Beta

About us Mission Statement