String HTML to correct HTML syntax

So I have users write some HTML in a text box. I can get it and it does show but the special characters and spacing are kept unnecessary. I tried with Phoenix.HTML but it doesn’t seem to correct out the text. I want to edit this just so that I use less space in the database.

I know I could do this with Regex but I really don’t wish to.

Is there any function and/or dep for this as I didn’t find anything.

As a potential user I would hate the service that strips / corrupts whatever I entered. The user content IMSO must be kept exactly as user submitted it, no matter what.

What kind of special characters and spacing do you mean?

Anyway, I’m with @mudasobwa here, user submitted stuff should be persisted as entered, makes it easier for the user when editing stuff.

Also most dumb implementation of a HTML white space stripper will also remove the whitespace from inside pre and friends, which is probably not what you want.

I agree that submitted html should be saved as is, but when you display it, it should absolutely be cleaned and sanitized. I would store 2 fields. html_raw and html_sanitized. You can use https://github.com/rrrene/html_sanitize_ex

The first half is correct! I don’t want to edit it when it is saved or when the user edits it.

It’s basically an HTML for emails and when a user writes in, it has multiple spaces (like space x5 because of indents) and characters which aren’t really seen by the end user (\n\r which could be replaced by just \n).

Let’s say when it is being sent and someone writes:

<body>
        <h1>text</h1>
</body>

It would save it as: <body>\r\n(8 spaces)<h1>text</h1>\r\n</body> (when I write text in ` and ` the multiple spaces are removed so I had to put it in a bracket). And it’s okay to save it that way for the user. But when the email is sent, I don’t want it to be sent like that but rather as: <body><h1>text</h1></body>.

You could use Floki to write a ”normalize” function:

def normalize(html_string) do
  html_string
  |> Floki.parse()
  |> Floki.raw_html()
end

Floki has other functions for manipulating HTML as well.

1 Like