Customize Elixir Earmark p tag output

BillBryson · December 6, 2020, 10:29pm

I’m trying to customize the outputted p tag when using Earmark.as_html!/2.

For example, I’d like this input:

Earmark.as_html!("Hello world")

To return: <p style="color: #000000;">Hello world</p>. Here the color is applied dynamically. I was looking through the docs but did not see something that would fit my needs, perhaps I missed it? Any help would be appreciated.

outlog · December 7, 2020, 1:08am

I’m using the code/gist from @neuone from here Earmark - add my own class names to the default set of markdown tags - #3 by neuone to customize earmark output…

only caveat is it’s stopped working on latest earmark, so currently locked at {:earmark, “1.4.4”}, and haven’t had time to figure what the issue/solution is, most likely something trivial…

mine looks like this:
you use like this:

    some_mark_down
     |> Earmark.as_ast()
     |> Myapp.Journal.Parser.parsing()
     |> Earmark.as_html!()

defmodule Myapp.Journal.Parser do
  # defp parse_attr(:body, value) do
  #  value
  #  |> Earmark.as_ast()
  #  |> Myapp.Journal.Parser.parsing()
  #  |> Earmark.as_html!()
  # end

  @moduledoc """
  This module will parse the body of a blog post and update the Markdown
  attributes with custom HTML and css attributes.

  This module is used within the Portal.Blog
  """
  def parsing({:ok, results, _option}) do
    Enum.map(results, fn item ->
      parse(item)
      |> Earmark.Transform.transform()
    end)
  end

  # @doc """
  # Customize your own css_styles by
  # providing your own tuple of css attributes
  # """
  @css_style %{
    "img" => [{"class", ""}],
    "p" => [{"class", "py-2"}],
    "h1" => [{"class", "mw7 lh-copy"}],
    "h2" => [{"class", "mw7 lh-copy"}],
    "h3" => [{"class", "mw7 lh-copy"}],
    "ul" => [{"class", "mw7 mb4 lh-copy"}],
    "ol" => [{"class", "mw7 lh-copy"}],
    "blockquote" => [{"class", "mw7 lh-copy"}]
  }

  @doc """
  The Markdown wraps all <img> tags with a <p> tag. This function will patttern match any
  <p> tags that might contain a nested <img> tag.

  Then it will take that <img> node and pass it through with some css style if it exists.

  If no img tag is found the <p> is passed through.
  """
  def parse({"p", attributes, children_nodes} = _node) do
    first_child = List.first(children_nodes)

    case first_child do
      {"img", img_attr, img_child_nodes} ->
        # IO.inspect(img_attr)
        parse({"img", img_attr ++ attributes, img_child_nodes})

      _no_img_tag ->
        {"p", merge_attributes(attributes, Map.get(@css_style, "p")), children_nodes}
    end
  end

  @doc """
  If the `tag` exists in the @css_style this function will merge the existing attributes with
  the new `css_style` attributes.

  If no `tag` exists in the @css_style the node will just pass through.
  """

  def parse({tag, attributes, children_nodes}) do
    if tag == "img" do
      IO.inspect(attributes)

      attributes
      |> Enum.map(fn {key, value} ->
        if key == "src" and String.starts_with?(value, "/images") do
          {key, value}
        else
          {key, value}
        end
      end)
      |> IO.inspect()
    end

    {tag, merge_attributes(attributes, Map.get(@css_style, tag)), children_nodes}
  end

  @doc """
  If the css_style is nil just return the attributes
  """
  def merge_attributes(attributes, css_style) when is_nil(css_style) do
    attributes
  end

  @doc """
  Will concat two list of tuples into one list. Then will merge
  any tuples that have a similar key value in the first index.

  Given the following params these are the expected results.

  # Params example 1
    attributes  = [{"class", "f1"}]
    css_style   = [{"class", "mw7"}]

  iex > merge_attributes(attributes, css_style)
  iex > [{"class", "f1 mw7"}]

  # Params example 2
    attributes    = [{"class", "f1"}, {"id", "headline"}]
    css_style     = [{"class", "mw7"}, {"id", "red"}]

  iex > merge_attributes(attributes, css_style)
  iex > [{"class", "f1 mw7"}, {"id", "headline red"}]

  """
  def merge_attributes(attributes, css_style) do
    (attributes ++ css_style)
    |> merge_attributes
  end

  @doc """
  With one list of tuples some of the items will have duplicate values in the first index.

  This function will merge the duplicates and return a list of tuples.

  # Params
    attributes = [{"class", "f1"}, {"class", "mw7"}, {"id", "foo"}, {"title", "something"}, {"id", "header"}]

  iex > merge_attributes(attributes)
  iex > [{"class", "f1 mw7"}, {"id", "foo header"}, {"title", "something"}]
  """
  def merge_attributes(attributes) do
    group = Enum.group_by(attributes, fn {key, _value} -> key end)

    keys = Map.keys(group)

    results =
      Enum.map(keys, fn key ->
        Enum.reduce(group[key], "", fn {_key, value}, acc ->
          (acc <> " " <> value)
          |> String.trim()
        end)
      end)

    Enum.zip(keys, results)
  end
end

BillBryson · December 7, 2020, 5:26am

Thank you - this is interesting. I’m trying to use this too, but the parse function doesn’t seem to work for me as-is. I’ve had to update it to take a 4th argument.

  def parse({"p", attributes, children_nodes, foo} = _node) do

Maybe I’m not doing something correctly.

neuone · December 7, 2020, 1:40pm

I haven’t looked at this code for awhile.

What I remember is I want to be able to supply my own CSS class names on certain HTML tags. And this was global meaning any HTML tags that matched would get updated.

This was all so I could control the formatted output of a Markdown blog system

Examples
If it encounters an <H1> tag just add the css class "mw7 lh-copy" to all <H1> tags.
If it encounters any <IMG> tags add the css class ".lazy" to all <IMG> tags.
If it encounters any <p> tag add the css class "py-2" to all <P> tags

Your example
It looks like you want the ability to pass in some specific String “Hello World” and have it wrap itself around a specific tag <p> with a specific css attribute style="color: #000000;"?

Is my assumption correct?

BillBryson · December 7, 2020, 3:30pm

Yes, that’s correct.

It looks like this will work, only in my app the parse function breaks unless it’s a tuple with 4 elements.

neuone · December 7, 2020, 5:43pm

Sounds like you want to pattern match for the sentence "Hello World" and then apply some custom CSS style around that sentence. This parser is not intended to do that type of work.

BillBryson · December 7, 2020, 6:18pm

Sorry, I misread. Not a specific string of “hello world”. I want apply a style to all these p tags. This solution works, I just required a 4 element tuple in place of a 3 element one for the parse function, which the example provided.

Thank you for your help.

RobertDober · December 8, 2020, 11:49am

Sorry for blending in so late, but I needed some time to get my focus back on Earmark

As a matter of fact the split of EarmarkParser was also motivated by giving Earmark more liberties to create code that is not needed by ex_doc and its zillions of users

Therefor I have started to implement an Ast Walker and the option to integrate an Ast Postprocessor
A good starting point on how to use it is here:

github.com

pragdave/earmark/blob/i398_ast_post_processing/test/acceptance/postprocessor_test.exs

defmodule Acceptance.PostprocessorTest do
  use ExUnit.Case


  describe "nop" do
    test "empty edge case" do
      assert post("", id())  == {:ok, [], []}
    end
    test "nop on ast" do
      assert post("hello", id())  == {:ok, [{"p", [], ["hello"], %{}}], []}
    end
  end

  describe "adding an attribute to all 'p' tags" do
    test "one level only" do
      assert post("hello", add_attr("p", :class, "classy")) ==
        {:ok, [{"p", [class: "classy"], ["hello"], %{}}], []}
    end
  end

This file has been truncated. show original

As this is WIP please do not hesistate to give me some input what might be better or more useful.

Sorry no doc yet, but I need the feature to be stable first.

Here is the implementation:
postprocess option
and Ast Walker

RobertDober · December 8, 2020, 6:49pm

Starting to remove recursion from transforming the AST, first step
iterative version of _walk

just in case someone spotted the danger in the code