fuelen

fuelen

HTML2Text - extract readable plain text from HTML using Rust NIF

HTML2Text provides a simple and efficient way to extract readable plain text from HTML content. It leverages the power of Rust’s html2text crate to deliver fast HTML parsing and text extraction while maintaining the logical structure and readability of the content.

https://github.com/fuelen/html2text

Most Liked

fuelen

fuelen

v0.3.0 is out with two major additions: annotated rich text output and an inspectable HTML container.

Rich text (convert_rich/2)

Returns structured {text, [annotation]} tuples instead of formatted strings, so you can build your own renderer (Slack, Discord, email, etc.):

HTML2Text.convert_rich("<p>Hello <strong>world</strong></p>")
#=> {:ok, [[{"Hello ", []}, {"world", [:strong]}]]}

HTML2Text.convert_rich(~s(<a href="https://example.com"><em>click</em></a>))
#=> {:ok, [[{"click", [{:link, "https://example.com"}, :emphasis]}]]}

Annotations: :strong, :emphasis, :strikeout, :code, {:link, url}, {:image, src}, {:preformat, bool}, {:colour, {r, g, b}}, {:bg_colour, {r, g, b}}.

CSS colour extraction is supported via use_doc_css: true (parses <style> tags and inline styles).

HTML container (HTML2Text.HTML)

A struct that renders HTML as formatted text when inspected in IEx — bold, italic, clickable links (OSC 8), CSS true color, and more:

%{subject: "Alert", body: HTML2Text.HTML.new(email_html)}

# In IEx you see formatted text instead of raw tags:
# %{subject: "Alert", body: #HTML2Text.HTML<
#     Dear customer,
#     Your order has been shipped.
#   >}

Short content stays inline: #HTML2Text.HTML<bold>. to_string/1 returns the original HTML.

Other changes

  • empty_img_mode option: :ignore (default), {:replace, text}, or :filename
  • Updated html2text Rust crate from 0.15.1 to 0.16.7 (rowspan support, bug fixes)
  • HTML2Text.Error custom exception for bang functions
fuelen

fuelen

html2text v0.2.0 released — breaking changes and a cleaner API!

This version introduces a new, more consistent API with proper error handling and an optional keyword list for configuration.


What changed

Before (v0.1.x):

HTML2Text.convert(html, width)
# => returns plain string
  • The second argument was required and only accepted a width (integer or :infinity)
  • Errors (like width too narrow) would raise directly

Now (v0.2.0):

HTML2Text.convert(html, opts)
# => {:ok, result} | {:error, reason}

HTML2Text.convert!(html, opts)
# => result (raises on error)
  • convert/2 now returns {:ok, text} or {:error, reason}
  • convert!/2 is a new function that raises on error (for convenience)
  • The second parameter is now an optional keyword list of options

Where Next?

Popular in Announcing Top

asiniy
Hey there! I wrote a download elixir package which does exactly what its name about - an easy way to download files. I saw solutions ab...
New
deadtrickster
I’ve just released stable versions of my Prometheus Elixir libs: Elixir client [docs]; Ecto collector [docs]; Plugs instrumenter/Export...
New
josevalim
EDIT: since Ecto 3.0 final version is out, this post was amended to use the final versions in the instructions below. Hi everyone, We a...
New
Crowdhailer
I have been updating a library that allows you to pipe between functions that use the erlang result tuple convention. Assuming you have ...
New
Eiji
ExApi is a library that I’m developing now and hope release soon This library will allow to: list all apis list all api implementation...
New
nikokozak
Hello all, I’ve been working on Svonix - a library for quickly integrating Svelte components into Phoenix views. It’s a much-needed succ...
New
mindok
What is ContEx? A pure Elixir server-side data plotting/charting library outputting SVG. It has nice barcharts in particular and works g...
New
treble37
Just looking for a little feedback on a tiny helper library I built - Sometimes I find the need to convert maps with atom keys to maps w...
New
marcuslankenau
I feel kind of stuck with the absence of a proper xml library for Elixir. Currently I use SweetXML which was ok for me more or less to pa...
New
tmbb
I’ve decided to create this topic to discuss optimization possibilities for something like Phoenix LiveView. I’ve created this topic unde...
144 10187 141
New

Other popular topics Top

marius95
Hello everyone, I try to use an Javascript Event Handler in my root.html.leex file. Therefore I created a function in the app.js file: ...
New
electic
Hi, I am new to Elixir. I am trying to use the DateTime component to insert a date into MySQL however the there seems to be no way to fo...
New
jerry
Good day to you all. I have been struggling to get a query involving like and ilike to work. Can anyone assist me on this, please? pro...
New
josevalim
Hi everyone, One of the features added to Elixir early on to help integration with Erlang code was the idea of overridable function defi...
New
alice
Hey, Just curious what are the main benefits of Elixir compared to Clojure? When is Elixir more useful than Clojure and vice versa? Th...
New
jason.o
In the code below, if the create action is not set to accept “extra_key” as an input, it errors out with a message shown above. Is there ...
New
dblack
I’ve got an issue with an app and I’ve no idea of how to troubleshoot it. I’m hoping someone here might have seen something similar. I p...
New
boundedvariable
I am going through the kafka architecture. All the features what the kafka is providing are already in Erlang. I would like hear your opi...
New
marick
I had some trouble figuring out how to make many-to-many associations work. Once I got it working, I wrote a blog post. Because I’m a nov...
New
Qqwy
Update: How to use the Blogs &amp; Podcasts section You can post links to your blog posts or podcasts either in one of the Official Blog...
3271 126479 1222
New

We're in Beta

About us Mission Statement