Parsing tokens out of a string

What from established Elixir “ecosystem” would you guys suggest for parsing out tokens embedded in a string. In this particular case tokens are embedded with double curly braces. Something like:

"This is a string with {{token0}} and {{token1}} embedded"

But suggestion for a generic approach would be appreciated.

You could take a look at gettext, which does do that for replacements in translatable strings. It’s essentially using a bunch of :binary api to handle the job.

Old but good: Tokenizing and parsing in Elixir with yecc and leex – Andrea Leopardi

1 Like

Can’t speak for best practices, but here’s a quick and dirty regex:

iex(1)> string = "This is a string with {{token0}} and {{token1}} embedded"
"This is a string with {{token0}} and {{token1}} embedded"
iex(2)> Regex.scan(~r/{{(.*)}}/U, string, capture: :all_but_first)
[["token0"], ["token1"]]

Also check out this thread: Help to parse a template with NimbleParsec

1 Like

NimbleParsec — NimbleParsec v1.2.3 is one more option for it.

Depending on what you’re trying to do, there’s always ExMustache — ex_mustache v0.1.0

Thank you guys for all the suggestions. Shall throw them at my brain cells and see what sticks :wink:

Hehe - yes, rolling out a regexp or two, while doesn’t scale well with complexity, is often the best/simplest choice