Challenge: Roman numbers parser?

I was looking at recursion, Sasa Juric’s talk on parsers and combinators, and was curious how a parser from roman to arabic numbners could look like (not the other way around, I’ve seen some examples of that here).

I feel like this could be just a 5-10 lines recursive function in elixir, but it’s a bit beyond my skills. In other languages, I’ve seen only kinda “brute force” approaches where the roman number was sliced and compared to some dictionary like “X”: 10, “IX”: 9, etc. I guess there should be a much more elegant way to implement that tricky bit of subtraction logic in roman numbers.

Did anyone try this exercise before? Sounds like a nice job interview question :slight_smile:

:wave:

https://rosettacode.org/wiki/Roman_numerals/Decode#Elixir might be interesting for you.

And

  def decode(<<x>>), do: to_value(x)

  def decode(<<h1, h2, rest::bytes>>) do
    case {to_value(h1), to_value(h2)} do
      {v1, v2} when v1 < v2 -> v2 - v1 + decode(rest)
      {v1, _} -> v1 + decode(<<h2, rest::bytes>>)
    end
  end

  def decode(""), do: 0

  defp to_value(?M), do: 1000
  defp to_value(?D), do: 500
  defp to_value(?C), do: 100
  defp to_value(?L), do: 50
  defp to_value(?X), do: 10
  defp to_value(?V), do: 5
  defp to_value(?I), do: 1

if you want to decode binaries instead of lists.

3 Likes

bingo! that was exactly the subtraction logic I was hoping for, and I was right it’s probably gonna be just a few lines of code :slight_smile: not much of a challenge, after all :smiley: thank you!

You could avoid subtraction entirely and rely on pattern-matching instead:

defmodule RomanParser do
  def parse(input, total \\ 0)

  def parse("M" <> rest, total), do: parse(rest, total + 1000)
  def parse("CM" <> rest, total), do: parse(rest, total + 900)
  def parse("D" <> rest, total), do: parse(rest, total + 500)
  def parse("CD" <> rest, total), do: parse(rest, total + 400)
  def parse("C" <> rest, total), do: parse(rest, total + 100)
  def parse("XC" <> rest, total), do: parse(rest, total + 90)
  def parse("L" <> rest, total), do: parse(rest, total + 50)
  def parse("XL" <> rest, total), do: parse(rest, total + 40)
  def parse("X" <> rest, total), do: parse(rest, total + 10)
  def parse("IX" <> rest, total), do: parse(rest, total + 9)
  def parse("V" <> rest, total), do: parse(rest, total + 5)
  def parse("IV" <> rest, total), do: parse(rest, total + 4)
  def parse("I" <> rest, total), do: parse(rest, total + 1)

  def parse("", total), do: total
end

RomanParser.parse("MMMDXVI")

(nimble_parsec can produce similar code with less typing, fwiw)

Both this solution and the other one up-thread have a similar bug/feature: they accept incorrect numbers and do their best to render them. For instance, "IIIIVM" is parsed to 1006 by both and "IM" is parsed to 1001 by this solution and 999 by the subtraction method.

2 Likes

And You can do it with magic of meta programming:

defmodule RomanParser do  
  def parse(input), do: parse(input, 0)

  numerals = [
    {"M", 1000},
    {"CM", 900},
    {"D", 500},
    {"CD", 400},
    {"C", 100},
    {"XC", 90},
    {"L", 50},
    {"XL", 40},
    {"X", 10},
    {"IX", 9},
    {"V", 5},
    {"IV", 4},
    {"I", 1}
  ]

  for {roman, decimal} <- numerals do
    defp parse(unquote(roman) <> rest, total), do: parse(rest, total + unquote(decimal))
  end

  defp parse("", total), do: total
end

RomanParser.parse("MMMDXVI") #=> 3516
1 Like

We have is_map_key/2 guard now!:

defmodule Roman do

  @numerals %{
    "M" => 1000, "CM" => 900,
    "D" => 500, "CD" => 400,
    "C" => 100, "XC" => 90,
    "L" => 50, "XL" => 40,
    "X" => 10, "IX" => 9,
    "V" => 5, "IV" => 4,
    "I" => 1}

  def parse(<<num::binary-size(2)>> <> rest) when is_map_key(@numerals, num), do: parse(rest) + @numerals[num]
  def parse(<<num::binary-size(1)>> <> rest) when is_map_key(@numerals, num), do: parse(rest) + @numerals[num]
  def parse(""), do: 0

end

Roman.parse("MMMDXVI") #=> 3516
Roman.parse("MMCMVI") #=> 2906
2 Likes