Enum.map/2 behavior on charlist and "list of chars"

Hi!

I’m making a simple exercise that consists in replacing characters in a charlist. Its a DNA-RNA conversion like

ATCG -> UAGC

Being fairly new to Elixir, I tried this code:

  def to_rna(dna) do
    Enum.map(dna, fn b -> complement(b) end)
  end

  def complement(base) when base == 'A', do: 'U'
  def complement(base) when base == 'T', do: 'A'
  def complement(base) when base == 'C', do: 'G'
  def complement(base) when base == 'G', do: 'C'
  def complement(_base) do "E" end

But I’m getting different results when calling to_rna/1 with a charlist and with a list of chars:

iex>to_rna('ATCG')
["E", "E", "E", "E"]

iex>to_rna(['A', 'T', 'C', 'G'])
['U', 'A', 'G', 'C']

I’m probably seriously misunderstanding something very fundamental about the way Enum.map/2 and charlist work together. I’d like some tips on the right direction about this!

Thank you all

A list of chars is the same as a charlist, your second example is a list of charlists.

'ACGT' is equivalent to [?A, ?C, ?G, ?T] which is [65, 67, 71, 84], so you need to match accordingly in complement/1.

5 Likes

You’re right! I changed to complement/1 to

  def complement(base) when base == 65, do: 'U'
  def complement(base) when base == 67, do: 'A'
  def complement(base) when base == 71, do: 'G'
  def complement(base) when base == 84, do: 'C'
  def complement(_base) do "E" end

And it worked! Thanks

Is there a way to leave the letters ATCG in complement/1? Given that its easier to see what is happening when using letters instead of their codepoints, I think it would make for a more readable code

See my example again. 'A' = [?A] = [65] so ?A = 65. The questionmark annotates a character literal, it is similar to enclosing in single quotes in C.

Also instead of using guards, you should match directly, your mentor will probably mention that after submission.

1 Like

It worked! Thanks.

Hah! That choice between guards and matching was also in my mind. I’ve just submitted it.

Thanks again!

Just a general design guideline-

You may want to consider removing the fall through clause returning ‘E’ so when you pass an invalid argument to the function it fails and you can see the errors

Let it crash is a pretty good match for elixir typically.
https://wiki.c2.com/?LetItCrash

3 Likes

My solution:

defmodule RnaTranscription do
  @doc """
    Given a DNA strand, return its RNA complement (per RNA transcription).
    Both DNA and RNA strands are a sequence of nucleotides.
    The four nucleotides found in DNA are adenine (A), cytosine (C), guanine (G) and thymine (T).
    The four nucleotides found in RNA are adenine (A), cytosine (C), guanine (G) and uracil (U).
    Given a DNA strand, its transcribed RNA strand is formed by replacing each nucleotide with its complement:

    G -> C
    C -> G
    T -> A
    A -> U
  """
  def to_rna(?G), do: ?C
  def to_rna(?C), do: ?G
  def to_rna(?T), do: ?A
  def to_rna(?A), do: ?U
  def to_rna(dna), do: Enum.map(dna, &to_rna/1)
end

I think you need when is_list(dna) guard check.
Also, when do you fallback to “E” or ?E when the character is other than CGTA?

Hi @eksperimental, thanks :smiley:

I implemented guided by the tests. There is no case where I need to use guard to check if it is a list. But this should work too.

About the fallback. 'E' is not a valid value, so as @LukeWood said:

Let it crash is a pretty good match for elixir typically.
https://wiki.c2.com/?LetItCrash

I see. That was his way to returning :error,
Anyway, by not using a guard, your implementation will try to use Enum.map on a term that is not an enumerable, for example ?E, and it will say that the protocol Enumerable has not been implemented for it.

Ah, you are right. I forgot this point. :joy: . Thanks!

1 Like