Some beginner questions regarding the Nucleotide problem of exercism.io

I’ve read the Basics of elixirschool only, so I have some questions regarding the Nucleotide problem of exercism.io.

Please take a look at https://paste.ofcode.org/37ZCKNYPTJbpFwcVLdMYRan

  1. What’s the purpose of @nucleotides under the defmodule declaration?
  2. Why does that tag mention A, C, G, T preceded by a question mark?
  3. What is wrong, or how to correct my counthelper implementation?
  4. What is the best source where I can have a more detailed esplanation than in elixirschool (i.e.) with examples?

https://elixir-lang.org/getting-started/module-attributes.html#as-constants

  • Why does that tag mention A, C, G, T preceded by a question mark?

https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html#utf-8-and-unicode

i.e.

  • What is the best source where I can have a more detailed explanation than in elixirschool (i.e.) with examples?

The “best” varies from person to person. But I don’t think you can go wrong with

First pass:

# file: nucleotide_count.exs
defmodule NucleotideCount do

  def counthelper(c, strand, nucleotide) do
    [hd | tl] = strand
    cond do
      hd == nucleotide -> counthelper(c + 1, tl, nucleotide)
      length(tl) != 0 -> counthelper(c, tl, nucleotide)        # 2.) Need (length tl) or length(tl)
      true -> c
    end
  end

  def count(strand, nucleotide) do
    counthelper(0, strand, nucleotide)
  end
end # 1.) added missing module "end"

IO.inspect(NucleotideCount.count('AATAA', ?A))
IO.inspect(NucleotideCount.count('AATAA', ?T))

But still:

$ elixir nucleotide_count.exs
** (MatchError) no match of right hand side value: []
    nucleotide_count.exs:5: NucleotideCount.counthelper/3
    nucleotide_count.exs:18: (file)
    (elixir) lib/code.ex:677: Code.require_file/2
$

i.e. [hd | tl] = strand the match fails when the list is empty - the code never gets to length(tl) != 0

quick fix: Pattern Matching

# file: nucleotide_count.exs
defmodule NucleotideCount do

  def counthelper(c, [], _) do
    c # ran out of strand - return count
  end
  def counthelper(c, strand, nucleotide) do
    [hd | tl] = strand
    cond do
      hd == nucleotide -> counthelper(c + 1, tl, nucleotide)
      length(tl) != 0 -> counthelper(c, tl, nucleotide)
      true -> c
    end
  end

  def count(strand, nucleotide) do
    counthelper(0, strand, nucleotide)
  end
end

IO.inspect(NucleotideCount.count('AATAA', ?A))
IO.inspect(NucleotideCount.count('AATAA', ?T))
$ elixir nucleotide_count.exs
4
1
$

But we can do better …

# file: nucleotide_count.exs
defmodule NucleotideCount do

  def counthelper(c, [], _),
    do: c                                                        # ran out of strand - return count
  def counthelper(c, [hd|tl], nucleotide) when hd == nucleotide, # 1. move pattern match to function head
    do: counthelper(c + 1, tl, nucleotide)                       # 2. extend pattern with a guard
  def counthelper(c, [_|tl], nucleotide),                        # 3. here we know "hd" isn't the "nucleotide"
    do: counthelper(c, tl, nucleotide)                           # we don't care about the "hd" so we use "_" instead

  def count(strand, nucleotide),
    do: counthelper(0, strand, nucleotide)

end

IO.inspect(NucleotideCount.count('AATAA', ?A))
IO.inspect(NucleotideCount.count('AATAA', ?T))

guards
named functions and do/end blocks

The whole thing may look a bit more familiar this way:

# file: nucleotide_count.exs
defmodule NucleotideCount do

  def counthelper(c, strand, nucleotide) do
    case strand do
      [x|xs] when x == nucleotide ->
        counthelper(c+1,xs,nucleotide)
      [_|xs] ->
        counthelper(c,xs,nucleotide)
      _ ->
        c
    end
  end

  def count(strand, nucleotide),
    do: counthelper(0, strand, nucleotide)

end

IO.inspect(NucleotideCount.count('AATAA', ?A))
IO.inspect(NucleotideCount.count('AATAA', ?T))

And then there is

# file: nucleotide_count.exs
defmodule NucleotideCount do

  def count(strand, nucleotide) do
    n_accumulate = fn
      x,c when x == nucleotide -> c + 1
      _,c -> c
    end
    List.foldl(strand, 0, n_accumulate)
  end

end

IO.inspect(NucleotideCount.count('AATAA', ?A))
IO.inspect(NucleotideCount.count('AATAA', ?T))

List.foldl/3
Enum.reduce/3

Defining Multiple Clauses In An Anonymous Function

8 Likes

Maybe You can replace cond with function head pattern matching.

defmodule NucleotideCount do
  @nucleotides [?A, ?C, ?G, ?T]
  
  def counthelper(c, [], _nucleotide), do: c
  def counthelper(c, [head], nucleotide) when head == nucleotide, do: c + 1
  def counthelper(c, [head], nucleotide) when head != nucleotide, do: c
  def counthelper(c, [head | tail], nucleotide) when head == nucleotide do 
    counthelper(c + 1, tail, nucleotide)
  end
  def counthelper(c, [head | tail], nucleotide) when head != nucleotide do 
    counthelper(c, tail, nucleotide)
  end
  
  def count(strand, nucleotide) do
    counthelper(0, strand, nucleotide)
  end
end

iex(1)> IO.inspect(NucleotideCount.count('AATAA', ?A))
4
4
iex(2)> IO.inspect(NucleotideCount.count('AATAA', ?T))
1
1

@nucleotides is not used, You might add some guard clause like this.

def counthelper(_c, [head | _tail], _nucleotide) when head not in @nucleotides, do: raise "error"
1 Like

Excellent @peerreynders, excellent!
Thank you very much for your thorough answer!
I guess I should have read the official documents first as they’re more complete than elixirschool’s contents.

I wouldn’t like to abuse your patience, but there are several questions that arose:

  1. It seems that every function parameter is a result of some pattern matching. Is this right? I searched but couldn’t find documentation on this. Maybe searched with wrong keywords. What I’ve found is something like this:

    add_subscription(user, subscription)
    
    def add_subscription(%User{subscription: nil} = user, subscription) do
      #Add subscription here using user
    end
    
    def add_subscription(_user, _subscription) do
      raise "This user already has a subscription"
    end
    

What’s this %User{subscription: nil} ? Is this a map? some shorthand for pattern matching?
I’ve also learnt from the source of the example https://medium.com/rebirth-delivery/how-to-use-elixir-pattern-matched-functions-arguments-a793733acc6d that there can be assignments in function headers… Elixir is powerful!

  1. There’s two more different attempts to solve this problem, seems close but it’s not working. I’d like to know why:

     def count(strand, nucleotide) do
         c = for l <- strand, l == nucleotide, into: [], do: l
         length c
     end
    
    
    def count(strand, nucleotide) do
       c = for l <- strand, l = nucleotide, into: [], do: l
       length c
       end
    end
add_subscription(user, subscription)

No pattern matching.

def add_subscription(_user, _subscription) do

No pattern matching. Leading underscores simply indicate to the compiler that the arguments (i.e. data) are deliberately being ignored - otherwise the compiler would warn about unused parameters (i.e. names).

def add_subscription(%User{subscription: nil} = user, subscription) do

There are two things going on here

%User{subscription: nil}

Yes, this is a map pattern match (on a struct which is more rigid than a map). However map matching “feels” a bit different because the match is “partial”. In this particular case the value under “subscription” key has to be nil - but there are no constraints on the match with regards to the remaining keys in the structure whatever they are.

%User{subscription: nil} = user

This binds the user argument to the whole matched structure. This is useful if you want a particular match but you intend to use the matched structure as a whole (i.e. you don’t have to “recreate it” when using it in another function invocation or returning it as is).

seems close but it’s not working

Which test case is failing?

# file: nc.exs
defmodule NucleotideCount do

  def count(strand, nucleotide) do
    c = for l <- strand, l == nucleotide, into: [], do: l
    length c
  end

end

IO.inspect(NucleotideCount.count('AATAA', ?A))
IO.inspect(NucleotideCount.count('AATAA', ?T))
IO.inspect(NucleotideCount.count('', ?A))
IO.inspect(NucleotideCount.count('AAAAA', ?A))
$ elixir nc.exs
4
1
0
5
$

I wonder why checking

IO.inspect(NucleotideCount.count('AATAA', ?A))

is not the same as checking:

IO.inspect(NucleotideCount.count('AATAA', 'A'))

against ?A is not the same as checking against 'A', since ? gets a character code point and a charlist is nothing more than a list of code points https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html

Thank you for your help!

Sorry @peerreynders.

The test with == is not failing. The reason is that I was checking against 'A', not against ?A. Please see my newer answer to @kokolegorille

Shouldn’t the expression with the pattern matching also work?

IO.inspect(Kernel.is_binary("string"))   # true
IO.inspect(Kernel.is_list("string"))     # false
IO.inspect(Kernel.is_binary('charlist')) # false
IO.inspect(Kernel.is_list('charlist'))   # true
IO.inspect(Kernel.is_list('A'))          # true
IO.inspect(Kernel.is_integer('A'))       # false
IO.inspect(Kernel.is_list(?A))           # false
IO.inspect(Kernel.is_integer(?A))        # true
2 Likes

Because ?A is 65 while ‘A’ is [65]

2 Likes