Comparison of single-quoted and double-quotes Strings

Can anybody explain how the comparison of Strings and characters works Elixir for single-quoted and double-quoted characters (I’m using the term characters because as far I understand from “Programming Elixir-1.6” book:

When Elixir library documentation uses the word string (and most of the time it uses the word binary), it means double-quoted strings.

So, if I got it right, single-quoted strings are stored as char lists:

iex(3)> is_list 'AZERTY'
true

iex(4)> is_list 'A'     
true

In this case, how can we compare a character (a letter) from the above example to match another character (letter), i.e., why can’t I just compare:

defp convert(letter) when 'C' == letter do
    IO.puts("+++++++ equal!!! ++++++++++ ")
end

However, it works in iex session:

iex(5)> letter = 'C'
'C'
iex(6)> 'C' == letter
true

What am I missing ?
Thank you

The problem with your compare function is, that it will crash as soon as you pass something that is not 'C'.

It should work besides of that. What is it you see instead?

Character data type is absolutely the same as Unsigned Integer type for most programming languages.
The same is for Elixir.

So, 'AZERTY' is list of integers, that are just looks like letters (or string). For example, you can add one more element to the list like this:

iex> 'AZERY' ++ [0]
[65, 90, 69, 82, 89, 0]

And as expected you see the list of integer (because Elixir cant convert 0 to a letter)

As you see here, A = 65. So, what is 'A'? OF cource it’s list with one element

iex> 'A' == [65]   
true

Now, the question is:

How can we represent char, if we don’t want to look at the ASCII table every time, to see that A is 65?

Using ? syntax sugar!

iex> ?A
65

Now, you can do:

defp convert(letter) when ?C == letter do
    IO.puts("+++++++ equal!!! ++++++++++ ")
end
5 Likes

I’m trying to catch the letters as follows:

def to_rna(dna) do
    Enum.each(dna, &convert(&1))
  end

  defp convert('G') do
    'C'
  end

  defp convert('C') do
    'G'
  end

  defp convert('T') do
    'A'
  end

  defp convert('A') do
    'U'
  end

  defp convert(letter) do
    IO.puts("Unknown letter")
  end

And the test:

assert RNATranscription.to_rna('G') == 'C'

fails as follows:

Assertion with == failed
     code:  assert RNATranscription.to_rna('G') == 'C'
     left:  :ok
     right: 'C'
     stacktrace:
       rna_transcription_test.exs:13: (test)

Can’t understand what’s really wrong :frowning:

Enum.each returns :ok, you need Enum.map here.
It’s good idea to ask these questions from your Exercism mentor, not on forum :slight_smile:

2 Likes
'C' == letter

is simply an expression the either evaluates to true or false

when 'C' == letter

Here that expression is part of a guard, i.e. if

  • true -> match
  • false -> match fails
def convert(letter) when 'C' == letter do
    IO.puts("+++++++ equal!!! ++++++++++ ")
end

Is equivalent to

def convert('C') do
  IO.puts("+++++++ equal!!! ++++++++++ ")
end

Given the absence of other function clauses a failed match will result in a program crash. To avoid that:

def convert('C') do
  IO.puts("+++++++ equal!!! ++++++++++ ")
end
def convert(_) do
  IO.puts("+++++++ not equal!!! ++++++++++ ")
end

This function convert/1 is one function with two function clauses (function heads).

OK, it is getting clearer now. Thank you very much for such a detailed explanation, really useful :smile:
Taking into account that Enum.each returns just :ok and map returns a list, I think a better solution would be to return a corresponding matching letter, so in pseudo language it would read:

if 'C' => return 'G'
if 'G' => return 'C'
etc...

Look at case do

Also remember that everything is an expression (there are no statements) so everything evaluates to some kind of value.

This is why there is no return statement - functions have to evaluate to a value just like expressions, so they automatically “return” the last value they evaluated.

But you’re still comparing lists of 1 character, and returning lists of 1 character, not single characters.

1 Like

As I’m getting a list of characters (either of just one, A, or a sequence like AZERTY), I think I have to

  • make them match one by one to make a translation to other characters according to the translation table (like A should be replaced with T, Z-> with U, etc.)
  • build another chain of translated characters and return it.

Yeah, it is like in Ruby methods where the last evaluated expression is returned from a method.

Thank you for your response. When trying to use ?as follows:

def to_rna(dna) do
    Enum.map_join(dna, &convert(&1))
  end

  defp convert(?G) do
    'C'
  end

the test:

assert RNATranscription.to_rna('G') == 'C'

fails with:

Assertion with == failed
     code:  assert RNATranscription.to_rna('G') == 'C'
     left:  "C"
     right: 'C'
     stacktrace:

I think it’s because of Enum.map_join joiner :frowning:

Enum.map_join/3:

map_join(enumerable, joiner \\ "", mapper) View Source
map_join(t(), String.t(), (element() -> String.Chars.t())) :: String.t()

i.e. it returns a String.t(), you’re looking for charlist() (or [char()]).

Enum.map/2:

map(enumerable, fun) View Source
map(t(), (element() -> any())) :: list()

returns list() or [any()] or more accurately a list of the type returned by the mapping function, e.g.:

map([char()], (char() -> char())) :: [char()]

something like

map([typeA()], (typeA() -> typeB())) :: [typeB()]

It’s just that Enumerable.t() is anything that has an implementation of the Enumerable protocol (which list() does).

  • charlist() is literally a list of character valued integers
  • String.t() is a binary(), a contiguous chunk of memory where each byte is a UTF-8 codepoint.

Is it one of the solutions would as follows or there are other Enum, String, Enumerable, etc. hiddent functions ?

def to_rna(dna) do
    Enum.map(dna, &convert(&1))
  end

  defp convert(?G) do
    ?C
  end

test "transcribes guanine to cytosine" do
    assert RNATranscription.to_rna('G') == 'C'
  end

By the way, where does it come from, this ? sign, - I find it nowhere in official guides.

The complete module:

@spec to_rna([char]) :: [char]
  def to_rna(dna) do
    Enum.map(dna, &convert(&1))
  end

  defp convert(?G) do
    ?C
  end

  defp convert(?C) do
    ?G
  end

  defp convert(?T) do
    ?A
  end

  defp convert(?A) do
    ?U
  end

  defp convert(letter) do
    IO.puts(:stderr, "Couldn't find matching for #{letter}")
    letter
  end
1 Like

Ahr, I found it, - some words about the use of ? sign:

UTF-8 requires one byte to represent the characters h , e , and o , but two bytes to represent ł . In Elixir, you can get a character’s code point by using ?

Thank you guys, really helpful !

In Elixir, you can get a character’s code point by using ?

https://hexdocs.pm/elixir/String.html#module-integer-codepoints

There are a couple of ways to retrieve a character integer codepoint. One may use the ? construct:


or there are other Enum, String, Enumerable, etc. hidden functions ?

I’m not sure what you are asking …

https://hexdocs.pm/elixir/master/syntax-reference.html#integers-in-other-bases-and-unicode-code-points

1 Like

I asked because it looked a little bit weird for me to compare two characters by using ? sign instead of === or == or just equals that exist in other languages. More of that, I also have to return the character to replace with using the same ? sign.

? simply generates the integer (codepoints) that is equivalent to the character, which in the case of ASCII will fit into 8 bits. What you call a comparison is actually a pattern match.

  • A pattern match is a conditional construct.
  • When a pattern match in a function head succeeds that function clause is evaluated
  • When the pattern match fails the next function head is attempted.
  • When you run out of function heads (i.e. each head failed the match) the program crashes.
  defp convert(letter) do
    case letter do
      ?G ->
        ?C

      ?C ->
        ?U

      ?T ->
        ?A

      ?A ->
        ?U

      _ ->
        IO.puts(:stderr, "Couldn't find matching for #{letter}")
        letter
    end
  end