Beginner: Ok or not ok, cyclic function calls?

mschwartau · May 31, 2016, 7:34pm

The following is my solution to one of the exercises of “Programming Elixir 1.2” and I wondered if it is considered an anitpattern to have two functions (upcase_first_char and downcase_till_point) which call each other - or if it is ok if there are not too many functions which call each other. Other solutions use a parameter (true/false) to know if they have to upcase or downcase a character but I think the code is better readable if I use different method names instead.

The exercise was:

Write a function to capitalize the sentences in a string. Each sentence is terminated by a period and a space. Right now, the case of the characters in the string is random.

 iex> capitalize_sentences("oh. a DOG. woof. ")
  "Oh. A dog. Woof. "

My solution is:

defmodule MyStrings do

    def capitalize_sentences(string), do: upcase_first_char(string)

    defp upcase_first_char(<< " ", tail :: binary >>), do: " " <> upcase_first_char(tail)

    defp upcase_first_char(<< head :: utf8, tail :: binary >>), do: String.upcase(<<head>>) <> downcase_till_point(tail)

    defp upcase_first_char(<<>>), do: ""

    defp downcase_till_point(<< ".", tail :: binary >>), do: "." <> upcase_first_char(tail)

    defp downcase_till_point(<< head :: utf8, tail :: binary >>), do: String.downcase(<<head>>) <> downcase_till_point(tail)

    defp downcase_till_point(<<>>), do: ""

end

I know that it’s easier to do it using String.split und Enum.join but I wanted to get familiar with binary pattern matching.

Ok or not ok?

Regards
Meinert

sztosz · May 31, 2016, 8:03pm

I’m not an expert, but for me it’s readable and clever Pattern matching in all it’s glory.

Qqwy · May 31, 2016, 8:12pm

In general, it is completely fine to have pairs of functions that call each other, or have functions that call themselves. This is called recursion and it is one of the fundamental principles of functional programming: Dividing a problem in simple steps that are trivially to solve individually.

One thing to keep in mind is the difference of ‘normal’ recursion and tail recursion: With normal recursion, you might at some point get a stack-overflow error, as none of the functions can return until the functions they call internally are finished.

tail recursion is what happens if you call, at the end of your function, a single function recursively. This enables languages that support tail recursion (and most functional ones, including Elixir do) to simply ‘replace’ the original function on the function-call-stack with the new one, which makes it work even for very large inputs.

Tail recursion can often be obtained by using a so-called accumulator to keep track of the results you’ve obtained so far. This lets you flip the order of operations, which allows you to keep the recursive function as the final operation in a function.
In this case, you call the <> concatenation operator in every function, which is the last step of the function. To make your function heads tail recursive, you could do something like:

defmodule MyStrings do

    def capitalize_sentences(string), do: upcase_first_char(string, "")

    defp upcase_first_char(<< " ", tail :: binary >>, acc), do: upcase_first_char(tail, acc <> " ")

    defp upcase_first_char(<< head :: utf8, tail :: binary >>, acc), do: downcase_till_point(tail, acc <> String.upcase(<<head>>) )

    defp upcase_first_char(<<>>, acc), do: acc

    defp downcase_till_point(<< ".", tail :: binary >>, acc), do: upcase_first_char(tail, acc <> ".")

    defp downcase_till_point(<< head :: utf8, tail :: binary >>, acc), do: downcase_till_point(tail, acc <>  String.downcase(<<head>>))

    defp downcase_till_point(<<>>, acc), do: acc

end

mkunikow · May 31, 2016, 8:50pm

I don’t know exacly how Erlang handles not tail recursions functions and I would like to know.
Ok I know check answer below

From Java/JVM machine perspective you will get stack overflow for no tail recursive functions for big stack. So the solution for languages based on JVM is to write recursion as tail recursion with special syntax and the compilator will convert it to for loop (possible in Clojure, Scala).

Note: not every problem you can write as tail recursion and every tail recursion you can convert to for loop. You can’t convert no tail recursion function to for loop.

In general always prefer tail recursion function over not tail recursion.

The 2 functions call each other are called trampoline. And yes on JVM in Scala and Clojure there is way to handle this without stack overflow. Does Erlang VM has some optimization for trampoline … ?

Ok I think I found some answer about Erlang VM

You are not getting a stack overflow because the Erlang “stack” is very large. Indeed, stack overflow usually means the processor stack overflowed, as the processor’s stack pointer went too far away. Processes traditionally have a limited stack size which can be tuned by interacting with the operating system. See for example POSIX’s setrlimit.

So you will get stack overflow when you reach out of memory what is quite hard. This is much better compared to JVM where there is specific stack limit (about 400k) and you can hit it very easy .

rvirding · June 1, 2016, 2:37pm

Erlang, and elixir, supports not only tail-recursion but also tail-call optimisation, or sometimes called last call optimisation. This means that not only recursive tail calls but ALL tail calls are optimised away into “jumps”. So in this case have a number of functions mutually calling each other will get the same optimisation as one being tail-recursive.

The compiler happily does this for you, the general tail-call optimisation creates no extra work for it. It would actually be more work to explicitly handle tail-recursion.

This means that there is no support for trampolines. There is no need for it. Trampoline is just a hack you need when you don’t have TCO.

Strangely enough you can do TCO on the JVM, but scala and clojure have chosen not to do it. There is an implementation of erlang running on the JVM, erjang, which supports full erlang with TCO. It’s quite an impressive system, and fast.

The only thing you have to be careful about is to make sure that you actually are doing proper tail-calls. It is not enough that they are syntactically last, they have to be operationally last for the effect to kick-in. @Qqwy discusses this and gives some tips on how you can make your functions have proper tail-calls.

As a final point I just want to say that while TCO is nice there are only a few places where you need to make sure you have them. Most common is the top-loop of a process which sits and waits for input messages, processes the input, then calls itself recursively to wait for the next message. This has to be proper tail calls otherwise stack will build up and you will crash the system. This also means that stacks seldom get deep in erlang/elixir processes, at least not when you get back to the top function.

Having TCO is common in functional languages so it is nothing we invented here.

Robert

peerreynders · June 1, 2016, 5:43pm

Given that your motivation was, as an exercise, to “get familiar with binary pattern matching” your use of recursion is justifiable. However when looking for “the right tool for the job” I don’t believe that “easier to do” was the sole motivation of using String.split, Enum.map, and Enum.join in Dave Thomas’ possible solution (though it certainly is a valuable bonus).

I personally suspect that it has something to do with what Michael Fogus discusses in his blog post Recursion is a low-level operation - the essence of which seems to be “Prefer use of higher order functions over recursion”. Now granted the blog is in the context of Clojure and the JVM but it seems to align with the general notion that higher level abstractions tend to lead to better solutions. So while it is certainly important to know how to effectively use recursion, it is just as important to know when not to use it. Don’t let recursion become your golden hammer; attempt to seek out better options whenever possible.

mkunikow · June 2, 2016, 8:35pm

This is soooo cool

Is it or will be possible to add to some annotation to Dialyxir that this function should be tail recursive and if compilator can’t optimize it the Dialyxir will throw an error? I can do something like this in Scala adding @tailrec annotation over function. This can be understand as some kind of contract.

mkunikow · June 2, 2016, 8:47pm

I think this is correct but you should always start to learn from bottom and know how to things work underneath.

rvirding · June 3, 2016, 3:34pm

It depends what you mean by “better solutions”. Yes, higher level abstractions can lead to better solutions but it very much depends on you choosing the right abstraction; if you choose the wrong one then your solution will be worse. You do need to understand how the abstractions work to be able to choose the right one and avoid the wrong ones, and to know when no abstraction does what you need and it is better to do it yourself.

I think it is something like driving a car. Yes, you can drive a car without knowing how it works but you will never become a good driver if you don’t know how it works. And I am not talking about driving fast here.

rvirding · June 3, 2016, 3:35pm

As far as I know dialyzer has no such feature, but I am no dialyzer expert, nor fan for that matter.