Codewars Tortoise racing

Fl4m3Ph03n1x · October 18, 2018, 10:59am

Background

I am learning Elixir by making exercises in Codewars as much as I can. I have stumbled upon one specific exercise that I think may be incorrect and I could use a hand in checking out my solution.

Exercise

Two tortoises named A and B must run a race. A starts with an average speed of 720 feet per hour . Young B knows she runs faster than A , and furthermore has not finished her cabbage.

When she starts, at last, she can see that A has a 70 feet lead but B 's speed is 850 feet per hour . How long will it take B to catch A ?

More generally: given two speeds v1 ( A 's speed, integer > 0) and v2 ( B 's speed, integer > 0) and a lead g (integer > 0) how long will it take B to catch A ?

The result will be an array [hour, min, sec] which is the time needed in hours, minutes and seconds (round down to the nearest second) or a string in some languages.

If v1 >= v2 then return nil

Examples:

race(720, 850, 70) => [0, 32, 18] 
race(80, 91, 37)   => [3, 21, 49]

My Solution

Spoiler

    defmodule Tortoise do
      def race(v1, v2, _) when v1 >= v2, do: nil
      def race(v1, v2, lead) do
        #Step 1
        slow_speed = v1 / 3600  #v1 feets per second
        fast_speed = v2 / 3600  #v2 feets per second
        
        #Steps 2 and 3
        seconds_to_catchup(lead, 0, slow_speed, fast_speed, 0)
        |> to_compound_date()
      end

      defp seconds_to_catchup( slow_distance, fast_distance, _, _, secs ) when fast_distance >= slow_distance, do: secs-1
      defp seconds_to_catchup( slow_distance, fast_distance, slow_speed, fast_speed, secs ) do
        seconds_to_catchup( slow_distance+slow_speed, fast_distance+fast_speed, slow_speed, fast_speed, secs+1)
      end

      defp to_compound_date(total_seconds, hours\\0, minutes\\0, seconds\\0) do
        cond do
          minutes >= 60       -> to_compound_date(total_seconds, hours + 1, 0, 0)
          seconds >= 60       -> to_compound_date(total_seconds, hours, minutes + 1, 0)
          total_seconds === 0 -> [hours, minutes, seconds]
          true                -> to_compound_date(total_seconds - 1, hours, minutes, seconds + 1)
        end
      end

    end

Tests

  dotest(720, 850, 70, [0, 32, 18])
  dotest(80, 91, 37, [3, 21, 49])
  dotest(820, 81, 550, nil)

Problem

The problem here is that I believe the tests to be wrong.

My solution is based on the following algorithm:

Convert feets/hour to feets/second
Iterate over each second adding the distanced traveled to the total distance
Convert number of total seconds to a compound duration

However, when I run the tests I get errors like:

Testing : 720 850 70
1939
[0, 32, 19]
[0, 32, 18]

I believe this happens because the author’s solution rounds and truncates numbers, while my solution does not. I believe my solution to be mathematically more correct because it never loses precision.

Question

However I can’t convince the author and he thinks my solution is incorrect. I can’t see how. Could anyone give me a hint?

NobbZ · October 18, 2018, 11:25am

I’m not sure about your approach.

Basically we have this

d = v * t + d0

Where d0 is the starting distance, v is the speed, t the time passed and d the distance after t.

So we know this:

d_a = v_a * t + d0_a
d_b = v_b * t # As we know she is the pivotal point and never moved at t = 0

And now we want to know what t we need to make d_a == d_b.

v_a * t + d0_a = v_b * t # use wolfram alpha to solve for t
t = - d0_a / (v_a - v_b)

This formula will give you a canonical result. Applying it to the initial example of 720, 850 and 70 we get t = -70/(720 - 850) = -70/-130 = 0,5384615384615385 hours. Converting to seconds is 1938,461538461538, then truncating and converting into that list gives [0, 32, 18].

So who ever of you two came to this conclusion is correct. From the example output you gave I’m not sure who that is.

But incremantally increasing values carries errors around. Especially as you got your input in feet/hour and you convert it feet/second and therefore create a floating point number you increase a lot of potential error here.

Fl4m3Ph03n1x · October 18, 2018, 12:31pm

Wow, indeed compared to your solution, my solution is not as precises and therefore not as good.
great input !

Edit

After manually doing the calculations I got the following formula:

t = d0_a / (v_b - v_a )

In the it is the same, but I think it is easier this way because I am not inverting the order of the sum.
Still, great answer!

Edit Edit

Using this approach I was able to greatly minimize my code:

Spoiler

    defmodule Tortoise do
      def race(v1, v2, _) when v1 >= v2, do: nil

      def race(v1, v2, lead) do
        #Solving the following equation for 't': v_a*t + d0_a = v_b*t
        #Yeilds the formula: t = d0_a / (v_b - v_a)
        time_hours = lead/(v2 - v1)
        time_seconds = trunc( time_hours * 3600 )
        to_compound_date(time_seconds)
      end

      defp to_compound_date(total_seconds, hours\\0, minutes\\0, seconds\\0) do
        cond do
          minutes >= 60       -> to_compound_date(total_seconds, hours + 1, 0, 0)
          seconds >= 60       -> to_compound_date(total_seconds, hours, minutes + 1, 0)
          total_seconds === 0 -> [hours, minutes, seconds]
          true                -> to_compound_date(total_seconds - 1, hours, minutes, seconds + 1)
        end
      end
    end

However, I still get one off errors with the some tests:

Testing : 642 669 111
[4, 6, 39]
[4, 6, 40]

It is now my firm belief that there is something definitely wrong with the exercise itself that the author refuses to acknowledge.

Replacing round with trunc fails on different tests. Overall without knowing the authors algorithm I think it’s rather impossible to pass the tests.

peerreynders · October 18, 2018, 3:35pm

If you keep rerunning attempts with your code it will eventually pass.

Same with this one:

Spoiler

defmodule Tortoise do
  def race(v1, v2, _) when v1 >= v2,
    do: nil
  def race(v1, v2, lead),
    do: to_list(lead/(v2 - v1))

  defp to_list(hours),
    do: to_list(trunc(hours * 3600),2,[])

  defp to_list(all, 0, rest) do
    [all|rest]
  end
  defp to_list(all, n, rest) do
    all
    |> div(60)
    |> to_list(n - 1, [rem(all, 60)|rest])
  end
end

IO.inspect Tortoise.race(720,850,70)
IO.inspect Tortoise.race(80,91,37)
IO.inspect Tortoise.race(820,81,550)
IO.inspect Tortoise.race(642,669,111)
IO.inspect Tortoise.race(716,740,98)
IO.inspect Tortoise.race(480,588,117)
IO.inspect Tortoise.race(444,504,123)

so the random testing is unstable.

NobbZ · October 18, 2018, 4:03pm

I passed it on the first try.

For those who already solved the kata a link to my solution:

https://www.codewars.com/kata/reviews/5b5f313a6d5fe186df00134c/groups/5bc8ac9a40ecc7fc1a001b6f

The faulty testcase you showed here gives a result of [4, 6, 39] for me.

I’m not sure whether that is your expectation or your actual… It would be nice if you were pasting the full output of the failing test.

edit and PS: I’d really prefer if this exercise were giving speed in feet/second to reduce the chance of rounding errors.

peerreynders · October 18, 2018, 4:15pm

# trunc
IO.inspect Tortoise.race(716,740,98)  # [4,4,59] Codewars expected [4,5,0]
IO.inspect Tortoise.race(480,588,117) # [1,4,59] Codewars expected [1,5,0]
IO.inspect Tortoise.race(444,504,123) # [2,2,59] Codewars expected [2,3,0]
IO.inspect Tortoise.race(311,391,82)  # [1,1,29] Codewars expected [1,1,30]
# round
IO.inspect Tortoise.race(720,850,37)  # [0,17,5] Codewars expected [0,17,4] test race will always fail
IO.inspect Tortoise.race(642,680,142)  # [3,44,13] Codewars expected [3,44,12]
IO.inspect Tortoise.race(419,504,65)  # [0,45,53] Codewars expected [0,45,52]

when changing to

defp to_list(hours),
    do: to_list(round(hours * 3600),2,[])

the test race will always fail (sample result will be [4,6,40]) and the odd random test would fail the other way

I’d really prefer if this exercise were giving speed in feet/second to reduce the chance of rounding errors.

Agreed as the cumulative error on successive floating point operations can mount pretty fast - the above code only uses one floating point division and one floating point multiplication - if the random tests use more than that there is always the chance of deviating results - there is a reason floating point results should always be tested with an epsilon (or delta in assert_in_delta/4).

This one seems to consistently pass:

Spoiler

defmodule Tortoise do
  def race(v1, v2, _) when v1 >= v2,
    do: nil
  def race(v1, v2, lead),
    do: to_list((lead*3600)/(v2 - v1))

  defp to_list(seconds),
    do: to_list(trunc(seconds),2,[])

  defp to_list(all, 0, rest) do
    [all|rest]
  end
  defp to_list(all, n, rest) do
    all
    |> div(60)
    |> to_list(n - 1, [rem(all, 60)|rest])
  end
end

IO.inspect Tortoise.race(720,850,70)  # [0,32,18]
IO.inspect Tortoise.race(80,91,37)    # [3,21,49]
IO.inspect Tortoise.race(820,81,550)  # nil
IO.inspect Tortoise.race(642,669,111) # [4,6,40]
IO.inspect Tortoise.race(716,740,98)  # [4,5,0]
IO.inspect Tortoise.race(480,588,117) # [1,5,0]
IO.inspect Tortoise.race(444,504,123) # [2,3,0]
IO.inspect Tortoise.race(311,391,82)  # [1,1,30]
IO.inspect Tortoise.race(720,850,37)  # [0,17,4] test race
IO.inspect Tortoise.race(642,680,142) # [3,44,12]
IO.inspect Tortoise.race(419,504,65)  # [0,45,52]

similarly this works:

Spoiler

  def race(v1, v2, lead) do
    #Solving the following equation for 't': v_a*t + d0_a = v_b*t
    #Yeilds the formula: t = d0_a / (v_b - v_a)
    time_seconds = trunc((lead*3600)/(v2 - v1))
    to_compound_date(time_seconds)
  end

What Every Computer Scientist Should Know About Floating-Point Arithmetic (pdf)

Basically by multiplying lead before the division you are losing less precision. So maybe the exercise should mention that it also contains a “floating point arithmetic” lesson.

NobbZ · October 18, 2018, 5:30pm

The test author does something similar to this: seconds = div(3600 * g, v2 - v1), so he doesn’t even use floating point arithmetic at all.

peerreynders · October 18, 2018, 5:53pm

Cute - didn’t see that “substitution” for getting rid of all the floating point operations.

And given that all the inputs and outputs are integral - maybe it is supposed to serve as a devious warning to avoid floating point if at all possible. (Maybe a reminder that rem/2 and div/2 exist would be helpful).

sribe · October 18, 2018, 8:47pm

Reminds me of a long, long thread on a programming forum where someone was complaining about “bugs” in the platform’s floating-point software. And it turned out that, as you might expect, he was utterly clueless about rounding issues. But worse, he was developing accounting software. And was unable to understand what he was being told by people trying to help. The thread ended with “please tells us the name of the product you’re working on, so we can be sure that none of our clients ever use it.”

Fl4m3Ph03n1x · October 19, 2018, 7:15am

A reminder for Elixir users that those exist would in deed be very useful, given the author’s resistance in fixing the exercise.

However, I am against nudging the description so that it gives you a hint in this specific case. The way I see it, there is not a single right way to solve this issue. Mathematics are complex and the same problem can be reached by using a plenitude of different paths.

It is my strong opinion that many of the solutions rejected by the random tests were in fact valid and that a good exercise should not force you to think or even use the same techniques as the creator, but to entice you into searching for your own path and your own solution to the problem.

This exercise, most often than not, is a guess game at which algorithm the author used…

peerreynders · October 19, 2018, 1:40pm

I think something generic like “Note: Use (any) floating point operations at your own risk” at the end would be helpful in any language just as a pointer to those individuals who may not fully realize that floating point calculations are a compromise.***

The way I see it, there is not a single right way to solve this issue.

The exercise is meant to be solved in an optimal way which has less to do with coding in any particular programming language but more with problem solving within the capabilities and limitations of most current computer hardware.

The “cute background story” is a bit of a distraction

You are supposed to realize that each of two competitors for the purposes of the problem can be represented via the distance as a linear function of time (from the point of view of the delayed competitor). By equating the functions for both competitors and solving for time you are doing most of the “hard work” up front.
You are supposed to be paying attention to the “types”(, units) and the required precision of the problem. The required precision is exactly down to a whole single second - all the inputs and all the outputs are specified as integers. (Type awareness can be a bit of an issue for people used to dynamic languages or languages with type coercion).
Equipped with the right formula and the realization that all quantities are integers you are supposed to take a good, hard look at the calculations involved to see if there is an optimal way of organizing the calculations to minimize or possibly eliminate any possible loss of precision. So in fact the author’s solution that uses only integer operations organized in such a way that sub-second precision is knowingly sacrificed only once is optimal.

is a guess game at which algorithm the author used

No - it’s realizing how to design an optimal solution given certain constraints.

It is my strong opinion that many of the solutions rejected by the random tests were in fact valid

No - the exercise falls flat because of it’s reliance on random testing that can occasionally pass suboptimal solutions.

The typical response would be to increase the number of random test from 50 to 500, 5000 or more. However that would significantly increase the load on the exercise server.

What would make more sense is to “slightly sabotage” the optimal solution (by using one or two floating point operations in the typical fashion) and use random tests to discover the cases that tend to fail suboptimal solutions and to collect those into a suite of static tests (acting more or less as a suite of regression tests).

***PS: Or simply state “this exercise is looking for an optimal solution, not just one that is good enough.”