Translate code from c++ to Elixir

Hello everyone!

I’m newbie on Elixir programming, actually I’m trying to figure out how translate a code from c++ to elixir, I can not achieve all these conditions on a loop, and also the manipulate of variables inside those loops, here the code on c++:

https://github.com/forrestthewoods/lib_fts/blob/master/code/fts_fuzzy_match.h

Actually I see Enum functions like reduce, reduce_while but in this case have 2 conditions on this loop, what I have (Just a really base code) is this:

defmodule Okamiserver.Searcher.Search do
# fts fuzzy match port from https://github.com/forrestthewoods/lib_fts/blob/master/code/fts_fuzzy_match.h

# Score consts
@adjacency_bonus = 5                # bonus for adjacent matches
@separator_bonus = 10               # bonus if match occurs after a separator
@camel_bonus = 10                   # bonus if match is uppercase and prev is lower
@leading_letter_penalty = -3        # penalty applied for every letter in str before the first match
@max_leading_letter_penalty = -9    # maximum penalty for leading letters
@unmatched_letter_penalty = -1      # penalty for every letter that doesn't matter

def fuzzy_search(pattern, str) do
    
end
defp _fuzzy_search([], [], out_score, _, _, _, _, _, _) do
    {false, out_score}
end
defp _fuzzy_search(_, _, out_score, _, _, _, _, _, recursion_count) 
    when recursion_count >= 10 #recursion limit
do
    {false, out_score}
end
defp _fuzzy_search([shead | stail], [phead | ptail], out_score,
    str_begin, str_matches, matches, max_matches, next_match,
    recursion_count)
do
    recursive_match = false
    best_recursive_matches = Enum.to_list 0..256
    best_recursive_score = 0
    _fuzzy_while(true, pattern, str)
end
defp _fuzzy_while(first_match, pattern, str) do
    if String.downcase pattern == String.downcase str do
        [_ | pattern] = pattern
    end
    [_ | str] = str
    _fuzzy_while(first_match, pattern, str)
end
end

I hope someone can guide me to achieve this!

Good day all!

PD: I’m sorry if my English isn’t proficient, I still learning

2 Likes

I haven’t looked closely at the code but I think your best bet is to fully understand the code and implement it in elixir as opposed to translating c++ to elixir. I believe it could limit the notion of ‘matching’ the c++ code which might help you to think about it more functionally as opposed to OO. Because it doesn’t have to match the code, it just has to solve the problem using elixir in a functional way. Just my opinion and a random thought.

Paul

3 Likes

I can translate it if you want, but you really should try it first. :slight_smile:

The way you have started is good. Just remember that the conditions of your loop as well as data carried between loop iterations just become arguments to your recursive functions. Keeping that in mind then direct translations become much easier. :slight_smile:

1 Like

One small point. You don’t want to bind a variable inside a block and expect to use it outside the block. Try this instead.

defp _fuzzy_while(first_match, pattern, str) do
    pattern = if String.downcase pattern == String.downcase str do
        tl pattern
    else
        pattern
    end
    [_ | str] = str
    _fuzzy_while(first_match, pattern, str)
end

or even even a little less verbose:

defp _fuzzy_while(first_match, pattern, [_ | str_t] = str) do
    if String.downcase pattern == String.downcase str, do: tl(pattern), else: pattern
        _fuzzy_while(first_match, pattern, str_t)
    end
end

You may want to consider using a map for all your state. This will reduce the number of parameters you need to pass around and still give you the flexibility to pattern match.

Something like:

defp _fuzzy_search([shead | stail], [phead | ptail], state) do
    str_begin, str_matches, matches, max_matches, next_match,
    recursion_count)
do 
   state = 
     Enum.into [
         recursive_match: false, 
         best_recursive_matches: Enum.to_list(0..256), 
         best_rescursive_score: 0], state

   # the line below is a problem. You are setting/updating variables above, but 
   # they are not passed to the recursive function. So, the assignments are not 
   # used. You probably need to add the state to _fuzzy_while, have it update the 
   # state and return it.
    _fuzzy_while(true, pattern, str)
end
2 Likes

Thanks guys for your answers!

@paulsullivanjr sure you’re right I try todo my best, I come from c# never explore a lot c++ but this code it’s very similar in certain way, and about the solution on elixir I know it but I like “match” the code for the moment until I still learning (I’m sure I will find more cool things on elixir to achieve this on a better way thats my goal)

@OvermindDL1 thanks for your comment I do the best I can at this time, but I really don’t achieve some parts cause I don’t understand the c++ work on the documentation, hope you ca help me on that

@smpallen99 thanks for your answer, I hange some things on the code (I hope I implemented it in a correct way) based on your recommendation!

Well thats all what I achieve to “match” from the c++, I really don’t understand how works the char* ++ and memcpy (it’s the same to asing the variable?) or thing like “str - strBegin” (char* - char* euh?), here it’s my actual code:

defmodule Okamiserver.Searcher.Search do
    # fts fuzzy match port from https://github.com/forrestthewoods/lib_fts/blob/master/code/fts_fuzzy_match.js

    # Score consts
    @sequential_bonus = 5               # bonus for adjacent matches
    @separator_bonus = 10               # bonus if match occurs after a separator
    @camel_bonus = 10                   # bonus if match is uppercase and prev is lower
    @first_letter_bonus = 15            # bonus if the first letter is matched

    @leading_letter_penalty = -5        # penalty applied for every letter in str before the first match
    @max_leading_letter_penalty = -15   # maximum penalty for leading letters
    @unmatched_letter_penalty = -1      # penalty for every letter that doesn't matter

    def fuzzy_search(pattern, str) do
        
    end
    defp _fuzzy_search([], [], out_score, _, _, _, _, _, _) do
        {false, out_score}
    end
    defp _fuzzy_search(_, _, out_score, _, _, _, _, _, recursion_count) 
        when recursion_count >= 10 #recursion limit
    do
        {false, out_score}
    end
    defp _fuzzy_search(pattern, str, out_score,
        str_begin, str_matches, matches, max_matches, next_match,
        recursion_count)
    do
        recursion_count = recursion_count + 1
        recursive_match = false
        best_recursive_matches = Enum.to_list 0..256
        best_recursive_score = 0
        case _fuzzy_while(true, pattern, str) do
            {false, out_score} ->
                {false, out_score}
            {true, state} ->
                [
                    recursive_match, 
                    best_recursive_matches, 
                    best_recursive_score, 
                    str_begin,
                    next_match, 
                    max_matches, 
                    recursion_count,
                    pattern,
                    str
                ] = state

                # Determine if full pattern was matched
                matched = pattern == ""

                #Calculate score
                if matched do
                    # Iterate str to end | I think it's the same to str = ""
                    # I'm not really sure if ++ a char in c++ it's equal to iterate
                    str = ""

                    # Initialize Score
                    out_score = 100
                    penalty = @leading_letter_penalty * Enum.at(matches, 0)
                    if penalty < @max_leading_letter_penalty do
                        penalty = @max_leading_letter_penalty
                    end
                    out_score = out_score + penalty

                    # Apply unmatched penalty
                    unmatched = 0 #TODO
                    out_score = out_score + @unmatched_letter_penalty * unmatched

                    # TODO: Apply ordering bonuses

                # Return best result
                cond do
                    recursiveMatch and (not matched or best_recursive_score > out_score) ->
                        # Recursive score is better than "this"
                        #memcpy TODO
                        out_score = best_recursive_score
                        # & in c++ argument is by ref so return it
                        {true, out_score, recursion_count}
                    matched ->
                        # & in c++ argument is by ref so return it
                        {true, out_score, recursion_count}
                    true -> 
                        # & in c++ argument is by ref so return it
                        {false, out_score, recursion_count}
                end
            end
        end
    end
    defp _fuzzy_while(_, recursive_match, best_recursive_matches, best_recursive_score,
        pattern, [], str_begin, next_match, max_matches, recursion_count, src_matches) do
        state = 
            Enum.into [recursive_match, 
                best_recursive_matches, 
                best_recursive_score, 
                str_begin, 
                next_match, max_matches, 
                recursion_count,
                src_matches,
                pattern,
                ""], []
        {true, state}        
    end
    defp _fuzzy_while(_, recursive_match, best_recursive_matches, best_recursive_score,
        [], str, str_begin, next_match, max_matches, recursion_count, src_matches) do
        state = 
            Enum.into [recursive_match, 
                best_recursive_matches, 
                best_recursive_score, 
                str_begin,
                next_match, max_matches, 
                recursion_count,
                src_matches,
                "", # "" for str and pattern
                str], []
        {true, state}        
    end
    defp _fuzzy_while(first_match, recursive_match, best_recursive_matches, best_recursive_score,
        pattern, [_ | str_t] = str, str_begin, next_match, max_matches, recursion_count, src_matches) do
        if String.downcase pattern == String.downcase str do 
            tl(pattern)
            if nextMatch >= maxMatches do
                {false, :error} #return
            else
                # TODO: "Copy-on-Write" src_matches into matches
                if first_match && src_matches do
                    # memcpy(matches, src_matches, nextMatch); (?)
                    first_match = false;
                end                  

                # Recursive call that "skips" this match
                recursive_matches = Enum.to_list 0..256
                # if recursion_count >= 10 we add 1 before function to check it
                recursion_count = recursion_count + 1
                # 0 for recursive_score
                case _fuzzy_search(pattern, str, 0, str_begin, matches, #TODO: str + 1 WTF?
                    recursive_matches, 256, nextMatch, recursion_count) do
                    {true, recursive_score, recursion_count} ->
                        #TODO: Pick best recursive score
                        if not recursive_match or recursive_score > best_recursive_score do
                            # memcpy (?)
                            best_recursive_score = recursive_score
                        end
                        recursive_match = true
                end
                # Advance 
                next_match = next_match + 1
                # TODO: matches[nextMatch++] = (uint8_t)(str - strBegin); (?)
                pattern_t = tl(pattern)
                _fuzzy_while(first_match, pattern_t, str_t, next_match, max_matches)
            end
        else
            _fuzzy_while(first_match, pattern, str_t, nextMatch, maxMatches)
        end
    end
end

Hope can help me, thanks guys!

This is a pretty complex algorithm to start with when learning Elixir, especially if you don’t know C++ very well. Perhaps you want to start with a simpler problem?

First, you need to be careful porting code from an imperative language. The functional solution often looks much different because of pattern matching, immutable state, etc.

Here are few issue that stand out in your solution

The tl(pattern) does nothing below. Since variables are immutable, you need to alway capture the return of function all, or pipe it into another function. In the case below tl(pattern) returns the tai of the pattern. But since you don’t assign that to a variable, the statement does nothing

tl(pattern)
            if nextMatch >= maxMatches do
                {false, :error} #return

You should not assign a variable in a for, else, case, cond block and use it outside the block:

            if first_match && src_matches do
                # memcpy(matches, src_matches, nextMatch); (?)
                first_match = false;
            end   

You need to assign the result of the if and else blocks to the variable like this:

first_match = 
  if first_match && src_matches do
    false
  else
    first_match
  end

As for the memcpy(matches, src_matches, nextMatch); call statement, this is where your going to need to update to some state variable which you need to pass to the recursive function.

You don’t need to use variables if they are not needed later in the code. For example, the following code:

            next_match = next_match + 1
            # TODO: matches[nextMatch++] = (uint8_t)(str - strBegin); (?)
            pattern_t = tl(pattern)
            _fuzzy_while(first_match, pattern_t, str_t, next_match, max_matches)

can be replaced with:

            next_match = next_match + 1
            # TODO: matches[nextMatch++] = (uint8_t)(str - strBegin); (?)
            # pattern_t = tl(pattern) don't need this
            _fuzzy_while(first_match, tl(pattern), str_t, next_match, max_matches)

I don’t think you even need to track a next_match. This is typically done in an imperative language as an index to an array. In Elixir, we just keep prepending items to a list and then reverse the list when the iterations are done. We prepend because its much more efficient than appending.

That small piece of code could look like this:

# Probably don't need max_matches. you get get that by length(matches)
# The (uint8_t)(str - strBegin) is not Elixir code. I just didn't translate it
_fuzzy_while(first_match, tl(pattern), str_t, [ (uint8_t)(str - strBegin) | matches])

In the following code, you are putting a list into a list:

    state = 
        Enum.into [recursive_match, 
            best_recursive_matches, 
            best_recursive_score, 
            str_begin,
            next_match, max_matches, 
            recursion_count,
            src_matches,
            "", # "" for str and pattern
            str], []

This can be replaced with:

    state = 
         [recursive_match, 
            best_recursive_matches, 
            best_recursive_score, 
            str_begin,
            next_match, max_matches, 
            recursion_count,
            src_matches,
            "", # "" for str and pattern
            str]

Generally, your if statements are way to long. In Elixir, statements like those are broken into smaller functions. The pipe operator is used to chain these smaller functions together. Functional programming is about transforming data. So, designs typically have small functions that take some data in and return a transformation of the output.

Keep at it. Learning function programming can be tough at first, but you eventually learn to think functionally. It takes some practice. But once you make the transformation, you may not want to program imperatively anymore. That was my experience.

4 Likes

@smpallen99 So many thanks for your help! a lot of errors omg but well you’re right It’s a complex algorithm for a begginer on elixir/c++ but hum I like challenges hahahahaha

This is what I achieve (I’m sure it’s not the elixir way if someone can correct those error I will be very happy) :slight_smile:

  defmodule Okamiserver.Searcher.Search do
    # fts fuzzy match port from https://github.com/forrestthewoods/lib_fts/blob/master/code/fts_fuzzy_match.js

    # Score consts
    @sequential_bonus 5               # bonus for adjacent matches
    @separator_bonus 10               # bonus if match occurs after a separator
    @camel_bonus 10                   # bonus if match is uppercase and prev is lower
    @first_letter_bonus 15            # bonus if the first letter is matched

    @leading_letter_penalty -5        # penalty applied for every letter in str before the first match
    @max_leading_letter_penalty -15   # maximum penalty for leading letters
    @unmatched_letter_penalty -1      # penalty for every letter that doesn't matter

    def fuzzy_search(pattern, str) do
        _fuzzy_search(pattern |> String.codepoints, 
        str |> String.codepoints, 0, str, [], [], 1)
    end
    defp _fuzzy_search([], [], out_score, _, _, _, _, _, _) do
        {false, out_score}
    end
    defp _fuzzy_search(_, _, out_score, _, _, _, _, _, recursion_count)
        when recursion_count >= 10 #recursion limit
    do
        {false, out_score}
    end
    defp _fuzzy_search(pattern, str, out_score,
        str_begin, str_matches, matches,
        recursion_count)
    do
        case _fuzzy_while(true, false, [], 0, pattern, str,
            str, [], recursion_count, str_matches) do
            {false, out_score} ->
                {false, out_score}
            {true, state} ->
                [
                    recursive_match,
                    best_recursive_score,
                    str_begin,
                    matches,
                    recursion_count,
                    pattern,
                    str
                ] = state

                # Determine if full pattern was matched
                matched = pattern === [] and length(matches) > 0

                #Calculate score
                if matched do
                    str = []

                    # Initialize Score
                    out_score = 100
                    penalty = @leading_letter_penalty * Enum.at(matches, 0)
                    if penalty < @max_leading_letter_penalty do
                        # first letter don't match
                        penalty = @max_leading_letter_penalty
                    end
                    out_score = out_score + penalty

                    # Apply unmatched penalty
                    unmatched = String.replace_suffix(Enum.join(str_begin, ""), Enum.join(str, ""), "")
                    |> String.length |> Kernel.-(length(matches)) #TODO
                    out_score = out_score + @unmatched_letter_penalty * 
                        unmatched

                    # TODO: Apply ordering bonuses, I was thinking replace with something like head | tail but don't find a way to it
                    for i <- 0..(length(matches) - 1) do
                        curr_value = Enum.at matches, i

                        if i > 0 do
                            # Secuential
                            prev_value = Enum.at matches, (i - 1)

                            if curr_value === (prev_value + 1) do
                                out_score = out_score + @sequential_bonus
                            end
                        end

                        # Check for bonuses based on neighbor character value
                        if curr_value > 0 do
                            # Camel case
                            neighbor = Enum.at str_begin, (curr_value - 1)
                            curr = Enum.at str_begin, curr_value
                            if String.downcase(neighbor) === neighbor and
                                String.downcase(curr) === curr do
                                out_score = out_score + @camel_bonus
                            end

                            # Separator
                            if neighbor === "_" or neighbor === " " do
                                out_score = out_score + @separator_bonus
                            end
                        else
                            # First letter
                            out_score = out_score + @first_letter_bonus
                        end
                    end
                end
                # Return best result
                IO.inspect best_recursive_score
                IO.inspect out_score
                cond do
                    recursive_match and (not matched or best_recursive_score > 
                        out_score) ->
                        # Recursive score is better than "this"
                        out_score = best_recursive_score
                        # & in c++ argument is by ref so return it
                        {true, out_score, recursion_count}
                    matched ->
                        # & in c++ argument is by ref so return it
                        {true, out_score, recursion_count}
                    true -> 
                        # & in c++ argument is by ref so return it
                        {false, out_score, recursion_count}
                end
        end
    end
    defp _fuzzy_while(_f_m, recursive_match, _b_r_m, best_recursive_score,
        pattern, [], str_begin, matches, recursion_count, _s_m) do
        state =
            [recursive_match,
                best_recursive_score,
                str_begin,
                Enum.reverse(matches),
                recursion_count,
                pattern,
                []]
        {true, state}
    end
    defp _fuzzy_while(_f_m, recursive_match, _b_r_m, best_recursive_score,
        [], str, str_begin, matches, recursion_count, _s_m) do
        state =
            [recursive_match,
                best_recursive_score,
                str_begin,
                Enum.reverse(matches),
                recursion_count,
                [], # [] for str and pattern
                str]
        {true, state}
    end
    defp _fuzzy_while(first_match, recursive_match, best_recursive_matches,
        best_recursive_score, [pattern_h | pattern_t] = pattern,
        [str_h | str_t] = str, str_begin, matches, recursion_count,
        src_matches) do
        if String.downcase(pattern_h) === String.downcase(str_h) do
            if length(matches) >= 255 do
                {false, :error} #return
            else
                # "Copy-on-Write" src_matches into matches
                first_match =
                    if first_match && src_matches do
                        # memcpy(matches, src_matches, next_match);
                        matches = src_matches
                        false
                    else
                        first_match
                    end

                # Recursive call that "skips" this match
                recursive_matches = []
                # if recursion_count >= 10 we add 1 before function to check it
                recursion_count = recursion_count + 1
                # 0 for recursive_score
                # str + 1 its equal to get the tail on elixir
                case _fuzzy_search(pattern, str_t, 0, str_begin, matches,
                    recursive_matches, recursion_count) do
                    {true, recursive_score, recursion_count} ->
                        # Pick best recursive score
                        if not recursive_match or recursive_score >
                            best_recursive_score do
                            #memcpy it's equal to asing
                            best_recursive_matches = recursive_matches
                            best_recursive_score = recursive_score
                        end
                        recursive_match = true
                    {_, _, recursion_count} ->
                        IO.inspect "keeping the recursion_count"
                end
                # Advance 
                # matches[nextMatch++] = (uint8_t)(str - strBegin); equal to:
                # String.replace_suffix "League of Legends", "Legends", ""
                # "League of " |> String.length = 10
                _fuzzy_while(first_match, recursive_match, best_recursive_matches,
                    best_recursive_score, pattern_t, str_t, str_begin,
                    [String.replace_suffix(Enum.join(str_begin, ""), Enum.join(str, ""), "") 
                    |> String.length | matches],
                    recursion_count, src_matches)
            end
        else
            _fuzzy_while(first_match, recursive_match, best_recursive_matches,
                    best_recursive_score, pattern_t, str_t, str_begin, matches,
                    recursion_count, src_matches)
        end
    end
end

I check how work code with a c++ debugger and fix many aspects on code, but this while loop it’s making me crazy :confused:

Good day all! And again thanks for your help

A while loop of this complexity is very difficult to covert to Elixir because of all the state changes (local variables) that occur.

You are still missing the point I’ve made on 2 previous posts on this thread. You don’t want to use a variable that is set inside a block. Although it might still work with the current version of Elixir, it will spit out a lengthy compile warning. I suggest you try it on on a small piece of code.

defmodule Mine do
  def test(new_val) do
    var = 1
    if new_val > 1 do
      var = new_val
    end
    IO.puts "var = #{var}"
  end
end

Then try this

def PleaseDo do
  def test(new_val) do
    var = 1
    var = 
      if new_val > 1 do
        new_val
      else
        var
      end
    IO.puts "var = #{var}"
  end

  def even_better_test(new_val) do
    1
    |> check_and_set_var(new_val)
    |> puts
  end

  defp check_and_set_var(_val, new_val) when new_val > 1, do: new_val
  defp check_and_set_var(val, _new_val), do: val
  defp puts(var), do: IO.puts("var = #{var}")
end

Do you understand this? Can you see the issues with your code where you are setting variables inside your for comprehension and if statements?

You appear to be use for the same as you would for other imperative languages. That is wrong. The for comprehension takes a list (or range) as input, and returns a list. You need to either return the results of a for, assign it to a local variable for later use, or pipe it into another function that expects its first argument to be a list or an Enumerable.

In Elixir, we don’t generally use indexing functions like you would use in imperative languages. You can do it this way, but most of the time, its better just to enumerate over a collection. In the event that you do need an index for a list, you can use Enum.with_index. For example:

Interactive Elixir (1.4.2) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> defmodule Special do
...(1)>   def square_every_fifth(list) do
...(1)>     list
...(1)>     |> Enum.with_index
...(1)>     |> Enum.map(fn {item, count} ->
...(1)>       if count == 0 or rem(count + 1, 5) == 0 do
...(1)>         item * item
...(1)>       else
...(1)>         item
...(1)>       end
...(1)>     end)
...(1)>   end
...(1)> end
{:module, Special,
 <<70, 79, 82, 49, 0, 0, 6, 228, 66, 69, 65, 77, 69, 120, 68, 99, 0, 0, 0, 162,
   131, 104, 2, 100, 0, 14, 101, 108, 105, 120, 105, 114, 95, 100, 111, 99, 115,
   95, 118, 49, 108, 0, 0, 0, 4, 104, 2, ...>>, {:square_every_fifth, 1}}
iex(2)> Special.square_every_fifth(2..20)
[4, 3, 4, 5, 36, 7, 8, 9, 10, 121, 12, 13, 14, 15, 256, 17, 18, 19, 20]
iex(3)>

Here is a more involved example showing how to structure an Elixir program for complex state management.

defmodule Example do
  def start do
    inital_state()
    |> update_state 
  end

  def inital_state do
    %{var1: 1, var2: 2, var3: 3, list: [2,3,4,5,6]} 
  end

  def update_state(state) do
    Enum.reduce(state[:list], state, fn item, %{var1: var1} = acc -> 
      case item do
        2 when var1 == 1 -> 
          Map.put(acc, :var1, 2)
        3 -> 
          if var1 == 1 and acc[:var3] == 3 do
            acc
            |> Map.put(:var3, 0)
            |> Map.put(:var2, 42)
          else
            acc
          end
        _ -> 
          acc   # do nothing
      end
    end)
    |> handle_var2_conditions
    |> handle_var3_conditions
  end

  defp handle_var2_conditions(state) do
    if state[:var2] == 1 and hd(state[:list]) * 2 == 8 do
      Map.update(state, :var2, &(&1 + 1))  # increment var2 if true
    else
      state  # do nothing otherwise
    end 
  end

  defp handle_var3_conditions(state) do
    if rem(state[:var3], 2) == 0 or Enum.at(state[:list], state[:var3]) == 0 do
      Map.put(state, :var3, 0)
    else
      state
    end
  end
end

and the output:

iex(10)> Example.start()
%{list: [2, 3, 4, 5, 6], var1: 2, var2: 2, var3: 3}
iex(11)>

Note that I’m trying to give you examples on how to structure your program. I could port the code for you, but I think you will learn much more by taking my examples and applying them yourself.

I would suggest that you try again, with only one of the fancy conditions and try to get something working on the C++ and Elixir site. This way you can test inputs and outputs on both and compare the results. It also allows you to get some compiled Elixir code working, trouble shoot the compile errors and warns. I would make sure you resolve all elixir compile warnings before posting the code here for review.

Let me know us know if you have specific questions about this post.

2 Likes

@smpallen99 Really! So many thanks!!! Your examples help me a LOT, I rewrite all the code again I think now it’s more “Elixir way” but if you can confirm that hahahaha, here the code

defmodule OkamiServer.Search.Searcher do

# Score consts
@sequential_bonus 5               # bonus for adjacent matches
@separator_bonus 10               # bonus if match occurs after a separator
@camel_bonus 10                   # bonus if match is uppercase and prev is lower
@first_letter_bonus 15            # bonus if the first letter is matched

@leading_letter_penalty -5        # penalty applied for every letter in str before the first match
@max_leading_letter_penalty -15   # maximum penalty for leading letters
@unmatched_letter_penalty -1      # penalty for every letter that doesn't matter

def search(pat, str) do
  _search(pat, str, 0, str, 1)
end
defp _search(pat, str, score, str_begin, rec_count) do
  case _while(pat, str, str_begin, rec_count) do
    {:ok, state, pat} ->
      _calculate_score(state, str_begin, pat)
    false ->
      {:false, score, []}
  end
end

defp _calculate_score(state, str_begin, pat) do
  100
  |> _calculate_penalty(Enum.at(state[:matches], 0))
  |> _cal_unmatched(str_begin, state)
  |> _iter_matches(state[:matches], str_begin)
  |> _calculate_return(state, pat)
end
defp _calculate_return({_, score}, state, pat) do
  cond do
    state[:recursive_match] and
      (pat != [] or state[:best_rec_score] > score) ->
        {true, state[:best_rec_score], state[:best_rec_matches]}
    pat === [] ->
      {true, score, state[:matches]}
    true ->
      {false, score, state[:matches]}
  end
end
defp _calculate_penalty(out_score, match_score) do
  if @leading_letter_penalty * match_score < @max_leading_letter_penalty do
    out_score + @leading_letter_penalty * match_score
  else
    out_score
  end
end
defp _cal_unmatched(out_score, str_begin, state) do
  out_score + @unmatched_letter_penalty *
    (String.replace_suffix(Enum.join(str_begin, ""), "", "")
    |> String.length |> Kernel.-(length(state[:matches])))
end

defp _iter_matches(out_score, matches, str_begin) do
  matches
  |> Enum.with_index
  |> Enum.map_reduce(out_score, fn({item, count}, score) ->
    {item, score
    |> _iter_sequential(matches, count)
    |> _iter_bonuses(str_begin, item)}
  end)
end
defp _iter_sequential(score, matches, count) do
  if count > 0 and count == Enum.at(matches, count - 1) do
    score + @sequential_bonus
  else
    score
  end
end
defp _iter_bonuses(score, str_begin, item) do
  if item > 0 do
    neighbor = Enum.at(str_begin, item - 1)
    score
    |> _camel_case(str_begin, item, neighbor)
    |> _separator(neighbor)
  else
    score + @first_letter_bonus
  end
end
defp _camel_case(score, str_begin, count, neighbor) do
  if neighbor == String.downcase(neighbor) and
      Enum.at(str_begin, count) === String.downcase(Enum.at(str_begin, count))
  do
    score + @camel_bonus
  else
    score
  end
end
defp _separator(score, neighbor) do
  if neighbor === "_" or neighbor === " " do
    score + @separator_bonus
  else
    score
  end
end

defp _while(pat, str, str_begin, rec_count) do
  state = %{
    rec_match: false, #rec_match
    best_rec_matches: [], # best_rec_matches
    best_rec_score: 0, # best_rec_score
    first_match: true, # first_match
    matches: [], # matches
    src_matches: nil # src_matches
  }
  _while(state, pat, str, str_begin, rec_count)
end
defp _while(state, [], _, _, _) do
  {:ok,state,[]}
end
defp _while(state, pat, [], _, _) do
  {:ok,state,pat}
end
defp _while(state,[pat_h | pat_t] = pat,[str_h | str_t] = str,
  str_begin, rec_count) do
  #can't call String.downcase on a def when so here use cond
  pat =
    if String.downcase(pat_h) === String.downcase(str_h) do
      rec_count = rec_count + 1 #TODO:how to fix this unsafe?
      state =
        state
        |> check_first_match
        |> check_recursive(pat, str_t, str_begin, rec_count)
        |> update_matches(str, str_begin)
      pat_t
    else
      pat
    end
  _while(state, pat, str_t, str_begin, rec_count)
end

defp check_first_match(state) do
  if state[:first_match] and state[:src_matches] != nil do
    state
    |> Map.put(:first_match, false)
  else
    state
  end
end

defp check_recursive(state, pat, str, str_begin, rec_count) do
  case _search(pat, str, 0, str_begin, rec_count) do
    {true, score, matches} ->
      recursive_matches state, score, matches
    _ ->
      state
  end
end

defp recursive_matches(state, score, matches) do
  state =
    if state[:recursive_match] or score > state[:best_rec_score] do
      state
      |> Map.put(:best_rec_matches, matches)
      |> Map.put(:best_rec_score, score)
    else
      state
    end
  state
  |> Map.put(:recursive_map, true)
end

defp update_matches(state, str, str_begin) do
  state
  |> Map.put(:matches,
    [String.replace_suffix(Enum.join(str_begin, ""), Enum.join(str, ""), "")
    |> String.length | state[:matches]])

end
end

What think about the new code? :smiley:

2 Likes

Wow, that looks much better! One problem area

defp _while(state,[pat_h | pat_t] = pat,[str_h | str_t] = str,
  str_begin, rec_count) do
  #can't call String.downcase on a def when so here use cond
  pat =
    if String.downcase(pat_h) === String.downcase(str_h) do
      rec_count = rec_count + 1 #TODO:how to fix this unsafe?
      state =
        state
        |> check_first_match
        |> check_recursive(pat, str_t, str_begin, rec_count)
        |> update_matches(str, str_begin)
      pat_t
    else
      pat
    end
  _while(state, pat, str_t, str_begin, rec_count)
end

You are assigning state in the if block and using it later. Try this instead:

defp _while(state,[pat_h | pat_t] = pat,[str_h | str_t] = str,
  str_begin, rec_count) do
  #can't call String.downcase on a def when so here use cond
  {pat, state} =
    if String.downcase(pat_h) === String.downcase(str_h) do
      rec_count = rec_count + 1 #TODO:how to fix this unsafe?
      state =
        state
        |> check_first_match
        |> check_recursive(pat, str_t, str_begin, rec_count)
        |> update_matches(str, str_begin)
      {pat_t, state}
    else
      {pat, state}
    end
  _while(state, pat, str_t, str_begin, rec_count)
end

I have really only look for obvious errors and have not verified the logic. I don’t see any other glaring errors.

There still might be an opportunity to optimize implement in more of an elixir way. But you can look at that after you get the algorithm compiling and passing some tests.

2 Likes

Again thanks @smpallen99 your code works totally fine I just add the rec_count parameter, with this all compiler errors are deleted, also I check again all the code and fix a lot of little mistakes on translate, this is the new one:

     defmodule OkamiServer.Search.Searcher do

  # Score consts
  @sequential_bonus 5               # bonus for adjacent matches
  @separator_bonus 10               # bonus if match occurs after a separator
  @camel_bonus 10                   # bonus if match is uppercase and prev is lower
  @first_letter_bonus 15            # bonus if the first letter is matched

  @leading_letter_penalty -5        # penalty applied for every letter in str before the first match
  @max_leading_letter_penalty -15   # maximum penalty for leading letters
  @unmatched_letter_penalty -1      # penalty for every letter that doesn't matter

  def search(pat, str) do
    _search(pat |> String.codepoints ,
      str |> String.codepoints , 0, str |> String.codepoints, 1)
  end
  defp _search(_, _, score, _, rec_count) when rec_count >= 10 do
    {:false, score, []}
  end
  defp _search([], _, score, _, _) do # pattern empty
    {:false, score, []}
  end
  defp _search(_, [], score, _, _) do # or str empty
    {:false, score, []}
  end
  defp _search(pat, str, score, str_begin, rec_count) do
    case _while(pat, str, str_begin, rec_count) do
      {:ok, state, pat} ->
        _calculate_score(state, str_begin, pat, score)
      false ->
        {:false, score, []}
    end
  end

  defp _calculate_score(state, str_begin, pat, score) do
    if pat == [] do
      100
      |> _calculate_penalty(Enum.at(state[:matches], 0))
      |> _cal_unmatched(str_begin, state)
      |> _iter_matches(state[:matches], str_begin)
      |> _calculate_return(state, pat)
    else
      _calculate_return({:ok, score}, state, pat)
    end
  end
  defp _calculate_return({_, score}, state, pat) do
    cond do
      state[:rec_match] && (pat != [] || state[:best_rec_score] > score)
        ->
        {true, state[:best_rec_score], state[:best_rec_matches]}
      pat === [] ->
        {true, score, state[:matches]}
      true ->
        {false, score, state[:matches]}
    end
  end
  defp _calculate_penalty(out_score, match_score) do
    penalty =
      if match_score == nil do
        0
      else
        @leading_letter_penalty * match_score
      end
    if penalty < @max_leading_letter_penalty do
      out_score + @max_leading_letter_penalty
    else
      out_score + penalty
    end
  end
  defp _cal_unmatched(out_score, str_begin, state) do
    out_score + @unmatched_letter_penalty *
      (String.replace_suffix(Enum.join(str_begin, ""), "", "")
      |> String.length |> Kernel.-(length(state[:matches])-1))
  end

  defp _iter_matches(out_score, matches, str_begin) do
    matches
    |> Enum.with_index
    |> Enum.map_reduce(out_score, fn({item, count}, score) ->
      {item, score
      |> _iter_sequential(matches, count, item)
      |> _iter_bonuses(str_begin, item)}
    end)
  end
  defp _iter_sequential(score, matches, count, item) do
    if count > 0 && item == (Enum.at(matches, count - 1) + 1) do
      score + @sequential_bonus
    else
      score
    end
  end
  defp _iter_bonuses(score, str_begin, item) do
    if item > 0 do
      neighbor = Enum.at(str_begin, item - 1)
      score
      |> _camel_case(neighbor, Enum.at(str_begin, item))
      |> _separator(neighbor)
    else
      score + @first_letter_bonus
    end
  end
  defp _camel_case(score, neighbor, curr) do
    if neighbor == String.downcase(neighbor) && curr === String.upcase(curr)
    do
      score + @camel_bonus
    else
      score
    end
  end
  defp _separator(score, neighbor) do
    if neighbor === "_" || neighbor === " " do
      score + @separator_bonus
    else
      score
    end
  end

  defp _while(pat, str, str_begin, rec_count) do
    state = %{
      rec_match: false, #rec_match
      best_rec_matches: [], # best_rec_matches
      best_rec_score: 0, # best_rec_score
      first_match: true, # first_match
      matches: [], # matches
      src_matches: nil # src_matches
    }
    _while(state, pat, str, str_begin, rec_count)
  end
  defp _while(state, [], _, _, _) do
    {:ok,
    state |> Map.put(:matches, Enum.reverse(state[:matches])),
    []}
  end
  defp _while(state, pat, [], _, _) do
    {:ok,
    state |> Map.put(:matches, Enum.reverse(state[:matches])),
    pat}
  end
  defp _while(state, pat,[_ | str_t] = str,
    str_begin, rec_count) do
    #can't call String.downcase on a def when so here use cond
    case _check_while(state, pat, str, str_begin, rec_count) do
      {pat, state, rec_count} ->
        _while(state, pat, str_t, str_begin, rec_count)
      false ->
        false
    end
  end
  defp _check_while(state,[pat_h | pat_t] = pat,[str_h | str_t] = str,
    str_begin, rec_count) do
    if String.downcase(pat_h) === String.downcase(str_h) do
      if length(state[:matches]) >= 255 do
        false
      else
        rec_count = rec_count + 1
        state =
          state
          |> check_first_match
          |> check_recursive(pat, str_t, str_begin, rec_count)
          |> update_matches(str, str_begin)
        {pat_t, state, rec_count}
      end
    else
      {pat, state, rec_count}
    end
  end

  defp check_first_match(state) do
    if state[:first_match] && state[:src_matches] != nil do
      state
      |> Map.put(:matches, state[:src_matches])
      |> Map.put(:first_match, false)
    else
      state
    end
  end

  defp check_recursive(state, pat, str, str_begin, rec_count) do
    case _search(pat, str, 0, str_begin, rec_count) do
      {true, score, matches} ->
        recursive_matches state, score, matches
      _ ->
        state
    end
  end

  defp recursive_matches(state, score, matches) do
    state =
      if state[:rec_match] === false ||
        score > state[:best_rec_score] do
        state
        |> Map.put(:best_rec_matches, matches)
        |> Map.put(:best_rec_score, score)
      else
        state
      end
    state
    |> Map.put(:rec_match, true)
  end

  defp update_matches(state, str, str_begin) do
    state
    |> Map.put(:matches,
      [String.replace_suffix(Enum.join(str_begin, ""), Enum.join(str, ""), "")
      |> String.length | state[:matches]])

  end
end

The code runs normally, and return the values, but this are wrong compared with the c++ code, for example:

The pattern “LoL” and str “League of Legends” in c++ give : 161 of score but in my code gives 131 of score, I already check all code many times but i really can’t figure out what is the problem, can someone give me a hand to know what is the error?

Many thanks all! :smiley:

I believe you have the C++ version working in a debugger. You might want to try https://github.com/qhool/quaff It should allow you to set breakpoints and single step through your Elixir code. I have only briefly tried that package, so I comment on how well it works.

The other option is to add logging statements in your code to compare the execution of the Elixir version vs the C++ version.

@smpallen99 Thanks to answer, thats exactly what I’m doing actually, but I’m facing a problem directly with Elixir apending numbers, here the problem:

If I append a 10 yo a empty list it return as char (I know this is not a error from elixir but crash my code), here an example

 iex(45)> OkamiServer.Search.Searcher.search("LoL", "League of Legends")
"enter"
 %{best_rec_matches: [], best_rec_score: 0, first_match: false, matches: '\n',
  rec_match: false, src_matches: []}
 %{best_rec_matches: [], best_rec_score: 0, first_match: true,
  matches: [0, '\n'], rec_match: false, src_matches: nil}
 %{best_rec_matches: [], best_rec_score: 0, first_match: true,
   matches: [7, 0, '\n'], rec_match: false, src_matches: nil}
 %{best_rec_matches: [], best_rec_score: 0, first_match: true,
   matches: [10, 7, 0, '\n'], rec_match: false, src_matches: nil}
 ** (ArithmeticError) bad argument in arithmetic expression
(okami_server) lib/okami_server/search/searcher.ex:70: OkamiServer.Search.Searcher._calculate_penalty/2
(okami_server) lib/okami_server/search/searcher.ex:46: OkamiServer.Search.Searcher._calculate_score/4
 iex(45)> [10 | []]
'\n'

In this case I’m calculating the score of matches but @leading_letter_penalty * match_score (in this case match_score be ‘n’ give the aritmetic error) I research but I just found how its fixed on IO.inspect adding char_list: false to the options but in this case I cant figure out how to fix it :confused:

For note [0, '\n'] === [0, [10]], so you have a list inside of a list, is that what you intended to do?

Inspecting lists is a source of confusion for many people learning elixir. So, let me explain. First, [10] is a one element list with the integer 10. Always. The inspect protocol looks at a list and if it contains all printable characters, it prints it as a char list. that is \n == [10].

Looks like you may have missed a list prepend | operator somewhere. Your building the list correctly, but as @OvermindDL1 says, your starting with [0, ‘\n’] == [0, [10]] and building on that [7, 0, ‘\n’], [10, 7, 0, ‘\n’], …

Oh no no this was just an example, here the real code:

[String.replace_suffix(Enum.join(str_begin, ""), Enum.join(str, ""), "")
  |> String.length | state[:matches]]

In this case str_begin = “League of Legends” and str = “Legends” gives 10 but the state[:matches] in empty on first case, as you can see on the previous post the matches:

%{best_rec_matches: [], best_rec_score: 0, first_match: true,
 matches: [10, 7, 0, '\n'], rec_match: false, src_matches: nil}

Oh yes but if you see the logs the first is giving me this :

%{best_rec_matches: [], best_rec_score: 0, first_match: false, matches: '\n',
 rec_match: false, src_matches: []}

this ‘\n’ is the result of [10 | []] (see the my answer to @OvermindDL1), the 0 on the next list are a append of the same funcion show before

@smpallen99 @OvermindDL1 You’r right, I check and find the problem I was doing this[matches | state[:matches]] instead of this List.flatten([matches | state[:matches]]) on another function, I post complete code when its ready, thanks

If your sure matches and state[:matches] are both lists, then I would to matches ++ state[:matches]. That is, if you sure they are both flat lists. I don’t believe there would be any performance issue.

++ involves rebuilding the left side to store the right side at the end of it, it is not a cheap operation.

Flatten would work to fix it, however the question is how you are storing a list into a list in the first place? (I’ve not had a moment to look close at the code yet… ^.^;)