Playing with DeepSeek for code generation

vkryukov · December 27, 2024, 8:40am

I’m playing with DeepSeek V3, a new open source 670B parameter model released by a Chinese LLM provider, that is said to be close in quality to the leading closed source models (Claude and gpt4o), and it’s pretty good (and ~10-15X cheaper than the competition when you use their API, with no restrictions).

One thing that it surprised me with was that when I asked it to fix the KV tutorial code (essentially, to implement “release buckets on exit”), instead of creating a map of refs as in the tutorial, it simply used the pid from the :DOWN message, arriving to the following solution:

  @impl true
  def handle_cast({:create, name}, names) do
    if Map.has_key?(names, name) do
      {:noreply, names}
    else
      {:ok, bucket} = KV.Bucket.start_link([])
      # Monitor the bucket process
      Process.monitor(bucket)
      {:noreply, Map.put(names, name, bucket)}
    end
  end

  @impl true
  def handle_info({:DOWN, _ref, :process, pid, _reason}, names) do
    # Remove the bucket from the registry when it exits
    names = Enum.into(Enum.reject(names, fn {_name, bucket_pid} -> bucket_pid == pid end), %{})
    {:noreply, names}
  end

Of course, nothing is ever rosy with LLM-produced code - its original suggestion didn’t have Enum.into, so it was silently replacing maps with lists when removing names (but DeepSeek was able to resolve the issue once I supplied it with a stack trace). And of course with a large number of processes, using a second map for refs will be more performant than linearly searching through the list. Notice also that most Elixir developers(?) would probably write

names = 
  names
  |> Enum.reject(...)
  |> Enum.into(%{})

to make the code more readable.

But I’m still impressed - I personally didn’t think of that solution when I was going through the docs for the first or second time.