How would I write this algorithm in Elixir

So I have this text file here containing a lot of passwords
https://raw.githubusercontent.com/danielmiessler/SecLists/master/Passwords/Common-Credentials/10-million-password-list-top-1000000.txt

And I want a way to compare this. Like I would fetch that URL with HTTPoison, the response would return all of the password as the body, yea, but how could I compare the passwords?
Like I enter a password and it then checks if the password (given by the user) is within this “list”. I know I could use contains but are there any recommendations?

Is it for comparing to make sure users arent using common passwords? I propably would convert that txt-file to sqlite and store that on the server and query it with ecto

Can’t you just put the passwords as keys in an ets table?

uh im not sure, i want to keep it a basic defmodule with a function

the most brutal way would be

"passwords.txt"
|> File.read!()
|> String.split("\n")
|> Enum.into(%{}, &{&1, true})

and check with Map.has_key

You can create the map at compile time by making it a module attribute.

You don’t want to download/parse/… that file each time you have a password to check. I’d probably use :persistent_term to store the list of passwords in and regularly update it with quantum/Parent.Periodic/Oban/a gen_server/…. Then when it’s time to check a password pull out the list and compare the password to the list contents. That is if you want to avoid needing to do a release for getting an updated list of bad passwords. Otherwise what @Sebb suggested works at compile time.

Example script

Depending on your use case you can use this code in many ways:

Mix.install(~w[httpoison http_stream]a)
Application.put_env(:http_stream, :adapter, HTTPStream.Adapter.HTTPoison)

defmodule Example do
  @url "https://raw.githubusercontent.com/danielmiessler/SecLists/master/Passwords/Common-Credentials/10-million-password-list-top-1000000.txt"

  def sample do
    password = "123456"
    path = url_to_path(@url)
    IO.inspect(Example.find(@url, password), label: "Example.find(url, password)")
    IO.inspect(File.exists?(path), label: "File.exists?/1")
    IO.inspect(Example.save(@url, path), label: "Example.save/3")
    IO.inspect(File.exists?(path), label: "File.exists?/1")
    IO.inspect(Example.find(path, password), label: "Example.find(path, password)")
    File.rm(path)
    IO.puts("Removed file before next save")
    IO.inspect(Example.save_and_find(@url, path, password), label: "Example.save_and_find?/3")
    IO.inspect(File.exists?(path), label: "File.exists?/1")
    IO.puts("Cleanup (removed saved file)")
    File.rm(path)
  end

  def save_and_find(url, path \\ nil, text, opts \\ []) do
    path = ensure_path(path, url)
    url |> ensure_saved(path, opts) |> find(text)
  end

  def find(path_or_url, text, opts \\ []) do
    opts = Keyword.put(opts, :flatten, true)
    path_or_url |> stream(opts) |> Enum.find(&(&1 == text))
  end

  defp ensure_saved(url, path, opts) do
    force_save = opts[:force] || false
    if force_save or not File.exists?(path), do: save(url, path, opts), else: path
  end

  def save(url, path \\ nil, opts \\ []) do
    path = ensure_path(path, url)
    File.touch(path)
    file_stream = stream(path, Keyword.put(opts, :raw_stream, true))
    url |> stream(opts) |> Stream.into(file_stream) |> Stream.run()
    path
  end

  defp ensure_path(nil, url), do: url_to_path(url)
  defp ensure_path(path, _url), do: path

  def url_to_path(url), do: url |> URI.parse() |> Map.get(:path) |> Path.basename()

  def stream(path_or_url, opts \\ []) do
    host = URI.parse(path_or_url).host

    if is_nil(host) do
      raw_stream = opts[:raw_stream] || false
      stream = File.stream!(path_or_url)
      if raw_stream, do: stream, else: Stream.map(stream, &String.trim_trailing(&1, "\n"))
    else
      flatten = opts[:flatten] || false
      stream = HTTPStream.get(path_or_url)
      if flatten, do: Stream.flat_map(stream, &String.split(&1, "\n")), else: stream
    end
  end
end

Example.sample()

Notes

In some cases you may want to write downloaded file into memory instead of file. In that case you can easily modify above code to use for exampleets table instead of File.stream!/1 as mentioned already.

As mentioned already you can use said stream and transform it into list and generate a functions at compile time with pattern matching, for example:

defmodule Password do
  stream = Example.stream(url, flatten: true)

  for password <- stream do
    def valid?(unquote(password)), do: true
  end

  def valid?(_invalid_password), do: false
end

Helpful resources

Articles:

  1. Download Large Files with HTTPoison Async Requests
  2. Elixir Streams to process large HTTP responses on the fly

Dependencies:

  1. httpoison
  2. http_stream

Documentation:

  1. Application.put_env/3
  2. Enum.find/2
  3. File.exists?/1
  4. File.rm/1
  5. File.stream!/1
  6. File.touch/1
  7. HTTPStream.get/1
  8. IO.inspect/2
  9. IO.puts/1
  10. Keyword.put/3
  11. Map.get/2
  12. Mix.install/1
  13. Stream.flat_map/2
  14. Stream.into/2
  15. Stream.map/2
  16. String.trim_trailing/2
  17. URI.parse/1
3 Likes

I would just parse them only once and put them in a DB and would then just query against it.

While many answers are good, none of them brought up the point that you haven’t provided an algorithm to translate to elixir. Instead, you’ve given a vague goal, without making clear what your constraints were. So you’ve got plenty solutions which are as good as they can be in this context.
However some of the solutions imply traversing the whole 10_000_000 records, is it acceptable? Is it fast enough?
Some (eg ets) imply storing those records in ram (I’d suspect the ets table would grow to 10M which may or may not be acceptable).
Some imply macros, which will lead to larger application size…
All of this to say, what you want is unclear. Had you provided an actual algorithm to port then, maybe the trade offs would have been clear.
If you can spare the ram, then the ets solution with password as key will probably be the fastest here (provided you use :ets.lookup/2).

could you link a source to ets?

Also here is an example :ets based script:

Mix.install(~w[httpoison http_stream]a)
Application.put_env(:http_stream, :adapter, HTTPStream.Adapter.HTTPoison)

defmodule Example do
  @url "https://raw.githubusercontent.com/danielmiessler/SecLists/master/Passwords/Common-Credentials/10-million-password-list-top-1000000.txt"

  def fetch(table) do
    @url
    |> HTTPStream.get()
    |> Stream.flat_map(&String.split(&1, "\n"))
    |> Stream.map(&:ets.insert(table, {&1}))
    |> Stream.run()
  end

  def find(table, text) do
    case :ets.lookup(table, text) do
      [{^text}] -> text
      [] -> nil
    end
  end
end

table = :ets.new(:table_name, [:set])
Example.fetch(table)
Example.find(table, "123456")
3 Likes