First Elixir Project (Probably terrible code)

I just finished the first iteration of my first Elixir project. I’m still super new to all of this and would love feedback, pull requests, etc. The code is available here GitHub - nix2intel/builtwith: Builtwith API wrapper written in Elixir and the documentation is here on hex builtwith. I would love to know if I’m doing dumb things, if there are different ways I should approach the problems, etc. I love elixir so far but still trying to grok it as python has been my main programming language for years.

3 Likes

To start with, I’d use Req and not HttpPoison.

Also the Map.get function calls you could just replace with pattern-matching IMO.

1 Like

Maybe Req is now the cool kid in town, however HttpPoison is a battle-tested library that was long before req was created, I use it a lot even to this day. Personally I would just decouple the http client and have a default implementation with whatever is easier to use these days.

2 Likes

Could you describe more what this would look like in this case? right now it kind of looks like this:
Builtwith.make_request(domain: example.com bwpass: api_token_here)

and then it handles building the string in the backend in this case using httpoison. I just want to make sure i’m writing elixiy code and not python code in elixir :smiley:

Are there any good resources you could point me to for pattern matching? This is a fairly new concept to me. I’d love to not have to do map finds in enum statements to get a map that I can pull out top level items from the json.

Just out of curiosity - what is your argument for one, not the other?

Instead of using HttpPoison directly in your codebase, you can define a generic http client interface that is fully configurable.

Here is an example of a interface: lib/ssl_moon/network/http/http_client.ex · main · SSL MOON / SSL MOON · GitLab
And implementation: lib/ssl_moon/network/http/http_client_impl.ex · main · SSL MOON / SSL MOON · GitLab

If you plan on using that code, don’t mind the logic related to redirects as that is irrelevant in your case.

2 Likes

on the topic of pattern matching, from the docs:
https://hexdocs.pm/elixir/pattern-matching.html

And personally, I really like this source as well:

1 Like

AFAIK it’s not advisable to check in the doc directory. It is ignored by the .gitignore generated by ElixirLS; gitignore/Elixir.gitignore at main · github/gitignore · GitHub.

Some other general feedback (not Elixir specific, hope that’s ok):

Improve names

  • Method names like make_request can feel obvious in the moment you are writing the code, but don’t truly communicate what they are doing. Instead, consider lookup_by_domain.

Do one thing

  • Most of the get_ methods are doing more than one thing, for example:
 def get_subdomains(builtwithjson) do
      get_results(builtwithjson)
      |> Enum.find(fn map -> Map.has_key?(map, "Result") end)
      |> case do
        nil ->
          nil
        subdomains ->
        Map.get(subdomains, "Result")
        |> Map.get("Paths")
        |> Enum.map(fn x -> Map.get(x, "SubDomain") end)
        |> Enum.reject(&(&1 == ""))
      end
  end

You are getting the first result from a list of results and getting the SubDomain from it. I know most likely the first result of the list is probably the correct one, otherwise you wouldn’t have built it this way. The issue though is that your library is making a choice that’s hidden from the user. Namely, which one of the results is “correct.” What if my lookup by domain had multiple results and I actually wanted the second one? Instead, make these get_ methods only work with a Result, not a list of Results, e.g. change the above to:

 def get_subdomains(builtwithjson_result) do
      builtwithjson_result
          |> Map.get(subdomains, "Result")
          |> Map.get("Paths")
          |> Enum.map(fn x -> Map.get(x, "SubDomain") end)
        |> Enum.reject(&(&1 == ""))
  end

Now, you do force the library user to pick which result they want, and they could get an error when using the get_ methods, but your current library has the following behavior:

# Imagine the list of results has, in order:
# 1. A result with a "Result" field, but no "Attributes" field.
# 2. A result with an "Atttribute" field.
results = Builtwith.lookup_by_domain("google.com", my_api_key)

# Reads the "SubDomain" field from the first entry in the list of results. 
subdomains = results |> Builtwith.get_subdomains() 

# Read the Attributes field from the second entry in the list of results.
attributes = results |> Builtwith.get_attributes() 

The user now has a list of subdomains and a list of attributes, but they aren’t from the same result. This could be very bad for some APIs. This exact scenario may be impossible with the Builtwith API, but it could be possible with another API with a different data model and could have very bad consequences if you get this mixed up.

Micro optimization: Avoid double Enum calls

This is a small one and someone may correct me if I’m incorrect about this. In get_subdomains:

|> Enum.map(fn x -> Map.get(x, "SubDomain") end)
|> Enum.reject(&(&1 == ""))

Each call to Enum will basically allocate a new list (which will increase memory usage). Instead you can use the Stream API:

|> Stream.map(fn x -> Map.get(x, "SubDomain") end)
|> Enum.reject(&(&1 == ""))

This will lead to a slight optimization to the memory footprint of your code. Note, Streams are lazy - the output of a Stream method is always a Stream, so you will always need to end with an Enum method to get the type you actually want at the end.

2 Likes

BTW Req is inspired by requests, a Python HTTP client…

1 Like

Not to speak for @dimitarvp but Andrea L. has a good write-up from a year ago now:

Despite HTTPoinson not being listed in the mini table of contents, he talks about it near the end.

1 Like

While I have been using HTTPoison for most of my HTTP clients, I am now switching to Req, it has some cool stuff baked in (retry…)

1 Like

Pattern-matchings is blowing my mind, thank you! goodbye python, this is so much cooler!

3 Likes

i’ll look into this in the future, I can already tell a major refactoring going to happen here with some of the above and as I get to know elixir better.

this is good to know thank you, what about priv? i’m trying to find all those edge cases.

thank you so much! these are definetly the problems I want to avoid, i’ll be refactoring based on these comments. Elixir seems like such a beautiful and amazing language and community.

I have priv committed in some of my Phoenix projects, because it contains non-machine generated code (e.g., migrations, javascript, etc).

thank you so much! these are definetly the problems I want to avoid, i’ll be refactoring based on these comments. Elixir seems like such a beautiful and amazing language and community.

I am glad the comments were helpful. And I am glad you are enjoying the language and the community. I feel the same way. The community is one of the best I have seen. :slight_smile:

1 Like

priv is the canonical place to store external resources, ie, non-Elixir files. Stuff in assets gets copied into it which is ignored by default if you use phx_new. You can also store such things in lib if you use the @external_resource module attribute though that’s a bit more advanced!

1 Like