I just finished the first iteration of my first Elixir project. I’m still super new to all of this and would love feedback, pull requests, etc. The code is available here GitHub - nix2intel/builtwith: Builtwith API wrapper written in Elixir and the documentation is here on hex builtwith. I would love to know if I’m doing dumb things, if there are different ways I should approach the problems, etc. I love elixir so far but still trying to grok it as python has been my main programming language for years.
To start with, I’d use Req
and not HttpPoison
.
Also the Map.get
function calls you could just replace with pattern-matching IMO.
Maybe Req
is now the cool kid in town, however HttpPoison
is a battle-tested library that was long before req was created, I use it a lot even to this day. Personally I would just decouple the http client and have a default implementation with whatever is easier to use these days.
Could you describe more what this would look like in this case? right now it kind of looks like this:
Builtwith.make_request(domain: example.com bwpass: api_token_here)
and then it handles building the string in the backend in this case using httpoison. I just want to make sure i’m writing elixiy code and not python code in elixir
Are there any good resources you could point me to for pattern matching? This is a fairly new concept to me. I’d love to not have to do map finds in enum statements to get a map that I can pull out top level items from the json.
Just out of curiosity - what is your argument for one, not the other?
Instead of using HttpPoison
directly in your codebase, you can define a generic http client interface that is fully configurable.
Here is an example of a interface: lib/ssl_moon/network/http/http_client.ex · main · SSL MOON / SSL MOON · GitLab
And implementation: lib/ssl_moon/network/http/http_client_impl.ex · main · SSL MOON / SSL MOON · GitLab
If you plan on using that code, don’t mind the logic related to redirects as that is irrelevant in your case.
on the topic of pattern matching, from the docs:
https://hexdocs.pm/elixir/pattern-matching.html
And personally, I really like this source as well:
AFAIK it’s not advisable to check in the doc
directory. It is ignored by the .gitignore
generated by ElixirLS; gitignore/Elixir.gitignore at main · github/gitignore · GitHub.
Some other general feedback (not Elixir specific, hope that’s ok):
Improve names
- Method names like
make_request
can feel obvious in the moment you are writing the code, but don’t truly communicate what they are doing. Instead, considerlookup_by_domain
.
Do one thing
- Most of the
get_
methods are doing more than one thing, for example:
def get_subdomains(builtwithjson) do
get_results(builtwithjson)
|> Enum.find(fn map -> Map.has_key?(map, "Result") end)
|> case do
nil ->
nil
subdomains ->
Map.get(subdomains, "Result")
|> Map.get("Paths")
|> Enum.map(fn x -> Map.get(x, "SubDomain") end)
|> Enum.reject(&(&1 == ""))
end
end
You are getting the first result
from a list of results
and getting the SubDomain
from it. I know most likely the first result of the list is probably the correct one, otherwise you wouldn’t have built it this way. The issue though is that your library is making a choice that’s hidden from the user. Namely, which one of the results is “correct.” What if my lookup by domain had multiple results and I actually wanted the second one? Instead, make these get_
methods only work with a Result
, not a list of Results
, e.g. change the above to:
def get_subdomains(builtwithjson_result) do
builtwithjson_result
|> Map.get(subdomains, "Result")
|> Map.get("Paths")
|> Enum.map(fn x -> Map.get(x, "SubDomain") end)
|> Enum.reject(&(&1 == ""))
end
Now, you do force the library user to pick which result they want, and they could get an error when using the get_
methods, but your current library has the following behavior:
# Imagine the list of results has, in order:
# 1. A result with a "Result" field, but no "Attributes" field.
# 2. A result with an "Atttribute" field.
results = Builtwith.lookup_by_domain("google.com", my_api_key)
# Reads the "SubDomain" field from the first entry in the list of results.
subdomains = results |> Builtwith.get_subdomains()
# Read the Attributes field from the second entry in the list of results.
attributes = results |> Builtwith.get_attributes()
The user now has a list of subdomains and a list of attributes, but they aren’t from the same result. This could be very bad for some APIs. This exact scenario may be impossible with the Builtwith API, but it could be possible with another API with a different data model and could have very bad consequences if you get this mixed up.
Micro optimization: Avoid double Enum
calls
This is a small one and someone may correct me if I’m incorrect about this. In get_subdomains
:
|> Enum.map(fn x -> Map.get(x, "SubDomain") end)
|> Enum.reject(&(&1 == ""))
Each call to Enum
will basically allocate a new list (which will increase memory usage). Instead you can use the Stream
API:
|> Stream.map(fn x -> Map.get(x, "SubDomain") end)
|> Enum.reject(&(&1 == ""))
This will lead to a slight optimization to the memory footprint of your code. Note, Streams are lazy - the output of a Stream
method is always a Stream
, so you will always need to end with an Enum
method to get the type you actually want at the end.
BTW Req is inspired by requests, a Python HTTP client…
Not to speak for @dimitarvp but Andrea L. has a good write-up from a year ago now:
Despite HTTPoinson not being listed in the mini table of contents, he talks about it near the end.
While I have been using HTTPoison for most of my HTTP clients, I am now switching to Req, it has some cool stuff baked in (retry…)
Pattern-matchings is blowing my mind, thank you! goodbye python, this is so much cooler!
i’ll look into this in the future, I can already tell a major refactoring going to happen here with some of the above and as I get to know elixir better.
this is good to know thank you, what about priv? i’m trying to find all those edge cases.
thank you so much! these are definetly the problems I want to avoid, i’ll be refactoring based on these comments. Elixir seems like such a beautiful and amazing language and community.
I have priv
committed in some of my Phoenix projects, because it contains non-machine generated code (e.g., migrations, javascript, etc).
thank you so much! these are definetly the problems I want to avoid, i’ll be refactoring based on these comments. Elixir seems like such a beautiful and amazing language and community.
I am glad the comments were helpful. And I am glad you are enjoying the language and the community. I feel the same way. The community is one of the best I have seen.
priv
is the canonical place to store external resources, ie, non-Elixir files. Stuff in assets
gets copied into it which is ignored by default if you use phx_new
. You can also store such things in lib
if you use the @external_resource
module attribute though that’s a bit more advanced!