HexDocs Search - find function, type and callback documentation for (almost) all hexdocs.pm packages

Update: HexDocs Search is no longer available, please see this message for more information

Hi! Over the last weeks, we have been working on HexDocs Search, a cross-package function/type/callback documentation search interface containing data for (almost) all packages currently on hexdocs.pm.

To give it a try, visit hexdocs.pandosearch.com and just start typing.

Why this project?

We are Pandosearch, a Dutch company offering search engines for websites and web applications. We have been working with Elixir since 2018 and love the language and its ecosystem.

We regularly try to give back by contributing to Elixir open source projects, but felt we could do more.

Seeing a community need for cross-package HexDocs search, we thought: this is our call!

What can be found?

HexDocs Search currently contains around 433K functions, types and callbacks. Most HexDocs packages are included, but not all. See “Known limitations” for more details on this.

How is data collected?

We collect data by crawling hexdocs.pm twice a day. We use the root sitemap.xml index file on hexdocs.pm as a starting point and use all the URLs found there.

The lastmod values from the XML sitemaps are used to ensure we only recrawl changed pages. This enables us to do a full recrawl in less than ten minutes. This strategy also has the added benefit of avoiding massive request loads on hexdocs.pm.

Project status

The current project status is probably best described as “beta”. Response times should be quick and the project should generally be up 24/7, but we are still actively working on some details, so a bit of downtime is to be expected every once in a while.

Feedback wanted!

Our first goal after posting this announcement is to gather feedback from the community. This means we need you, as you are probably an Elixir developer reading this!

Any feedback is greatly appreciated. If possible, just use this forum thread, but you can also send a private message if deemed more appropriate. We’ll get back to you as quickly as possible.

Future plans

A few days before posting this announcement, we’ve offered @josevalim early access to gather some initial feedback. Reason for contacting him was this recently added GitHub issue, which outlines a set of features similar to what our project aims to offer.

We are currently in the process of determining if (parts of) HexDocs Search could be integrated into ExDoc/HexDocs for cross-package search. As the outcome of this process is still highly uncertain, José encouraged us to wait no longer and go live with what we have right now. In other words: here we are :).

For future features, we are very much open to suggestions, as our main goal is to maximize value to the community. As such, we deliberately do not have a fixed feature roadmap right now.

Known limitations

No module documentation and non-module pages included yet

We deliberately decided to limit our scope to function/type/callback documentation for now. This means no module documentation, guides and/or other ExDoc-generated HTML pages are included in the results yet.

An important reason for this is that in contrast to single-package search, including these content types for all of HexDocs is quite challenging in terms of correctly ranking and identifying search results.

In addition, due to the more free-form nature of ExDoc guide pages and module documentation, there aren’t any hard guarantees about the exact structure of the HTML output, especially as Markdown also allows raw HTML to be used. This gives us some additional challenges when crawling and rendering these types of content.

Only ExDoc 0.20.0 and up

Due to ExDoc HTML changes over the year, searchable content is limited to HTML generated by ExDoc versions starting from v0.20.0 (released April 2, 2019). In case your package is missing, update your ExDoc dependency version before releasing a new package version. Our crawler should pick up the newly generated documentation within 24 hours.

iframe documentation rendering

Also due to ExDoc HTML/CSS changes over the years, documentation HTML is loaded in an <iframe> using the original hexdocs.pm CSS. In most cases, this ensures styling is the same as on hexdocs.pm, but some edge cases may have some minor rendering issues.

Note that JavaScript used on hexdocs.pm is not loaded in the <iframe> at all. Main reason is that we are only rendering parts of the original hexdocs.pm HTML layout, which would cause JavaScript element selectors to raise exceptions due to certain elements not being present on the page.

Limited dark/light mode support

Dark and light mode are supported, but only based on system preferences. Manually selected theme preferences set on hexdocs.pm cannot be respected due to localStorage being scoped to a domain.

That’s it for now. Looking forward to your feedback!

– Edward (@edwardsmit), Roeland (@roeland) and Floris (@florish)

12 Likes

How about this one: https://hexpm.docs.apiary.io :icon_question:

See linked guide for more information:

Doesn’t it apply to all documentation including callback, function and types? Not sure how it has been solved in hex, but especially at the beta stage you could simply ignore HTML which should be pretty easy …

markdown
|> Floki.parse_fragment!()
|> Enum.filter(&is_binary/1)
|> Enum.join()

The other limitation I found is that you do not support linking to specific section in documentation, for example: https://hexdocs.pm/elixir/Kernel.SpecialForms.html#quote/2-quote-and-macros. Search work nice and fast, but developers still needs to find a specific section they were searching for.

There is one important problem with search hints when entering input. The suggested hints are terrible for often used naming like render. First of all the top 3 results are unfortunately useless and there is nothing for phoenix. I’m not saying to always suggest most popular packages search, but in this specific case it makes much more sense. At least when entering an input (especially when search term does not contain a space) I suggest to reject all labels not containing a search term which would allow to find specific callback/function/type by it’s name or part of it. After that I would add some condition like … if there is too many results (detecting generic naming) with highest score then I would add extra sort for package downloads like

def sorter(score, doc_chunk) do
  {score, doc_chunk.package.downloads}
end`.
1 Like

Thanks for the feedback!

Our search engine is built to index html content and https://hexdocs.pm has nicely structured HTML and sitemap for us to consume and that works well. I see no immediate advantage to using Apiary, but I might miss something?

We do include the documentation for the callbacks, functions and types, if you search for x you get the pin operator as a first result due to the large number of x’s in the examples.

Floris is referring to the HTML not having a structure to correctly pick up the sections in the documentation, which you refer to in the next part of your reaction. We are certainly interested in including this, but for now it is out of scope as @florish mentioned.

This is definitely true. We are already boosting the Elixir namespace, using the number of downloads as influence might be interesting. Thanks!

Note that you can include the module or package in your query to get the relevant results. For example https://hexdocs.pandosearch.com/?q=phoenix+ren or https://hexdocs.pandosearch.com/?q=LiveView.r.

1 Like

happy to see another entry in cross-package search, since my hexdocs.krister.ee is buggy and has problems (usage seems very low so no motivation to deal with other usecases).

I’m quite used to searching for guides and whatever titles are in the hexdocs already so it’s too bad you’re not indexing that. plus the results aren’t really as readable as I would like them to be (hexdocs.pm does that pretty well, it’s why I copied them). so I guess I’ll just stick to using my own tool for now. will keep an eye on this though.

2 Likes

Thanks for the feedback!

I’m quite used to searching for guides and whatever titles are in the hexdocs already so it’s too bad you’re not indexing that.

Good to know, thanks! To be sure: it is already on our feature list, we are just not sure what is possible without tighter integration into HexDocs/ExDocs in terms of navigating from search results to the actual docs content and rendering everything correctly (HTML + CSS). We’ll update this topic after releasing new features.

plus the results aren’t really as readable as I would like them to be (hexdocs.pm does that pretty well, it’s why I copied them).

Oh that’s too bad! Just to understand what you mean: so the function/type/callback documentation HTML/CSS in the main screen area does not look like it does on hexdocs.pm? It should, so maybe that is a bug. You’re welcome to DM me with one or more screenshots and I’ll take a closer look!

On hexdocs.krister.ee I see now that you are loading the full hexdocs.pm HTML page with the original URL anchor. This is an approach we also tried during development, but full remote hexdocs.pm page loading had some drawbacks in terms of loading and rendering speed, especially combined with keyboard (ArrowUp/ArrowDown) navigation, which is why we decided to locally load only the relevant parts of the HTML in the iframe instead.

We will have to reconsider our current approach anyway when working on including guides pages and top-level module documentation, so good to know about your preferences on this.

Thanks again!

sorry, I can be so vague sometimes. by “results” i meant the menu (or search results listing).

when searching on your app I can’t tell what package the function belongs to. in my app I basically kept the hexdocs official structure, because it just makes sense to me. i’ve already learned to use that tree structure they have.

i like it so much in fact that in my app you can hide the search results and see the actual official menu behind it (find the almost invisible arrow). this also helps a lot when I’m searching by proximity (say you want Enum.at/2 but so it returns multiple items, what was that function called again? are there more than one options? search “enum at”, hide menu, look at all the functions).

1 Like

by “results” i meant the menu (or search results listing).

Thanks for clarifying :+1: I’ll refer to this as “autocomplete results” in the rest of my reply.

when searching on your app I can’t tell what package the function belongs to.

While working on the project, our conclusion was that the module name (second line in each autocomplete result in the sidebar) almost always tells you what package a function belongs to. Good to know that in practice this doesn’t work for you.

Similar to adding context to the iframe content (see previous reply), adding the package name to the autocomplete results is also something we will have to consider once guide pages and arbitrary page titles section heading are included.

i like it so much in fact that in my app you can hide the search results and see the actual official menu behind it (find the almost invisible arrow).

:slight_smile: Today I Learned! Thanks for explaining.

Hi, based on a combination of limited usage and recent information on the current direction for HexDocs/ExDoc multi-package search, we’ve decided to take down our HexDocs Search service.

Thanks for all the feedback in this thread!

Do you plan to publish archived project?

No, in the current state it is not possible for us to publish the project code as open source.