Hey everyone, I’d like to share a project I’ve been working on together wit three other students for one the Universities course ‘Requirements Engineering and Software Startups’.
It is called Artfacer, and it is a search engine that crawls the different graphical art services out there, to inspire visitors.
The reason I am sharing this here, is because it has been built using Elixir and Phoenix. Right now we do use some Postgres (but not much of the Postgres-specific or even SQL-specific stuff, so maybe we might move this to Mnesia at some point, when there is Ecto 2 support for it. For the fuzzy text searching, we use ElasticSearch. (On one hand it feels a bit unfortunate to bring a JVM in here as well. On the other, as far as I know there is no other tool that exists that has the same amount of discoverability of stuff as ElasticSearch has).
But the main important part is of course the Elixir code: The business logic, The crawlers and the web-facing user interface have all been written using Elixir in a nice supervision tree. (Which we want to improve further; crawler errors currently can bring the visitor-facing app down, which is not what we want. Maybe we’ll change it to an umbrella setup. So many possibilities… hooray for proper separation of concerns!)
I will go into a little bit more detail about our stack these next couple of days, and also would love to answer questions.
And all feedback on the site (all of the interface, the usability and our architectural choices) is of course extremely welcome!
~Qqwy/Wiebe-Marten and the others of Team Artfacer
I like that it is fast and accessible too
Have you thought about putting some lists on the homepage - such as most common searches, most clicked images etc?
How are you storing the images? I expect over time it will require some hefty disk space (or do you delete old images routinely?)
Thank you !
Yes, we have! This feature is actually due to release in some version today (of course, after that, the styling might be refined multiple times).
Images are fetched and stored using the
Arc library. Amongst other things this wraps calling ImageMagick to create the different lower-resolution versions of the images for us (for use on mobile, tablet, etc.)
Indeed, images can take up storage space quite quickly. Right now, the ~100_000 images in the application take up 1/4th of a Terabyte together. I have no idea if there is a better way to manage this. Requesting all images only when they are shown is probably too slow (also, most of the services dislike hotlinking), so we need to store them in one way or another.
Wouldn’t there be copyright concerns with storing images on your own server or do you handle that concern after crawling?
We store a copy of the images so we can show smaller versions in the interface, which is a requirement for mobile phones, and the services we crawl want you to do that to reduce their bandwidth usage.
We do not alter the images in any way (other than making smaller, resized versions to reduce load for mobile visitors.), nor do we claim them to be our own: We mention the original author and the location it came from and people can go to that location directly by clicking on one of the images.
As far as we know (Of course, we’re four Computing Science students so our law experience is very limited), what we do falls under the Dutch/European [Citaatrecht]/Right to Quote, and the Fair Use doctrine in the United States (and I believe other American countries have similar rules).
Artfacer’s functionality is very similar as the one of any other search engine, so we expect the same rules to apply.
The homepage now contains links to common tags as well as a ‘show all’ button.
There also now is a maturity filter, by popular request .