Basically I’m trying to implement a very minimal autocomplete using channels and ets. Not to keen on pulling in Elasticsearch as the amount of records is about 1000. Not exactly sure how to go about this.
Some ideas I had were:
Insert them using sorted charlists e.g. (“Elixir” would be inserted as :ets.insert(:cache, {"key", 'eiilrx', "Elixir}) and then doing an ets match spec -> :ets.fun2ms(fn({_, ['eii' | _], v}) -> v end. The only problem here is that it doesn’t work because ['eii' | _] tries to pattern match with [[101,105,105] | _] which obviously won’t match [101,105,105] because of the nested list. So my question with this option is, is there a way to pattern match a charlist like I am trying above without it being nested?
Create keys using word partials. E.g. if I had elixir and phoenix stored:
Oh I came up with the same question a few months ago, I don’t think ets is the best solution to your problem, I think is much better use solutions like Tries, this are blazing light fast and easy to implement
The drawback with this trie is you can’t make any typo is your searching (And based on you post you need correct users typos), but you can use this trie to implement another cool things like Levenshtein automata, here a few cool articles:
Hey thanks that actually looks really interesting! My only concern with this is, is that it would require loading the entire the trie into memory inorder to traverse it - which might not as efficient? Unless i’m missing something here
You’d have to keep the ets in memory as well. Of course you can serialize each subtrie to disk. Let’s say one per starting letter. Or maybe you can think of some nested stuff. But you should do that only if you experience problems with keeping it in memory. Or you can tell in advance that it won’t fit.
Also matching on your prefix should be possible like this prefix = 'eii'; match([prefix|_]).
oops yeah I meant ets is in memory as well, but my thought was (I could be wrong here) was that with the trie you would have to bring the WHOLE object into memory before performing some operation on it, whereas using ets, you would only be copying the one object from memory.
Also are you able to provide an example for that match because I can’t seem to get it to work:
Oh right, is there way to do it without :ets.fun2ms? OR are you saying this can’t be done at al. basically I just need a way to turn this ['abc' | _] which is really [[97,98,99] | _] into [97,98,99 | _ ]
With such a small set, perhaps it would make more sense to ship all the entries to the browser and do autocomplete there, so you could avoid constant round trips?
Which browser? I can’t find anything in the context of the question that would imply the existence of one.
Just to play around with the possibilty, can you also provide an example where the actual value to match on is dynamic as in the question?
Disclaimer: I’m still asking and trying to solve this out of curiosity. There are much more efficient ways to solve this, as the already mentioned Trie (where we find all matches in a time that is proportianal to the given length of prefix). In an ETS for the same result we can only get all matches if we scan the complete table, everytime, therefore response time is proportianal to the size of the table.
The minimalist solution is HTML5 datalist and call it a day - if you aren’t JavaScript averse take your pick - for example Pixabay autoComplete can search local data provided in an array.
Yeah I was thinking about that, but the size of the actual data may be quite a bit larger then that (not 100% sure at the moment) but what I’ve come up with currently is this:
I’ve modified the Trie example from @wolfdan above and added in the ability to store an ID at the end and it seems to be working fairly well (~20(2000 items) -> 10,000(250,000 items) microseconds) depending on the number objects in there and the number of results for the search.
So basically the the trie has a structure of {term, id}.
For a first cut, it seems like it works - once I have a real idea for numbers etc. I’ll probably make a decision on which direction I need to go, so if it doesn’t look like there is going to be a heap of objects then i will most likely just send it to the browser.
Wait? You want to tell me that finding completions to a given prefix rises with the number of entries? Either you have implemented it wrong, or you are not using the prefixes as keys but something else and just do a linear scan.
If you use the prefix as key, the number of elements should not matter but only the size of the key.
If you just do a linear scan, you can simply use a list…
The symbol treemap is even safe by design to encode into the beam file itself by unquoting it into position to make it even faster if it is compile-time static!
You could of course just use my TreeMap straight too, it is not tied to the parser parts at all, but they do make it easy to parse out things safely and easily.
You can always encode them into a database as well.
[unquote_splicing('abc') | _] if it is known at compile-time at least, otherwise there are other workarounds that are situation dependent if an entire example is given. And manually building matchspecs is pretty easy too. :ets.fun2ms is entirely unneeded.[quote=“NobbZ, post:14, topic:8651”]
Wait? You want to tell me that finding completions to a given prefix rises with the number of entries? Either you have implemented it wrong, or you are not using the prefixes as keys but something else and just do a linear scan.
[/quote]
Yeah precisely. If you want an actual trie then try out my ExSpirit’s TreeMap as shown above, it is quite well tested and used.