@mischov: When parsing speed is not as much important then more important is code safely than dependencies. Yup it does not require to compile rust nif code, but now it requires to compile C nif code.
Of course ANY rule have it’s own edge-cases, but in scraping case generally nobody cares what nif are you using unless there are specified requirement for project like code safely.
Dependencies are compiled rarely (comparing to main project), so at least I don’t care about it - especially after found asdf-rust
plugin for asdf
, so I do not need to compile Rust
from source.
Anyway as a developer I prefer to compile Rust
+ nif and have safe environment to not be confused in small 5-minute home projects
than any faster parser (even 100x faster) in any other language. Developer have always full hands of work, so one or two compilations in background is nothing surprising and they usually don’t care about them unless they are using too more resources, so they can’t continue they work.
As I mentioned this changes when normal end user is using specified project. User does not care what developer use. It should work and be as fast as possible. So parser speed and project dependencies does not matter unless you are providing solution for end user.
Look that most of Windows users will still use this OS for years. There are lots of awesome projects and lots of really hard work already done. We, as developers, understand they motivations and really appreciate they work and skills that they train. We could wait for next releases, but … When everything is going to end user then all that things suddenly does not matter. You are making specified project for client then you need to choose dependencies that much him needs. They don’t care that your project requires Rust or not - this is advantage only for us and not end users.
From that point I can see that really similar project - like parsing spreadsheet could really interest end user, because his document is parsed faster and he have more time - it’s especially important when working with much more than few spreadsheet documents. I know lots of people using spreadsheets everyday and they are importing and exporting them to lots of apps. Here speed is really important, because it’s not home project - you have thousands of documents from hundreds of users or even more. Here every 1 additional second is means exactly 1 lost second, because users depends on result of that work and then they can continue their work.
I wanted to say that starting from (again only for example) fast spreadsheet parser could be a better idea, because it could be better tested by bigger number of interested people and it’s also possible that your project could be tested in production environment - that is really big advantage. When your skills are much bigger and you received lots of support then another extra parser - even if it will be used only in home projects is both much more profitable and it’s just a matter of time. That project is much more easier even if you will not get any support for it, because you already have experience with similar project that is used by more people.
ah, btw. we already talked about Rust compiling
I already used your Rust HTML parser and it works awesome. I have parsed automatically lots of small pages and personally I don’t feel that I need any faster parser. It’s already fast and don’t know any scraping project and can’t imagine myself any my future private project that will require faster parser especially when that parser does not guarantee same stability as yours.