PRQL vs/with Ecto (PRQL is a modern language for transforming data)

dimitarvp · January 29, 2023, 1:59pm

I am skeptical as well but you’re being needlessly harsh IMO.

It’s OK to just shrug it off as “not for me” and move on. Personally I’ve not seen any satisfactory elaboration from @kasvith but that can be due to a number of factors. It’s also true that most people struggle to express the selling points of something they hold dear to their heart.

Probably the best skill I have fought tooth and nail to acquire as a programmer is to become eloquent / expressive / articulate. Everything else you can pretty much get to 95% in maximum 10 years of experience. Being well- and clear-spoken however, I feel we can never truly master well.

So give the guy a break, he has a passion project but maybe can’t sell it well. If it takes off we’ll know.

In the meantime I’m in your skeptical camp. Plus having in mind the literal hundreds of tools I had to learn and apply well during my career, I am not in a rush to be early adopter of… almost anything really.

D4no0 · January 29, 2023, 2:11pm

Probably the best skill I have fought tooth and nail to acquire as a programmer is to become eloquent / expressive / articulate.

Very true, I witnessed a few very good projects go to the drain because of this issue with communication, it is very sad when you can’t find common grounds and your arguments cannot be understood.

The thing that makes me afraid of these kind of libraries is the fact that someday I will wake up at a company where someone pushed this technology without asking anyone and you will have to deal with it, it happened countless times already when I had to deal with frontends in js, some of them had to be thrown in the garbage after few months of work only because they used some bleeding edge and hip libraries, that were either poorly designed, bug ridden, unmaintained.

LostKobrakai · January 29, 2023, 2:17pm

That shouldn‘t be hold against creators or maintainers of such libraries, but rather against the people having decided in favor of using them and even that might be unfair given the addional context of judging in hindsight.

D4no0 · January 29, 2023, 2:58pm

The thing I want to point is that this library seems to be misleading in the things it delivers and the arguments on this thread only consolidated my opinion. The thing is that such claims makes the majority of the people that don’t bother to check a viable choice use the library without second thoughts.

Yes, it is true that the engineers making the choice are fully responsible for such failures, however the truth is that only few companies have the privilege of having top class engineers capable of making correct decision in the context. The reality is that you could hire someone even with 5+ years of experience and he would still fall for these kind of traps.

kasvith · January 29, 2023, 7:45pm

PRQL objectives are clearly defined in the website with examples.

I hate to repeat this, but it’s a tool still in active development.

I wrote the Elixir bindings for this project with the aim to bring it to the elixir world. In case someone liked PRQL and wanted to try it with Elixir(with Livebook for example).

It’s an OSS project done by passionate people around the world to solve a problem they faced.
If it doesn’t align with your view, it’s safe to ignore it. It is not a threat.

I was sad to see the negative comments here for bringing a tool into the Elixir world.

That’s all from me.

hubertlepicki · January 29, 2023, 8:28pm

I looked into PRQL, and I really liked it. But for the same purpose we can use “WITH” queries directly.

The nice thing about PRQL is that you can create these logical chains easily, and so can you with “WITH” queries: Literate SQL using the WITH clause and, it appears, PRQL compiles to just that.

I actually don’t know if you can build such queries, and then chain them together with Ecto. It’d be super nice. Some of the reporting queries in my projects would definitely benefit, as it’s one of these things where you may not understand your own code in 6 months.

dmitriid · January 30, 2023, 1:36pm

Also, anything that hides away the windowing functions behind a sane syntax is also a win

hubertlepicki · January 30, 2023, 1:37pm

readable syntax

zachallaun · January 30, 2023, 2:22pm

Try not to get too down about it! Vocal minority and all. Keep it up.

max-sixty · February 2, 2023, 9:59am

I’m here a couple days late again.

Did you look at the website?

Check out the showcase — there are a dozen examples there. Or did you look but not think they’re simpler?

D4no0 · February 2, 2023, 3:18pm

Finally something that should have been pointed from the very start, I’ve looked at the page but it seems I missed the showcase, thanks! Let’s do some structured and nice review by looking at some examples:

Friendly Syntax

PRQL

from order               # This is a comment
filter status == "done"
sort [-amount]           # sort order

SQL

SELECT
  order.*
FROM
  order
WHERE
  status = 'done'
ORDER BY
  amount DESC

Don’t see any improvement over the original SQL, filter is just a wrapper over the where clause. On the other hand usage of == operator is from development world and the -amount is a very peculiar way to dictate the order, if I didn’t know the SQL context of the order operation I would be inclined to think that this is an arithmetic operation.

Orthogonality

PRQL

from employees
# Filter before aggregations
filter start_date > @2021-01-01
group country (
  aggregate [max_salary = max salary]
)
# And filter after aggregations!
filter max_salary > 100_000

SQL

SELECT
  country,
  MAX(salary) AS max_salary
FROM
  employees
WHERE
  start_date > DATE '2021-01-01'
GROUP BY
  country
HAVING
  MAX(salary) > 100_000

Witch one is easier to understand in your opinion (especially if we talk about people that don’t have experience in development)? I would always go for the SQL statement, since it is clearly structured, it communicates clearly the intent, while on the other hand PRQL uses this group syntax that is very confusing in what is trying to achieve. The catch from the SQL side of course here is to know the precedence of where and having, but once you understand them you are good to go.

Joins
PRQL

from employees
join b=benefits [==employee_id]
join side:left p=positions [p.id==employees.employee_id]
select [employees.employee_id, p.role, b.vision_coverage]

SQL

SELECT
  employees.employee_id,
  p.role,
  b.vision_coverage
FROM
  employees
  JOIN benefits AS b ON employees.employee_id = b.employee_id
  LEFT JOIN positions AS p ON p.id = employees.employee_id

This one is cherry on top of the cake. In what world the PRQL join statement is more readable than the SQL one? Once again usage of peculiar operators that you need to learn and read the documentation to understand them.

max-sixty · February 2, 2023, 11:13pm

Great, thanks for checking those out, appreciate that we’re engaging on something concrete now.

I won’t do some point-by-point rebuttal — I agree Joins aren’t much clearer, and the sort syntax is new if you’re not familiar with R. Whether or not folks prefer = or == isn’t really the focus of PRQL, folks should make a call on whether that’s a big deal to them.

But take the orthogonality example:

Knowing the order of the SQL operations is not trivial — check out A Beginner’s Guide to the True Order of SQL Operations – Java, SQL and jOOQ.! PRQL’s pipelines make it very clear how each transform changes each intermediate result, and this lets us have a single filter transform rather than WHERE & HAVING & QUALIFY.
…this makes more of a difference as the query size grows, and we start needing to abstract parts of the SQLout into CTEs or nested subqueries.
The group syntax allows for an expression inside a group to operate on groups as it would across the whole table outside of a group. Combined with an aggregate, we can know the shape of the result at parse-time; whereas that’s actually quite difficult in SQL — check out the “What’s going on with this aggregate function?” in our FAQ. So that lets / will let us type-check much earlier, rather than waiting for the DB to fail a query execution.

I’m sure there are bad parts of PRQL, we’re very open to feedback, and still making changes. That said, it’s unlikely it’s all bad, which someone might characterize as your view — 6K people have starred the repo, so unless we’re bamboozling them with our color scheme, there’s something that’s resonating with folks.

(Happy to discuss anything more specific, though not looking into getting into a back & forth on whether everything is bad)

stevensonmt · February 2, 2023, 11:36pm

Daniel C.:

PRQL

from order               # This is a comment
filter status == "done"
sort [-amount]           # sort order

SQL

SELECT
  order.*
FROM
  order
WHERE
  status = 'done'
ORDER BY
  amount DESC

As someone who does not know enough SQL to be dangerous I would say the PRQL syntax is easier to parse due to more similarity with other programming languages. Specifically filter replacing WHERE and sort replacing ORDER BY are more natural to me. Of course these kinds of preferences are entirely subjective, I’m just offering the perspective of someone who is not innately comfortable with SQL already.

hauleth · February 5, 2023, 5:29pm

I wonder how PRQL handle cases where you want to have defined NULL ordering (SQL has NULLS LAST and NULLS FIRST) or how they handle ties in LIMITs.

D4no0 · February 6, 2023, 1:00am

Now this is a conversation we were all striving to achieve, even though github is no longer the open platform for folks so the argument doesn’t count (everybody knows this, but let’s not get into details).

The composability of queries seem to be the most important feature missing from the SQL, as someone who used ecto for years, I would find it a huge inconvenience to not have composable queries when writing raw sql.

In general the thing that started this aggressive conversation is the fact that PRQL promises too much, in elixir world we try to minimize this marketized approach, I guess this is why so many skilled people are attracted to this ecosystem, a truly open-source ecosystem, but in the case PRQL, more than half of the claims are for marketing propose only, witch is a disappointing because software is much more than just a commercial product you make money from…

max-sixty · February 6, 2023, 4:05am

Without addressing the full diatribe, I’ll respectfully make one point — PRQL is strongly open-source. From the front page of the website, again:

PRQL will always be fully open-source and will never have a commercial product. … We’re a welcoming community for users, contributors, and other projects.

…notably we see open-source as both the license and the community…

dmitriid · February 6, 2023, 2:40pm

As others mentioned, what works in PRQL’s favor is that you basically go top to bottom, and every part just works on the result of the previous part. Basically pipes, and they are called pipes in PRQL documentation IIRC.

SQL is often … weird. Especially when it comes to more advanced things like windowing functions.

In what world the PRQL join statement is more readable than the SQL one?

Funnily enough it doesn’t look dissimilar to Ecto. What I do dislike is [==employee_id] which is hard to parse.

AstonJ · February 6, 2023, 5:29pm

Hi all, let’s close this thread as everyone has had a chance to have their say and the original query “PRQL vs Ecto” has been answered: PRQL is not trying to compete with Ecto as it’s mainly a query language for analysts and data engineers (there’s no insert/update/delete).

When you get a moment @kasvith, maybe you could post a thread in the Libraries section detailing your bindings, perhaps linking to this thread and making the above obvious as well so there’s no further misunderstanding

Thanks everyone!

AstonJ · February 7, 2023, 8:00am

This topic was automatically closed after 14 hours. New replies are no longer allowed.