iex> query = from c in Category,
…> select: c.name
iex> Repo.all query
First, we create a query. In this case:
from is a macro that builds a query. c in Category means we’re pulling rows (labeled c) from the Category schema.
select: c.name means we’re going to return only the name field.
Although I am surprised mapping layers or even db engines aren’t intelligent enough to infer that a count on a table does not require loading entire row into memory…
The “*” in SELECT * FROM ... is expanded to all columns. Consequently, many people think that using count(*) is inefficient and should be written count(id) or count(1) instead.
But the “*” in count(*) is quite different, it just means “row” and is not expanded at all (actually, that is a “zero-argument aggregate”). Writing count(1) or count(id) are actually slower than count(*), because they have to test if the argument IS NULL or not (count, like most aggregates, ignores NULL arguments).
A note about count(1) vs count(*). One might think that count(1) would be faster because count(*) appears to consult the data for a whole row. However the opposite is true. The star symbol is meaningless here, unlike its use in SELECT *. PostgreSQL parses The expression count(*) as a special case taking no arguments. (Historically the expression ought to have been defined as count().) On the other hand count(1) takes an argument and PostgreSQL has to check at every row to see that ts argument, 1, is indeed still not NULL.
For the other two queries, SQL Server would use the following rule. To perform a query like SELECT COUNT(*), SQL Server will use the narrowest non-clustered index to count the rows. If the table does not have any non-clustered index, it will have to scan the table.
InnoDB processes SELECT COUNT(*) statements by traversing the smallest available secondary index unless an index or optimizer hint directs the optimizer to use a different index. If a secondary index is not present, InnoDB processes SELECT COUNT(*) statements by scanning the clustered index.
I’m def no db expert, just trying to understand the rationale behind count(u.id) rather than count(u)
Not really, why would a db engine load every column in a row to count number of rows in a table? I’d be surprised if count(*) was ever implemented this way