Is Ecto slower than Rails ActiveRecords?

pillaiindu · December 4, 2017, 7:47pm

This Techempower benchmark suggests that Ecto’s performance for multiple queries is half of the performance of Rails ActiveRecords on Puma.

If this benchmark is correct, then how does Phoenix perform multiple times better than Rails?

OvermindDL1 · December 4, 2017, 7:54pm

It’s not, their benchmark code is done poorly.

It’s pretty well proven nowadays that techempower is a joke, the lack of decent code across near all of their tests is just reckless and shows that they are not to be taken seriously.

acrolink · December 4, 2017, 7:56pm

My benchmarks showed that Phoenix is at least 6 times faster than Rails. I did not pay special attention to Ecto / Activerecord, I measured requests / second.

Yes, Elixir is much faster than Ruby. Elixir is converted to byte code before execution whereas Ruby is interpreted on run-time. But there are other factors which make Ruby slow by nature.

ryh · December 4, 2017, 9:38pm

If I had to guess, I’d say that it’s the JSON serialization that’s causing that.

I got afraid that @michalmuskala had quit the Ecto project because there was a month of inactivity on his commits. It turns out he was working on a JSON lib. I wonder what would happen if Antidote was used instead of Poison.

jeremyjh · December 4, 2017, 11:04pm

Interpreted vs. compile time bytecode is irrelevant - once Rails has finished booting all the code has been interpreted to the bytecode that runs in the Ruby VM and that won’t happen again. Since 1.9 MRI Ruby has been based on the YARV architecture and implementation - this presentation is old but should shed some light on the internals if you are curious. Ruby is slow for lots of reasons but interpreted code isn’t one of them.

benwilson512 · December 5, 2017, 2:48am

Tough crowd…

ryh · December 5, 2017, 8:02am

There’s obviously nothing wrong with being inactive for any period of time and I definitely appreciate the privilege of getting access to a quality codebase.

I felt sad thinking about the possibility he had quit the project. I look to Ecto as a source of inspiration and it would be a downer to see a major contributor leave.

josevalim · December 5, 2017, 11:50am

I would still say this kind of relationship is unhealthy for both sides. We definitely shouldn’t be sad if a contributor is leaving a project as long as they are enjoying themselves on whatever they are doing and the project is well-maintained.

michalmuskala · December 5, 2017, 3:03pm

While I’m flattered by the concern about me, Ecto is a strong and healthy project with many contributors and great maintainers. I don’t think there are any risks concerning the project even with some major contributors leaving. That said, I didn’t leave, I just needed some time off.

Coming back to the topic. It is possible there are some bottlenecks in Ecto, though the real-world experiences we have prove otherwise. The techempower benchmarks are problematic and in general I think they are not very representative of a framework. I’m pretty sure the current performance of phoenix could be improved with some time spent on tuning the implementation and turning some of the VM knobs.

Furthermore, when it comes to the future of ecto performance, for 3.0 @fishcakez is doing a great work on improving transaction handling, which has some potential to further improve the performance and simplify the implementation at the same time.
For overall Phoenix performance the work of the Phoenix team and especially @Gazler on bringing Cowboy 2 and HTTP2 to Phoenix is quite promising, as well as the work of the OTP team on improving the scalability of the whole IO system of Erlang.
As you can see, even though the performance is, at least, satisfactory, there’s a lot of work going on at various levels to improve it even further.

Finally, if somebody looks to contribute in the space of Elixir/Ecto/Phoenix performance, I think the biggest help for maintainers of libraries would be an easy way to run performance suites and benchmarks. Something similar to https://rubybench.org/ or http://perf.rust-lang.org/ would be an extremely powerful tool for the whole ecosystem. With https://spawnfest.github.io/ coming up, maybe that’s a good idea for a project?

ardhitama · December 24, 2017, 10:28am

According to this, even go is slower than mri. Maybe something is off there.

brightball · December 24, 2017, 4:22pm

One of the things I’ve wondered about regarding the benchmarks:

Does activerecord hold a single database connection for every operation on the request while Ecto checks it out and immediately hands it back on a per query basis in case it’s needed elsewhere on the BEAM?

IMO that would make sense.

I’ve also taken some time to dive into the code for those benchmarks and the implementations are all over the place. Most of the Go examples are tuning to the level of presorting inserts for database tuning, just as an example.

OvermindDL1 · December 27, 2017, 7:25am

Techempower is entirely debunked at this point, no one should ever take anything they say or test as truth.

NobbZ · December 27, 2017, 6:56pm

This is a known fact in the community of developers and programmers, but those in the company’s that are able to decide things, do this barely because of the numbers of the TE or other benchmarks, not even knowing the difference between C++ and C or JavaScript and Java…

jeremyjh · January 4, 2018, 4:30am

It is known, in the Dothraki sense, but I think what TE is trying to do is along the right lines and shouldn’t be dismissed so easily.

Where is the data that is more exhaustive than TE’s, that is more balanced, that is available freely to the public, that has implementations with more contributions from capable framework contributors etc? Without that public data, nothing more than what TE tells us is truly known.

Competent and well-funded teams compile their own data about their own application in their own environments which is much more useful, but if you are evaluating comparable frameworks that you don’t use today, what can you look at?

gon782 · January 4, 2018, 6:18am

When it comes to a benchmark, all that is secondary (by a massive margin) to the integrity of the code running in the benchmark. I can provide free helicopter rides to the public and that’s cool and all, but if I’m really just putting people in a helicopter that I just keep on the ground, while making helicopter noises, I think most people would recognize it wasn’t exactly the actual experience of a helicopter ride.

jeremyjh · January 4, 2018, 11:34pm

Everything I described impacts code quality and the integrity of the results that are reported from the code. The question can be rephrased if you like, where is the better benchmark code?

cmkarlsson · January 5, 2018, 12:18am

It is internal to your application. Every benchmark must be done with your specific requirement in mind. Are there any better benchmark’s out there for everyone to see and compare? No, not that I know of. But the TE benchmarks are less then ideal. I think they have the right idea with trying to measure more than plainly reading and writing to a socket which a lot of HTTP benchmarks are doing. They add some computation and database IO in the background. I think they fail on how the tests are executed and benchmarked.

The also state themselves that the benchmarks cannot be used to compare different framework and/or technologies and that they only show the “maximum” a framework can reach with the wind in the back, a slight downhill slope and lots of luck.

The measure and take the best throughput run of a number of tries. They disregard maximum and/or 99% latency and throughput is max. They also fall into the “coordinated omission” problem which is a real problem if the client side is open to the internet vs used as an internal API between two closed servers (where you can limit connections and have backpressure).

I’ve done a fair bit of benchmarking for our internal application in erlang, golang and java (and python but that was discarded quickly). The numbers in TE benchmarks don’t stack up in our scenario and are often misleading. Especially as we are very concerned with maximum and high percentage latency.

We are doing heaps of crypto, calls out to external HTTP servers and some database IO (but most of it is cached). In our tests golang is the fastest (but only by 10-20%), then erlang and finally java. On the other hand, erlang is the most stable and gives most even latency. Even under 95% CPU load we still manage to have the maximum latency within reasonable numbers. ( average 15ms, max 200 ms)

golang starts behaving worse with latency under those circumstances and java goes off at very low load (i.e some requests take seconds!).

According to the TE benchmarks golang and java should completely outshine erlang but for our application it is much closer and the throughput makes such little difference that other factors (stability and fault tolerance) are more important.

Trying to create a better benchmark is obviously possible, but to make it more accurate the they will need to run for a longer time and I’d think that will make it economically unfeasible for such a large number of frameworks.

axelson · January 5, 2018, 12:39am

That really doesn’t make any sense to me. Who cares about the max if 99% of the time you’re not getting that speed?

cmkarlsson · January 5, 2018, 12:49am

Some applications require high throughput and can live with high worst case latency. For example any race which require you to be first. If you lose the auction 1 of a 100 it doesn’t matter if that is 0.5ms or 150seconds.

Most standard web applications should however pay lots of respect to max latency

mgwidmann · January 5, 2018, 1:53am

Heres the Elixir implementation vs the Ruby vs the Go. They all seem to be looping and finding a random database record, so to that respect its fair. However, I don’t believe this to be any kind of real world example as this would likely only ever happen if you had an N+1 type bug in your code. I can bet if that code instead generated a list of ids to fetch once, you’d see an inversion of these results. I don’t believe Ecto is optimized for this kind of usage, so its not surprising with how message passing works in the Erlang VM that this is slower.

github.com

TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Elixir/phoenix/web/controllers/page_controller.ex#L38


      x when x < 1    -> 1
      x when x > 500  -> 500
      x               -> x
    end
  rescue
    ArgumentError -> 1
  end


  conn
  |> put_resp_content_type("application/json", nil)
  |> send_resp(200, Jason.encode_to_iodata!(for _ <- 1..q, do: Repo.get(World, :rand.uniform(10000))))
end


def fortunes(conn, _params) do
  additional_fortune = %Fortune{
    id: 0,
    message: "Additional fortune added at request time."
  }


  fortunes = [additional_fortune | Repo.all(Fortune)]

github.com

TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Ruby/rails/app/controllers/hello_world_controller.rb#L24


end


def query
  queries = params[:queries].to_i
  queries = 1 if queries < 1
  queries = 500 if queries > 500


  results = (1..queries).map do
    World.find(Random.rand(1..10000))
  end


  render json: results
end


def fortune
  @fortunes = Fortune.all.to_a
  @fortunes << Fortune.new(id: 0, message: 'Additional fortune added at request time.')
  @fortunes = @fortunes.sort_by(&:message)
end


def update

github.com

TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Go/gin/hello.go#L87


	c.JSON(200, &world)
}


/// Test 3: Multiple database queries
func dbs(c *gin.Context) {
	numQueries := parseQueries(c)


	worlds := make([]World, numQueries)
	for i := 0; i < numQueries; i++ {
		err := worldStatement.QueryRow(rand.Intn(worldRowCount)+1).Scan(&worlds[i].Id, &worlds[i].RandomNumber)
		if err != nil {
			c.AbortWithError(500, err)
			return
		}
	}
	c.JSON(200, &worlds)
}


/// Test 4: Fortunes
func fortunes(c *gin.Context) {
	rows, err := fortuneStatement.Query()