Huge Google search algo leak that involves their Elixir open-source repo

Some SEO details here: An Anonymous Source Shared Thousands of Leaked Google Search API Documents with Me; Everyone in SEO Should See Them - SparkToro

Elixir code involved here: feat: Automated regeneration of ContentWarehouse client (#11378) · googleapis/elixir-google-api@078b497 · GitHub

Fireship channel made a video about it here:

anyways, that doesn’t mean (unfortunately) that Google uses Elixir for this stuff, since the code looks to be generated by some bot and added to their open-source Elixir API client by mistake.

3 Likes

The code looks generated allright, most probably generated based on a spec.

Yes I have my doubts that they keep source code for their engine not tightly secured. I guess we will see in the near future if this is a fluke or real as people will definetly try to abuse it.

2 Likes

The leaked code shared with the author of the sparktoro.com article is for non existent modules that must have been generated from an internal API document and then pushed to the repo and hexdocs?

If true I assume the hex maintsiners would want to keep things private or may already have purged them from the servers.

Or maybe it’s just local exdoc files that never got published

2 Likes

Yep, the code generation appears to happen here.

3 Likes

The comment at the top of that mentions a “Google Discovery document”, which TIL is a thing. Similar vibe to OpenAPI / Swagger / etc at a glance.

I read through a handful of the generated files, but all I found were Poison defimpls and comments generated from the document.

Calling this an “algo leak” seems massively overblown.

2 Likes

Would it surprise any of us that Google use Elixir? :upside_down_face:

Just like Erlang has been one of tech’s biggest kept secrets, I am convinced so too is Elixir. I would happily bet that Apple, Google and Microsoft all use Erlang or Elixir (we already know that Apple does) and I think it’s safe to say for the forseeable future it will remain being the secret sauce of many companies - they’re not going to give away their competitive advantage if they can help it :lol:

Re the leaks, I have not looked at them in any great detail but I think many of us have suspected for a long time that they manually/artificially push/throttle results - whether that’s because of advertising partnerships or them just helping their mates. Not many people buy their canned response of “that’s what users want”. As someone who has been interested in SEO for 2 decades, I certainly don’t!

4 Likes

Looking at the sorry state of nowadays technology we have for software development, I would go as far as saying that there is a very good interest in keeping software development out of reach for many people, unreliable and obscure as ever.

There is no other way to explain the fact that even to this day 95% of programming languages don’t know how to deal with concurrency, a problem solved by erlang 30 years ago for the general use-case, and this is just one of the huge number of aspects of making development accessible for everyone.

2 Likes

It’s huge in this regard that it revealed hundreds of factors, metrics, and signals taken into account to rank a page, the data they collect, and how they fight spam. Generally, it provided a lot of information about the inner workings of the algorithm (some insights here: Google Algo Leak). This is a goldmine for SEO specialists and Google’s competition. Equally important, the leak revealed that they lied about a lot of things they’ve been actively denying for the past years.

2 Likes