Workspace - A set of tools for working with Elixir monorepos

pnezis · May 15, 2024, 7:23am

edit: it’s a powershell thing I guess. Works in cmd

Strange, I will need to find a windows machine to debug

cmo · May 15, 2024, 7:28am

mix workspace.run -t deps.update --% --all

kudos to the powershell developers for making the behaviour different to the other shells

With the common deps folder, I was thinking something that outlines what you need to be aware of/do if you choose to go down that path, e.g. the checks in workspace.exs to ensure everything is the same version.

mindreader · May 20, 2024, 11:16am

This could solve a lot of problems for us. A lot of our code ended up in one large project, and smaller projects tend to bitrot.

One thing I’d like to see is if we could set up validation that one message or datatype emitted by one project is compatible with another project to prevent breakages between projects that communicate with each other. Sort of a test that spans multiple projects.

That may ne beyond the scope of this project.

pnezis · May 20, 2024, 11:40am

One thing I’d like to see is if we could set up validation that one message or datatype emitted by one project is compatible with another project to prevent breakages between projects that communicate with each other. Sort of a test that spans multiple projects.

This should be handled by your tests. If you have configured your CI properly then when a project changes, all projects depending on it will also be tested. So if a breaking change is introduced and the proper tests are in place then the CI suite will break.

pnezis · May 21, 2024, 5:00pm

@mindreframer This is a sample workspace project with some boundaries rules enabled (all projects are mix hello world projects)

mindreframer · May 23, 2024, 8:10am

Awesome, this looks like a great example! Also great to see a working folder structure with apps / packages folders. Yeah, I think this is an absolutely needed alternative to umbrellas vs huge monolithic Phoenix apps.

Would it be possible to know how on what principles you structure your app with hundreds of packages?
Also how do you manage the workload between team members / teams, so that friction and code conflicts are reduced?

I dont see very often a large well structured Elixir app in the wild, hence my questions

pnezis · May 23, 2024, 9:00am

Yeah, I think this is an absolutely needed alternative to umbrellas vs huge monolithic Phoenix apps.

Our main app is still a huge monolithic Phoenix app, the only difference is that instead of having all code under the app itself, we have moved independent pieces of code (e.g. third party api wrappers, utilities, broadway producers, etc.) to reusable packages that we add to the app as path dependencies. The main reason for this was the CI execution time and developer ergonomics. When for example you work on a broadway pipeline there is no need for the CI to run all API integration tests on your PRs since they are independent.

Using a poly-repo solution does not work for big enterprise systems. Managing dependencies across repos is a nightmare and there is no guarantee that a change on a package will not break something on apps/packages that depend on it.

We have managed to drastically reduce CI execution time by moving to a workspace. Also, given you have a well tested codebase, every change on every package will trigger all needed CI steps to all packages depending on it, ensuring that no breaking change is introduced. Of course some discipline is needed, e.g. when you bump a dependency you need to bump it on all projects using it, but this can be enforced with workspace checks.

As an advice I would say that you should start with a monolithic phoenix app or an umbrella and consider moving to a mono-repo/workspace only if the CI starts becoming a bottleneck.

Would it be possible to know how on what principles you structure your app with hundreds of packages?

The structure is very similar to the demo app. The only difference is that under packages we group packages based on the domain / scope:

packages
├── domain_1
│   ├── package_a
│   └── package_b
├── domain_2
│   └── package_c
└── shared
    ├── shared_package_1
    └── shared_package_2

You can use though any folder structure that works for you. Since everything is a mix project, changing the structure to acomodate your needs is trivial, you only need to mv packages to the new folder and update the path dependencies (which can be automated).

Also how do you manage the workload between team members / teams, so that friction and code conflicts are reduced?

Code conflicts/friction is exactly the same as it would be if it was an umbrella app, a huge monolith or a polyrepo. If two developers need to modify the same file they would modify it no matter where it is located. This should happen rarely though if the codebase is decoupled.

What we do for helping with the daily operations, is to require every package to have at least one valid (team member) maintainer (this is something enforced by workspace checks). This works like CODEOWNERS on a package level, so we know who should review every change. Also this helps on handovers.

I dont see very often a large well structured Elixir app in the wild, hence my questions

Your questions were perceptive and straight to the point

mindreframer · May 23, 2024, 10:09am

Wow, that was quick! Thanks for the elaborate response!

That is exactly why I’m asking and also am very excited about this package! In umbrellas or normal phoenix apps the CI execution (compilation + tests) time grows in linear fashion and becomes enbearable rather quickly. Usual solution is test parallelisation, and it works OK, though it’s wasteful to run all the tests all the time, even though the change was maybe a readme adjustment. It’s just crazy.

By having a proper DAG between internal packages + nested hierarchy (as alternative to the flat folder structure in umbrellas) + tags and scopes, one can properly decide which tests should be executed and save CI execution time drastically

Also having domain-related packages grouped in a single folder is a great way to communicate intent and reduce the cognitive load of understanding how all the things are related.

I quite like this 2 level nesting. A flat packages folder is still prone to unbounded growth, keeping them in domain folders makes it so much nicer.

That was exactly my suspicion! Otherwise it would be too chaotic. Nice to have it confirmed.

Thanks a lot, I feel you have invested a ton of time in making a flexible, yet very structured and not over-engineered solution available for the Elixir community! Looking forward to some opensource projects adopting it.

Have a great day,
Roman

pnezis · May 23, 2024, 10:48am

You will still need parallelisation for big projects, this time on the package level. workspace.run supports partitioned runs similarly to mix test. This has also the benefit that you can partition all time consuming CI steps, not only tests. If needed you can also have partitioned tests for big packages like now.

one can properly decide which tests should be executed

Totally agree, want to re-iterate that it is not only for tests and that the applicable packages are automatically picked by the workspace.run flags. For example:

# given that you use a common dependencies folder you only need to fetch and
# cache the external dependencies in your CI only from the root projects
mix workspace.run -t deps.get --only-roots

# format checks needs to run only on modified projects
mix workspace.run -t format --modified -- --check-formatted

# tests need to run to all projects affected by the changes
mix workspace.run -t test --affected

Thanks a lot for the feedback!

heathen · June 28, 2024, 8:43am

I’ve created an issue for vscode-elixir-ls and sent a PR to address this problem exactly because I caught it when decided to try Workspace. Hopefully, it will be approved, and then you just need to set the elixirLS.useCurrentRootFolderAsProjectDir flag and voila - it won’t try to go up to find the outermost mix.exs.

heathen · June 28, 2024, 8:46am

@pnezis thanks a lot for the project, it looks really promising and interesting!

May I ask, how to you deal with configs for each of these parts and for the bigger chunks up to the root apps as well? To be able to test independently, for example?

Do you have a central config for everything (like in umbrella) or keep them separately?

pnezis · June 28, 2024, 9:10am

May I ask, how to you deal with configs for each of these parts and for the bigger chunks up to the root apps as well? To be able to test independently, for example?

Hi @heathen, in our case we have a couple of independent apps each of which has it’s own config.The other packages are treated as libraries, with no associated config. If your apps have common config you could always have some shared config at the root folder (or anywhere else you want) and import it in the apps configs.

Regarding tests, I prefer to test everything in isolation (so each package has it’s own unit tests with mocks where needed) and also have an extensive integration/e2e test suite that tests everything together. Given the dependencies graph it is guaranteed that any change on any package will trigger the tests on all affected parent packages.

Let me know how it goes and feel free to open an issue if you face any problem.

heathen · June 28, 2024, 9:49am

Thank you for the answer. I just wonder, sometimes application contexts require their own settings, so I’m thinking how it is better to deal with that: have the central config which will get all settings for all apps and their parts or keep context-specific configs in the respected package directories and collect (import) them from a root app config or from the central config.

pnezis · June 28, 2024, 10:05am

This depends on your use case and what would work better for you. Personally I prefer some duplication than magically importing configs. If this does not scale I may consider some shared configs.

Another option (which we extensively use) is to have your configs as normal elixir modules in your packages, which we can then use in the apps runtime configs. This has the benefit that you make your configs reusable and you can also unit test them.

byu · July 9, 2024, 2:42pm

Found interesting behavior regarding apps directory, which I’m not reporting as a bug because I don’t think it’s workspace’s issue; but noting here for people who are figuring out their own monorepo directory structure.

I was playing around with the directory structure where I created a blank workspace, then a “domains” path for domain bounded context applications.

mix workspace.new my_workspace
cd my_workspace
mix deps.get && mix compile

mkdir domains
cd domains
mix new --sup mytestdomain1
mix phx.new mytestdomain2

cd ..
mkdir apps
cd domains

mix phx.new mytestdomain3
mix new --sup mytestdomain4

The workspace before mkdir apps

my_workspace % ls
README.md	_build		deps		domains		mix.exs		workspace.lock

The workspace after apps and mix creating the new phoenix project

my_workspace % ls
README.md	apps		deps		mix.exs		workspace.lock
_build		config		domains		mix.lock

Highlighting:

That the mytestdomain2 project config is within the mytestdomain2 path; but after just making the apps directory, mytestdomain3 has config references (in its mix.exs) pointing to a top level config path.

Expansion of details:

mytestdomain2 % ls
README.md	assets		deps		mix.exs		priv
_build		config		lib		mix.lock	test
mytestdomain2 % cat mix.exs 
defmodule Mytestdomain2.MixProject do
  use Mix.Project

  def project do
    [
      app: :mytestdomain2,
      version: "0.1.0",
      elixir: "~> 1.14",
      elixirc_paths: elixirc_paths(Mix.env()),
      start_permanent: Mix.env() == :prod,
      aliases: aliases(),
      deps: deps()
    ]
  end

... OMITTED ...

and

mytestdomain3 % ls
README.md	assets		lib		mix.exs		priv		test
mytestdomain3 % cat mix.exs 
defmodule Mytestdomain3.MixProject do
  use Mix.Project

  def project do
    [
      app: :mytestdomain3,
      version: "0.1.0",
      build_path: "../../_build",
      config_path: "../../config/config.exs",
      deps_path: "../../deps",
      lockfile: "../../mix.lock",
      elixir: "~> 1.14",
      elixirc_paths: elixirc_paths(Mix.env()),
      start_permanent: Mix.env() == :prod,
      aliases: aliases(),
      deps: deps()
    ]
  end

... OMITTED ...

Concluding:

The phoenix project generator switches logic to generate an umbrella based phoenix project if it detects the existence of a top-level (in monorepo) apps directory.

Which stands to reason is because the apps path is significant to elixir umbrella projects.

Regular mix new projects don’t have configs so they look as expected.

pnezis · July 9, 2024, 3:07pm

You are right, I have noticed it as well but forgot to mention it in the docs. What I do as a workaround is to run mix phx.new outside of apps and then just mv it in it.

The phx.new task is only looking if there is a parent apps folder and if it exists it is treated as an umbrella project. Maybe a --no-umbrella option option would help.

gaggle · September 27, 2024, 8:09am

This looks like a fantastic addition, thanks for making it available!

@pnezis is there anything to add on monorepo vs. “Poncho projects”? In my understanding Poncho projects have always been an Elixir monorepo setup, do you have insights/observations on this terminology?

pnezis · September 28, 2024, 11:29am

is there anything to add on monorepo vs. “Poncho projects”? In my understanding Poncho projects have always been an Elixir monorepo setup, do you have insights/observations on this terminology?

poncho projects are monorepos. workspace just offers a set of tools for working with (big) monorepos efficiently.