Complete environment setup for Livebook on Gentoo and Gentoo-based distributions

Is anyone here using Livebook on Gentoo or Gentoo-based distribution? I would like to install all dependencies for it including those for smart cells.

Hardware (AMD-based)
  1. Motherboard: Gigabyte AMD Socket AM5, X670E AORUS MASTER
  2. CPU: AMD CPU Desktop Ryzen 9 16C/32T 7950X3D
  3. GPU: Sapphire PULSE AMD Radeon RX7900XTX 24GB, GDDR6, 384BIT
  4. Memory: Kingston, DIMM, DDR5, 64GB, 6000MHz, CL32, 1.35V, Fury Renegade, RGB, Kit of 2
  5. Active GPU: discrete (iGPU is turned off in UEFI)
Operating System (Linux)
  1. Distribution: Gerntoo Linux
  2. Profile: [9] default/linux/amd64/17.1/desktop/plasma (stable)
  3. Kernel: 6.6.13-gentoo-dist (64-bits)
  4. Shell: GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)
  5. Display manager: X11
  6. Version manager: asdf v0.14.0-ccdd47d
  7. Graphics driver: AMDGPU (open source)
Desktop Environment (KDE Plasma 5)
  1. Plasma: 5.27.10
  2. KDE Frameworks: 5.113.0
  3. Qt: 5.15.11
  4. GCC: Gentoo 13.2.1_p20240113-r1 p12
  5. Make: GNU Make 4.4.1
Installed Plugins (asdf)
$ asdf list
bazel
 *6.1.2
  7.0.1
elixir
 *ref:v1.16.0
erlang
 *26.2.1
java
 *temurin-21.0.2+13.0.LTS
nodejs
 *21.6.1
php
 *8.3.2
postgres
 *16.1
ruby
 *3.3.0
rust
 *1.75.0
sqlite
 *3.45.0
Environment
  1. AMDGPU_TARGETS: -gfx906 -gfx908 -gfx90a -gfx1030 gfx1100
  2. Shell Variables
EXLA_TARGET="rocm"
XLA_BUILD="true"
XLA_TARGET="rocm"
ROCm
  1. dev-util/rocm-smi: 5.4.2
  2. dev-util/rocminfo: 5.7.1
$ rocminfo
ROCk module is loaded
=====================
HSA System Attributes
=====================
Runtime Version:         1.1
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE
System Endianness:       LITTLE
Mwaitx:                  DISABLED
DMAbuf Support:          YES

==========
HSA Agents
==========
*******
Agent 1
*******
  Name:                    AMD Ryzen 9 7950X3D 16-Core Processor
  Uuid:                    CPU-XX
  Marketing Name:          AMD Ryzen 9 7950X3D 16-Core Processor
  Vendor Name:             CPU
  Feature:                 None specified
  Profile:                 FULL_PROFILE
  Float Round Mode:        NEAR
  Max Queue Number:        0(0x0)
  Queue Min Size:          0(0x0)
  Queue Max Size:          0(0x0)
  Queue Type:              MULTI
  Node:                    0
  Device Type:             CPU
  Cache Info:
    L1:                      32768(0x8000) KB
  Chip ID:                 0(0x0)
  ASIC Revision:           0(0x0)
  Cacheline Size:          64(0x40)
  Max Clock Freq. (MHz):   5759
  BDFID:                   0
  Internal Node ID:        0
  Compute Unit:            32
  SIMDs per CU:            0
  Shader Engines:          0
  Shader Arrs. per Eng.:   0
  WatchPts on Addr. Ranges:1
  Features:                None
  Pool Info:
    Pool 1
      Segment:                 GLOBAL; FLAGS: FINE GRAINED
      Size:                    65571832(0x3e88bf8) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Alignment:         4KB
      Accessible by all:       TRUE
    Pool 2
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    65571832(0x3e88bf8) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Alignment:         4KB
      Accessible by all:       TRUE
    Pool 3
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED
      Size:                    65571832(0x3e88bf8) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Alignment:         4KB
      Accessible by all:       TRUE
  ISA Info:
*******
Agent 2
*******
  Name:                    gfx1100
  Uuid:                    GPU-aa0f7bf064530bb9
  Marketing Name:          AMD Radeon RX 7900 XTX
  Vendor Name:             AMD
  Feature:                 KERNEL_DISPATCH
  Profile:                 BASE_PROFILE
  Float Round Mode:        NEAR
  Max Queue Number:        128(0x80)
  Queue Min Size:          64(0x40)
  Queue Max Size:          131072(0x20000)
  Queue Type:              MULTI
  Node:                    1
  Device Type:             GPU
  Cache Info:
    L1:                      32(0x20) KB
    L2:                      6144(0x1800) KB
    L3:                      98304(0x18000) KB
  Chip ID:                 29772(0x744c)
  ASIC Revision:           0(0x0)
  Cacheline Size:          64(0x40)
  Max Clock Freq. (MHz):   2371
  BDFID:                   768
  Internal Node ID:        1
  Compute Unit:            96
  SIMDs per CU:            2
  Shader Engines:          6
  Shader Arrs. per Eng.:   2
  WatchPts on Addr. Ranges:4
  Features:                KERNEL_DISPATCH
  Fast F16 Operation:      TRUE
  Wavefront Size:          32(0x20)
  Workgroup Max Size:      1024(0x400)
  Workgroup Max Size per Dimension:
    x                        1024(0x400)
    y                        1024(0x400)
    z                        1024(0x400)
  Max Waves Per CU:        32(0x20)
  Max Work-item Per CU:    1024(0x400)
  Grid Max Size:           4294967295(0xffffffff)
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)
    y                        4294967295(0xffffffff)
    z                        4294967295(0xffffffff)
  Max fbarriers/Workgrp:   32
  Packet Processor uCode:: 528
  SDMA engine uCode::      19
  IOMMU Support::          None
  Pool Info:
    Pool 1
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED
      Size:                    25149440(0x17fc000) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Alignment:         4KB
      Accessible by all:       FALSE
    Pool 2
      Segment:                 GLOBAL; FLAGS:
      Size:                    25149440(0x17fc000) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Alignment:         4KB
      Accessible by all:       FALSE
    Pool 3
      Segment:                 GROUP
      Size:                    64(0x40) KB
      Allocatable:             FALSE
      Alloc Granule:           0KB
      Alloc Alignment:         0KB
      Accessible by all:       FALSE
  ISA Info:
    ISA 1
      Name:                    amdgcn-amd-amdhsa--gfx1100
      Machine Models:          HSA_MACHINE_MODEL_LARGE
      Profiles:                HSA_PROFILE_BASE
      Default Rounding Mode:   NEAR
      Default Rounding Mode:   NEAR
      Fast f16:                TRUE
      Workgroup Max Size:      1024(0x400)
      Workgroup Max Size per Dimension:
        x                        1024(0x400)
        y                        1024(0x400)
        z                        1024(0x400)
      Grid Max Size:           4294967295(0xffffffff)
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)
        y                        4294967295(0xffffffff)
        z                        4294967295(0xffffffff)
      FBarrier Max Size:       32
*** Done ***

This is very first time for me in machine learning. I have no idea about ROCm and how to install support for it. xla project does not provides a procompiled rocm version in releases, so I need to compile it, but I have no idea what packages I have to emerge.

So far the livebook server escript command does not give any output except:

(…) Application running at (…)

and after clicking on Neural Network task smart cell the livebook dependencies setup fails with such error:

ERROR: Config value 'rocm' is not defined in any .rc file
make: *** [Makefile:26: $HOME/.cache/xla/0.5.1/cache/build/xla_extension-x86_64-linux-gnu-rocm.tar.gz] Błąd 2
could not compile dependency :xla, "mix compile" failed. Errors may have been logged above. You can recompile this dependency with "mix deps.compile xla --force", update it with "mix deps.update xla" or clean it with "mix deps.clean xla"
** (Mix.Error) Could not compile with "make" (exit status: 2).
You need to have gcc and make installed. If you are using
Ubuntu or any other Debian-based system, install the packages
"build-essential". Also install "erlang-dev" package if not
included in your Erlang/OTP version. If you're on Fedora, run
"dnf group install 'Development Tools'".

    (mix 1.16.0) lib/mix.ex:580: Mix.raise/2
    (elixir_make 0.7.8) lib/elixir_make/compiler.ex:53: ElixirMake.Compiler.compile/1
    (mix 1.16.0) lib/mix/task.ex:478: anonymous fn/3 in Mix.Task.run_task/5
    (mix 1.16.0) lib/mix/tasks/compile.all.ex:124: Mix.Tasks.Compile.All.run_compiler/2
    (mix 1.16.0) lib/mix/tasks/compile.all.ex:104: Mix.Tasks.Compile.All.compile/4
    (mix 1.16.0) lib/mix/tasks/compile.all.ex:93: Mix.Tasks.Compile.All.with_logger_app/2
    (mix 1.16.0) lib/mix/tasks/compile.all.ex:56: Mix.Tasks.Compile.All.run/1

Since Gentoo by default have gcc and make this error says literally nothing. Also I have 0 results when searching for ERROR: Config value 'rocm' is not defined in any .rc file.

I have tried latest code in elixir-nx/nx using:

Mix.install [{:nx, github: "elixir-nx/nx", sparse: "nx", override: true}, {:exla, github: "elixir-nx/nx", sparse: "exla",
override: true}]

and I have received some more information:

ERROR: An error occurred during the fetch of repository 'local_config_rocm':
Traceback (most recent call last):
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 810, column 38, in _rocm_autoconf_impl
_create_local_rocm_repository(repository_ctx)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 546, column 35, in _create_local_rocm_repository
rocm_config = _get_rocm_config(repository_ctx, bash_bin)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 393, column 30, in _get_rocm_config
config = find_rocm_config(repository_ctx)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 371, column 26, in find_rocm_config
exec_result = execute(repository_ctx, [python_bin, repository_ctx.attr._find_rocm_config])
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/remote_config/common.bzl", line 230, column 13, in execute
fail(
Error in fail: Repository command failed
ERROR: Specified ROCM_PATH "/opt/rocm" does not exist
ERROR: $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/WORKSPACE:19:15: fetching rocm_configure rule//external:local_config_rocm: Traceback (most recent call last):
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 810, column 38, in _rocm_autoconf_impl
_create_local_rocm_repository(repository_ctx)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 546, column 35, in _create_local_rocm_repository
rocm_config = _get_rocm_config(repository_ctx, bash_bin)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 393, column 30, in _get_rocm_config
config = find_rocm_config(repository_ctx)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 371, column 26, in find_rocm_config
exec_result = execute(repository_ctx, [python_bin, repository_ctx.attr._find_rocm_config])
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/remote_config/common.bzl", line 230, column 13, in execute
fail(
Error in fail: Repository command failed
ERROR: Specified ROCM_PATH "/opt/rocm" does not exist
ERROR: Skipping '//xla/extension:xla_extension': no such package '@local_config_rocm//rocm': Repository command failed
ERROR: Specified ROCM_PATH "/opt/rocm" does not exist
WARNING: Target pattern parsing failed.
ERROR: no such package '@local_config_rocm//rocm': Repository command failed
ERROR: Specified ROCM_PATH "/opt/rocm" does not exist
INFO: Elapsed time: 0.088s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
make: *** [Makefile:26: $HOME/.cache/xla/0.6.0/cache/build/xla_extension-x86_64-linux-gnu-rocm.tar.gz] Błąd 1
could not compile dependency :xla, "mix compile" failed. Errors may have been logged above. You can recompile this dependency with "mix deps.compile xla --force", update it with "mix deps.update xla" or clean it with "mix deps.clean xla"
** (Mix.Error) Could not compile with "make" (exit status: 2).
You need to have gcc and make installed. If you are using
Ubuntu or any other Debian-based system, install the packages
"build-essential". Also install "erlang-dev" package if not
included in your Erlang/OTP version. If you're on Fedora, run
"dnf group install 'Development Tools'".

(mix 1.16.0) lib/mix.ex:580: Mix.raise/2
(elixir_make 0.7.8) lib/elixir_make/compiler.ex:53: ElixirMake.Compiler.compile/1
(mix 1.16.0) lib/mix/task.ex:478: anonymous fn/3 in Mix.Task.run_task/5
(mix 1.16.0) lib/mix/tasks/compile.all.ex:124: Mix.Tasks.Compile.All.run_compiler/2
(mix 1.16.0) lib/mix/tasks/compile.all.ex:104: Mix.Tasks.Compile.All.compile/4
(mix 1.16.0) lib/mix/tasks/compile.all.ex:93: Mix.Tasks.Compile.All.with_logger_app/2
(mix 1.16.0) lib/mix/tasks/compile.all.ex:56: Mix.Tasks.Compile.All.run/1

I have tried to export ROCM_PATH with value /usr, but then it did not located the rocm_version.h file, see:

ERROR: An error occurred during the fetch of repository 'local_config_rocm':
Traceback (most recent call last):
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 810, column 38, in _rocm_autoconf_impl
_create_local_rocm_repository(repository_ctx)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 546, column 35, in _create_local_rocm_repository
rocm_config = _get_rocm_config(repository_ctx, bash_bin)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 393, column 30, in _get_rocm_config
config = find_rocm_config(repository_ctx)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 371, column 26, in find_rocm_config
exec_result = execute(repository_ctx, [python_bin, repository_ctx.attr._find_rocm_config])
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/remote_config/common.bzl", line 230, column 13, in execute
fail(
Error in fail: Repository command failed
ERROR: ROCm version file not found in ['include/rocm-core/rocm_version.h', 'include/rocm_version.h']
ERROR: $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/WORKSPACE:19:15: fetching rocm_configure rule//external:local_config_rocm: Traceback (most recent call last):
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 810, column 38, in _rocm_autoconf_impl
_create_local_rocm_repository(repository_ctx)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 546, column 35, in _create_local_rocm_repository
rocm_config = _get_rocm_config(repository_ctx, bash_bin)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 393, column 30, in _get_rocm_config
config = find_rocm_config(repository_ctx)
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 371, column 26, in find_rocm_config
exec_result = execute(repository_ctx, [python_bin, repository_ctx.attr._find_rocm_config])
File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/remote_config/common.bzl", line 230, column 13, in execute
fail(
Error in fail: Repository command failed
ERROR: ROCm version file not found in ['include/rocm-core/rocm_version.h', 'include/rocm_version.h']
ERROR: Skipping '//xla/extension:xla_extension': no such package '@local_config_rocm//rocm': Repository command failed
ERROR: ROCm version file not found in ['include/rocm-core/rocm_version.h', 'include/rocm_version.h']
WARNING: Target pattern parsing failed.
ERROR: no such package '@local_config_rocm//rocm': Repository command failed
ERROR: ROCm version file not found in ['include/rocm-core/rocm_version.h', 'include/rocm_version.h']
INFO: Elapsed time: 0.066s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
make: *** [Makefile:26: $HOME/.cache/xla/0.6.0/cache/build/xla_extension-x86_64-linux-gnu-rocm.tar.gz] Błąd 1
could not compile dependency :xla, "mix compile" failed. Errors may have been logged above. You can recompile this dependency with "mix deps.compile xla --force", update it with "mix deps.update xla" or clean it with "mix deps.clean xla"
** (Mix.Error) Could not compile with "make" (exit status: 2).
You need to have gcc and make installed. If you are using
Ubuntu or any other Debian-based system, install the packages
"build-essential". Also install "erlang-dev" package if not
included in your Erlang/OTP version. If you're on Fedora, run
"dnf group install 'Development Tools'".

(mix 1.16.0) lib/mix.ex:580: Mix.raise/2
(elixir_make 0.7.8) lib/elixir_make/compiler.ex:53: ElixirMake.Compiler.compile/1
(mix 1.16.0) lib/mix/task.ex:478: anonymous fn/3 in Mix.Task.run_task/5
(mix 1.16.0) lib/mix/tasks/compile.all.ex:124: Mix.Tasks.Compile.All.run_compiler/2
(mix 1.16.0) lib/mix/tasks/compile.all.ex:104: Mix.Tasks.Compile.All.compile/4
(mix 1.16.0) lib/mix/tasks/compile.all.ex:93: Mix.Tasks.Compile.All.with_logger_app/2
(mix 1.16.0) lib/mix/tasks/compile.all.ex:56: Mix.Tasks.Compile.All.run/1

I have not found any package which provides such file and I have no idea how to proceed from here.

Ok, so far this is what I have found:

  1. My machine is too new and I need to wait for support for it in Gentoo i.e. I need to wait when 6.x versions of ROCm would be available in gentoo ebuild repository. In theory I could use -9999 ebuilds but those rely on master branch and therefore could be very unstable.

  2. I have not found an ebuild for rocm-core which is pretty weird. After update to 6.x (see above) I would have to install manually rocm-core which does not looks really hard, but still it’s weird.

  3. Optionally it would be helpful to write rocm-core and rocm-meta ebuilds in order to easily install and update ROCm and it’s dependencies.