I’m building a custom Nerves system for a CM4-based board and running into a timing issue where the main Elixir application starts before all kernel drivers and hardware are fully initialized.
My firmware uses NervesTime with an RTC (PCF8563) on an I2C bus that’s created by a device tree overlay. During boot, NervesTime tries to initialize the RTC, but the I2C bus doesn’t exist yet:
I am considering adding an erlinit pre-run script to explicitly wait for the the hardware to be fully initialized, but am curious if there is a recommended Nerves pattern for waiting on hardware initialization? Ideally this would be baked into the Nerves system.
You’re running into a common issue where libraries assume that hardware exists on initialization when it really shouldn’t. I make this mistake too and I’m on the maintainers list for that library, so I’m not blaming those involved. The usual answer is that the library needs to either retry or hook into the proper notification for when its requirements are met. I prefer retrying for these things since it’s simpler. Retries can also recover from I2C issues like the bus being temporarily hung.
If you’re looking for a pragmatic answer that does involve changing that library, then I think you have it. For other things, I’d suggest adding init code to an Application.start callback that gets run before the code that needs it. However, an RTC is special and affects the clock, so you’ll benefit by getting it initialized as early as possible.
If you already have a custom Nerves system, then the way I’d solve the problem is to use the Linux device driver for the PCF8563. If it’s a built-in device driver and you’re loading the overlay via the config.txt, then I’d expect system time to get set by the RTC before any Elixir code runs. I know others who prefer handling the RTC in Elixir, so if you’re interested in updating NervesTime.RTC.NXP.PCF8563 to retry and are open to sending PRs, then I’d work with the other maintainers to get that merged.
I actually did explore having Linux manage the RTC and ntpd directly, but that seemed contrary to the “Nerves philosophy” of handling as much as possible in Elixir rather than relying on traditional Linux tools.
I noticed that nerves_time is included as a dependency in nerves_hub_link, and there was a PR (now closed) to check for time synchronization before attempting to connect. That made me think the intended direction was to have Elixir manage the RTC.
I’m curious about the broader architectural direction here: should libraries like nerves_time become more resilient with retries/timeouts to handle these timing issues? Or is it actually fine (or even preferable) to delegate responsibilities like RTC management to Linux and keep the Nerves system more self-contained?
When using the mostly-for-development SharedSecret method with NervesHub the time can really screw with the signing of the secret so that’s why that was considered. I’m not sure we ever merged that. It was also about this time I noticed just how long it can be between OS time getting NTP sync and Erlang actually warping time to match. Fun stuff. Not painful.. at .. all.
I wouldn’t necessarily consider nerves_hub_link to be a reference for the Nerves philosophy or best practice. It is well proven but has been worked on for a long time by many hands and contains more than one philosophy
It very much tries and retries the way that Frank is talking about. Because internet access is not a guarantee. Access to the hardware crypto chip is hopefully stable but if it blips, best try again later.
I think improving the RTC library sounds good. As that benefits us all.
But for something like an RTC using the Linux kernel driver is reasonably pragmatic as well. There is no purity always a balancing act. You will se a mix in most production Nerves projects.