Coder Social home page Coder Social logo

nerves_runtime's Introduction

nerves_runtime

CircleCI Hex version

nerves_runtime is a core component of Nerves. It contains applications and libraries that are expected to be useful on all Nerves devices.

Here are its features:

  • Generic system and filesystem initialization (suitable for use with shoehorn)
  • Introspection of Nerves system, firmware, and deployment metadata
  • Device reboot and shutdown
  • A small Linux kernel uevent application for capturing hardware change events and more. See nerves_uevent.
  • Device serial numbers
  • Linux log integration with Elixir. See nerves_logging

The following sections describe the features in more detail. For more information, see the hex docs.

System Initialization

nerves_runtime provides an OTP application (nerves_runtime) that can initialize the system when it is started. For this to be useful, nerves_runtime must be started before other OTP applications, since most will assume that the system is already initialized before they start. To set up nerves_runtime to work with shoehorn, you will need to do the following:

  1. Add shoehorn to your mix.exs dependency list
  2. Add a :shoehorn configuration to config.exs and :nerves_runtime to the beginning of the init: list:
config :shoehorn,
  init: [:nerves_runtime, :other_app1, :other_app2]

Filesystem Initialization

Nerves systems generally ship with one or more application filesystem partitions. These are used for persisting data that is expected to live between firmware updates. The root filesystem cannot be used since it is mounted as read-only by default.

nerves_runtime takes an unforgiving approach to managing the application partition: if it can't be mounted as read-write, it gets re-formatted. While filesystem corruption should be a rare event, even with unexpected loss of power, Nerves devices may not always be accessible for manual recovery. This default behavior provides a basic recoverability guarantee.

To verify that this recovery works, Nerves systems usually leave the application filesystems uninitialized so that the format operation happens on the first boot. This means that the first boot takes slightly longer than subsequent boots.

A common implementation of "reset to factory defaults" is to purposely erase (corrupt) the application partition and reboot. See Nerves.Runtime.FwupOps.factory_reset/1.

nerves_runtime uses firmware metadata to determine how to mount and initialize the application partition. The following variables are important:

  • [partition].nerves_fw_application_part0_devpath - the path to the application partition (e.g. /dev/mmcblk0p3)
  • [partition].nerves_fw_application_part0_fstype - the type of filesystem (e.g. ext4)
  • [partition].nerves_fw_application_part0_target - where the partition should be mounted (e.g. /root or /mnt/appdata)

Nerves System and Firmware Metadata

All official Nerves systems maintain a list of key-value pairs for tracking various information about the system. This information is not intended to be written frequently. To get this information, you can call one of the following:

  • Nerves.Runtime.KV.get_all_active/0 - return all key-value pairs associated with the active firmware.
  • Nerves.Runtime.KV.get_all/0 - return all key-value pairs, including those from the inactive firmware, if any.
  • Nerves.Runtime.KV.get_active/1 - look up the value of a key associated with the active firmware.
  • Nerves.Runtime.KV.get/1 - look up the value of a key, including those from the inactive firmware, if any.

Global Nerves metadata includes the following:

Key Build Environment Variable Example Value Description
nerves_fw_active N/A "a" This key holds the prefix that identifies the active firmware metadata. In this example, all keys starting with "a." hold information about the running firmware.
nerves_fw_devpath NERVES_FW_DEVPATH "/dev/mmcblk0" This is the primary storage device for the firmware.
nerves_serial_number N/A "12345abc" This is a text serial number. See Serial numbers for details.
nerves_fw_validated N/A 0 Set to "1" to indicate that the currently running firmware is valid. (Only supported on some platforms)
nerves_fw_autovalidate N/A 1 Set to "1" to indicate that firmware updates are valid without any additional checks. (Only supported on some platforms)
upgrade_available N/A 0 If using the U-Boot bootloader AND U-Boot's bootcount feature, then the upgrade_available variable is used instead of nerves_fw_validated (it has the opposite meaning)
bootcount N/A 1 If using the U-Boot bootloader AND U-Boot's bootcount feature, then this is the number of times an unvalidated firmware has been booted.
bootlimit N/A 1 If using the U-Boot bootloader AND U-Boot's bootcount feature, then this is the max number of tries for unvalidated firmware.

Firmware-specific Nerves metadata includes the following:

Key Example Value Description
nerves_fw_application_part0_devpath "/dev/mmcblk0p3" The block device that contains the application partition
nerves_fw_application_part0_fstype "ext4" The application partition's filesystem type
nerves_fw_application_part0_target "/root" Where to mount the application partition
nerves_fw_architecture "arm" The processor architecture (Not currently used)
nerves_fw_author "John Doe" The person or company that created this firmware
nerves_fw_description "Stuff" A description of the project
nerves_fw_platform "rpi3" A name to identify the board that this runs on. It can be checked in the fwup.conf before performing an upgrade.
nerves_fw_product "My Product" A product name that may show up in a firmware selection list, for example
nerves_fw_version "1.0.0" The project's version
nerves_fw_vcs_identifier "bdeead38..." A git SHA or other identifier (optional)
nerves_fw_misc "anything..." Any application info that doesn't fit in another field (optional)

Note that the keys are stored in the environment block prefixed by the firmware slot for which they pertain. For example, a.nerves_fw_description is the description for the firmware in the "A" slot.

Several of the keys can be set in the mix.exs file of your main Nerves project. This is the preferred way to set them because it requires the least amount of effort.

Assuming that your fwup.conf respects the fwup variable names listed in the table, the keys can also be overridden by setting environment variables at build time. Depending on your project, you may prefer to set them using a customized fwup.conf configuration file instead.

The fwup -m value shows the key that you'll see if you run fwup -m -i project.fw to extract the firmware metadata from the .fw file.

Key in Nerves.Runtime Key in mix.exs Build Environment Variable Key in fwup -m
nerves_fw_application_part0_devpath N/A NERVES_FW_APPLICATION_PART0_DEVPATH N/A
nerves_fw_application_part0_fstype N/A NERVES_FW_APPLICATION_PART0_FSTYPE N/A
nerves_fw_application_part0_target N/A NERVES_FW_APPLICATION_PART0_TARGET N/A
nerves_fw_architecture N/A NERVES_FW_ARCHITECTURE meta-architecture
nerves_fw_author :author NERVES_FW_AUTHOR meta-author
nerves_fw_description :description NERVES_FW_DESCRIPTION meta-description
nerves_fw_platform N/A NERVES_FW_PLATFORM meta-platform
nerves_fw_product :name NERVES_FW_PRODUCT meta-product
nerves_fw_version :version NERVES_FW_VERSION meta-version
nerves_fw_vcs_identifier N/A NERVES_FW_VCS_IDENTIFIER meta-vcs-identifier
nerves_fw_misc N/A NERVES_FW_MISC meta-misc

Device Reboot and Shutdown

Rebooting, powering-off, and halting a device work by signaling to erlinit an intention to shutdown and then exiting the Erlang VM by calling :init.stop/0. The Nerves.Runtime.reboot/0 and related utilities are helper methods for this. Once they return, the Erlang VM will likely only be available momentarily before shutdown. If the OTP applications cannot be stopped within a timeout as specified in the erlinit.config, erlinit will ungracefully terminate the Erlang VM.

Reverting firmware

If you'd like to go back to the previous version of firmware running on a device, you can do that if the Nerves system supports it. At the IEx prompt, run:

iex> Nerves.Runtime.revert

Running this command manually is useful in development. Production use requires more work to protect against faulty upgrades.

Newer Nerves systems support preventing a revert. This is useful when you've loaded a version of firmware that is not meant to be used after it has been upgraded. This could be a factory test or an initial firmware that bootstraps encrypted firmware storage. See Nerves.Runtime.FwupOps.prevent_revert/0.

Assisted firmware validation and automatic revert

Nerves firmware updates protect against update corruption and power loss midway into the update procedure. However, what happens if the firmware update contains bad code that hangs the device or breaks something important like networking? Some Nerves systems support tentative runs of new firmware and if something goes wrong, they'll revert back.

At a high level, this involves some additional code from the developer that knows what constitutes "working". This could be "is it possible to connect to the firmware update server within 5 minutes of boot?"

Here's the process:

  1. New firmware is installed in the normal manner. The Nerves.Runtime.KV variable, nerves_fw_validated is set to 0. (The systems fwup.conf does this)
  2. The system reboots like normal.
  3. The device starts a five minute reboot timer (your code needs to do this if you want to catch hangs or super-slow boots)
  4. The application attempts to make a connection to the firmware update server.
  5. On a good connection, the application sets nerves_fw_validated to 1 by calling Nerves.Runtime.validate_firmware/0 and cancels the reboot timer.
  6. On error, the reboot timer failing, or a hardware watchdog timeout, the system reboots. The bootloader reverts to the previous firmware.

Some Nerves systems support a KV variable called nerves_fw_autovalidate. The intention of this variable was to make that system support scenarios that require validate and ones that don't. If the system supports this variable then you should make sure that it is set to 0 (either via a custom fwup.conf or via the provisioning hooks for writing serial numbers to MicroSD cards). Support for the nerves_fw_autovalidate variable will likely go away in the future as steps are made to make automatic revert on bad firmware a default feature of Nerves rather than an add-on.

U-Boot assisted automatic revert

U-Boot provides a bootcount feature that can be used to try out new firmware and revert it if it fails. At a high level, it works similar to logic just described except that it can attempt a new firmware more than once if desired. This can help if validating a firmware image depends on factors out of your control and you want a few tries to happen before giving up.

To use this, you need to enable the following U-Boot configuration items:

CONFIG_BOOTCOUNT_LIMIT=y
CONFIG_BOOTCOUNT_ENV=y

See the U-Boot documentation for more information. The gist is to have your bootcmd handle normal booting and then add an altbootcmd to revert the firmware. The firmware update should set the upgrade_available U-Boot environment variable to "1" to indicate that boot counting should start. Nerves.Runtime.validate_firmware/0 knows about upgrade_available, so when you call it to indicate that the firmware is ok, it will set upgrade_available back to "0" and reset "bootcount".

Best effort automatic revert

Unfortunately, the bootloader for platforms like the Raspberry Pi makes it difficult to implement the above mechanism. The following strategy cannot protect against kernel and early boot issues, but it can still provide value:

  1. Upgrade firmware the normal way. Record that the next boot will be the first one in the application data partition.
  2. On the reboot, if this is the first one, record that the boot happened and revert the firmware with reboot: false. If this is not the first boot, carry on.
  3. When you're happy with the new firmware, revert the firmware again with reboot: false. I.e., revert the revert. It is critical that revert is only called once.

To make this handle hangs, you'll want to enable a hardware watchdog.

Serial numbers

Finding the serial number of a device is both hardware specific and influenced by you and your organization's choices for assigning them (or not). Programs should call Nerves.Runtime.serial_number/0 to get the serial number.

Nerves systems all come with some default way of getting a serial number for a device. This strategy will likely work for a while, but may not meet your needs when it comes to production. Nerves uses boardid to read serial numbers and it can be customized via its /etc/boardid.config file. See boardid for the mechanisms available. If none of boardid's mechanisms work for you, please consider filing an issue or making a PR, since our history has been that organizations tend to use similar mechanisms and it's likely someone else will use it too.

As a word of caution, many Nerves users write serial numbers in the U-Boot environment block under the key nerves_serial_number. This is supported and documentation exists for it in many places. While it's very convenient, it has drawbacks - like it's easily modified. It's definitely not the only mechanism. The boardid.config file supports trying multiple ways of getting a serial number to handle hardware changing over the course of development.

See embedded-elixir for how to assign serial numbers to devices using the U-Boot environment block way.

Using nerves_runtime in tests

Applications that depend on nerves_runtime for accessing provisioning information from the Nerves.Runtime.KV can mock the contents with the included Nerves.Runtime.KVBackend.InMemory module through the Application config:

config :nerves_runtime,
  kv_backend: {Nerves.Runtime.KVBackend.InMemory, contents: %{"key" => "value"}}

You can also create your own module based on the Nerves.Runtime.KVBackend behavior and set it to be used in the Application config. In most situations, the provided Nerves.Runtime.KVBackend.InMemory should be sufficient, though this would be helpful in cases where you might need to generate the initial state at runtime instead:

defmodule MyApp.KVBackend.Mock do
  @behaviour Nerves.Runtime.KVBackend

  @impl Nerves.Runtime.KVBackend
  def load(_opts) do
    # initial state
    %{
      "howdy" => "partner",
      "dynamic" => some_runtime_calc_function()
    }
  end

  @impl Nerves.Runtime.KVBackend
  def save(_map, _opts), do: :ok
end

# Then in config.exs
config :nerves_runtime, :kv_backend, MyApp.KVBackend.Mock

nerves_runtime's People

Contributors

amclain avatar connorrigby avatar dependabot-preview[bot] avatar dependabot[bot] avatar electricshaman avatar ericr3r avatar fazibear avatar fhunleth avatar gregmefford avatar jjcarstens avatar kyleaa avatar mnishiguchi avatar mobileoverlord avatar nicoeg avatar oestrich avatar okothkongo avatar pancho-villa avatar udoschneider avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nerves_runtime's Issues

An application boot failure around KmsgTailer after update from nerves_runtime v0.8.0 to v0.11.0

#62 # Environment
Build host: Linux ubuntu 4.15.0-91-generic #92-Ubuntu SMP Fri Feb 28 11:09:48 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Target: Linux smartslot 4.19.72 #4 SMP PREEMPT Thu Mar 12 07:38:25 PDT 2020 armv7l GNU/Linux

nerves: 1.6.0
nerves_bootstrap: ~> 1.8

  • Elixir version (elixir -v):
    Erlang/OTP 22 [erts-10.6.4] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]
    Elixir 1.10.0 (compiled with Erlang/OTP 22)

  • Nerves environment: (mix nerves.env --info)
    nerves_bootstrap| Environment Package List

    Pkg: nerves_system_poky
    Vsn: 2.7.1-5
    Type: system
    BuildRunner: {Nerves.Artifact.BuildRunners.Local, []}

    Pkg: nerves_system_yocto
    Vsn: 1.0.0
    Type: system_platform
    BuildRunner: {nil, []}

|nerves_bootstrap| Loadpaths Start

Nerves environment
MIX_TARGET: lces2
MIX_ENV: dev

NERVES_TOOLCHAIN is unset
_|nerves_bootstrap| Environment Variable List
target: lces2
toolchain: /home/motyl/.local/share/nerves/artifacts/nerves_system_poky-portable-2.7.1-5/toolchain/sysroots/x86_64-pokysdk-linux
system: /home/motyl/.local/share/nerves/artifacts/nerves_system_poky-portable-2.7.1-5/toolchain/sysroots/armv7vet2hf-vfpv4d16-poky-linux-gnueabi/../../../
app: /mnt/src/agilis_fw/firmware.cr4

|nerves_bootstrap| Loadpaths End_

  • Additional information about your host, target hardware or environment that
    may help

Current behavior

Boot Failed::nerves_runtime exited from {{:shutdown, {:failed_to_start_child, Nerves.Runtime.Log.KmsgTailer, {:enoent, [{:erlang, :open_port, [{:spawn_executable, '/srv/erlang/lib/nerves_runtime-0.11.0/priv/nerves_runtime'}, [{:arg0, "kmsg_tailer"}, {:line, 1024}, :use_stdio, :binary, :exit_status]], []}, {Nerves.Runtime.Log.KmsgTailer, :init, 1, [file: 'lib/nerves_runtime/log/kmsg_tailer.ex', line: 22]}, {:gen_server, :init_it, 2, [file: 'gen_server.erl', line: 374]}, {:gen_server, :init_it, 6, [file: 'gen_server.erl', line: 342]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}}}, {Nerves.Runtime.Application, :start, [:normal, []]}}

Expected behavior

Move nerves_system_shell to a separate repo?

I've heard mixed reviews of nerves_system_shell. My takeaway is that it is not ready for 1.0. The question is to fix or move to a separate repo? I believe that #3 is the primary complaint, but I believe there are other issues that have not been logged.

@electricshaman Since you were the original contributor, do you still use it or have any plans to improve it?

Support for ping

This would be really helpful to have a simple Elixir-based ping command to verify network connectivity as part of the helpers. I'd be fine with another solution that doesn't involve ICMP just so that it works against and access point and doesn't require the Internet to be connected (like wget'ing some web page).

Option to disable `kmesg` and `syslog` tailer

Current behavior

When starting nerves_runtime on a host machine. (for example when testing NervesHub)
The application fails to start:

kmsg_tailer: open /proc/kmsg: Permission denied

13:42:47.743 [info]  Application nerves_runtime exited: Nerves.Runtime.Application.start(:normal, []) returned an error: shutdown: failed to start child: Nerves.Runtime.Log.SyslogTailer
    ** (EXIT) an exception was raised:
        ** (MatchError) no match of right hand side value: {:error, :eaddrinuse}
            (nerves_runtime) lib/nerves_runtime/log/syslog_tailer.ex:26: Nerves.Runtime.Log.SyslogTailer.init/1
            (stdlib) gen_server.erl:374: :gen_server.init_it/2
            (stdlib) gen_server.erl:342: :gen_server.init_it/6
            (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
** (Mix) Could not start application nerves_runtime: Nerves.Runtime.Application.start(:normal, []) returned an error: shutdown: failed to start child: Nerves.Runtime.Log.SyslogTailer
    ** (EXIT) an exception was raised:
        ** (MatchError) no match of right hand side value: {:error, :eaddrinuse}
            (nerves_runtime) lib/nerves_runtime/log/syslog_tailer.ex:26: Nerves.Runtime.Log.SyslogTailer.init/1
            (stdlib) gen_server.erl:374: :gen_server.init_it/2
            (stdlib) gen_server.erl:342: :gen_server.init_it/6
            (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3

Expected behavior

To be able to start nerves_runtime on host machine

Proposed solution:

Currently one can disable some things with configs such as:

config :nerves_runtime, :kernel,
  autoload_modules: false

I think something like

config :nerves_runtime, Nerves.Runtime.Log,
   kmesg_tailer: false,
   syslog_tailer: false

Would work fine?

after install and compile dependencies got error starting nerves_runtime

Generated fw_rpi app

20:57:18.689 [info]  mDNSResponder (Engineering Build) (Mar 23 2018 20:56:58) starting

20:57:18.689 [error] bind(listenfd, (struct sockaddr *) &laddr, sizeof(laddr)); failed: 13 (Permission denied)

20:57:18.689 [error] udsserver_init: 13 (Permission denied)

20:57:18.689 [info]  mDNS_AddDNSServer: Lock not held! mDNS_busy (0) mDNS_reentrancy (0)

20:57:18.689 [info]  mDNS_AddDNSServer: Lock not held! mDNS_busy (0) mDNS_reentrancy (0)

20:57:18.689 [info]  mDNSResponder (Engineering Build) (Mar 23 2018 20:56:58) stopping

20:57:18.689 [warn]  mDNS server stopped with exit code 255

20:57:18.696 [error] GenServer Nerves.Dnssd.Daemon terminating
** (stop) {:mdnsd_exited, 255}
Last message: {#Port<0.92184>, {:exit_status, 255}}
State: #Port<0.92184>

** (Mix) Could not start application nerves_runtime: Nerves.Runtime.Application.start(:normal, []) returned an error: shutdown: failed to start child: Nerves.Runtime.Kernel
    ** (EXIT) shutdown: failed to start child: Nerves.Runtime.Kernel.UEvent
        ** (EXIT) an exception was raised:
            ** (ErlangError) Erlang error: :enoent
                :erlang.open_port({:spawn_executable, '/Users/mauricionr/workspace/elixir/brewery/_build/dev/lib/nerves_runtime/priv/uevent'}, [{:args, []}, {:packet, 2}, :use_stdio, :binary, :exit_status])
                (nerves_runtime) lib/nerves_runtime/kernel/uevent.ex:14: Nerves.Runtime.Kernel.UEvent.init/1
                (stdlib) gen_server.erl:365: :gen_server.init_it/2
                (stdlib) gen_server.erl:333: :gen_server.init_it/6
                (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
|nerves_bootstrap| Environment Package List

  No packages found
|nerves_bootstrap| Loadpaths Start

|nerves_bootstrap| Env Start

|nerves_bootstrap| Env End

NERVES_SYSTEM is unset
NERVES_TOOLCHAIN is unset
|nerves_bootstrap| Environment Variable List
  target:     unset
  toolchain:  unset
  system:     unset
  app:        /Users/mauricionr/workspace/elixir/brewery

|nerves_bootstrap| Loadpaths End

Nerves:           0.11.0
Nerves Bootstrap: 0.7.0
Elixir:           1.5.0
|nerves_bootstrap| Info End

any ideia?

Thanks.

Move iex helpers to separate project

In prep for 1.0, I'm very interested in moving features out of our core libraries that either aren't well used or don't quite fit. The iex helpers seem to fit this bill since they're debugging convenience functions. I know that a bunch of us really like a few of these functions. It seems like we could easily move them to a separate repo and pull them in with nerves_init_gadget or another "starter" project. Other advantages would be that we could add more helpers without worrying about bloating everyone's projects since nerves_runtime is so hard to avoid as a dependency.

I'll leave this here for a couple weeks for comments if anyone feels strongly.

Cannot run iex session

I started a fresh project with mix nerves.new and when i try to run the application within an iex session the nerves_runtime fails. When i remove nerves runtime from the mixfile all is good. The link to the docs site seems to be down, so i could not find a way to run the app locally with iex.

Interactive host shell returns to prompt while OS process is still running

Environment

  • Erlang version: 19
  • Elixir version: 1.4.2
  • Operating system: Linux 4.4.43-v7
  • Nerves Environment Info:
Env
  MIX_TARGET:   rpi3
  MIX_ENV:      dev

|nerves_bootstrap| Environment Package List

  Pkg:      nerves_system_br
  Vsn:      0.9.4
  Type:     system_platform
  Provider: [{Nerves.Package.Providers.HTTP, []}, {Nerves.Package.Providers.Local, []}]

  Pkg:      nerves_system_rpi3
  Vsn:      0.11.0
  Type:     system
  Provider: [{Nerves.Package.Providers.HTTP, []}, {Nerves.Package.Providers.Local, []}]

  Pkg:      nerves_toolchain_arm_unknown_linux_gnueabihf
  Vsn:      0.10.0
  Type:     toolchain
  Provider: [{Nerves.Package.Providers.HTTP, []}, {Nerves.Package.Providers.HTTP, []}]

  Pkg:      nerves_toolchain_ctng
  Vsn:      0.9.0
  Type:     toolchain_platform
  Provider: [{Nerves.Package.Providers.HTTP, []}, {Nerves.Package.Providers.Local, []}]

|nerves_bootstrap| Loadpaths Start

|nerves_bootstrap| Env Start

|nerves_bootstrap| Env End

|nerves_bootstrap| Environment Variable List
  target:     rpi3
  toolchain:  /home/jeff/.nerves/artifacts/nerves_toolchain_arm_unknown_linux_gnueabihf-0.10.0.linux-x86_64
  system:     /home/jeff/.nerves/artifacts/nerves_system_rpi3-0.11.0.arm_unknown_linux_gnueabihf
  app:        /home/jeff/Code/seedlings

|nerves_bootstrap| Loadpaths End

Current behavior

The shell prompt returns to host[x+1] even if the executing OS process is still running. Standard output continues to be streamed (which is good, at least you can wait for it to stop). If the OS process requires a signal to exit, you're basically stuck. I don't see this as being a huge problem for anyone right now since you can bail out with Erlang job control but I felt it was worth putting on record in case you ever wanted to bump the shell out of experimental status.

Steps to Reproduce

On a Debian or Ubuntu system, execute sudo apt-get update in the shell.

On any target, execute cat without any arguments. That one is fun.

This is happening because the BEAM port process that gets spawned is executing the target's shell at /bin/sh rather than user commands individually.

Expected behavior

Ideally the shell should block while dumping standard out until the executing OS process has completed. It would also be nice if we could pass signals to the OS process without using the tty that Erlang has already hijacked. Maybe using a prefix key like tmux if that's even possible? I don't know how far down that rabbit hole you want to go. Ultimately Erlang should have priority over the tty since we are piggybacking off of its job control capabilities.

Inconsistent return value for `Nerves.Runtime.KV.get/1` on RPi4 after reboot

Environment

  • Elixir version (elixir -v):
Elixir 1.10.4 (compiled with Erlang/OTP 23)
  • Nerves environment: (mix nerves.env --info)

  Pkg:         nerves_toolchain_ctng
  Vsn:         1.7.2
  Type:        toolchain_platform
  BuildRunner: {nil, []}

  Pkg:         nerves_system_br
  Vsn:         1.12.0
  Type:        system_platform
  BuildRunner: {nil, []}

  Pkg:         nerves_system_rpi4
  Vsn:         1.12.1
  Type:        system
  BuildRunner: {Nerves.Artifact.BuildRunners.Docker, []}

  Pkg:         nerves_toolchain_arm_unknown_linux_gnueabihf
  Vsn:         1.3.2
  Type:        toolchain
  BuildRunner: {Nerves.Artifact.BuildRunners.Local, []}

  • Additional information about your host, target hardware or environment that
    may help

Current behavior

When I set a custom firmware Key/Value pair as a list, it is returned as a list until a reboot.

Nerves.Runtime.KV.put "test", [17,22,27]

After the reboot, what is returned is a BitString

iex(3)> Nerves.Runtime.KV.get "test"                   
<<17, 22, 27>>

Expected behavior

Either prevent me from setting the value of the key as a list or keep returning it as a list.

crng is slow to initialize on devices where rngd finds no entropy sources

Environment

  • Elixir version (elixir -v): 1.8.1
  • Nerves environment: (mix nerves.env --info)
|nerves_bootstrap| Environment Package List

  Pkg:         kit_x86_64
  Vsn:         1.9.0
  Type:        system
  BuildRunner: {Nerves.Artifact.BuildRunners.Docker, []}

  Pkg:         nerves_toolchain_ctng
  Vsn:         1.5.0
  Type:        toolchain_platform
  BuildRunner: {nil, []}

  Pkg:         nerves_toolchain_x86_64_unknown_linux_gnu
  Vsn:         1.1.0
  Type:        toolchain
  BuildRunner: {Nerves.Artifact.BuildRunners.Local, []}

  Pkg:         nerves_system_br
  Vsn:         1.7.1
  Type:        system_platform
  BuildRunner: {nil, []}

|nerves_bootstrap| Loadpaths Start

Nerves environment
  MIX_TARGET:   intel_STK1A32SC
  MIX_ENV:      dev

|nerves_bootstrap| Environment Variable List
  target:     intel_STK1A32SC
  toolchain:  /Users/troels/.nerves/artifacts/nerves_toolchain_x86_64_unknown_linux_gnu-darwin_x86_64-1.1.0
  system:     /Users/troels/.nerves/artifacts/kit_x86_64-portable-1.9.0
  • Additional information about your host, target hardware or environment that
    may help

Current behavior

I'm working on porting Nerves to the NanoPi Neo2. rngd fails to start on this board (more info in the issue on the system repo) because it finds no sources of entropy. As a result, crng is very slow to initialize, usually taking 3 - 4 minutes.

From what I can find, the usual remedy to these kinds of situations is to use Haveged instead, which can also generate entropy. There's some uncertainty about how secure that is, but I can tell from experiments that it certainly speeds things up (crng is now initialized in less than 20 seconds).

Setting BR2_PACKAGE_HAVEGED=y adds a binary at /usr/sbin/haveged which works much like rngd - it forks and runs as a daemon in the background.

Expected behavior

If rngd fails to start (or has not been included in the system), nerves_runtime should try starting haveged as a fallback. This will enable systems to opt-in to this potentially-weak entropy source by setting BR2_PACKAGE_HAVEGED=y in nerves_defconfig.

Can't run unit tests on OSX

The uevent binary prevents running unit tests on OSX and transitively prevents people from running unit tests on any project that depends on it. Here's the exception:

=INFO REPORT==== 11-Sep-2017::14:26:45 ===
    application: logger
    exited: stopped
    type: temporary
** (Mix) Could not start application nerves_runtime: Nerves.Runtime.Application.start(:normal, []) returned an error: shutdown: failed to start child: Nerves.Runtime.Kernel
    ** (EXIT) shutdown: failed to start child: Nerves.Runtime.Kernel.UEvent
        ** (EXIT) an exception was raised:
            ** (ErlangError) erlang error: :enoent
                :erlang.open_port({:spawn_executable, '/Users/fhunleth/nerves/nerves_firmware_ssh/_build/test/lib/nerves_runtime/priv/uevent'}, [{:args, []}, {:packet, 2}, :use_stdio, :binary, :exit_status])
                (nerves_runtime) lib/nerves_runtime/kernel/uevent.ex:14: Nerves.Runtime.Kernel.UEvent.init/1
                (stdlib) gen_server.erl:365: :gen_server.init_it/2
                (stdlib) gen_server.erl:333: :gen_server.init_it/6
                (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3

Updates to UBoot environment not seen by Nerves.Runtime.KV

Environment

  • nerves_runtime version: 0.9.5

Current behavior

In our Nerves systems we use a key in the UBoot environment to indicate whether the current firmware is known to be valid - when the system boots, it performs a couple of health checks, and then updates the UBoot environment with the pseudo-public function Nerves.Runtime.KV.UBootEnv.put/2 (public but explicitly not documented).

I asked on Slack whether this function is generally unsafe to use (since it's not documented), but @fhunleth replied that it should be OK, as long as its use is limited (conversation). He also noted that the function should be documented.

Functionality to update the UBoot environment should be documented. We could add documentation to Nerves.Runtime.KV.UBootEnv.put/2, but there is an issue with using this when also using Nerves.Runtime.KV.

Changes to the UBoot environment made with Nerves.Runtime.KV.UBootEnv.put/2 are not picked up by Nerves.Runtime.KV. Nerves.Runtime.KV reads the contents of the UBoot environment at boot time and stores this in its state. This state is never updated as long as Nerves.Runtime.KV is alive, and all requests for UBoot metadata are based on this state.

Expected behavior

Nerves.Runtime.KV should have a public (and documented) function for updating the UBoot environment. This function would be a call to the Nerves.Runtime.KV server which calls Nerves.Runtime.KV.UBootEnv.put/2 and updates its state with the new environment.

It might even make sense to make a function that would accept a map (or keyword list) of keys and values to write. Since Nerves.Runtime.KV.UBootEnv.put/2 reads the entire UBoot environment into a map, puts the new key and value and writes it back, it would be easy to handle multiple changes in one write - merge a map of changes into map with current UBoot environment and then perform the write.

As an alternative to exposing a public function for updating the UBoot environment (to avoid people using it too enthusiastically), a function can be added to Nerves.Runtime.KV to re-read the UBoot environment.

I would be happy to make a PR for this if the proposed solution is OK.

Device properties or child device trees might be overwritten.

nerves_runtime uses a nested map for storing uevent data. If however a child devpath part has the same name as a property key it might get overwritten. There are two cases (depending on whether the child or parent event is received first):

IF a (direct) child device's devpath part is the same as an existing property in the parent
AND the child is added after the parent
THEN the parent key property will be overwritten.

IF a (direct) child device's devpath part is the same as an existing property in the parent
AND the parent is added after the child
THEN the child(ren) are replaced by the parent property.

Please note that both cases can only happen if the devpath part has the same name as the property key. This however is outside of nerves_runtime's scope as the event data is defined by Linux. By checking the Linux source very briefly I didn't find any guarantee that this cannot happen. So it's very unlikely but not impossible ...

Test cases are in my branch under https://github.com/krodelin/nerves_runtime/blob/0198186cdfe3bf5f8d2f1b262468cf9b708d46ba/test/uevent_test.exs#L58-L76

Firmware validation

Post-boot firmware validation plays a vital role in a robust firmware upgrade
strategy. The nerves_runtime project offers suggestions for how users can
perform firmware validation, but leaves implementation to its users. Some
Nerves systems provide out-of-the-box support for firmware validation, but
others do not, and this discrepancy increases the amount of work that must be
done to properly implement firmware validation. This is a proposal to add
standardized procedures around firmware validation to the Nerves framework, in
order to make this powerful feature more accessible to Nerves users.

Firmware validation is input to a decision that must be made at boot time -
which target (kernel and rootfs) to boot. This is a decision that must be made
as early as possible in the boot process, in order to provide protection
against as many possible firmware faults as possible. The best place to make
the decision is in the bootloader, but bootloaders vary between systems, and
not all are equally capable. This means the method of marking firmware valid
must be provided by the system itself. The most obvious way of doing this is
through fwup scripts.

All the official Nerves systems provide a revert.fw file, and
Nerves.Runtime exposes a revert function to invoke fwup with that
configuration file. In a similar vein, a mark_valid.fw file could be added
to all systems, and a mark_valid function could be added to
Nerves.Runtime. When users have checked that their firmware is valid,
they will invoke the Nerves.Runtime.mark_valid function. If they reboot
without calling this function first, the system will automatically revert
to the previous version of the firmware.

Automatic reverts of bad firmware can be confusing to new users. Thus,
firmware should probably validate itself automatically by default. Users then
need a way to disabling firmware autovalidation. I suggest letting users
control this by setting an environment variable (e.g.
NERVES_FW_AUTOVALIDATE) at firmware creation time.

In order to make the decision about which target to boot, a few facts must be
known. Facts whose state must be kept across reboots as persistent variables:

  • What is the intended target? nerves_fw_active
  • Has the intended target been booted? nerves_fw_booted
  • Has the intended target been validated? nerves_fw_validated

The variables will be modified at different points during the firmware lifecycle:

  1. When firmware is first burned onto the device for the first time, it will be assumed valid:
    • nerves_fw_active = a
    • nerves_fw_booted = 0
    • nerves_fw_validated = 1
  2. When firmware is upgraded, the active target will change and validation state reset:
    • nerves_fw_active = b
    • nerves_fw_booted = 0
    • nerves_fw_validated = 0
  3. When the system is rebooted for the first time after upgrading, the boot attempt is recorded:
    • nerves_fw_booted = 1
  4. Once booted, the users' Nerves application will perform validation checks:
    1. If the firmware validation checks succeed, the firmware is marked as valid:
      • nerves_fw_validated = 1
    2. If the firmware validation checks fail, the system will reboot without marking the firmware valid, and the boot process will revert to the previous target
      • nerves_fw_active = a
      • nerves_fw_validated = 1

How exactly this is done varies from system to system, but the methodology is
the same for all systems.

If users have not disabled automatic firmware validation, the firmware will
validate itself. This can be done by setting nerves_fw_validated = 1 in the
firmware upgrade task or on the first boot.

Also, note that firmware upgrade scripts should require current firmware to be
marked valid before performing an upgrade. This is to ensure a safe fallback
in case of a bad upgrade.

Bootloader logic for checking validation is already implemented in
nerves_system_bbb.
Here, firmware is marked valid by setting a U-Boot variable. However, this
approach to marking firmware valid is not usable for all Nerves systems (see
Grub2 below). Also, it is not possible for users to disable autovalidation at
firmware creation time. Thus I suggest adding a fwup file to mark firmware
valid, and adding an environment variable to disable autovalidation at
firmware creation time.

Very similar logic for validation checks can be performed in systems which use
the Grub2 bootloader (e.g.
nerves_system_x86_&4).
However, Grub2 uses the Grub environment block (a file with a certain
structure) rather than the U-Boot environment. This means implementation will
be different. In fact, it requires jumping through a few hoops since fwup
can't modify the Grub environment block. But it can be done, and the interface
can be the same: a fwup file that marks firmware valid, and an environment
variable to disable autovalidation at firmware creation time.

The Raspberry Pi bootloader is, unfortunately, unable to modify its boot
environment. Because of this, the decision about which firmware to boot can
only be made after booting. This can be done with a shell script evaluated by
erlinit prior to booting the Nerves application. There is also another reason
why this is necessary: On RPi systems we need to rewrite the MBR in order to
revert the firmware. This means the Raspberry Pi will not be as capable in
terms of reverting bad firmware - if the system crashes before erlinit can
invoke the shell script, the system will be caught in an endless reboot loop.
Unfortunately, I don't see a way out of this due to the design of its
bootloader. Apart from this deviation, the interface to firmware validation can
be the same: a fwup file that marks firmware valid, and an environment
variable to disable autovalidation at firmware creation time.

I would like to hear your opinions on this proposal. To check that the idea is
viable, I have partially complete implementations of this for Grub2 and
Raspberry Pi based systems. If you want to see the changes, I can submit draft
PRs for you to check out. If there's a consensus that this is a good idea I'd be
happy to move forward with the implementations.

Stopping and restarting causes zombie kmesg_tailer processes

Environment

Elixir 1.9

Current behavior

iex(13)> Application.stop(:nerves_runtime)
:ok
iex(14)> Application.ensure_all_started(:nerves_runtime)
{:ok, [:nerves_runtime]}
iex(15)> cmd('ps | egrep "kmsg_tailer"')                
  123 root      1668 S    {nerves_runtime} kmsg_tailer
  369 root      1668 S    {nerves_runtime} kmsg_tailer
  461 root      1668 S    {nerves_runtime} kmsg_tailer
  517 root      2004 S    /bin/sh -c ps | egrep "kmsg_tailer"
  520 root      2004 S    egrep kmsg_tailer
0

Rewrite log tailer in pure Elixir

This is now possible in OTP 21 since Erlang's File I/O can read special files. It would be nice to simplify the code in this area even though it's currently working.

Support kernel panic, watchdog initiated reboots and Elixir logs from before a reboot

Linux's pstore driver lets you designate an area of DRAM for the final logs, panic messages, etc. that happened before a reboot. Since it's stored in DRAM, it doesn't survive a power failure, but it survives reboots easily and watchdog timeouts don't seem to be an issue. To use this feature, the Nerves systems need to have the pstore driver enabled. This can be detected at run-time. If it is enabled, then it would be super-convenient to have helper methods to get access to this info. I'm thinking that this is convenient enough with a trivial footprint that we should include it here in nerves_runtime.

Nerves.Runtime.KV looks for fw_printenv in host mode

I suspect a lot of folks are going to develop and test their web applications primarily in host mode with MIX_TARGET=host.

H:MM:SS.SSS [warn] Nerves.Runtime.KV could not find executable fw_printenv

That message and also Cannot parse config file: No such file or directory are logged every time I start the firmware application in host mode.

It would be nice if we could hide these fw_printenv related messages while running in host mode.

init.ex documentation question

Module doc in init.ex states:

Since one would expect this to be rare, Nerves systems create firmware images with the do not initialize the application partition so that this code is regularly exercised.

Does it mean? ...create firmware images with un-initialized application partitions...

UEvent property map contains non-binary attributes

Coming from a udev background one might assume that all properties are strings. This is not true for parent devices. They contain non-property keys named after the child dev path element whose values are the child properties.

IMHO this is not a "real" issue - just a "gotcha" if it's not documented.

A added testcases for this behaviour in my branch: https://github.com/krodelin/nerves_runtime/blob/0198186cdfe3bf5f8d2f1b262468cf9b708d46ba/test/uevent_test.exs#L38-L56

Add utility functions for getting information from nerves_heart

nerves_heart provides useful information about the hardware watchdog timer and why a reboot happened. It's exposed through the :heart API, but due to limitations in the API, it's a little inconvenient and hard to discover.

Some ideas for features to expose:

  • Is nerves_heart running? The answer should always be yes on Nerves, but sometimes I forget to start it when porting to a new platform. This may only be useful to me.
  • Why did the system reboot?
  • Hardware watch dog timeout
  • Hardware watch dog time left

Add common API for getting the serial number

Serial numbers are stored in many ways on Nerves devices and there's currently no way of getting the device's serial number generically.

To fix this, add a Nerves.Runtime.serial_number/0 function and update all documentation to refer to it as the official way of getting the device's serial number. (The actual function name could be changed if desired.)

The easiest way to implement this would be to call out to boardid since all official Nerves systems and many non-official ones use it already.

As part of this effort, the documentation referring to storing serial numbers in the U-Boot environment block needs to be updated. While it's an option, it would be better if Nerves users were pointed to write-once options.

Break apart nerves_runtime into separate libraries

nerves_runtime has always been a catch all place for code that needs to run on all devices. It, however, makes it hard to experiment with and advance some functionality. Here's a list of what it currently does:

  1. Route kernel and syslog messages through the Elixir logger
  2. Implement a small key-value store in U-Boot
  3. Process UEvent messages to automatically load Linux kernel modules when needed
  4. Populate system registry with device insertions
  5. Contain a handful of helper functions for getting information about a device
  6. Format and mount the data partition the first time and if it gets too corrupted

Some of these are difficult to unit test. Others automatically start and need to be disabled when testing any code that depends on :nerves_runtime. Much of the code is dates from before Elixir 1.0 and you can see its age. I have felt that the lack of unit tests combined with the central importance of this code have kept me from making large modifications.

I'd like to propose that we opportunistically split :nerves_runtime up into smaller libraries. Breaking out logging looks especially easy. Since we're getting close to having a SystemRegistry replacement, that seems like the next easiest split. Moving UEvent processing out would be great since it would enable people to use udevd for more complicated Nerves setups without the wart of running two UEvent processors on the device and having them race each other.

Thoughts?

Add utility function for resetting the application data partition

To implement factory reset, the Nerves documentation recommends storing user settings and data on the application partition and making it so that an empty data partition is the factory defaults. That means that resetting to factory defaults is a matter of corrupting the application data partition so that it gets reformatted on the next boot.

This can be done using dd like dd if=/dev/zero of=/dev/mmcblk0p3 bs=128K count=1 to write zeros over the first 128KB. (mmcblk0p3 is for the RPi4).

It would certainly be nicer if there were a helper function to do this since it could be done better that the dd line, the partition could be figured out automatically, and a reboot could be forced.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.