Erlang socket module for SocketCAN on Nerves device

Hi all,

I have a Raspberry Pi Nerves device which I have attached a MCP2515 SPI to CAN transceiver board. I can get this board to work fine with regular raspbian, and I just send commands using candump and cansend utilities from the canutils tool suite. I’ve got to the point where I have a slightly customized buildroot image of the standard rpi3 firmware to enable the MCP2515 board and include the SocketCAN kernel modules (heavily based off of the work done by @brien and his 2018 talk titled “Customize Your Car: An Adventure in Using Elixir and Nerves to Hack Your Vehicle’s Electronics Network”. Huge thank you :pray:)

So far, so good. I was testing using the ng_can library, and indeed able to read the data. However, I have found the library to be a little bit less flexible than I need for my application. Particularly, it sets a fixed can bus bitrate at initialization. Additionally, the library hasn’t been updated in several years, and I’d like to avoid pulling in the C NIF dependencies. Plus, I’ve been looking for an excuse to finally create and publish my own library, and it is one that help me better cement my understanding of several OTP concepts.

So I’m thinking about trying to create a pure-elixir library for SocketCAN. I was browsing this thread on community automotive projects, which hints that this should be possible with just the :socket module from Erlang.

Here’s where my trouble starts: I’ll be the first to admit that the Erlang documentation is daunting to me, having spent all my time in Elixir thus far. I can’t understand just quite why I can’t get the socket connection to the system SocketCAN to work, or if I am even approaching it in the right way. Maybe the problem is that it isn’t even supported on my hardware for some reason - though theoretically it should, since I have the kernel modules properly installed (ng_can does work).

Some reference: the SocketCAN docs say that the socket connection can be opened in C with s = socket(PF_CAN, SOCK_RAW, CAN_RAW);. This is pretty much the same thing for the python-can library implementation, which is sock = socket.socket(constants.PF_CAN, socket.SOCK_RAW, constants.CAN_RAW)

I think my main confusion is how to emulate these parameters in :socket.open/3 (though maybe I need :socket.open/4?).

Below is a couple of things I’ve tried/some extra info.

Also, I tried to join the Elixir slack to be part of the Nerves community there, but I can’t find any valid join link. If someone could shoot one my way I would be grateful :slight_smile:

Thank you!
Gus

iex(nerves@nerves.local)1> :socket.open(:local, :raw, :pf_can)
{:error, {:invalid, {:protocol, :pf_can}}}
iex(nerves@nerves.local)2> cat "/etc/protocols"               
# Internet (IP) protocols
#
# Updated from http://www.iana.org/assignments/protocol-numbers and other
# sources.

ip      0       IP              # internet protocol, pseudo protocol number
hopopt  0       HOPOPT          # IPv6 Hop-by-Hop Option [RFC1883]
icmp    1       ICMP            # internet control message protocol
igmp    2       IGMP            # Internet Group Management
ggp     3       GGP             # gateway-gateway protocol
ipencap 4       IP-ENCAP        # IP encapsulated in IP (officially ``IP'')
st      5       ST              # ST datagram mode
tcp     6       TCP             # transmission control protocol
egp     8       EGP             # exterior gateway protocol
igp     9       IGP             # any private interior gateway (Cisco)
pup     12      PUP             # PARC universal packet protocol
udp     17      UDP             # user datagram protocol
hmp     20      HMP             # host monitoring protocol
xns-idp 22      XNS-IDP         # Xerox NS IDP
rdp     27      RDP             # "reliable datagram" protocol
iso-tp4 29      ISO-TP4         # ISO Transport Protocol class 4 [RFC905]
dccp    33      DCCP            # Datagram Congestion Control Prot. [RFC4340]
xtp     36      XTP             # Xpress Transfer Protocol
ddp     37      DDP             # Datagram Delivery Protocol
idpr-cmtp 38    IDPR-CMTP       # IDPR Control Message Transport
ipv6    41      IPv6            # Internet Protocol, version 6
ipv6-route 43   IPv6-Route      # Routing Header for IPv6
ipv6-frag 44    IPv6-Frag       # Fragment Header for IPv6
idrp    45      IDRP            # Inter-Domain Routing Protocol
rsvp    46      RSVP            # Reservation Protocol
gre     47      GRE             # General Routing Encapsulation
esp     50      IPSEC-ESP       # Encap Security Payload [RFC2406]
ah      51      IPSEC-AH        # Authentication Header [RFC2402]
skip    57      SKIP            # SKIP
ipv6-icmp 58    IPv6-ICMP       # ICMP for IPv6
ipv6-nonxt 59   IPv6-NoNxt      # No Next Header for IPv6
ipv6-opts 60    IPv6-Opts       # Destination Options for IPv6
rspf    73      RSPF CPHB       # Radio Shortest Path First (officially CPHB)
vmtp    81      VMTP            # Versatile Message Transport
eigrp   88      EIGRP           # Enhanced Interior Routing Protocol (Cisco)
ospf    89      OSPFIGP         # Open Shortest Path First IGP
ax.25   93      AX.25           # AX.25 frames
ipip    94      IPIP            # IP-within-IP Encapsulation Protocol
etherip 97      ETHERIP         # Ethernet-within-IP Encapsulation [RFC3378]
encap   98      ENCAP           # Yet Another IP encapsulation [RFC1241]
#       99                      # any private encryption scheme
pim     103     PIM             # Protocol Independent Multicast
ipcomp  108     IPCOMP          # IP Payload Compression Protocol
vrrp    112     VRRP            # Virtual Router Redundancy Protocol [RFC5798]
l2tp    115     L2TP            # Layer Two Tunneling Protocol [RFC2661]
isis    124     ISIS            # IS-IS over IPv4
sctp    132     SCTP            # Stream Control Transmission Protocol
fc      133     FC              # Fibre Channel
mobility-header 135 Mobility-Header # Mobility Support for IPv6 [RFC3775]
udplite 136     UDPLite         # UDP-Lite [RFC3828]
mpls-in-ip 137  MPLS-in-IP      # MPLS-in-IP [RFC4023]
manet   138                     # MANET Protocols [RFC5498]
hip     139     HIP             # Host Identity Protocol
shim6   140     Shim6           # Shim6 Protocol [RFC5533]
wesp    141     WESP            # Wrapped Encapsulating Security Payload
rohc    142     ROHC            # Robust Header Compression
iex(nerves@nerves.local)3> :socket.supports()
[
  ioctl_requests: [
    siftxqlen: true,
    sifmtu: true,
    sifdstaddr: true,
    sifbrdaddr: true,
    sifaddr: true,
    giftxqlen: true,
    gifnetmask: true,
    gifname: true,
    gifmtu: true,
    gifmap: true,
    gifindex: true,
    gifhwaddr: true,
    gifflags: true,
    gifdstaddr: true,
    gifconf: true,
    gifbrdaddr: true,
    gifaddr: true,
    sifflags: true
  ],
  ioctl_flags: [
    staticarp: false,
    slave: true,
    simplex: false,
    renaming: false,
    promisc: true,
    ppromisc: false,
    portsel: true,
    pointopoint: true,
    oactive: false,
    notrailers: true,
    noarp: true,
    master: true,
    lower_up: false,
    link2: false,
    link1: false,
    link0: false,
    knowsepoch: false,
    echo: false,
    dynamic: true,
    dying: false,
    dormant: false,
    cantconfig: false,
    automedia: true,
    allmulti: true,
    nogroup: false,
    multicast: true,
    up: true,
    loopback: true,
    broadcast: true,
    debug: true,
    running: true,
    monitor: false
  ],
  options: [
    {{:ipv6, :authhdr}, true},
    {{:ipv6, :use_min_mtu}, false},
    {{:socket, :busy_poll}, false},
    {{:IPv6, :pktoptions}, false},
    {{:IP, :options}, false},
    {{:IP, :transparent}, true},
    {{:ipv6, :recvpktinfo}, true},
    {{:socket, :acceptconn}, true},
    {{:IP, :dontfrag}, false},
    {{:IP, :recvif}, false},
    {{:IPv6, :router_alert}, true},
    {{:ip, :multicast_if}, true},
    {{:IPv6, :recvtclass}, true},
    {{:ip, :freebind}, true},
    {{:socket, :domain}, true},
    {{:socket, :reuseaddr}, true},
    {{:ipv6, :portrange}, false},
    {{:tcp, :maxseg}, true},
    {{:IPv6, :drop_membership}, true},
    {{:ipv6, :unicast_hops}, true},
    {{:IP, :recvdstaddr}, false},
    {{:tcp, :info}, false},
    {{:IP, :multicast_if}, true},
    {{:IP, :hdrincl}, true},
    {{:ipv6, :faith}, false},
    {{:IPv6, :tclass}, true},
    {{:ipv6, :rthdr}, true},
    {{:TCP, :maxseg}, true},
    {{:ip, :sendsrcaddr}, false},
    {{:ipv6, :addrform}, true},
    {{:socket, :rxq_ovfl}, false},
    {{:ipv6, :flowinfo}, false},
    {{:ip, :dontfrag}, false},
    {{:socket, :dontroute}, true},
    {{:IPv6, :recvpktinfo}, true},
    {{:IP, :tos}, true},
    {{:ipv6, :recvhoplimit}, true},
    {{:ipv6, :add_membership}, true},
    {{:IPv6, :checksum}, false},
    {{:IPv6, :rthdr}, true},
    {{:IPv6, :dstopts}, true},
    {{:ipv6, :multicast_if}, true},
    {{:socket, :priority}, true},
    {{:IPv6, :v6only}, true},
    {{:ipv6, ...}, true},
    {{...}, ...},
    {...},
    ...
  ],
  msg_flags: [
    peek: true,
    oob: true,
    nosignal: true,
    errqueue: true,
    eor: true,
    ctrunc: true,
    confirm: true,
    cmsg_cloexec: true,
    dontroute: true,
    trunc: true,
    more: true
  ],
  protocols: [
    DCCP: true,
    hopopt: true,
    egp: true,
    icmp: true,
    EGP: true,
    wesp: true,
    udp: true,
    RSPF: true,
    HOPOPT: true,
    skip: true,
    CPHB: true,
    vmtp: true,
    "IPv6-Opts": true,
    ROHC: true,
    "mpls-in-ip": true,
    RDP: true,
    fc: true,
    IPv6: true,
    PUP: true,
    PIM: true,
    ipcomp: true,
    IDRP: true,
    igp: true,
    GGP: true,
    L2TP: true,
    "IPSEC-ESP": true,
    ah: true,
    EIGRP: true,
    sctp: true,
    WESP: true,
    "ipv6-icmp": true,
    ISIS: true,
    UDPLite: true,
    "IPv6-NoNxt": true,
    idrp: true,
    "IPv6-ICMP": true,
    ICMP: true,
    ST: true,
    TCP: true,
    vrrp: true,
    rsvp: true,
    RSVP: true,
    ggp: true,
    "iso-tp4": true,
    DDP: true,
    ...
  ],
  sctp: false,
  ipv6: true,
  local: true,
  netns: true,
  sendfile: true
]
iex(nerves@nerves.local)4> :socket.open(29, :raw)             
{:error, :eprotonosupport}
iex(nerves@nerves.local)5> "29 is the value defined for PF_CAN in socket.h"
"29 is the value defined for PF_CAN in socket.h"
iex(nerves@nerves.local)6> :socket.open(29, :raw, :local)
{:error, {:invalid, {:protocol, :local}}}
iex(nerves@nerves.local)7> :socket.open(29, :raw, :ip)   
{:error, :eprotonosupport}
iex(nerves@nerves.local)8> :socket.open(29, :raw, :ipv6)
{:error, :einval}
iex(nerves@nerves.local)9> :socket.open(29, :raw, :tcp) 
{:error, :eprotonosupport}
iex(nerves@nerves.local)10> :socket.open(29, :raw, :udp)
{:error, :einval}
iex(nerves@nerves.local)11>

2 Likes

So few things:

  • Erlang do not exposè PF_CAN for you. You need to check your OS and extract respective integer value on your own. I know this can be daunting, but with Python you should be able to do it. I also have looked up that in the SocketCAN project and found that it is equal to 29.
  • It also do not exposè CAN_RAW protocol value as well, so you need to use integer value there. Some digging in Linux headers shows that it is value 1.

Now we need to apply it to our code:

defmodule GenCAN do
  @pf_can 29

  @can_raw

  # You probably can come up with more flexible implementation than that
  def open, do: :socket.open(@pf_can, :raw, @can_raw)
end
2 Likes

:socket.open(29, :raw, 1) does the trick!

I tried so many options for the protocol as the third argument, but didn’t realize that I should be looking at the integer values to replace CAN_RAW.

Thanks for the suggestion!

Any chance you would also be able to help me understand the Erlang errors when trying to bind the socket? In the SocketCAN docs, it shows the following C example:

int s;
struct sockaddr_can addr;
struct ifreq ifr;

s = socket(PF_CAN, SOCK_RAW, CAN_RAW);

strcpy(ifr.ifr_name, "can0" );
ioctl(s, SIOCGIFINDEX, &ifr);

addr.can_family = AF_CAN;
addr.can_ifindex = ifr.ifr_ifindex;

bind(s, (struct sockaddr *)&addr, sizeof(addr));

// ...

I can successfully open the socket as per above, but I’ve been struggling as well to get the socket to bind. The Erlang docs show the following for bind:

bind(Socket, Addr) -> ok | {error, Reason}

spec for Addr:

Addr = sockaddr() | any | broadcast | loopback

sockaddr() = 
    sockaddr_in() |
    sockaddr_in6() |
    sockaddr_un() |
    sockaddr_ll() |
    sockaddr_dl() |
    sockaddr_unspec() |
    sockaddr_native()


sockaddr_native() = #{family := integer(), addr := binary()}

I’m pretty certain I need to use sockaddr_native, and the integer value should be the same as AF_CAN as per the C example (value is 29). So I’ve tried the following:

{:ok, {:"$socket", sock}} = :socket.open(29, :raw, 1)
:socket.bind(sock, {:sockaddr_native, 29, "can0"}) # error
:socket.bind(sock, {:sockaddr_native, 29, 'can0'}) # error
:socket.bind(sock, {:sockaddr_native, 29, <<"can0">>}) # error
:socket.bind(pid, {29, <<"can0">>}) # error

The error in question for each:

** (ArgumentError) argument error
    (kernel 9.0.2) socket.erl:1547: :socket.bind(#Reference<0.2860049625.78249987.217043>, {:sockaddr_native, 29, "can0"})
    iex:7: (file)

From my understanding, the sockaddr_native is an Erlang record, which in Elixir is a tagged tuple. The code below seems to confirm that I’ve got that part right.

Record.is_record({:sockaddr_native, 29, "can0"}) # true

Some help to enlighten me on what I am missing with this Erlang function call (and more generally, how to better understand how to use Erlang from Elixir) would be helpful, thanks! :slight_smile:

Gus

Some more digging:

socket:bind/2 seems to call prim_socket:bind/2, which is defined below:

bind(SockRef, Addr) ->
    try
        enc_sockaddr(Addr)
    of
        EAddr ->
            case nif_bind(SockRef, EAddr) of
                {invalid, Reason} ->
                    case Reason of
                        sockaddr ->
                            {error, {invalid, {Reason, Addr}}}
                    end;
                Result -> Result
            end
    catch
        throw : Reason ->
            {error, Reason}
    end.

Presumably the enc_sockaddr/1 call is throwing the error…

enc_sockaddr(#{family := inet} = SockAddr) ->
    merge_sockaddr(?ESOCK_SOCKADDR_IN_DEFAULTS, SockAddr);
enc_sockaddr(#{family := inet6} = SockAddr) ->
    merge_sockaddr(?ESOCK_SOCKADDR_IN6_DEFAULTS, SockAddr);
enc_sockaddr(#{family := local, path := Path} = SockAddr) ->
  if
      is_list(Path), 0 =< length(Path), length(Path) =< 255 ->
          BinPath = enc_path(Path),
          enc_sockaddr(SockAddr#{path => BinPath});
      is_binary(Path), 0 =< byte_size(Path), byte_size(Path) =< 255 ->
          merge_sockaddr(?ESOCK_SOCKADDR_LOCAL_DEFAULTS, SockAddr);
      true ->
          %% Neater than an if clause
          throw({invalid, {sockaddr, path, SockAddr}})
  end;
enc_sockaddr(#{family := local} = SockAddr) ->
    %% Neater than a function clause
    throw({invalid, {sockaddr, path, SockAddr}});
enc_sockaddr(#{family := unspec} = SockAddr) ->
    merge_sockaddr(?ESOCK_SOCKADDR_UNSPEC_DEFAULTS, SockAddr);
enc_sockaddr(#{family := Native} = SockAddr) when is_integer(Native) ->
    merge_sockaddr(?ESOCK_SOCKADDR_NATIVE_DEFAULTS, SockAddr);
enc_sockaddr(#{family := _} = SockAddr) ->
    SockAddr;
enc_sockaddr(#{} = SockAddr) ->
    throw({invalid, {sockaddr, family, SockAddr}});
enc_sockaddr(SockAddr) ->
    %% Neater than a function clause
    erlang:error({invalid, {sockaddr, SockAddr}}).

merge_sockaddr(Default, SockAddr) ->
    case
        maps:fold(
          fun (Key, _, Acc) ->
                  if
                      is_map_key(Key, Default) ->
                          Acc;
                      true ->
                          [Key | Acc]
                  end
          end, [], SockAddr)
    of
        [] ->
            maps:merge(Default, SockAddr);
        InvalidKeys ->
            throw({invalid, {sockaddr, {keys,InvalidKeys}, SockAddr}})
    end.

#{ ... snip ... } is a map, not a record. Use %{ :family => 29, :addr => address } instead.

Note that address has to be a valid struct sockaddr_can in binary form, just as it would appear in memory for the bind(2) call.

__kernel_sa_family_t seems to be a short, so 16 bits. You then have an int which is 32 bits, but alignment restrictions most likely mean that it’s most likely preceded by 16 bits of padding. There’s also tail padding to consider but I’m not sure how much that will be in this case.

<<family::size(16)-little,
  0::size(16)-little, # Padding
  ifIndex::size(32)-little,
  rxId::size(32)-little,
  txId::size(32)-little,
  0::size(40)>> # Experiment until it works?
2 Likes

Wow, this got a lot deeper into the weeds than I was expecting/realized! Big thanks for the reply and pointing me in the right direction.

For my future reference, can you clarify what is the difference between records and maps in the Erlang spec syntax? I was getting really tripped up on that part. If I were to guess, it would be that records have the name after the # but before the braces? And also the => operator.

Map: #{Field1=>Value1, ..., FieldN=>ValueN}
Record: #Name{Field1=Value1, ..., FieldN=ValueN}

I also guess part of my original confusion is that I’m now seeing that map pattern matching syntax is the := operator, which is different from Elixir. Maybe that’s why I started to look at records instead.

#{family := integer(), addr := binary()}

Anyways, I also realized that I was having other syntax issues in the code posted above: I was pattern matching and extracting the individual fields of the socket returned by :socket.open/3, like so:

Incorrect: {:ok, {:"$socket", sock}} = :socket.open(29, :raw, 1)
Correct: {:ok, sock} = :socket.open(29, :raw, 1)

This is where my argument error was coming from. However, you definitely enlightened me that the addr for the sockaddr_native type was a binary, but not to pass in the "can0" binary, but rather the whole struct sockaddr_can data!

I played with it a little bit, and was able to get it to bind correctly (the difference was the padding, which I guess is not going to be the same on every device depending on struct packing? For debugging, I actually compiled an example C program and printed the sockaddr_can struct to see the data to ensure I got it right). The code is below:

{:ok, sock} = :socket.open(29, :raw, 1) 
{:ok, ifindex} = :socket.ioctl(sock, :gifindex, 'can0') # this returns {:ok, 2} on my device
addr = <<29::size(16)-little, 0::size(16)-little, ifindex::size(32)-little, 0::size(32), 0::size(32), 0::size(64)>>
:ok = :socket.bind(sock, %{:family => 29, :addr => addr})

Remaining issue

If I actually run the code exactly as above, it does not work, I get {:error, :enodev} from the :socket.bind/2 call. I played around with it and discovered that I can only correctly bind and get the return of :ok if the ifindex is set to 0.

This is the part I’m confused about - why does :socket.ioctl(sock, :gifindex, 'can0') return the wrong index? How can I debug this?

Again, huge thank you!

Hmm I am realizing that the C example uses SIOCGIFINDEX, which might be different than the Erlang :gifindex option:

ioctl(s, SIOCGIFINDEX, &ifr);

This might be the reason

Pretty much. :slight_smile:

It should be the same on every device with the same ABI, so it may differ between x86 and ARM, but should be the same for the same architecture (broadly speaking).

What does it return if you execute the C example? Edit: if it’s the same value and works fine, how does the struct sockaddr_can look in memory, byte-by-byte?

That’s the one it should be using. Can you check if config.h contains #define ESOCK_USE_IFINDEX?

(edit: you should be able to find it at erts/$TARGET/config.h in the source tree)

The output is 0x04. One thing to note is that I’ve been using Nerves to test everything Elixir, but Raspbian OS Lite to test this C program, python, plus other things. So I’m not sure if the return will be the same, since the configuration might be different.

Happy to check this, but not sure where to look? In my Nerves rpi3 custom image directory? I’m not compiling Erlang myself (that I’m aware of, maybe that’s happening under the Nerves hood) :slight_smile:

Thanks!
Gus

I’d be surprised if the ABI is different, so try checking how struct sockaddr_can looks in memory if you set all the fields to recognizable values, maybe we’ve missed something when constructing the binary address.

struct sockaddr_can addr = {0};
addr.can_family = 0x1234;
addr.can_ifindex = 0x56789ABC;
addr.can_addr.tp.tx_id = 0x11223344;
addr.can_addr.tp.rx_id = 0x55667788;
{
    unsigned char *bytes = &addr;
    int i;
    for(i = 0; i < sizeof(addr); i++)
        printf("%x, ", bytes[i]); 
    printf("\n"); 
}

Ah, that will make things more difficult. :confused:

What happens if you ask for the gifname of the returned interface index?

I can confirm that running the following does work (ie, using the same interface index provided by ioctl to get the name as the interface index):

{:ok, sock} = :socket.open(29, :raw, 1)

# confirm ifindex and ifname
{:ok, ifindex} = :socket.ioctl(sock, :gifindex, 'can0') # returns {:ok, 2}
:socket.ioctl(sock, :gifname, ifindex) = {:ok, ~c"can0"}

# using index from ioctl doesn't work
addr = <<29::size(16)-little, 0::size(16)-little, ifindex::size(32)-little, 0::size(32), 0::size(32), 0::size(64)>>
:socket.bind(sock, %{:family => 29, :addr => addr}) = {:error, :enodev}

# hard code to index 0 works
ifindex = 0
addr = <<29::size(16)-little, 0::size(16)-little, ifindex::size(32)-little, 0::size(32), 0::size(32), 0::size(64)>>
:socket.bind(sock, %{:family => 29, :addr => addr}) = :ok

I can give this a go! But I think this part seems to be working at the moment, I think I was more referring to the interface indices are probably not the same since the network interfaces might be configured in a different order for Nerves vs Raspbian. But let me try your suggestion :slight_smile:

An interesting discovery in the SocketCAN docs - a interface index of 0 corresponds to binding on “any” interface.

When the CAN interface is bound to ‘any’ existing CAN interface (addr.can_ifindex = 0) it is recommended to use recvfrom(2) if the information about the originating CAN interface is needed

Also, the resulting bytes of the sockaddr_can struct test is printed below (in hex):

34, 12, 0, 0, bc, 9a, 78, 56, 88, 77, 66, 55, 44, 33, 22, 11, 0, 0, 0, 0, 0, 0, 0, 0

The equivalent bytes when printing this from the packet formed with the Elixir code is:

34 12 00 00 BC 9A 78 56 88 77 66 55 44 33 22 11 00 00 00 00 00 00 00 00

So I ran into this today and figured out the issue with the address data.

The :socket.bind/2 function takes a map with :family and :addr keys. Their values get copied into the sa_family and sa_data fields of a sockaddr struct in the erlang source.

struct sockaddr addr;
((struct sockaddr_can*) &addr)->can_ifindex = ifindex;
return addr.sa_data; // <- this is what we want to pass to :socket.bind

This means, your approach works, when cutting off the first field of the binary:

addr = <<0::size(16)-little, ifindex::size(32)-little, 0::size(32), 0::size(32), 0::size(64)>>

The initial 2 Bytes of padding remain–I think because sockaddr is aligned to 2 Bytes while sockaddr_can is aligned to 4 Bytes (can_ifindex being an int).

1 Like