Problems with processing fast occurring elements in stream

Hello, I’m currently struggling with the following problem:

"(1609089402.655258) vcan0 136#000200000000002A\n(1609089402.655456) vcan0 13A#0000000000000028\n(1609089402.655651) vcan0 13F#000000050000002E\n(1609089402.655838) vcan0 164#0000C01AA8000004\n(1609089402.656030) vcan0 17C#0000000010000021\n(1609089402.656223) vcan0 18E#00006B\n(1609089402.656410) vcan0 1CF#80050000003C\n(1609089402.656763) vcan0 1DC#02000039\n(1609089402.656997) vcan0 183#0000000900001020\n(1609089402.657268) vcan0 143#6B6B00E0\n"
** (ArgumentError) non-alphabet digit found: "\n" (byte 10)
    (elixir 1.11.2) lib/base.ex:878: Base.dec16_upper/1
    (elixir 1.11.2) lib/base.ex:892: Base."-do_decode16/2-lbc$^0/2-2-"/2
    (elixir 1.11.2) lib/base.ex:890: Base.do_decode16/2
    (cannes 0.0.1) lib/dumper.ex:56: Cannes.Dumper.format_candump_string/1
    (elixir 1.11.2) lib/stream.ex:441: anonymous fn/4 in Stream.each/2
    (elixir 1.11.2) lib/stream.ex:1540: Stream.do_unfold/4
    (elixir 1.11.2) lib/stream.ex:1609: Enumerable.Stream.do_each/4
    (elixir 1.11.2) lib/enum.ex:3461: Enum.into/4

Maybe some background first. I want to process the output of candump in elixir. I’m using the porcelain library to spawn a process to execute the candump command. So far everything works fine, but I get the above error when the timestamps are very, very close to each other. If they differ in the 4th last digit, everything works fine. But when the difference is only in the 3rd last digit, the problem occurs.

My Cannes.Dumper.format_candump_string/1 function assumes one line of the console output e.g. like:
(1609089347.177315) vcan0 164#0000C01AA8000022
Now, if the messages follow each other too closely, the function will receive more than one line and therefore fail. Somehow the Stream.each gets more then a single console line. It forwards e.g.:
"(1609090018.715053) vcan0 18E#00007A\n(1609090018.715409) vcan0 294#040B0002CF5A000E\n(1609090018.715701) vcan0 21E#03E83745220601\n"

A pretty dumb and hacky way that came up in my mind is to simply reject all elements in the stream where more than one \n occurs. I already implemented it and it works, but this way I’m missing all those messages.

I’m not very familiar with using streams in elixir. So maybe there is a pretty simple solution that I’m just not aware of. Also I would like to know why Streams behave like this.
Can anyone help me with this?

1 Like

Hi, without knowing much about your use case, for me it sounds like you could use flat_map to split multi lines into several messages and reinject each into the stream.

If you have a stream of values that sometimes have more than one \n-delimited value, you can transform them to a stream that has them one at a time:

a = ["1\n", "2\n3\n", "4\n"]

iex(4)> Stream.flat_map(a, &String.split(&1, "\n", trim: true)) |> Enum.to_list()
["1", "2", "3", "4"]
2 Likes

Perfect. So simple :heart_eyes:
Thanks for your post.