stefanluptak

stefanluptak

JS/Elixir string indexing interoperability

Hi all!

I would like to ask for an ideas how to solve the different way JavaScript and Elixir are approaching Strings and indexes of characters.

I want to store the same (user entered) text on the Elixir and also the JavaScript side. I also want to exchange messages describing operations like {:delete, from, to}, {:insert, position, text}, etc. to ensure the same text is on both sides.

The thing is, that this becomes intersting as soon as the text contains some emojis, characters with puncutation and so on.

Working with strings (binaries) in Elixir is pretty straightforward. Unfortunately (but not surprisingly :slightly_smiling_face:), the JavaScript behavior is (in my opinion) a bit unintuitive.

// JavaScript
> String.fromCharCode(97, 769).slice(0, 1)
'a'
> String.fromCharCode(225).slice(0, 1)
'á'
# Elixir
iex> [97, 769] |> to_string() |> String.slice(0, 1)
"á"
iex> [225] |> to_string() |> String.slice(0, 1)    
"á"

What strategy should I use? Should I work with the text as a charlist on the Elixir side or should I normalize all strings everywhere? Or is there a better strategy I should take a look at?

Thank you all for you advices.

Marked As Solved

hauleth

hauleth

Operate on bytes it is the safest way, so you do not use String module in short. This will provide you independence form encoding of your data. So it would look like this

// JavaScript message to Elixir, somebody wrote "áb"
["insert", 0, Uint8Array.of(97, 204, 129, 98)]
// JavaScript message to Elixir, somebody deleted "á" (but not b)
["delete", 0, 3]

And then in Elixir:

text_from_javascript = <<97, 204, 129, 98>>
deleted_text = binary_part(text_from_javascript, 0, 3)

Alternatively use LSEQ mentioned earlier which is representation independent (as it generates it’s own indices instead of using string positions).

Also Liked

hauleth

hauleth

Unicode is hard.
Unicode is very hard.
Unicode is enormously hard.


If you want to have consistent behaviour then operate on bytes. So you will use Blob or Uint8Array on JS side and binaries on Elixir side. I think it would be the simplest way. I hope that what you are describing (as I assume you want to created distributed concurrent text editor) is CRDT like LSEQ.

NobbZ

NobbZ

Do you want to be a and ^ be considered as separate or as â?

If the latter, work on graphemes.

Read docs of string functions carefully to know if they work on graphemes or codepoints.

The same is true for the JS functions and methods you use.

Though I have to disappoint you. It will be complicated, no matter what. Most developers of string handling libraries either do not care or even understand the differences.

Anyway, try to avoid random access of strings by codepoint or grapheme, it’s O(n) operation!

benwilson512

benwilson512

Author of Craft GraphQL APIs in Elixir with Absinthe

Do what, binary_part? binary_part is about as fast as it gets on the BEAM for that specific operation I think.

axelson

axelson

Scenic Core Team

Charlists are lists which are implemented as linked lists. And to iterate through the linked list you need to follow a bunch of pointers which is O(n).

stefanluptak

stefanluptak

Ah OK. Thanks a lot. So if I understand this correctly, charlist is a standard Elixir List container implemented as linked list in this case containing integers that represent code points and String is piece of memory and to “iterate” over it’s content we just need to increment the memory address. Am I right?

Where Next?

Popular in Questions Top

Harrisonl
We have an ECS cluster with 4 services, where each task joins a single cluster, via discovery ECS discovery service. Currently when I de...
New
mcarvalho
What is the difference between System.get_env and Application.get_env? For example, what are best practices to use one versus another.
New
Patoshizzle
After calling mix ecto.create I get this error: 17:00:32.162 [error] GenServer #PID&lt;0.412.0&gt; terminating ** (Postgrex.Error) FATAL...
New
vac
Hi, I’m quite new in Elixir and I’m trying to format a string to a PEM format. I have the certificate value like MIIDBTCCAe2...... and I...
New
fireproofsocks
Forgive me if this is obvious, but how does one delete a database record WITHOUT selecting it first? Ecto.Repo — Ecto v3.14.0 has exampl...
New
hariharasudhan94
lets say i have a sample like a = 20; b = 10; if (a &gt; b) do {:ok, "a"} end if (a &lt; b) do {:ok, b} end if (a == b) do {:ok, "equa...
New
New
vonH
When I run the Plug and I recompile I wind up having to use Ctrl C to quit iex and start again. Witht the help of rlwrap I can use the cu...
New
srinivasu
How to handle excepions in elixir? Suppose i have A, B, C ,D, E modules. and each module has get() function. A.get() method will call t...
New
Brian
What is the proper way to load a module from a file in to IEX? In the python world, doing something like this pretty standard: from ....
New

Other popular topics Top

skosch
To my knowledge, put_in, Map.update etc. all have the one limitation of not automatically creating intermediate keys when needed (for exa...
New
gshaw
What is the idiomatic way of matching for not nil in Elixir? E.g., First way: defp halt_if_not_signed_in(conn, signed_in_account) when...
New
dokuzbir
I want to highlight html closing tags when i click a html tag. That works in .html files but doesnt work for html.eex templates. How can...
New
New
pmjoe
I have a relationship of love and hate with Elixir. Lots of things are just absolutely right, but there are some things that are kind of ...
New
vonH
When I run the Plug and I recompile I wind up having to use Ctrl C to quit iex and start again. Witht the help of rlwrap I can use the cu...
New
aalberti333
As the title describes, I’m trying to run Enum.map() over a list of key/value pairs, where the value is a map. My data looks like this: ...
New
freewebwithme
Using vs code and installed ElixirLS: support and debugger. And I got an error popped up on start up says Failed to run ‘elixir’ comma...
New
klo
Got a question about when to concat vs. prepending items to list then reversing to achieve appending. So i know lists boil down to [1 | ...
New
openscript
Hello! Sorry for this astonishing simple question, but I’m really stuck. I try to set up the intellij-elixir plugin, but I don’t know ho...
New

We're in Beta

About us Mission Statement