Cyrillic validation in erlang

erlang

#1

i have some problem

i use erlang, and send some message from html input type text to erlang-server
at server i want validate my message

example 1 -
i send “test” from my html page,
receive “test” at erlang server,
put “test” to re:run(A, “^[0-9A-Za-z_]{2,100}$”)
and get true ( or false ) and all ok

example 2 -
i send “тест” from my html page,
i receive [209,130,208,181,209,129,209,130], not “тест” at erlang server,
i can not put [209,130,208,181,209,129,209,130] to re:run(A, “^[0-9A-Za-zА-Яа-я_]{2,100}$”)
because i have runtime error !

how i can validate cyrillic in erlang?
thanks


#3

What version of erlang are you running? What encoding does your erlang expect? How do you convert the incomming binary to a charlist? Why do you do it at all?

Currently the binary -> list conversion transfers bytes as they are but not decodes UTF8 properly. The correct way to convert unicode io-data (which includes a unicode binary) to a charlist is unicode:characters_to_list/1, but that module is only available since OTP 19 IIRC.


#4

solved.

i must 1st -

unicode:characters_to_binary(Note,utf8,latin1)

2nd

re:run(Note2, <<"^[0-9A-Za-zА-Яа-я\_]{2,100}$"/utf8>>,[unicode])

i use otp 20 and n2o


#5

This converts utf8 encoded characters to latin1. Latin1 does not contain kyrillic letters.


#6

that works
i am not theoretic, i am practical

i use n2o, send data ( “тест1” ) from page to server via websockets with bert
in erlang i receive that value like [209,130,208,181,209,129,209,130,49]
and not like [1090,1077,1089,1090,49]


#7

So the problem is already in the receiver, you probably configured your gen_tcp wrong.


#8

no problem, all works, thanks