egor  
                
                  
                    January 31, 2019, 12:48pm
                   
                  1 
               
             
            
              Hi all, I need to port this  from JavaScript to Elixir:
/[\0-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]/
When I try:
Regex.compile!("[\0-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]", "u")
I get:
** (ArgumentError) invalid or reserved Unicode codepoint 55296
    (elixir) src/elixir_interpolation.erl:200: :elixir_interpolation.append_codepoint/5
    (elixir) src/elixir_interpolation.erl:81: :elixir_interpolation."-unescape_tokens/2-lc$^0/1-0-"/2
    (elixir) src/elixir_interpolation.erl:81: :elixir_interpolation.unescape_tokens/2
    (elixir) src/elixir_tokenizer.erl:673: :elixir_tokenizer.handle_strings/6
    (elixir) lib/code.ex:669: Code.string_to_quoted/2
and when I try
~r/[\0-\x{D7FF}\x{E000}-\x{FFFF}]|[\x{D800}-\x{DBFF}][\x{DC00}-\x{DFFF}]|[\x{D800}-\x{DBFF}](?![\x{DC00}-\x{DFFF}])|(?:[^\x{D800}-\x{DBFF}]|^)[\x{DC00}-\x{DFFF}]/u
I get
** (Regex.CompileError) disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at position 39
    (elixir) lib/regex.ex:172: Regex.compile!/2
    (elixir) expanding macro: Kernel.sigil_r/2
    iex:43: (file)
Looks like the problem is with \uD800. Does anybody has an idea how to solve it?
             
            
              
            
           
          
            
              
                Nicd  
              
                  
                    January 31, 2019,  1:15pm
                   
                  2 
               
             
            
              Looks like you are trying to find surrogate pairs? Elixir uses UTF-8 strings and that is AFAIK not allowed to contain surrogate pairs, so binaries containing them would not be valid strings. And since Regex operates on Elixir strings, it does not make sense to ask it to find them.
What is the actual thing you are trying to accomplish here?
             
            
              3 Likes 
            
            
           
          
            
              
                egor  
              
                  
                    January 31, 2019,  1:22pm
                   
                  3 
               
             
            
              Thanks!
What is the actual thing you are trying to accomplish here?
 
I need to linkify plain text links into html. I’m trying to port https://github.com/markdown-it/linkify-it .
             
            
              
            
           
          
            
              
                Nicd  
              
                  
                    January 31, 2019,  1:24pm
                   
                  4 
               
             
            
              Where is that regex there?
I think  you can avoid running it if it only tries to detect and skip surrogate pairs (since they can’t be in Elixir strings), but would be nice if someone more knowledgeable than me replies also. 
             
            
              
            
           
          
            
              
                egor  
              
                  
                    January 31, 2019,  1:33pm
                   
                  5 
               
             
            
              
Where is that regex there?
 
https://github.com/markdown-it/linkify-it/blob/master/lib/re.js#L8 
https://github.com/markdown-it/uc.micro/blob/master/properties/Any/regex.js 
I think  you can avoid running it if it only tries to detect and skip surrogate pairs (since they can’t be in Elixir strings), but would be nice if someone more knowledgeable than me replies also.
 
Thanks, I hope you’re right.
             
            
              
            
           
          
            
            
              Perhaps this could be useful?
             
            
              
            
           
          
            
              
                egor  
              
                  
                    February 1, 2019,  4:57am
                   
                  7 
               
             
            
              Thanks! Looks good, but too simple for my case (doesn’t support urls with http://, internationalized domain names, etc).
             
            
              1 Like 
            
            
           
          
            
            
              
 egor:
 
Thanks! Looks good, but too simple for my case (doesn’t support urls with http:// , internationalized domain names, etc).
 
 
Perhaps a set of PR’s to buff it up could be useful?  It could be a great generic library for handling such things.  
             
            
              1 Like 
            
            
           
          
            
              
                egor  
              
                  
                    February 5, 2019,  2:05pm
                   
                  9 
               
             
            
            
              1 Like 
            
            
           
          
            
            
              
Ooo very cool, hope it gets accepted soon!
             
            
              1 Like