Escaping in sigil_w

There appears to be no way to escape whitespace in word lists, so sigil_w (and sigil_W) can only be used for lists of single words. Of course that is its main purpose, but to me it would be much more useful if it would allow something like:

~w[foo\ bar baz] => [“foo bar”, “baz”]

Is there any reason it doesn’t work this way?

~w because it is a list or words. "foo bar" is not a word :slight_smile: it doesn’t fit semantically

1 Like

I like to think of these sigils as a way to communicate intent to the reader (the list should only be made up of single string tokens) rather than a shorthand for the writer. I never use the a modifier for this reason, as well as others.

1 Like

I just never use them. It doesn’t take much effort to write out a list, and it’s completely unambiguous when read - which is many more times than written.

2 Likes

My point is we are not always in control of the contents of these lists. Things like paths names or spreadsheet headers can easily contain spaces, and whenever this occurs the sigil needs to be replaced by traditional lists. I would love to avoid that by being able to escape whitespace.

That’s my point, though: if it’s only incidental that the entries are single words and could be more, I don’t bother using the sigil version. I understand the pain, though.

sigil_w and it’s uppercase counterpart are macros. Their values need to be compile time known.

This is not the tool to use for user input or anything you don’t know what it is.

1 Like

No the contents are static, but out of our control.

You’re still free to use a more appropriate tool to parse content you cannot make sure is going to be a list of words. These sigils really are conveniences for when you’re dealing with a list of words and you don’t want to type that much. You’re not dealing with a list of words as it seems, so you want to use something else.

1 Like

I am dealing with lists. E.g. in the case of paths, the System.cmd() function accepts a list of arguments, often containing pathnames. I could give other examples, but trust me I have encountered it enough to warrant this post. Even using a variable doesn’t work:

foo_bar = "foo bar"
~w[#{foo_bar} baz]
=> ["foo", "bar", "baz"]

A list of words is a subset ot all list. You’re clearly not dealing with just that subset.

If you’re dealing with CLI arguments OptionParser might be the more appropriate tool to use.

Yes I am dealing with a list of strings, not just words. All I am saying is that not allowing any mechanism to escape whitespace limits its usability. Or maybe this requires a separate sigil. Final illustration:

args = ~w[import --debug true --path /path/with\ space/file.txt]  #can't do this

This would allow me to just basically write out the command

args = ["import", "--debug", "true", "--path", "/path/with space/file.txt"] 

This is the (clumsy) way I need to do this now. I have no need for all the OptionParser overhead.

But again, this is just an example, I have come across this in many other contexts as well.

So you’re aspiring a sigil, which can handle splitting a string into CLI arguments, which is fair. Afaik there are no univeral rules on how to escape characters in a terminal, so even if elixir decides for one it will only let people “copy paste” their cli command, which work on a system matching the chosen escape rules.

It also kinda did so already on OptionParser.split/1, which is afaik a way of parsing CLI args not trying to be universal, but if the people producing your input can apply its format it should work for you:

iex> arg_str = "import --debug true --path \"/path/with space/file.txt\""
iex> OptionParser.split(arg_str)
["import", "--debug", "true", "--path", "/path/with space/file.txt"]

Edit: Using quotes actually seems to be the most portable way to deal with spaces in args based on some quick research.

Strongly agree with this

Thanks, but no, I want a ~w-like sigil that allows a way to escape whitespace. You are focusing on the example but that is not the point.

iex(1)> ~w/hello\u00a0world/
["hello world"]

edit: I do not claim this to be the best idea ever and I truly believe that the use of a sigil is not really usefull in the suplied examples

1 Like