I’m trying all sorts of things with no luck, so I thought I’d ask for some help.
I want to split a string anytime there’s a “b” in “abc”, but not in " b ". I.e., when a “b” is surrounded by whitespace, I don’t want it to be a splitting character, but an element in the array returned by the split.
The above pattern will match a b for splitting, only if it’s entirely surrounded by non-whitespace, eg. it’s not preceded by whitespace ((?<!\s)) and it’s not followed by whitespace ((?!\s)). It’ll also split on standalone whitespace ((...|\s)).
Depending on your exact needs you may want to adjust the pattern, of course.
Well, I’m doing an exercism.io exercise that involves recreating a subset of Forth.
One of the test requirements is that non-word characters are separators between tokens, such that:
"1\x002\x013\n4\r5 6\t7" == "1 2 3 4 5 6 7"
But the exercise involves building a toy language that can do subtraction, so I need a way to distinguish between hyphens used to separate word characters and hyphens used as subtraction symbols:
"1 2 + 4 -" == "1 2 + 4 -"
So actually, what I need to represent in a regex is “all non-word characters EXCEPT a hyphen on its own.” This question was just to unstick me on that last part, finding the hyphen on its own.
No, that won’t work. If I split on ALL non-word characters, my mathematical operators vanish. I need to be able to keep the operators (characters in the class [±*/], and surrounded by whitespace).
I may end up pre-processing the string and converting the operators to words for the operations they represent. Then if I just split on non-word characters, everything is easy. But I also have to re-convert them back to the original characters at the end. So this seemed like a potentially better way.
Oops. There’s supposed to be a hyphen between the 5 and the 6 of the input. I don’t know how that got left out
The trick is catching that hyphen when it’s between two word characters and splitting on it, but NOT splitting on a hyphen that is adrift in white space.
I read this blog series(sorry it’s in python) last year that I think will help. It’s about building a simple interpreter and it starts off with simple maths and then onto supporting order of operation. Also requires no regex