BUG: MatchRegex() "\b" doesn't work

In a Formula field, a regex with \b never matches anything.

The above Formula should match these entities, but doesn’t match any:

image

It seems you’re right, but I don’t know why :confused:
In the meantime, for your specific case, could you just use
[\W^]PM[\W$]
instead?

1 Like

Maybe a Unicode-related bug somewhere? Or escaping/quoting? SQL?

I think it would need to be something more like
(^|\W)PM($|^W)
because you can’t put ^ $ in a character class - it’s just a literal there.

You’re probably right - regex is not my strong point to be honest!

Hi! I just came across this issue, and it still does not work i.e. word boundaries in Fibery Regex formula functions somehow do not get passed correctly to the Regex engine.

Since this issue still persists, I figured I would share the other workaround using negative lookbehind & lookahead assertions to replace the \b functionality:

Example: ReplaceRegex(Text, "(?<!\w)YOUR_WORD_REGEX(?!\w)", "YOUR_REPLACEMENT")

How it works:

  • (?<!X): this ensures the the match (i.e. YOUR_WORD_REGEX) is not preceded by X. In our example above the X is \w, which means letters, numbers and _ in ASCII*.
  • (?!X): this ensures the the match is not followed by X.

* Unicode versus ASCII
I have tested this with Unicode characters, and the \w matches exactly as expected. For example ReplaceRegex("and😀 and €and 𝐀and andé", "(?<!\w)and(?!\w)", "#") results in #😀 # €# 𝐀and andé as expected.

If you somehow still want to expand the definition of your word boundaries, you can convert the \w into a character class and add to it. For example (?<![\w-·]) would stop considering the dash and middle-dot boundaries, too.

1 Like