Thanks for reporting, and especially for including code to repro the issue.
We’ve made two changes in this area to address this crash.
The first problem was that isValidString was returning a false positive result for the string in your example. The provided string is not valid UTF-8, and the result of isValidString will now reflect that.
Secondly, ideally we wouldn’t crash, even when attempting to tokenise an invalid UTF-8 string. CharPointer_UTF8 will now return a substitute codepoint in the case that reading a codepoint from the bytestream fails.