Introjucer UTF-8 String Literal Converter bug


#1

I run a git modules checkout from today and just tried that converter.
It seems to be buggy:
input: Übertoll
output: CharPointer_UTF8 ("\xc3\x9c\x62\x65rtoll")
input: Ü
output: CharPointer_UTF8 ("\xc3\x9c")
input: bertoll
output: “bertoll”

As you can see from the above example, the converter converts some characters unnecessarily (in this example: “be”).


#2

No, it’s working correctly.

If the character that follows an escaped character sequence is a valid hex char, like the ‘b’ and ‘e’ in this case, then it also has to be escaped, to make it clear that it’s not just part of the preceding hex number.


#3

[quote=“jules”]No, it’s working correctly.

If the character that follows an escaped character sequence is a valid hex char, like the ‘b’ and ‘e’ in this case, then it also has to be escaped, to make it clear that it’s not just part of the preceding hex number.[/quote]
I’m a bit picky, but you only have to escape one, not both. The sequence should read:
"\xc3\xc9""bertoll"
Please notice that the rules of concatenating string are preprocessor based (with no replacement done in the strings). This ends up in smaller final code, and IMHO, more readable code.


#4

Ah, I never thought of using extra quotes… Yes, that’d work too!