juce_CharPointer_UTF16.h!


#1

These new classes look great! But if juce_wchar is a 16-bit value (under Windows only I believe) then functions like

juce_wchar CharPointer_UTF16::operator*() const throw();

will incorrectly decode Unicode code points that decode to values greater than 0xffff. Unless this is just a transition on the way to a final API, Windows is going to get the “short” end of the stick. Unless you set juce_wchar to uint32 somewhere in a header I didn’t see.


#2

Please ignore these classes for the moment, I’m still working on them.


#3

This no longer compiles:

String s((unsigned long)0);

#4

Yes, that constructor will have to disappear.

Because Android has a wchar_t of 1 byte, and Windows has a wchar_t of 2 bytes, I’m having to create a custom juce_wchar which on those platforms is a typedef of an unsigned int. So any overloads that might cause confusion between a real unsigned int and a juce_wchar will have to be removed.

It shouldn’t be a big deal - just cast your value to an int or int64 instead.


#5

[size=150]HOORAY!!![/size]

I agree, it’s not a big deal. And since you don’t seem to mind extra typing in exchange for clarity, let me recommend:

template<class IntegralType>
static String String::fromInteger (IntegralType value) { /*...*/ }

to avoid seeing a bunch of casts that might not be so easy to search for.


#6

I’m excited that Android was mentioned… :slight_smile:


#7

Yeah seriously…

[size=150]DROID!!![/size]


#8

Jules:

  1. Can I start banging on String with the new changes

  2. What’s the recommended way to store strings as utf-8?


#9
  1. …yeah, but please don’t hassle me with bugs or comments, I’m still stabilising it.
  2. you could use a CharPointer_UTF8 to wrap some storage of your own. I’ll probably create a wrapper class to make that easier, but haven’t done so yet.

#10

Okay…so what will String be? Always juce_wchar, therefore always utf-32? Or will it be like before utf-16 on Windows, utf-32 everywhere else?


#11

Probably the same on all platforms, and will most likely default to utf32, but as I said, I’m still tinkering with this and thinking about it. If the O(N) random-access time isn’t a problem, I might make it all utf-8 instead.


#12

I have literally hundreds of thousands of strings to millions of strings in my in-memory database so I will be storing them as utf-8 no matter what. I’m fine with however Juce will represent strings for Components and elsewhere (file paths, plugin API, etc). As long as Juce provides conversions between encodings (which it seems you already do), I don’t mind performing a conversion at display time or at a point of call.

TextEditor looks to be the most affected by removal of array based indexing into String.


#13

I’ll probably provide a really simple templated string holder object for stuff like that, e.g. “StringBuffer<CharPointer_UTF8>” that can be read and written with Strings.