Std::min in JUCE causing warnings


#1

Unfortunately, there are hundreds of them!

ld: warning: unsigned long const& std::min(unsigned long const&, unsigned long const&)has different visibility (hidden) in /Users/tom/Documents/development/juce/bin/libjucedebug.a(juce_DrawableText.o) and (default) in /Users/tom/Documents/development/rec/unit-tests/build/unit-tests.build/Debug/unit-tests.build/Objects-normal/i386/BufferedAudioSource.o

Clear enough - somewhere JUCE has included some STL-like include file within an anonymous namespace… but, where?

Not a biggie if no immediate solution comes up, I might even just redefine my own min as I only use it in two places.


#2

use the existing jmin instead.


#3

That isn’t going to work - I use STL and that uses std::min (as I’ve discovered since the original post), so by removing the one spot in my code where I use it, I simply get warnings about other spots in STL.

It’s not a huge annoyance, but it’s a potential trap. I really do like to have my code compile with no warnings, and with a mass of warnings each compilation run, I’m eventually going to miss some useful one. There doesn’t seem to be an obvious way to disable this one warning (and that’s another trap, though a much less critical one…)

I do wish (pretty well every day! :-D) that JUCE used STL instead of reinventing the wheel… it’s available for all the platforms that JUCE supports, and is small and has well-defined behaviour. I don’t see the advantage that e.g. String has over std::string - there are extra facilities, yes, but I’d argue they’d be much better off as stand-alone functions that operated on std::string rather than methods on some new class - see Scott Meyer’s seminal article on this topic here.


#4

Cons:

  • STL is hard to learn because any mistake you do just throw a large compiler error that a beginner can’t hardly understand. So using the STL means you’ll just prevent the C++ newbies from taking the bandwagon.
    Worst, some developers just don’t fix the error correctly and hide the “template” magic by specifying the template specialization, something like :
int pos = find<std::vector<int, std::basic_allocator<int, 4, true> > >(vec, iter); // which will break as soon as you change platform, where the allocator doesn't follow the same syntax.
  • STL is not cross platform. There are subtle difference between platforms and this leads to ugly #ifdef PLATFORM workaround, when it doesn’t lead to run-time bugs by themselves (locale, hash map and so on).
  • STL string is a joke. It’s slower than any other string class, including Juce String, (see http://bstring.sourceforge.net/features.html). Any serious code have to write its own error prone “process your string” lot of functions like trim, split, join, search, fromFirst, etc…
  • STL streams are also a joke. They are slow, they don’t split the format from the data in the << operators (so as soon as you have to internationalize your code, you’ve to rewritte your own hacky and buggy printf like solution), and they are stateful (which means that you can’t assume that a stream you get in a function will work the way you expected, unless you write 50 lines on top of your function to save the previous state and restore on leaving).
  • Any text processing with stl::string is a pain to write, a pain to understand when you haven’t written it, and a pain to maintain ( ex: http://labs.trolltech.com/blogs/2009/08/24/with-or-without-qt/ ).
  • STL is too verbose for basic things, so you don’t concentrate on what you want to do, you distract on the STL library itself.

Pros:

  • STL iterators are really useful to write less code, even if it’s very “algorithm-o-matic”, so it’s less of a use for a GUI framework.
  • STL containers are interesting to use. In GUI work, however, you rarely need more than a vector / hash_map.
  • templating the algorithm is very very important.

For me, if Juce had used stl, I’d have looked at Qt instead. I’ve had to maintain a big STL based spaghetti code, and really, I won’t do that anymore.
The defect of STL have almost all been addressed in boost, but still, it’s a hard step to perform for a C++ newbie.

Also, because of the template argument must match exactly, such code doesn’t compile:

int a = 3; uint32 b = 5;
int c = std::min(a, b); // Error a and b are not the same types

while a “int min(int a, int b)” would have compiled due to integer conversion done by the compiler.
So yes, it’s better in terms of code safety, but such useless code-safety leads to ugly code when done by any non-C++ guru:

int c = std::min(a, static_cast<int>(b)); // Which does still fail when b is > 0x80000000, so it's not safer, it's simply uglier.

#5

Reading the second URL, there is an error in the STL code posted, and unless you’re quite efficient, I doubt you’ll see it.


#6

I agree completely with your comments about streams, which are IMHO a failure (i18n and statefulness being indeed the big issues). Ditto with the errors - but those are also a problem with C++ in general.

I really don’t know about the interoperability issues - I haven’t run into any…

I’m also unfond of the verbosity of STL in particular - but again, that’s an issue with C++ in general. STL is intended to be a high-performance, general library and it’s not clear that there’s a better way to do it which still lets you write fast, tight code.

However, I’ve had excellent success using std::string and that chart, while interesting, also shows a lot about the person making the chart.

For example, ref substrings is for me a negative feature, not a positive one - because it makes your strings either not-thread-safe or requiring a mutex, and strings that require a mutex kill the performance of your processor (because each time it synchronizes, the processor potentially has to throw away all its caches). If I recall correctly, std::string went to ref-counted strings for GCC 3 - and then reverted to copied strings, precisely for this reason. (The chart also says that std::string isn’t interoperable with C, which baffles me as I’ve done it numerous times).

std::string can be quite different depending on the vendor. In Scott Meyer’s Effective STL and in other books of his he goes through three different implementations of std::string, at least one of which is optimized for small strings. Generally, you shouldn’t be using stl::string for large strings that you’re editing - there’s stl::rope for that.

At the risk of “argument by authority”, I’d point out that Google uses std::string for everything except long, editable strings, precisely because of performance.

I’m not sure I understand why you’d ever write that first code fragment…? It’s not just that you’d already have typedef’ed the vector, but automatic argument resolution should be enough to figure out which instantiation of find is needed? Sigh, I don’t really love the C++ templating system - though it’s ultra-powerful - but it should work fine in this case.

As for your case when you’re comparing signed and unsigned values… using signed and unsigned numbers together has resulted in some of the most famous bugs in the world (like this one). I would in fact say that it’s almost always a bad idea to use unsigned integers!

I have a love/hate relationship with C++, I confess - given a choice I’d write in Python any day. However, when I need to get performance or close to the hardware, there isn’t any other choice and at that point I might as well geek out entirely and use STL, which is obscure, yes, but a very rigourous framework that lets me IMHO write the ultra-tight performance code.

Anyway, avoiding work I am - back to it!


#7

I agree. The chart was made when GCC was 3.2 or something and by that time, all of STL was COW (with catastrophic performance).

Yes it’s true. However, most C++ dev will blindly use std::__basic__string implementation, which is not made for text processing, because of the poor features string API, and it’s no more COW, and not ptr + size based either. So clearly all strlen are still O(N) and any string based code still use this a lot.

Yes. They don’t use std::string for their web indexer, but for the code they share on opensource code.google.com. Internally they use lockless code (I’ve talked with the guy who made Protocol buffers).
Don’t let me wrong, std::string is a lot better than plain “const char *” code.
When you have something more text oriented, std::string is limited (while juce::String or QString is better).

Yes it should work fine. Any senior C++ developper would have spotted the missing conversion to match the function’s expected template parameter, but a 2 year old C++ developer will likely hack the code around until that specific compiler is happy, just because he doesn’t understand the 32 lines based error message. While maintaining huge STL code, I had numerous place like that to change because of inexperienced C++ developers.
In STL/C++ you can only consider proficient after 5 year of painful work. I think it’s why C++ is declining compared to interpreted language like C# or Java or Ruby.

Please notice that in the code I’ve posted, the “fixed” (understand: warning-less) version for STL still hide the bug, it doesn’t correct it.
In my day to day work I almost never use signed integers. But of course it depends on what you’re doing, and mainly on your dataset.
Unsigned is generally faster as you’ll halve the test required when you enter a function :

A* getUnchecked(int a)
{
   if (a < 0 || a > size) return 0; // Notice range check 
   return array[a];
}

with unsigned:

A * getUnchecked(unsigned a)
{
   if (a > (unsigned)size) return 0; // Even if passing -1 to the function, it get converted to a value (0xFFFFFFFF) > INT_MAX which will be > size if size is signed
   return array[a];
}

#8

A few more tiny comments!

Perhaps 5 years is an exaggeration, though it took me a lot longer than that :smiley: but now you have a ton of good, authoritative references - I keep mentioning Scott Meyer, but his stuff is really readable and fun.

But I essentially agree. You could be writing really nice code in six months with Python, for example (if you could already program). Java, well, that takes quite a lot more time to really master… (I assume Ruby is more similar to Python).

Well, I don’t know if you’re right about that… but I can’t really comment! :wink:

Regarding platform-dependence - this is an issue for me since I’m trying to develop for the Mac and PC. I couldn’t find any web references for this…

I went through the SGI STL source (which I have floating around) and saw nothing platform-dependent in the strings or really, much of anywhere except for threads.

There is no “standard” STL implementation as far as I know. If there were any, it’d be the SGI, but I just checked my XCode path and it appears to be a GCC implementation (which was too complex for me to really see if it were platform-dependent or not).

A* getUnchecked(unsigned a) { if (a > (unsigned)size) return 0; // Even if passing -1 to the function, it get converted to a value (0xFFFFFFFF) > INT_MAX which will be > size if size is signed return array[a]; }

First, that code isn’t actually right :smiley: - it’ll fail if the size of your array is greater than INT_MAX. (You’re claiming that your size is signed, but why? If you’re using unsigned, surely “sizes” are one good place for them?) You use In the case of 32 bit integers, this is unlikely, but people do sometimes make use of massive byte arrays for example. However, if you believe this pattern is correct, you’d be tempted to use it for shorts… or characters…

Using unsigned like this is a trick. It saves a tiny amount of processor time, at the expense of readability.

But I have to say that I have no idea why you’d want to write this method at all. Returning NULL is just the worst possible thing you can do because it causes a problem some indeterminate time later. Putting all this range checking in each array access is not only overhead but prevents the compiler from optimizing in some cases.

In my mind there are two possible classes of errors - data errors and programming errors. You as the programmer should assume your data is always bad and clean up indetectably or at least never get into a bad state. But trying to access an array outside of its bounds is a programmer error and at that point the state of your program could be anything at all…

So I’d do pretty well what Jules does in JUCE - which is to have a debug-only time check which throws an exception if there’s an array out of bounds access, and perhaps does nothing at all in optimized mode. Fail fast, fail early. If your code is exception-safe, it can catch that exception at a much higher level and either retry the operation, or report this bug to you.

The other things that’s key is whether pointers are nullable or not. The idea that a NULL pointer could mean either a deliberate “pointer left empty” or a mistaken “oops, I goofed with my array arithmetic somewhere” is a generally bad idea and encourages the writing a lot of spurious error handling code that is simply never executed.

Frankly, I rarely if check input parameters to functions and methods - though I check each and every return code of course. Instead, I have strict conditions in the documentation for my functions, lots of unit tests, and in the actual code I just charge ahead as if everything is correct. There are programmer errors, of course, and you get an unexpected state at some point in the program - but that’s always true, and I’ve spent the time I would have spent checking variable values (BORING) in writing unit tests or proving (tiny) code segments correct.


#9

As long as you use basic stuff it’s cross platform. But as soon as you specify allocator, or locale (std::face) code and so on, then it breaks.

Well, I guess I disagree with you unsigned usage. Some things are unsigned by nature (like the size of an array), and they are better written using unsigned.
The issue appears when you interface code which a different semantic (that is a lot of POSIX code, or even third party code, still use signed size value like memcpy, and so on).
You then have to convert to signed or from signed.
Most compiler reduce range checking tests by “hiding” signed to unsigned conversion so they only check against one value (look at asm output of your compiler you’d be amazed how your variable are treated).
However, your code is more verbose, slower to write and with probaly more bugs. The human bug rate is quite stable and is usually one bug in five line of code.

Concerning bounds checking I really hope you check your parameter on function entrance, especially when it comes from hostile user code.
Because, if you don’t, then most code will appear to work (for example, in release mode, you can write to the array_size + 1 with no issue most of time as the allocation return page-aligned memory block) and array_size + 1 is a frequent error.
Also, when a code is full of assert, it doesn’t prevent the code from checking input parameter. For example:

int ChecksumTable(int a)
{
    static int array[36] = { .... };
    assert(a < sizeof(array) / sizeof(array[0]); // If it's not the case, it crash

    return array[a];
}
// This code will trigger an assertion break in debugger, but in release mode, it'll crash, or worst, it will return random stack value that'll go unnoticed.
// So finally you still have to write such code:
int ChecksumTable(int a)
{
    static int array[36] = { .... };
    if (a < sizeof(array) / sizeof(array[0])
    {
        assert(false); // In that case assert is quite useless anyway, if you check function return code 
        return 0; // Or throw, whatever, be coherent with your error handling policy
    }
    return array[a];
}

You can’t test all the possible input value of your code (especially when the variable comes from user) as soon as it’s quite important, and if you don’t add parameter checking, your code go break instantly.