Soul arrays vs vectors

Is there some circumstances where holding primitive types in an array has some advantages over using a vector?

It seems that everything that can be done with an array can be done with a vector, correct?
the ‘only’ difference between the 2 is that the vector supports arithmetics operation?
I’m not sure why we are given the choice between float[] array and float<> vector

We spent a lot of time debating whether we should merge the two types, and decided they had different enough semantics that it works better if we keep them separate.

There are a few differences - vectors can’t be dynamically indexed, and can’t be as big as arrays. These restrictions mean we can use LLVM’s vector intrinsics which generates better code (but that’s not in itself a reason for doing it this way). And we’ll provide shuffle intrinsics for vectors that probably aren’t useful for arrays.

But mainly it just works better when you’re writing code to distinguish the concept of vectors from arrays - especially when you start creating arrays of vectors for audio data, it’s less confusing than an array-of-array would be.

1 Like

I know I’m coming late to this thread, but curious why the names “array” and “vector” were chosen, as they are kind of the opposite of C++ convention where “array” is the built-in language data type that is more primitive / less flexible and “vector” is the standard library class that is more powerful. In SOUL, array and vector are reversed. According to the first paragraph of https://github.com/soul-lang/SOUL/blob/master/docs/SOUL_Language.md , SOUL tries to be similar to existing languages where feasible. I suggest reversing the terms. The good thing is this will not (as far as I can tell) require any changes to language syntax or affect any existing SOUL programs; it is just a change to terminology in the language guide. I think that making this change in terminology now will help decrease the effort for first-time programmers coming to SOUL from C++, and will be result in less ongoing mental stress to remember which is which.

That’s an interesting point! I guess our thinking is based on:

  • The name “array” is definitely the right one for those types - it’s the absolute standard name for “things that have a square bracket size” in pretty much every language, including C++
  • So then we needed a name for other other type, which were small by-value SIMD primitives which we expect to get vectorised by the compiler, and which are probably going to be used for purposes like mathematical vectors, and for which the LLVM IR type is “vector”, and which are very similar to GLSL vectors… so we kind of settled on “vector” without giving it a second thought!

If you can think of a better name for vector then great! But I think swapping the two terms would be confusing both for non-C++ programmers who’ve never heard of a vector, and for C++ programmers who think of vectors as dynamic containers, compared to our statically sized arrays.

We did have a very long discussion about whether vectors should exist at all - i.e. whether we should just have arrays, and then automatically optimise small ones to do what vectors do, but decided their use-cases were sufficiently different to warrant the two types.

1 Like

If vectors are kept, then I don’t have a different name to propose, other than the swap already mentioned.

However given the decision to keep the array name, I suggest you re-visit that long discussion about removing vectors and then do remove vectors. I re-read the SOUL_Language.md and from what I can tell it will be a quite straightforward effort to delete vectors, as follows:

  • temporarily allow <> as a synonym, for existing code until it transitions to syntax
  • at definition of stream, change second sentence to “The type must be a scalar”.
  • clean up definition of scalar to be primitive or array[compile-time constant] of primitive; it is currently inconsistently:
    • primitiveType(T) returns the scalar type of a primitive or vector
    • isScalar(T) returns true if the type is a scalar type (i.e. a vector or primitive of integer floating points)
      That parenthetical is meaningless.
  • change Vector Intrinsics to Array Intrinsics, defined for array[compile-time constant] of numeric
    Note that sum and product are currently incorrectly permitted for bool, according to that section.
  • finally, remove vector

I think you’ll find that you’ll be glad to be rid of vectors. They are currently
a worthless appendage, much like C’s “register” keyword.
Now is the time to remove vectors; you’ll regret them later … especially that <> syntax.

When we discussed it, I was also arguing that we should get rid of vectors, but Ces made a good argument to the contrary that I’ve now forgotten! Maybe he can remember it…

I think we’ve both swung back and forth about getting rid of vectors. There used to be significantly more distinction between vectors and arrays - you couldn’t perform non-constant element access to vectors, for example, and we’ve also relaxed rules around .at() vs [ ] access to elements of both arrays and vectors.

Since this is not an OO language, it’s impossible for users to add data types and make them use natural operators. If a user wanted, say, a complex number, then there is no simple way for them to make these appear naturally in SOUL.

vector is an example of this - it supports concepts like scalar multiplication, and without this, you’d have to write a helper function and you’d not have as simple looking code. Within DSP I think it’s a common enough concept, especially for representing frames.

And talking of vector, i’ve been exploring adding multi-dimensional support for matrix operations. That’ll be helpful for ML derived algorithms…