Custom ListBox selection method

I need support for a ListBox that can have hundreds of thousands of items in it. One of the bottlenecks for these huge lists is managing a selection - a SparseSet won’t cut it. I have my own implementation for dealing with a huge selection that most operations run in constant time (with some running in O(N) where N ~= selection size), in exchange for some per-item storage.

Would it be possible to make the necessary adjustments to ListBox (and TableListBox) so that the selection related calls are virtual and overridable?

Or should I just make my own copy of ListBox and TableListBox and diverge? What is planned in terms of improvements to these classes?

Reason I am asking now is because stuff like this just isn’t gonna work for me:

class TableListBoxModel
{
//...
    virtual const String getDragSourceDescription (const SparseSet<int>& currentlySelectedRows);

If I have to fork my own ListBox then I won’t need the drag/drop changes to ListBox

I’ve no plans to add anything like that… But perhaps a better approach would be to make SparseSet smarter?

The current SparseSet obviously works best when you have a small number of contiguous ranges, but perhaps it should be able to change its internal structure when its content becomes highly fragmented? If it could decide when to change from using ranges to using a BigInteger, that’d use only one bit per item, so would be at least as efficient as any other mechanism you could use.

Interesting approach I will have to think about that. What sticks out is that iterating over the selected elements might be a little bit painful.

As far as efficiency goes I’m not so much concerned about the storage, but more about the performance.

Looking at ListBox, I’m a little bit confused about how sorts interact with the selection. Is it the case that the SparseSet representing the selection is based on visible row indexes? It seems that way. This means that any sort would by necessity have to re-map the selection, or else functions like selectRangeOfRows() would not work properly. After a sort, lastRowSelected is not updated either, and there seems to be no way to change it from the model…

How are people dealing with multiple selection in a sorted TableListBox?

Jules what do I do with the sort? I’m kind of stuck…unsure how to proceed

Maybe before sorting, set a ‘selected’ flag in each of your source items, based on the state of the SparseSet. Then sort them. Then reverse the process and create a new SparseSet from the new order.

Can’t really think of a better way than that…

I would really love it if the ListBox and TableListBox were refactored so that everything that deals with the selection passed through a tight set of overridable virtual members and I could provide my own implementation…

Remapping the SparseSet on a sort is really not an option, Sometimes a stream of fresh data is added to the list once per second and if there’s both a sort, and a selection, this will get ugly really fast.

Using a BigInteger and 1 bit per row to indicate the selected status is even worse for this…

Instead of calling the ListBoxModel with a SparseSet, it would be nice if the model had to call the ListBox to retrieve the SparseSet (i.e. listBox.getSelectedRowSet()) this way in my model I would not call that routine I would call my own functions that implement the constant-time selection algorithms.

Yeah, I hear you. If you want to suggest a sensible interface for doing that, am happy to discuss it.

Consider the benefits of this alternative implementation of selections:

  • The model provides a small amount of storage associated with each logical row
  • Each logical row is initialized with an integer ‘selection timestamp’ = 0
  • The list has an integer ‘selection timestamp’ = 1
  • List has a boolean ‘invert’ indicating if the selection is inverted

A row is considered selected if

  • row timestamp equals list timestamp, and invert == false, OR
  • row timestamp does not equal list timestamp and invert == true

To add rows to the selection:

  • set row timestamp = list timestamp if invert == false OR
  • set row timestamp = 0 if invert == true

To remove a row from the selection

  • set row timestamp = 0 if invert == false OR
  • set row timestamp = list timestamp if invert == true

To deselect all rows:

  • increment list timestamp, set invert == false

To select all rows

  • increment list timestamp, set invert == true

To invert the selection:

  • set invert = ! invert

How to iterate over rows? Maintain a doubly linked list using the per-row storage. When invert == true there is a little extra trickery needed but its straightforward (the list becomes the list of rows to skip, and must be sorted first).

Advantage of this algorithm:

* All operations execute in constant time regardless of list size (HOLY SCHITTE!!!)
* The selection is preserved after a sort

Disadvantage:

  • Requires refactoring of ListBox and TableListBox
  • Passing a copy of the selected rows does not execute in O(1) time (have to build the SparseSet at that point)

This technique is not perfectly suitable for general use so it wouldn’t be ideal to throw out the SparseSet implementation, but it would be incredibly convenient to have them both available, that is why I suggest a refactoring.

I have a feeling you will like this technique :slight_smile:

Okay well then I will fork my own copy of ListBox and TableListBox and move forward and see what I can do

I’m traveling for the week so no progress on this but two things come to mind:

  1. Seems like you’ve already neatly grouped all of the selection operations together in a small set of member functions on ListBox. I think just making those routines virtual would be 90% of the work

  2. I think that any change is going to have to combine sorting and selection into one interface, for the simple reason that when the sort changes, without knowledge of the sort, any selection method is going to have to perform some kind of expensive remapping operation.