views:

204

answers:

1

I have a SQLite database that contains a huge set of log messages.

I want to display this in a list view (using wxWidgets).

The user can reorder the list (by pressing the column header), apply a filter on the result set and navigate through it as a usual list, using the scroll bar. The user can also select one or multiple entries in the list and delete them.

I have a virtual list model: the list view asks the model for the content of a particular row. The model issues a select-query with the current filter-conditions and order and returns the corresponding row from the result.

To make it faster I keep page-caches of results: when a row is requested i fetch a whole page (~100 rows) using LIMIT and OFFSET and return the particular row from the page. I store a number of pages, and next time a row is requested I first look if it is available in one of the cached pages. This technique has proven to be fast and responsive even with lots of entries (50k+).

The problem

My problem is how to handle updates/inserts/deletes. I have one trigger for each so the model is notified whenever an insert/update/delete happens. The trigger also tell the model the ID (primary key) of the affected entry.

My first version simply made a complete reset of the model after each trigger. This was not very fast, but fast enough. The problem was that if the user had made a selection of one or a couple of rows, the selection was lost.

The base class of the model (wxDataViewVirtualListModel) contains methods that should be called when a change happens:

  • RowInserted (row)
  • RowDeleted (row)
  • RowChanged (row)

If I used them the selection problem would be solved, however there are problems:

  • How do I know if the changed row is within the currently filtered set?
  • How do I know which row in the list view was affected?

The first problem could be solved by creating a method that check if the entry belongs to the set. It must behave exactly like the SQL-conditions, but it is doable.

The second problem I have simply no idea about how to solve.

I've used a bogus (0 or last row) row-number to force the view to be updated, but the problem is if the row was inserted/deleted before the selection, the selection points to wrong rows afterwards, and so on.

How would you do? Keep an advanced data structure with all entries in memory?

This question is related to another question: http://stackoverflow.com/questions/821506/display-large-result-set

+1  A: 

I would design this around two different SELECT operations, one getting only the primary key and a timestamp (of the INSERT / UPDATE) for all rows, the other getting all data for a single page of rows. On a modern machine keeping the complete list of primary keys and timestamps in memory should not be a problem even for several 100000 rows.

Whenever the filter criteria change or a trigger fires I would retrieve the list of primary keys and timestamps again. The model maintains a list of primary keys and matching timestamps, and a comparison between the model and the new list shows which rows need to be inserted, invalidated or removed. Cached rows whose timestamp has changed get deleted from the cache, cached rows with the same timestamp need not be retrieved again. The oldest entries of the cache would be removed when it gets too large.

The selection of the list can be identified via its primary key value, so unless the row has been deleted it is always possible to reselect it after changes to the model, even when it is now in a completely different position. I find this to be much more intuitive than keeping the same row position when the ordering changes, which selects a completely different row.

Edit:

This works for concurrent data changes from other database clients, I have implemented it this way for applications using the Firebird database server. If there's no way the data can be changed from outside it may not be necessary to always retrieve the full list of primary keys and timestamps.

mghie
Thank you. I will do something like this.
Jonatan