views:

836

answers:

4

Hi,

Good morning !

What is the best way to remove duplicate records from grid control? I use Delphi 2009 and devEx quantumGrid component.

I tried looping through all the records and when a duplicate record is found then add it to list and apply filter on grid. I found this as time consuming logic. There are also two downsides of this approach.

[1] When the duplicate records are considerably more say 10K records then applying filter takes lot of time, because of lot of entries to filter out.

[2] Looping through all the records is itself time consuming for big result set like 1M rows.

SQL query returns me distinct rows, but when the user hides any column in grid, then it resembles as if there are duplicate records(internally they are distinct).

Is there any other way of doing this?

Any ideas on this are greatly helpful !

Thanks & Regards, Pavan.

A: 

Checking for duplicates is always a bit tricky, for the reasons you just mentioned. The best way to do it in this particular case is probably to filter before the data reaches the grid.

If this grid is getting its records from a database, try tweaking your SQL query to not return any duplicate records. (The "distinct" keyword can be useful here.) The database server can usually do a much better job of it than you can.

If not, then you're probably loading your result set from some sort of object list. Try filtering the list and culling duplicate objects before you load it into the grid. Then it's over with and you don't have to filter the grid itself. This is a lot less time-consuming.

Mason Wheeler
A: 

I have worked with DevExpress's Quantum Grid for some time and their support form http://www.devexpress.com/Support/Center/ is excellent. When you post questions the DevExpress staff will answer you directly. With that said, I did a quick search for you and found some relevant articles.

how to hide duplicate row values: http://www.devexpress.com/Support/Center/p/Q142379.aspx?searchtext=Duplicate+Rows&p=T1|P0|83

highlight duplicate records in a grid: http://www.devexpress.com/Support/Center/p/Q98776.aspx

Unfortunately, it looks like you will have to iterate through the table in order to hide duplicate values. I would suggest that you try to clean the data up before it makes it to the grid. Ideally you would update the code/sql that produces the data. If that is not possible, you could write a TcxCustomDataSource that will scrub the data when it is first loaded. This should have better performance because you will not be using the grid's api to access the data.

Edit

ExpressQuantumGrid will not automatically hide rows that look like duplicates because the user hid a column. See: http://www.devexpress.com/Support/Center/p/Q205956.aspx?searchtext=Duplicate+Rows&p=T1|P0|83.

Poster

For example, I have a dataset which contains two fields ID and TXT. ID is a unique field and TXT field may contain duplicate values. So when the dataset is connected to the grid with all columns visible, the records are unique. See the image1.bmp. But if I hide the ID column then the grid shows duplicate rows. See the image2.bmp.

DevExpress Team

I'm sorry, but our ExpressQuantumGrid Suite doesn't support such a functionality, because this task is very specific. However, you can implement it manually.

Lawrence Barsanti
Hi, how do i write TcxCustomDataController to scrub the data?
Pavan
My mistake, you need could write a TcxCustomDataSource. Basically, you extend TcxCustomDataSource and override the methods GetRecordCount, GetValue, SetValue, InsertRecord, AppendRecord and DeleteRecord. You don't need to implement them all if your source is read only. The documentation explains how to do this in greater detail. Good luck.
Lawrence Barsanti
Hi larry, i am not sure that how it solves the actual problem of removing duplicate rows. It is obvious that data comes for SQL query. The main thing is how to avoid displaying duplicate rows when a column is hidden in grid. I am afraid if there is no way of doing it on client side.
Pavan
Hi, I am glad to hear that it can be implemented manually. But is there any other way apart from looping all grid records and applying filter? I mean, is there any possibility of avoiding adding duplicate rows to grid (if we know the hidden column, prior to populating grid)?
Pavan
+1  A: 

Can you alter your dataset to not return duplicate records in the first place? I would normally only return the records I want displayed instead of returning unwanted records from the database and then using a database grid to try to suppress unwanted records.

C Harmon
A: 

With thousands of rows I would add an additional field to the DB called say Sum or Hash or if you can't change the DB add a calculated field if it is a ClientDataSet but this carries overhead at display time

Calculate the contents of the hash field with something fast and simple like a sum of all the chars in your text field. All dupes are now easily identified. Add this field to your Unique or Distinct Query parameters or filter out on that.

Just an Idea.

Despatcher
Hi, it sounds good, but i did not understand what is meant by "something fast and simple" . Sum of all chars will not give me unique values.. because string "ba" is differnt from "ab", but sum will give me the same value, if i am not wrong. It would be great and helpful if you have any idea for calculating the value of calculated column.
Pavan
Hi sorry I've been away - but It looks like you must have worked out a way to create a unique vale to represent your strings :) I would probably have used the index of the characters in the string in addition to the character value. Or similar :)
Despatcher