duplicate-data

How to detect duplicate data?

I have got a simple contacts database but I'm having problems with users entering in duplicate data. I have implemented a simple data comparison but unfortunately the duplicated data that is being entered is not exactly the same. For example, names are incorrectly spelled or one person will put in 'Bill Smith' and another will put in 'Wi...

How do i find duplicate values in a table in Oracle?

What's the simplest SQL statement that will return the duplicate values for a given column and the count of their occurrences in an Oracle database table? For example: I have a JOBS table with the column JOB_NUMBER - how can I find out if I have any duplicate JOB_NUMBERs, and how many times they're duplicated? ...

Database Duplicate Value Issue ( Filtering Based on Previous Value)

Earlier this week I ask a question about filtering out duplicate values in sequence at run time. Had some good answers but the amount of data I was going over was to slow and not feasible. Currently in our database, event values are not filtered. Resulting in duplicate data values (with varying timestamps). We need to process that data ...

Merging contacts in SQL table without creating duplicate entries

I have a table that holds only two columns - a ListID and PersonID. When a person is merged with another in the system, I was to update all references from the "source" person to be references to the "destination" person. Ideally, I would like to call something simple like UPDATE MailingListSubscription SET PersonID = @DestPerson WHER...

SQL Query: Remove Duplicates with Caveats

There are multiple delete rows questions here, but this one is a little unique (not difficult, just unique). I have already ready the other thread found here: http://stackoverflow.com/questions/18932/sql-how-can-i-remove-duplicate-rows Question: I have a table with rowID,longitude,latitude,businessName,url, caption. This might look lik...

SQL Server Query Question: If I stop a long running query, does it rollback?

A query that is used to loop through 17 millions records to remove duplicates has been running now for about 16 hours and I wanted to know if the query is stopped right now if it will finalize the delete statements or if it has been deleting while running this query? Indeed, if I do stop it, does it finalize the deletes or rolls back? ...

SQL Duplicate Delete Query over Millions of Rows for Performance

This has been an adventure. I started with the looping duplicate query located in my previous question, but each loop would go over all 17 million records...meaning it would take weeks (just running *select count * from MyTable* takes my server 4:30 minutes => mssql 2005). I gleamed information from this site and at this post: http:/...

Extracting unique items from a list of mappings

Hello, He're an interesting problem that looks for the most Pythonic solution. Suppose I have a list of mappings {'id': id, 'url': url}. Some ids in the list are duplicate, and I want to create a new list, with all the duplicates removed. I came up with the following function: def unique_mapping(map): d = {} for res in map: ...

How to detect duplicate text with some fuzzyness

Some thing ago, I write small script using Text::DeDupe to remove duplicates of blog posts before I have to lay my eyes on them. After reading Syntactic Clustering of the Web paper on which implementation is based, I would love to have ability to find overlapping documents (e.g. snippets of blogs as opposed to full text, maybe also quot...

How to remove duplicate values from an array in PHP

( can't believe this hasn't been asked in SO yet... ) How can I remove duplicate values from an array in PHP? ...

How to remove duplicate values from a multi-dimensional array in PHP

How can I remove duplicate values from a multi-dimensional array in PHP? Example array: Array ( [0] => Array ( [0] => abc [1] => def ) [1] => Array ( [0] => ghi [1] => jkl ) [2] => Array ( [0] => mno [1] => pql ) [3] => Array ( [0] => abc [1] => def ) [4] => Array ...

Suppress the duplicate values in group, SSRS Reports

Hi, I have an SSRS report where the date should be grouped by project category the project code in the category is repeating in side the group how do I suppress the value Please help me to get an idea. Thanks,brijit ...

Duplicating model instances and their related objects in Django / Algorithm for recusrively duplicating an object

I've models for Books, Chapters and Pages. They are all written by a User: from django.db import models class Book(models.Model) author = models.ForeignKey('auth.User') class Chapter(models.Model) author = models.ForeignKey('auth.User') book = models.ForeignKey(Book) class Page(models.Model) author = models.ForeignKey...

How do I find duplicate data in xml document using XQuery?

I have a bunch of documents in a MarkLogic xml database. One document has: <colors> <color>red</color> <color>red</color> </colors> Having multiple colors is not a problem. Having multiple colors that are both red is a problem. How do I find the documents that have duplicate data? ...

How to prevent repeated postbacks from confusing my business layer

I have a web application (ASP.Net 3.5) with a conventional 3 layer design. If the user clicks a button a postback happens, some middle and data layer code runs, and the screen is refreshed. If the user clicks the button multiple times before the first postback is completed my logic gets confused and the app can end up in an invalid state...

SQL - Only select row that is not duplicated

Hi, I need to transfer data from one table to another. The second table got a primary key constraint (and the first one have no constraint). They have the same structure. What I want is to select all rows from table A and insert it in table B without the duplicate row (if a row is0 duplicate, I only want to take the first one I found) ...

Xcode duplicate/delete line

Coming from Eclipse and having been used to duplicate lines all the time, it's pretty strange finding out that Xcode has none such function. Or does it? I know it's possible to change the system wide keybindings but that's not what I'm after. TIA ...

How can I compare two tables and delete the duplicate rows in SQL?

I have two tables and I need to remove rows from the first table if an exact copy of a row exists in the second table. Does anyone have an example of how I would go about doing this in MSSQL server? ...

SQL Unique Record not column?

Is there a way to insert into an SQL database where the whole record is unique? I know you can make primary keys and unique columns, but that is not what I want. What is the best way of doing this without overloading the database? I have seen a sort of subquery where you use "WHERE NOT EXISTS ()" I just want to know the most efficient ...

Compare many text files that contain duplicate "stubs" from the previous and next file and remove duplicate text automatically

I have a large number of text files (1000+) each containing an article from an academic journal. Unfortunately each article's file also contains a "stub" from the end of the previous article (at the beginning) and from the beginning of the next article (at the end). I need to remove these stubs in preparation for running a frequency an...