views:

93

answers:

6

i am using a text file to store my data records. the data is stored in the following format.

Antony|9876543210
Azar|9753186420
Branda|1234567890
David|1357924680
John|6767676767

Thousands of records are stored in that file. i want to delete a particular record, say "David|1357924680". I am using C, how to delete the particular record efficiently? currently i am using a temporary file to copy the records to that temp file by omitting the record i want to delete. and after copying to temp file, i copy the contents of the temp file to original file by truncating all the contents of the original file. i don't think that i am doing it efficiently. Help me. Thanks in advance.

+6  A: 

Add a column to your data indicating it is either a valid ( 1 ) or deleted ( 0 ) row:

Antony|9876543210|1
Azar|9753186420|1
Branda|1234567890|1
David|1357924680|1
John|6767676767|1

When you want to delete a record, overwrite the single byte:

Antony|9876543210|1
Azar|9753186420|1
Branda|1234567890|0
David|1357924680|1
John|6767676767|1

Branda is now deleted.

Then add a data file compression function which can be used to rewrite the file excluding deleted rows. This could be done during times of low or no usage so it doesn't interfere with regular operations.

Edit

The validity column should probably be the first column so you can skip deleted rows more easily.

Robert S. Barnes
+1: If the file is sorted and records are fixed size then traversing the file can be even more efficient as you could do a binary chop search. But nice trick.
Martin York
+5  A: 

I think your approach is a little bit wrong. If you really want to do it efficiently use a database, for example sqlite. It is a simple to use database in a simple file. But it offers a lot of power of sql and is very efficient. So adding new entries and deleting wont be a problem (also searching will be easy). So check it out: http://www.sqlite.org/ . Here is a 3minutes tutorial which will explain by example how to do everything you are trying to accomplish here: http://www.sqlite.org/quickstart.html .

inf.ig.sh
+1  A: 

Three suggestions:
1. Do it the way you describe, but instead of copying the temporary file back to the original, just delete the original and rename the temporary file. This should run twice as fast.
2. Overwrite the record with 'XXXXXXX' or whatever. This is very fast, but it may not be suitable for your project.
3. Use a balanced binary tree. This is the 'professional' solution. If possible, avoid programming it from scratch!

TonyK
+2  A: 

Some simple ideas to improve efficiency a little bit:

  • You could not copy the temp file back into the original but delete the original after renaming the new one as the original (supposing they are in the same dir)
  • Use an in-memory data structure to copy the files instead of a support temp file (but by doing so you maybe shall limit its size and use it only as a buffer)
  • Mark some records as deleted but do not remove them from the file, then after a certain amount of delete operations you can provide to delete physically the records marked this way (but you shall rewrite your other operations on the file to ignore the marked records)
rano
+2  A: 

I would tell a similar solution that "Robert S. Barnes" gave.

I woud modify David|1357924680 to |--------------- (equal amount of bytes).

  • No need for extra bytes (not much benefit)

  • The data is really deleted. It is useful when needed by security concepts.

Sometime later (daily, weekly, ...) do the same / similar as you do now.

Notinlist
A: 

Since direct editing of a file isn't possible, you have to resort to a method similar to what you are dong now.

As mentioned by few others, maintaining a proper data structure and only writing back at intervals would improve efficiency.

Gunner