I am new to database programming and want some tips on performance / best practices. I am parsing some websites to scrap television episode infos and placing them into an MS SQL 2008 R2 relational database.
Lets say i have a table filled with type Episode. When i start a new parsing, i generate a new list of Episodes. The thing is, i want the database to match exactly as the new list. Currently I'm doing a mass delete all then insert all. Problem is I'm not sure if this is the best way to go about it, especially since i'm concerned about data persistancy (episode_id primary indexes staying the same for long periods of time).
Is there some easy way to insert any new episodes into the table, update any ones that have changed, and delete any that no longer exist anymore, such that the end result is exactly the same as the new list of episodes. An episode would be compared by the series id, season number, and episode number.
Edit:
A Series type contains a list of multiple different episode types, for instance:
List<TVDBEpisode>
List<TVRageEpisode>
List<TVcomEpisode>
I would parse a single site at a time for instance:
public void ParseTVDB(Series ser)
{
var eps = new List<TVDBEpisode>();
//... Parse tvdb and add each epsiode to this list
//... Make the Series' existing TVDBEpisodes match the new TVDBEpisodes
}
public void ParseTVRage(Series ser)
{
var eps = new List<TVRageEpisode>();
//... Parse tvrage and add each epsiode to this list
//... Make the Series' existing TVRageEpisodes match the new TVRageEpisodes
}
public void ParseTVcom(Series ser)
{
var eps = new List<TVcomEpisode>();
//... Parse tvcom and add each epsiode to this list
//... Make the Series' existing TVcomEpisodes match the new TVcomEpisodes
}