I want to know how to import .mdb (MS Office 07) file into Sql server 2008 by SSIS. I need to run this ETL packet in such a way that it checks for duplicates, and if any doesnt re-insert them, but only inserts new records. If someone has a tutorial link or can explain here in steps would be very helpful
This is a bit extensive to discuss in a forum like this, but...
Generally I do this by importing the data to a work table (not the one you want the data to end up in). It should also have it's own identity column (Useful for separating out duplicates within the data) and a column for your database table record id. Hopefully the data has some type of id field for each record. If it does, then you should have a mapping table that links the datbase ids from your database to the record ids from the Access database. Then it becomes a simple matter of looking for the ids that don't exist in the mapping table and inserting the records associated with them your the production table you are putting the information into. Usually when I do this I add my own id field to the work table and then I insert those records into the mapping table as a final step.
If the data you are receiving has no id field, this is much harder and may be impossible depending on the nature of the natural key or even if you have one (Access databases often being notorious for not following database design principles). If the closest thing you have to a unique identifier is the name/address combination how do you know if John Smith at 10 State Street in Chicago IL is the same person as John Smith at 25 Main Street Chicalgo , IL. He could have moved or it might be a differnt John Smith.
Expanding on HLGEM's answer...
In SSIS create a new Database Flow object and enter into that section (second tab on top).
Create a OLE DB Source object (may be a specific one for Access but the basic OLE DB Source should work) and pick Access as your source and find your .mdb file. Without getting really descriptive it should find the columns and you can format it a little bit to skip headers rows, etc.
Next, create an OLE DB Destination component by dragging it onto the screen and connect the green arrow from the source to it. Open that destination component up and chose to create a new database. It should automatically make column names and assign their types based off of your .mdb database. In this section you can specific which columns you'd like to be keyed which will denote the unique rows as you wanted. If you don't have any fields to go by you will have to explore other options after (see HLGEM's post). If you do have a field you can key on, do so.
Now that you have this created you can specify how you want your error output to be handled on the OLE DB Destination...If you choose redirect and push the data to a file all your duplicate keys will be thrown in there.
Hope this brief summary helps!