I have a Biztalk project that imports an incoming CSV file and dumps it to a database table. The import works fine, but I only need to keep about 200-300 records from a file with upwards of a million rows. My orchestration discards these rows, but the problem is that the flat file I'm importing is still 250MB, and when converted to XML using a regular flat file pipeline, it takes hours to process and sometimes causes the server to run out memory.
Is there something I can do to have the Custom Pipeline itself discard rows I don't care about? The very first item in each CSV row is one of a few strings, and I only want to keep rows that start with a certain string.
Thanks for any help you're able to provide.