views:

45

answers:

3

Hi everyone,

So, I'm working on a database that I will be adding to my future projects as sort of a supporting db, but I'm having a bit of an issue with it, especially the logs.

The database basically needs to be updated once a month. The main table has to be purged and then refilled off of a CSV file. The problem is that Sql Server will generate a log for it which is MEGA big. I was successful in filling it up once, but wanted to test the whole process by purging it and then refilling it.

That's when I get an error that the log file is filled up. It jumps from 88MB (after shrinking via maintenance plan) to 248MB and then stops the process altogether and never completes.

I've capped it's growth at 256MB, incrementing by 16MB, which is why it failed, but in reality I don't need it to log anything at all. Is there a way to just completely bypass logging on any query being run against the database?

Thanks for any responses in advance!

EDIT: Per the suggestions of @mattmc3 I've implemented SqlBulkCopy for the whole procedure. It works AMAZING, except, my loop is somehow crashing on the very last remaining chunk that needs to be inserted. I'm not too sure where I'm going wrong, heck I don't even know if this is a proper loop, so I'd appreciate some help on it.

I do know that its an issue with the very last GetDataTable or SetSqlBulkCopy calls. I'm trying to insert 788189 rows, 788000 get in and the remaining 189 are crashing...

string[] Rows;

using (StreamReader Reader = new StreamReader("C:/?.csv")) {
    Rows = Reader.ReadToEnd().TrimEnd().Split(new char[1] {
        '\n'
     }, StringSplitOptions.RemoveEmptyEntries);
};

int RowsInserted = 0;

using (SqlConnection Connection = new SqlConnection("")) {
    Connection.Open();

    DataTable Table = null;

    while ((RowsInserted < Rows.Length) && ((Rows.Length - RowsInserted) >= 1000)) {
        Table = GetDataTable(Rows.Skip(RowsInserted).Take(1000).ToArray());

        SetSqlBulkCopy(Table, Connection);

        RowsInserted += 1000;
    };

    Table = GetDataTable(Rows.Skip(RowsInserted).ToArray());

    SetSqlBulkCopy(Table, Connection);

    Connection.Close();
};

static DataTable GetDataTable(
    string[] Rows) {
    using (DataTable Table = new DataTable()) {
        Table.Columns.Add(new DataColumn("A"));
        Table.Columns.Add(new DataColumn("B"));
        Table.Columns.Add(new DataColumn("C"));
        Table.Columns.Add(new DataColumn("D"));

        for (short a = 0, b = (short)Rows.Length; a < b; a++) {
            string[] Columns = Rows[a].Split(new char[1] {
                ','
            }, StringSplitOptions.RemoveEmptyEntries);

            DataRow Row = Table.NewRow();

            Row["A"] = Columns[0];
            Row["B"] = Columns[1];
            Row["C"] = Columns[2];
            Row["D"] = Columns[3];

            Table.Rows.Add(Row);
        };

        return (Table);
    };
}

static void SetSqlBulkCopy(
    DataTable Table,
    SqlConnection Connection) {
    using (SqlBulkCopy SqlBulkCopy = new SqlBulkCopy(Connection)) {
        SqlBulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("A", "A"));
        SqlBulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("B", "B"));
        SqlBulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("C", "C"));
        SqlBulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("D", "D"));

        SqlBulkCopy.BatchSize = Table.Rows.Count;
        SqlBulkCopy.DestinationTableName = "E";
        SqlBulkCopy.WriteToServer(Table);
    };
}

EDIT/FINAL CODE: So the app is now finished and works AMAZING, and quite speedy! @mattmc3, thanks for all the help! Here is the final code for anyone who may find it useful:

List<string> Rows = new List<string>();

using (StreamReader Reader = new StreamReader(@"?.csv")) {
    string Line = string.Empty;

    while (!String.IsNullOrWhiteSpace(Line = Reader.ReadLine())) {
        Rows.Add(Line);
    };
};

if (Rows.Count > 0) {
    int RowsInserted = 0;

    DataTable Table = new DataTable();

    Table.Columns.Add(new DataColumn("Id"));
    Table.Columns.Add(new DataColumn("A"));

    while ((RowsInserted < Rows.Count) && ((Rows.Count - RowsInserted) >= 1000)) {
        Table = GetDataTable(Rows.Skip(RowsInserted).Take(1000).ToList(), Table);

        PerformSqlBulkCopy(Table);

        RowsInserted += 1000;

        Table.Clear();
    };

    Table = GetDataTable(Rows.Skip(RowsInserted).ToList(), Table);

    PerformSqlBulkCopy(Table);
};

static DataTable GetDataTable(
    List<string> Rows,
    DataTable Table) {
    for (short a = 0, b = (short)Rows.Count; a < b; a++) {
        string[] Columns = Rows[a].Split(new char[1] {
            ','
        }, StringSplitOptions.RemoveEmptyEntries);

        DataRow Row = Table.NewRow();

        Row["A"] = "";

        Table.Rows.Add(Row);
    };

    return (Table);
}

static void PerformSqlBulkCopy(
    DataTable Table) {
    using (SqlBulkCopy SqlBulkCopy = new SqlBulkCopy(@"", SqlBulkCopyOptions.TableLock)) {
        SqlBulkCopy.BatchSize = Table.Rows.Count;
        SqlBulkCopy.DestinationTableName = "";
        SqlBulkCopy.WriteToServer(Table);
    };
}
A: 

There is no way to bypass using the transaction log in SQL Server.

Raj More
You do have to use the log - that is true. But you can minimize the impact of the logging by picking alternate recovery models and being strategic about how you insert your data.
mattmc3
+1  A: 

You can set the Recover model for each database separately. Maybe the simple recovery model will work for you. The simple model:

Automatically reclaims log space to keep space requirements small, essentially eliminating the need to manage the transaction log space.

Read up on it here.

Aheho
+3  A: 

If you are doing a Bulk Insert into the table in SQL Server, which is how you should be doing this (BCP, Bulk Insert, Insert Into...Select, or in .NET, the SqlBulkCopy class) you can use the "Bulk Logged" recovery model. I highly recommend reading the MSDN articles on recovery models: http://msdn.microsoft.com/en-us/library/ms189275.aspx

mattmc3
@mattmc3, I thought the SqlBulkCopy (which is what I would be using since the operation is being performed by a console app) is supposed to be used from Table to Table?
Alex
Oh no... it's from DataTable to a SQL Server table. You load the data into a System.Data.DataTable object that matches your destination table. You can get that data into the DataTable from a file, from a query, from your busniess objects... however you want. I recommend getting a chunk of 1000 or so records in there, doing the bulk copy via the SqlBulkCopy object, and then clearing the DataTable out and doing another chunk. Behind the scenes, the SqlBulkCopy object is just using the same facilities as a `Bulk Insert` statement. I have done ETL this way for years and it's fast and simple.
mattmc3
Ok. I'm trying to write the code now, haven't ever used DataTables before and I need to remember the good 'ol Sql methods because I haven't used them since Linq came out... :) Anyway, thanks for the help!
Alex
OMG! I just truly realized how much I need to use the Bulk stuff, I left my app running (after a change which I thought would speed it up), played some games with friends for like 3 hours, and I looked at it and it's still running, not even 1/5th the way done... Man, this is the worst code I've ever written apparently...
Alex

related questions