views:

173

answers:

2

I am investigating an issue relating to a large log expansion during an ETL process, even though the database is set in bulk logged mode (and it is not running in psuedo simple but truely bulk logged)

Using the ::fn_dblog(null,null) function to examine the transaction log operations and the context of the operation, the log expansion is pretty much entirely down to the logging of a LOP_FORMAT_PAGE operation, on a LCX_Heap context. (97% of the expansion is that operation, appearing in the log over 600k times for a single data load.)

The question is, what is the lop_format_page doing / recording that SQL has done?

Given that, I should be able to reverse the logic and understand what the cause / effect chain is that results in this and be able to alter the ETL if appropriate.

I'm not expecting many people have come across this one, the level of available detail on the operations and context is minimal to none.

+2  A: 

You're correct that this is very thinly (AKA not!) documented. I've done a little poking around inside logs and have done a lot of log-reduction work (mostly by ensuring bulk inserts were actually being done in bulk!). So I know this can be challenging to track down.

My best guess, having seen LOP_FORMAT_PAGE used in context, is that it's clearing out a new page-- for example when splitting an index page once that page is full and another entry needs to be created. So, if this assumption is correct, you may want to track down what may be causing a whole bunch of new pages to get allocated.

Do you know which operations are going on in the ETL while you're seeing the log expansion? It would be helpful to understand this context-- please add that info to your question if possible.

Also, are you able to run and vary your ETL code in a test environment? Instead of figuring out this inscrutable log record definition, it may be easier to isolate the problem by running your ETL while commenting out some steps (or limiting the number of rows affected) and then seeing which change makes the problem go away.

Justin Grant
The context is LCX_Heap so I am taking a guess it is during the ETL process and not the index process where the context would be LCX_Clustered. A heap page shouldn't ever split, just row forward so I am thinking it is not a page split operation / forwarding operation - but can see your thinking. The code is on client site behind a firewall, so I can not get first person access to the logs / control the environment - only submit scripts.It gut feels like the bulk logged mode is not entirely being used by the SSIS package, but I can not see where as yet.
Andrew
Hmmm, I skipped over the LCX_Heap part initially, thanks for correcting me. Is there any code which is dropping the clustered index as part of the ETL? Or creating tables as heaps initially and then creating clustered indexes later? In other words, are there any "transient heaps" in the ETL even if the final schema is all clustered indexes? Or using the SORT_IN_TEMPDB option when building indexes? Or SELECT....INTO tempdb tables? And so on-- I'm not saying that any of these would cause the problem, just throwing out ideas.
Justin Grant
Also, I've had lousy experiences with SSIS and log growth. In one case it was due to a transaction which got started at the start of ETL and included every operation. I fixed it by setting Transaction=NotSupported in the package steps. In the other case, I had not configured insertion tasks correctly, so they were not minimally logged. But neither of these seems to match your behavior, where you're seeing page formats without row insertion in the log. You may also try inserting logging or timestamsps into the ETL code, so you can correlate to the transaction log records.
Justin Grant
There is no CL IX drop, the import is a new week of data into a partitioned table; a blank staging table is created in a staging FG, where SSIS dumps the data into, then a clustered index operation is performed to both index the data and move it to the correct filegroup, to be swapped into the partitioned table.Because the data is moving filegroups, we do not sort in temp, that is switched off. There are no additional NC indexes created in any part of the process either.As a result I do not believe we have any transient heaps in the process, just the one deliberate one for loading into.
Andrew
I've had the client's dump the entire log out after an operation so I have the non-aggregated data so we can look at the other fields in the log, but it is a zipped up 650 meg download, so is not going to be a quick process to decode.I don't think there will be a decent answer to this, but will award you the points / answer for both trying to help and all of your suggestions are all valid for other people getting this kind of issue in similar but different scenarios.
Andrew
Final answer to this, was an exceptionally long backup process meant that the system was backing up during the inload, forcing the log to consider the data loading as a fully logged task.
Andrew
ahhh, makes sense. thanks for following up to resolve this mystery.
Justin Grant
A: 

I think you and Justin are onto the answer, but it is not all that complicated.

The ETL process (Extract, transform, load) is loading data into the db. Naturally, as pages fill up, new ones need to be allocated on the heap.

UncleO
The situation is certainly not simple - the log growth is exceeding the data growth by quite a factor, e.g. 2 gig of data expansion has generated 5 gig of log - the rows per page is quite high and the majority of this growth is this one operation for which there is no documentation.
Andrew