views:

306

answers:

6

OK, here is goes. I’m not a database guru or admin. In fact, besides some occasional index / query tuning, I don’t poke around too often in databases. One of the things which often eludes me is the SQL Server transaction log. I know what it’s for, what it contains and how it works (at least conceptually), but I think I don’t get why SQL Server seems to be so attached to transaction logs.

Here is the first question. Correct me if I’m wrong, but it seems to me by default transaction logs will simply contains the entire history of all changes in the database. There are two indications that this is may be indeed the case. When I create a new database, the maximum size of its log is set to "unrestricted growth". The second reason is that I have often dealt with tiny databases with huge transaction logs which could not be shrunk no matter what I did. It seems to be so odd I cannot believe it’s true. Why would I ever want the entire history by default? All I care about is the latest version of the data in a consistent state. Well, I suspect that there may be valid reasons for it is some cases, but I would consider this as an extra option.

My second question is why is so complicated to get rid of a transition log? Is it just me, or is there really no direct way to do it? Just recently, I was trying to get rid of a 100MB+ log of a 5MB database, and the simplest way I found was to detach the database, delete the log and re-attach it again (and even that SQL Serve complained a bit). I tried the shrink command with all possible options I could find, but I was able to shrink in only to about 50%. The database was not in use (no active connections), and I honestly did not care about any past transitions at all. I noticed that there are possibly some other "ways" how to do it; some involving backups and restores.

I diligently tried to read the MSDN documentation and learn something more about transitions, but after about 15 minutes which felt like walking in mud in circles, I gave up. I know that to database admins and gurus my questions will sound silly. I appreciate any feedback.

Edit: After the first answers, I realized that I may not been clear enough. I’m aware how a transaction log works during transactions, and why it’s important, and that it can be used for backup purposes. I think I wanted to ask more from the developer point of view. Most of the time I deal with staging / test temporary databases which don’t need any backup, and which nobody is using except me, and I often find myself needing to transfer it and having a huge transitional log is an unnecessary inconvenience at that situation.

A: 

Most relational database servers maintain a transaction log that records all the changes made to the database. The idea is that if the database server goes down, you can "replay" all the changes made to the database from the last database backup, which should be a known state.

It has been a while since I worked with SQL Server, so I can't answer why it's so hard to get rid of a transaction log. In the SQL Server 7, SQL server 2000 days, taking a full backup of the database would truncate the transaction log by default.

dthrasher
+7  A: 

The database log is not some after the fact history record, like an IIS log. In databases with Write-Ahead Logging the log is the primary restore, redo and undo source, and for all purposes the authoritative source of data. Deleting or replacing the log is one of the worst decisions somebody administering a database can do. Your database may be corrupted at this moment.

The proper way to truncate the log is to do a full database backup, followed by a database log backup, followed by periodical log backups. This will free space in the log and allow for its reuse. It will not shrink the physical LDF file.

Another avenue is to change the recovery model to SIMPLE, which will allow the server to automatically recycle the log file as it sees fit. Changing the recovery model to SIMPLE has implications on the recoverability of the database as you won't be able to apply certain disaster recovery scenarios like point in time recovery nor will you be able to recover any data changes since last backup.

Understanding Logging and Recovery in SQL Server
Misconceptions around the log and log backups: how to convince yourself
Inside the Storage Engine: More on the circular nature of the log

Remus Rusanu
Thank you for a very detailed answer. Just to confirm: So is it true that, if I don’t do anything with a transaction log and leave all options on defaults, it will grow forever and eventually contain a complete history of all changes in the database (in some form)?
Jan Zich
No, is not true. A new database will have the recovery mode SIMPLE and will recycle the log. A database that has *changed* the recovery model but never take a backup will still be in the 'pseudo-simple' mode and still not grow the log. Only after the recovery model is changed and a backup is taken will the log start to grow.
Remus Rusanu
Can you please elaborate on the "pseudo-simple" mode or point me to some links? I thought I saw something somewhere about it, but I cannot find it anymore.
Jan Zich
How is it possible that a 100MB+ log could not be shrunk by more than 50% under the simple mode? The database in question was used for a long time by anybody else then me and I was using it mostly or read only operations.
Jan Zich
Pseudo-simple: http://www.sqlskills.com/BLOGS/PAUL/post/New-script-is-that-database-REALLY-in-the-FULL-recovery-mode.aspx
Remus Rusanu
If you believe your log is not being re-used, check the `log_reuse_wait_desc` column in sys.databases http://technet.microsoft.com/en-us/library/ms178534.aspx
Remus Rusanu
+2  A: 

Transaction logs exist for two main reasons

1 - To allow the DB to "rollback" a current transaction 2 - To allow a crashed DB to recover it's state up to the last committed transaction before the crash

1 is transient, if you have no active transactions then it will be consuming little space. 2 however, as you surmise continues to grow over time, maybe without bound.

The idea is that you backup the transaction log every so often to a dump file. A dump file is a complete state snapshot of a DB that you can recover the ENTIRE DB from at a point in time. When you do this the transaction log is truncated and starts again from empty.

If the DB subsequently crashes you load the DB snapshot from the dump file and then apply the transaction log, which contains the incremental changes since the last snapshot, on top to recover the state. For this reason transaction logs need to be kept on high resiliance storage otherwise you can't recover the DB.

So by taking regular db dumps your transaction log is repeatedly cleared down.

There is another option, which I wouldn't recommend, which is to disable transaction logs on your DB. This basically eliminates 2 above so the transaction log doesn't grow over time. Of course if your DB crashes you can only recover from the last full backup/dump so this option carries considerable risk if you are running anything even semi-important.

HTH.

Mike Q
+2  A: 

You have a couple of incorrect thoughts on what is or is not within the logs, depending on your recovery mode the log may only contain the current uncommitted transactions, so they may be rolled back (simple mode) or might contain only minimally logged operations, but the resulting transaction log backups will be larger, since they will contain the extents altered between the log backups (bulk logged mode) or could contain all the log information for each committed transacation since the last log backup.

The unrestricted growth is just a default and it not best practise to let a log just arbitarily grow, since this causes log fragmentation by increasing the number of vlf, virtual log files within the actual log file. After you get too many of those, performance of certain operations, such as replication suffers.

In the case where the log file is larger than the database and after a backup is unable to be shrunk, then there is potentially an open transaction on the system holding the VLF activem which is preventing it from re-using the space.

Dettaching the database and reattaching it after killing the log is a pretty bad thing to do to your database, it could not be in a transactionally consistent state and you really want to run a dbcc checkdb after doing that. Yes, you had no active connections but it is a risky thing to do and pretty much a last resort when you have had log corruption, not a first thing on the list.

It is considered that bad, that within 2005 there is even a flag you can not alter which indicates you have done it. The MS support can check if you were after their help with a potential bug in SQL.

In terms of getting rid of the log, it is the log that enables a DBA to provide a restore strategy that can restore a database to the near exact point in time that a hardware failure / corruption destroyed the database. Without the transaction log you would only be able to recover to the last full backup. With the logs you can replay the transactions like a fast forwarding video recorder.

The topic is very large and as you have seen complex, but it is worth learning about.

Andrew
+1  A: 

(This is MS-SQL, right?)

Yeah, I bet the server did complain when you detached and deleted the log manually.

If you don't care to keep a T-log, change the catalog's recovery mode to "Simple" (in catalog's properties) this will truncate the log as transactions are committed successfully.

If you DO want the log (most people do, they are quite handy to have!) the log will be truncated when it is backed up.

(Transaction logs, FTW!)

st3in
A: 

I run my backup every night. So, all my LDF files are about 1-2MB in size. No wonder.

Yasutaka