I am developing one Blog Website and I was wondering how to save the large blog post data - in a XML file, as a HTML file or directly in database. Any suggestion?
A database would be much better. So save an XML file (or any other file), you need to overwrite the entire thing. A database allows you to add/update a record at a time.
Not to mention a DB is easier to search if you're looking for all blog posts with a certain word or phrase...
Pick one.
I wouldn't suggest HTML, since you may choose to render it some other way at some point, but XML and a DB both have their up-sides and down-sides. XML Files, assuming you mean one post per file, are highly portable, easilly editable, etc. A DB store is easier to search and retrieve and a little less likely accidentally to be deleted.
XML is not a good choice when it comes to saving/loading/serializing/deserializing large data. i would recommend using a database.
A blog post is not large. Images might be.
Some questions:
- What database are you using? If you're using MySQL (ick), you'll probably want to use the TEXT (for <64K) or MEDIUMTEXT (for between 64K and 16M).
- What do you mean by "XML"? XHTML is XML. HTML5 has an XML serialization.
- Do you mean one-file-per-post? I'm assuming you do.
Issues you might consider:
What are the issues you're considering?
- Read performance: Is it faster to fetch a filename from the database and then read the file, or just fetch the data from the database? If you keep it all in the database, you skip some more system calls. You also avoid the "lots of small files" (around or under 4K) problem which most filesystems are bad at.
- Write performance: It might be faster to write a file than to write to the database, simply because the database provides a lot more guarantees (transactional integrity). On the other hand, you'll have to write to the database anyway, so adding more files might mean more seeks.
- Database overheads: Storing more data in the database makes VACUUM ANALYZE take longer.
- Transactions: If a DB write fails, the transaction fails. If the disk gets full, a normal file write will partially complete. Does your code correctly handle that, or does it simply save the start of the post?
- Deleting (related to transactions): You'll need to remember to delete the file too. What if deleting the file fails? What if deleting the row fails?
- Migration: You'll need to copy the database. Do you want to copy lots of small files too?
- Ease of access: Do you want to modify posts in a text editor?
- Orphaned/missing files: What if there are posts without files, or files without posts?