views:

177

answers:

1

What's a reasonable maximum size that a cross-platform application could allow a text file to get? I understand that this is an oversimplified question so allow me to explain.

My team is implementing a bulk load interface for clients to load data into our database. It will write out a CSV file and then load that file into the appropriate database (at this point either Oracle or SQL Server). We could be dealing with a relatively high number of records.

Is there any limit I should put on the size of these text files before I start breaking it up into multiple text files? Currently, we're deploying to Linux and Windows, but we also have developers using OS X. Plus, some of our clients have somewhat dated versions of these operating systems. I'd imagine that this is dependent upon the OS, file system, and RDBMS that we're connecting to. Rather than trying to set a limit for each individual platform, I'd like to just have one overall limit for simplicity's sake (as long as that limit isn't overly restrictive). Is this even necessary, or is there a cap I can set across the board?

+7  A: 

Most modern systems have no problem handling multi-gigabyte files, but if you want to be cautious, then setting a limit of 2GB can be useful:

  • Even slightly out-of-date file systems have no problem storing 2GB files (for example FAT16)
  • 2GB can be addressed by a signed 32-bit integer, which is used more often than one might think

For the filesystem part this comparison of file systems might be useful (it lists a lot of not-really-widely-used systems as well, 'though).

Joachim Sauer
+1, but I would add that if you have larger file than 2GB, there must be something isn't right in your way to do what you do. =)
Clement Herreman
@Clement: I wouldn't say so. It is sometimes reasonable to store images and other documents in the database and then a database dump can *easily* become bigger than 2GB. And even for DBs without LOBs 2GB is not an unreasonable size for a dump
Joachim Sauer
Heh... so FAT16 is just "slightly" out of date? :-)
Jason Baker
@Jason: heh ... yes, it's ancient, but it's still used surprisingly often.
Joachim Sauer