views:

819

answers:

6

While writing a Perl script, I got a requirement to write the user names with comma separation in only one line of the file.

That's why I would like to know is there any restriction on the maximum size of the line in the .txt file.

+5  A: 

Text files are just like any other files and newline character is like any othe character, so only the usual filesize restrictions apply (4Gb size limit on older file systems, file must fit on the disk etc.)

You won't encounter any problem reading and writing it, unless you're reading it line by line—you can run out of memory then or encounter a buffer overflow of sorts. This may happen in any text editor or text processing program (such as sed or awk), because, unlike OS kernel, in those line separation matters

I would suggest keeping one user per line, as it's more natural to read and less error-prone when you process the file with an external program.

Pavel Shved
At least kwrite and vi doesnt get affected by the line size (tested on a 4Mb single line XML file)
Cem Kalyoncu
There's certainly a limit. It has to fit on the disk, and if you are reading it line-by-line, it has to fit in memory. In addition to that, you might need large file support to deal with files over 4 Gb.
brian d foy
@brian d foy: since you're more experienced here, over SO, I'm following your advice and playing Captain Obvious, so now my post says that a file on a disk should not exceed the size of that disk. Sigh.
Pavel Shved
Nothing is obvious. You might be able to create a string in Perl that you can't save to your full disk but that you can fit in program memory, and you might not have enough program memory to read an entire file in one go. They are real problems you have to handle when you play with very large strings and file, but most people never think about them.
brian d foy
@brian d foy: okay, you say many right things everyone should remember about. But didn't you notice that the topic has changed from "newlines and OS" to "handling large files"? Are you sure it's the right way to go?
Pavel Shved
Am I sure what is the right way? If you're talking about lines, see my answer.
brian d foy
+3  A: 

There is no size limit except your filesystem's which is most probably 2TB or something.

Cem Kalyoncu
+2  A: 

No, there is no such limit until you hit any file-size limits.

Joe Casadonte
+1  A: 

On some old Unix systems, some text utilities (e.g. join, sort and even some old awk) have a limit on the maximum line size. I think this is the limit of utilities but not the OS. GNU utilities do not have such a limit as far as I know and therefore Linux never has this problem.

I used to have this problem on an old version of IRIX and AIX. Then I installed GNU textutils (not merged to coreutils) in my home directory, which solved the problem
A: 

file size depends on your OS's file system. Tools has no limit for such (or at-least I have never seen so far..)

Kartik Mistry
Some tools have limits because they use a four-byte int address space, which is why there is large file support in some tools.
brian d foy
+3  A: 

The only thing you need to worry about is the size of the file that you can create and the size of the file that you can read.

Computers don't know anything about lines, which is an interpretation of the bytes in a file. We decide that there is some sequence of characters that demarcate the end of a line, and then tell our programs to grab stuff out of the file until it hits that sequence. To us, that's a line.

For instance, you can define a line in your text file to end with a comma:

 $/ = ',';

 while( <DATA> )
    {
    chomp;
    print "Line is: $_\n";
    }

 __DATA__
 a,b,c,d,e,f,g

Even though it looks like I have a single line under __DATA__, it's only because we're used to books. Computers don't read books. Instead, this program thinks everything between commas is a line:

Line is: a
Line is: b
Line is: c
Line is: d
Line is: e
Line is: f
Line is: g
brian d foy