tags:

views:

357

answers:

4

I want to display to a user a large text file (100MB Log Files specifically) via a web interface without requiring the user to have to download the entire file. Obviously returning the entire file to someones web browser would not be sensible, so my theory was to used Ajax to fetch portions of the file depending on the user scrolling through the file, similar to the way Google Maps provides a "window" of the map.

My application server is PHP, and I fairly sure I can perform the appropriate seeks and reads through the file and return the results via XHR to application, but my Ajax framework is dojo and I can't think of any standard dijit that would work and am trying to figure out how best it would be to impliment something.

Should I derive my own widget? Is there already something out there that I am not aware of? If I build my own custom widget, what sort of structure should it take and are there any good resources for developing custom widgets for dojo/dijit? Any other thoughts?

A: 

If the log file is a text file with a consistent line ending, maybe you can fetch it by line number.

I have idea with the algorithm like this:

  1. When page loaded, fetch first 100 line from file. put it in some container, maybe a div, textarea, or using <ul><li>
  2. Put an event handler to know that user have scrolling to the last part of container.
  3. Send AJAX request to get next 100 lines from the file. Pass ithe line offset as parameter (GET or URI Parameter) so the PHP script can get the right part of the file
  4. Put the AJAX response to the end of container, update next AJAX request offset.
  5. If no more lines in file left, return an empty response. AJAX handler should consider this as end of file so will remove event handler in step 2 above.

I don't know much about Dojo. I use jquery tools's scrollable in my application. It's easy to put an event handler when the scroller reach last page, then fetch next item.

Donny Kurnia
Yes, this is similar to what I was thinking, but if I keep loading and loading objects into memory of the browser, I will eventually explode it. I would think I would need to worry about unloading the objects. dojo has a similar object called `dojox.layout.ScrollPane`. I might be able to derive something from that.
Kitson
+1  A: 

This seems to be a tut on what you might need I would suggest that you use an li, because you will end up wanting to perform some actions on each line, most likely each line will be relevant.

Scrolling is nice, but you can also just blit the interface with pagination, meaning they click next page, previous page, and you fetch it, then update the view. That's the easiest method. With scrolling, you'll need to get more above and below the current visible lines for seamless scrolling.

For instance, if you want to show 25 lines, you'll need to fetch 25 + bottom pad on the first go, and define the lines showing in bottom pad as the threshold for signalling a new event to download an extra 25+ bottom pad items.

With a 100mb file, that's gonna get sluggish soon, so you'll have to clear out the previous entries, and define a new top pad to signal a request to get the reverse. That is to say, 1st req: fetch 25 + bottom pad, 2nd req fetch 25 + bottom pad remove prev 25 - top pad.

One thing to note is, when you do this, in firefox at least, it can tend to get wonky and not fire events after a few loads, so you may want to unbind/rebind your even listeners. I only say this because I have a friend who is currently working on something with similar functionality, and these are some of the issues he came across.

No one is going to complain that they have to click next page/previous page, it'll be fast and clean, but mess up your scrolling and no one will want to use your widget.

Here are some other resources on the topic: Old Ajax Scrollable Table -Twitter like load more tut - Good scrolling example, read the source - Check out this googlecode project

Interesting thought about the paging. I honestly hadn't thought about it (and I don't know why). Thanks for the tut, and a rough outline of what would need to be done and a potentially feasible alternative.
Kitson
Glad to help, and don't worry about snazzy features, functionality first is my motto. Good luck on your project!
+1  A: 

I recommend caching.

It should be noted that the solution to this problem should take into account that reading a sufficiently large file (100mb+) from disk is going to be disk bound and likely to outrun any timeout that your web server has set for script execution time. In order to avoid making the user wait an inordinate amount of time to load any portion of the file I would avoid hacks like changing your server's timeout limits.

Here's one possible solution that comes to mind: 1) Cache the file by chopping it up into separate files. You can easily do this in a cron job or even trigger it when the file is written. Use readfile_chunked (http://cn2.php.net/manual/en/function.readfile.php#48683) or similar.

2) Write a service handler script that when invoked from the browser (say './readfile?chunk=##') returns the requested chunk.

3) Use a pagination widgit or a scroller as suggested by the other contributor to make the call to the service handler via AJAX.

Cons: This will inevitably increase the amount of disk space. Pros: Happy users as disk access will be optimized and so will script execution time. Also, it scales well. (on the order of O(n)).

Interesting though, especially the "readfile_chunked", I will look into it. The problem is that there will be in the range of 160,000 such log files of which only maybe 10-20 will ever get looked at by maybe 5-10 users. I already have a process that indexes them in a meaningful way but I want to provide a way to view the logs without needing to download the archive locally and find the 20 or 30 lines the user is looking for.
Kitson
+1  A: 

Have you considered using Dojo Grid for viewing logs? It has built-in support for dynamic loading of 'pages' i.e. rows of data.

Yaroslav
I considered it... It is a bit too complex for what I need and there are some of the UI elements that wouldn't lend itself to working the way I need, but it is funny that you should mention it, because I decided to move forward with writing a custom dojo Widget and Datastore that does what I need and I discovered `dojox.grid._Scroller` that is part of the grid is very similar to what I need and am taking a lot of clues from it.
Kitson