ansaurus

Question

How do I write a Java text file viewer for big log files

Answer 1

+2 A:

A typical approach is to use a seekable file reader, make one pass through the log recording an index of line offsets and then present only a window onto a portion of the file as requested.

This reduces both the data you need in quick recall and doesn't load up a widget where 99% of its contents aren't currently visible.

msw 2010-05-20 13:04:08

Answer 2

+4 A:

I'm not sure that NotePad++ actually implements random access, but I think that's the way to go, especially with a log file viewer, which implies that it will be read only.

Since your log viewer will be read only, you can use a read only random access memory mapped file "stream". In Java, this is the FileChannel.

Then just jump around in the file as needed and render to the screen just a scrolling window of the data.

One of the advantages of the FileChannel is that concurrent threads can have the file open, and reading doesn't affect the current file pointer. So, if you're appending to the log file in another thread, it won't be affected.

Another advantage is that you can call the FileChannel's size method to get the file size at any moment.

The problem with mapping memory directly to a random access file, which some text editors allow (such as HxD and UltraEdit), is that any changes directly affect the file. Therefore, changes are immediate (except for write caching), which is something users typically don't want. Instead, users typically don't want their changes made until they click Save. However, since this is just a viewer, you don't have the same concerns.

Marcus Adams 2010-05-20 13:19:55

Thanks, I also saw RandomAccessFile in addition to FileChannel which may prove useful

Hannes de Jager 2010-05-20 13:46:23

Answer 3

A:

I post my test implementation (after following the advice of Marcus Adams and msw) here for your convenience and also for further comments and criticism. Its quite fast.

I've not bothered with Unicode encoding safety. I guess this will be my next question. Any hints on that very welcome.

class LogFileTableModel implements TableModel {

    private final File f;
    private final int lineCount;
    private final String errMsg;
    private final Long[] index;
    private final ByteBuffer linebuf = ByteBuffer.allocate(1024);
    private FileChannel chan;

    public LogFileTableModel(String filename) {
        f = new File(filename);
        String m;
        int l = 1;
        Long[] idx = new Long[] {};
        try {
            FileInputStream in = new FileInputStream(f);
            chan = in.getChannel();
            m = null;
            idx = buildLineIndex();
            l = idx.length;
        } catch (IOException e) {
            m = e.getMessage();
        }
        errMsg = m;
        lineCount = l;
        index = idx;
    }

    private Long[] buildLineIndex() throws IOException {
        List<Long> idx = new LinkedList<Long>();
        idx.add(0L);

        ByteBuffer buf = ByteBuffer.allocate(8 * 1024);
        long offset = 0;
        while (chan.read(buf) != -1) {
            int len = buf.position();
            buf.rewind();            
            int pos = 0;
            byte[] bufA = buf.array();
            while (pos < len) {
                byte c = bufA[pos++];
                if (c == '\n')
                    idx.add(offset + pos);
            }
            offset = chan.position();
        }
        System.out.println("Done Building index");
        return idx.toArray(new Long[] {});
    }

    @Override
    public int getColumnCount() {
        return 2;
    }

    @Override
    public int getRowCount() {
        return lineCount;
    }

    @Override
    public String getColumnName(int columnIndex) {
        switch (columnIndex) {
        case 0:
            return "#";
        case 1:
            return "Name";
        }
        return "";
    }

    @Override
    public Object getValueAt(int rowIndex, int columnIndex) {
        switch (columnIndex) {
            case 0:                
                return String.format("%3d", rowIndex);
            case 1:
                if (errMsg != null)
                    return errMsg;
                try { 
                    Long pos = index[rowIndex];
                    chan.position(pos);
                    chan.read(linebuf);
                    linebuf.rewind();
                    if (rowIndex == lineCount - 1)
                        return new String(linebuf.array());
                    else    
                        return new String(linebuf.array(), 0, (int)(long)(index[rowIndex+1]-pos));
                } catch (Exception e) {
                    return "Error: "+ e.getMessage();
                }
        }            
        return "a";
    }

    @Override
    public Class<?> getColumnClass(int columnIndex) {
        return String.class;
    }

    // ... other methods to make interface complete


}

Hannes de Jager 2010-05-21 14:10:33

Hmmm, ok, seems like my implementation is UTF-8 safe because of UTF-8's inherent self-synchronizing-ness. Checking for '\n' which is binary 00100000 is unique in UTF-8. All bytes that is part of a multi-byte sequence will have at least bit 8 set.

Hannes de Jager 2010-05-24 10:09:49

ansaurus

tags:

views:

answers:

How do I write a Java text file viewer for big log files

related questions