views:

838

answers:

3

How do I do on-the-fly search & replace in a Java Stream (input or output)?

I don't want to load the stream into memory or to a file.

I just see the bytes passing by and I need to do some replacements. The sequences being replaced are short (up to 20 bytes).

+1  A: 

You could implement a deterministic finite automaton which looks at each byte once only (e.g. no lookbehind is required), so that you would basically stream the input through a buffer holding max as many characters as the length of your pattern, outputting the pattern on a match or overflowing (non-matched) characters when advancing in the pattern. Runtime is linear after preparation of the pattern.

Wikipedia has some information on pattern matching and how that works in theory.

Lucero
Thank you, @Lucero. I was looking for a library solution.
flybywire
+3  A: 

You can use the class provided here if static replacement rules are enough for you.

denis.zhdanov
+1 looks promising
flybywire
Just a small "academic" note: I looked over the source, and as far as I can tell the runtime the runtime/CPU usage - especially in the worst case - seems to be pretty bad. Assuming for instance 10 patterns of 101 characters, for *each* byte read there could be up to 1000 processing steps (comparare operations) performed. The DFA solution would require only one operation (table lookup). With increasing pattern size and number and input stream length this could be a problem.
Lucero
(However, the source is well done in terms of structure and documentation and there is good test coverage, so please take my comment as suggestion for a better algorithm, I don't mean to criticize the answer.)
Lucero
Thanks, one of the reasons to reference the class here is ability to get a feedback :) Your comment is right, will revise the algorithm.
denis.zhdanov
+1  A: 

This is not really an answer to your question.

You should though, look into the java.nio package.

Take a look at the following examples:

NIO Examples

The first example shows how to do a simple "grep" on a file.

Using the NIO you will not have to worry about a buffer size, just let the regular expression library methods do the heavy lifting.

Mr Jacques