views:

221

answers:

1

I've been trying to find information on this, but due to the immaturity of the Spring Integration framework I haven't had much luck.

Here is my desired work flow:

  1. New files are placed in an 'Incoming' directory

  2. Files are picked up using a file:inbound-channel-adapter

  3. The file content is streamed, N lines at a time, to a 'Stage 1' channel, which parses the line into an intermediary (shared) representation.

  4. This parsed line is routed to multiple 'Stage 2' channels.

  5. Each 'Stage 2' channel does its own processing on the N available lines to convert them to a final representation. This channel must have a queue which ensures no Stage 2 channel is overwhelmed in the event that one channel processes significantly slower than the others.

  6. The final representation of the N lines is written to a file. There will be as many output files as there were routing destinations in step 4.

*'N' above stands for any reasonable number of lines to read at a time, from [1, whatever I can fit into memory reasonably], but is guaranteed to always be less than the number of lines in the full file.

How can I accomplish streaming (steps 3, 4, 5) in Spring Integration? It's fairly easy to do without streaming the files, but my files are large enough that I cannot read the entire file into memory.

As a side note, I have a working implementation of this work flow without Spring Integration, but since we're using Spring Integration in other places in our project, I'd like to try it here to see how it performs and how the resulting code compares for length and clarity.

+1  A: 

This is a very interesting use case that I'm sorry I missed for such a long time. It's definitely worth creating an issue for. At the moment we have support in Spring Integration for picking up the files and sending references to them around. There is also some rudimentary support for converting the files to a byte[] or a String.

The answer is that now you would do step 2 in custom java code, sending the chunks off to a Stage 2 channel. I would recommend against sending references to streams around as message payloads.

iwein