I have a arbitrarily large string of text from the user that needs to be split into 10k chunks (potentially adjustable value) and sent off to another system for processing.
- Chunks cannot be longer than 10k (or other arbitrary value)
- Text should be broken with natural language context in mind
- split on punctuation when possible
- split on spaces if no punction exists
- break a word as a last resort
I'm trying not to re-invent the wheel with this, any suggestions before I roll this from scratch?
Using C#.