views:

219

answers:

1

First, the background: I'm writing a Ruby app that uses SendGrid to send mass emails. SendGrid uses a custom email header (in JSON format) to set recipients, values to substitute, etc. SendGrid's documentation recommends splitting up the header so that the lines are shorter than 1,000 bytes.

My question, then, is this: given a long JSON string, how can I split it into lines < 1,000 so that lines are split at appropriate places (i.e., after a comma) rather than in the middle of a word?

This is probably unnecessary, but here's an example of the sort of string I'd like to split:

X-SMTPAPI: {"sub": {"pet": ["dog", "cat"]}, "to": ["[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]"]}

Thanks in advance for any help you can provide!

+1  A: 

If you are splitting arbitrary JSON, you'll need to parse the JSON to be able to know where you can split. This is because commas and whitespace can appear inside name/values as well as between tokens. So, I would first parse the string, then re-generate the output either by using something like JSON.pretty_generate or by writing my own generation method that adds line breaks as necessary.

If you know special properties of the JSON you are dealing with, you may not have to do that. For example, if you are certain that no whitespace appears in the name/values and that name/values have a certain max length, you can use a regular expression like this

    str = 'X-SMTPAPI: %s' % json.gsub(/(.{1,72})( +|$\n?)|(.{1,72})/,"\\1\\3\n")

(I took the above from here: http://wiki.sendgrid.com/doku.php?id=smtpapiheader.rb)

However, this will force a split of any token longer than 144 characters (not sure why they picked the number 72 if 1,000 is the limit), and possibly splits in the middle of a value if it has whitespace in it.