tags:

views:

263

answers:

1

When you submit a feed of products to Amazon it will return a error report that will contain suggestions for products that were not matched, like so:

18  998 8042 Error "SKU '998'ASIN B0001FSZ6K  'item_name' Merchant value: 'Promax Nutrition Promax Bar - Mocha Blast' Amazon catalog value: 'Promax Bars, Mocha Blast 12 bars'.   ASIN B00024D3HQ  'manufacturer' Merchant value: 'Promax Nutrition' Amazon catalog value: 'PROMAX'  'item_name' Merchant value: 'Promax Nutrition Promax Bar - Mocha Blast' Amazon catalog value: 'Promax Gourmet Bar Mocha Blast 12 bars'."

In this example "Merchant value:" is what was submitted and "Amazon catalog value:" is a close match.

I'm trying to extract these values so we can adjust our feed submission and resend the erroneous items. I've hacked together this regex:

(Merchant value\:) (')(.+?)(')( Amazon catalog value\: )(')(.+?)(ASIN)

This almost works. I can pull group 3 and 7 and have my values... sometimes. Unfortunately some of the suggestions look like:

Merchant value: 'Barleans' Amazon catalog value: 'Barlean's Organic Oils'

So the tick will break the match. I assume I can change the end match to "' " and ignore the "'s", but i'm not sure if my regex is at all efficient and I know it's prone to breaking on the corner cases.

Edit to clarify:

The data I'm trying to get would be:

SKU (998 in the first example) then for each suggestion: Field (itemname or manufacturer) Our Value Amazon's Suggested Value

So it would be like:

998 | itemname | Promax Nutrition Promax Bar - Mocha Blast | Promax Bars, Mocha Blast 12 bars

+2  A: 

The references I've been able to find on the web for this sort of Amazon error message show the format as follows:

Definition:

<@foreach ASIN><ASIN> <@foreach attribute that differs><ATTRIBUTE> Merchant value: “<merch val>” Amazon catalog value: “<Amazon catalog value>”;</foreach>.</foreach>

Example:

SKU “a1b2d4” could correspond to multiple ASINs in the Amazon catalog. Please review the possible matches below.

If your product is the same as the product on one of these detail pages, please modify your product data to reflect the Amazon catalog values and resubmit. If your product is different than what is on these detail pages, please provide more data and resubmit.

ASIN B1234567890 “UPC” Merchant value: “12345678901” Amazon catalog value: “00000000000” “TITLE” Merchant value: “Good Bad Ugly” Amazon catalog value: “The Good, the Bad, and the Ugly”.

ASIN B0987654321 “UPC” Merchant value: “12345678901” Amazon catalog value: “12345678901” “TITLE” Merchant value: “Good Bad Ugly” Amazon catalog value: “For a few dollars more.”

It looks to me like you're doing some processing on it to convert the smart-quotes ( and ) to single quotes. If you're going to do that, you'll have to escape apostrophes somehow yourself. It may be easier to convert them to double-quotes instead of single, or just parse it with the smart-quotes intact.

It's a simple regular expression if you leave the smart-quotes alone, but impossible if you convert them to single-quotes and don't do anything to apostrophes inside the strings.

Chad Birch
+1 for looking it up. BTW: you can edit your original post instead of posting a new answer even if the original one is deleted. Just undelete the original for a second, click "edit" in a new tab, and delete it again. Edit your stuff, and undelete shortly before you click save.
Tomalak
Yeah, I wasn't sure if it's better to undelete and edit an old, wrong answer with now-irrelevant comments or just post a fresh one. I also always forget that 10k+ people can see the deleted ones, so it's probably more annoying for you this way.
Chad Birch
I see. In the good old times you could delete all comments on your own posts at least, but they changed that recently methinks.
Tomalak