views:

158

answers:

5

Hi,

I am working on integrating affiliate sales into few existing sites. We are using a few merchants who work via different networks (cj, shareasale, linkshare, avantlink).

Now my observation is that all these networks provide data feeds in different formats. But that's not a big problem. My main concern is actually merchants using different titles on same products. I don't want to run into these situations:

a) two listings of the SAME product from N merchants (if titles are just a bit different)

b) one listing of N different products from merchants (if we don't use strict comparison algorithm)

We want to automate everything as much as possible, want to avoid operators scanning listings under question all the time.

How is this problem typically handled?

A: 

How is this problem typically handled?

Such scenarios are typically covered by data warehouse systems like provided by ORACLE, HP, Microsoft, IBM, Netezza or Teradata.

Robert
We use MySQL. Also "handled" is not quite informative to me. How are they handled? I don't think it can be 100% automated and work without producing faulty results.
SODA
A: 

Look for merchants who provide UPC codes in their feeds. They are universal. Plus in AvantLink you can customize your own feed output so that's nice.

Max
Thanks. But see my comment above.
SODA
A: 

I was actually looking at 2 sample data feeds from AvantLink a minute ago. Here's the list of fields they provide (not filtered, so I assume it's everything):

SKU 
Manufacturer 
Id  
Brand Name  
Product Name    
Long Description    
Short Description   
Category    
SubCategory 
Product Group   
Thumb URL   
Image URL   
Buy Link    
Keywords    
Reviews 
Retail Price    
Sale Price  
Brand 
Page Link   
Brand Logo Image    
Product Page View Tracking  
Product Content Widget

I was thinking that yes, having UPC would be (almost) ideal but both stores I was looking at (one of them is REI) don't provide UPC's.

Checked Commission Junction and Sshareasale, a few large merchants, they don't include UPC's either.

SODA
+2  A: 

We have a similar issue with trying to collapse products from multiple merchant feeds. What we do is collapse products based on their brand (or manufacturer) + sku combo.

Our data is pretty messy so we have to do some work to normalize both the brand and the sku so the products collapse nicely. We have a list of brands that we care about and do some work to map brands from the merchant feed into our brand. e.g. If we have an "ACME" brand in our system we might map the following to that brand:

A.C.M.E => ACME
ACME Inc. => ACME
Acme Incorporated => ACME

For skus we usually just strip any non-alphanumeric characters for matching purposes. e.g. all the following would map to the same sku:

abc-123 => abc123
abc.123 => abc123
abc 123 => abc123
ab.c1.23 => abc123

So if we see brand "ACME Inc." and sku "abc-123" in one feed that will collapse with brand "A.C.M.E" and sku "abc 123" from another feed.

As part of the collapsing process we end up with multiple names/images/descriptions/categories/etc... for each collapsed part and need to choose the "best" one to show on the website.

That's a very high level overview of how we handle it.

tpunder
Thanks a lot! I did not know that SKU's would be same for different brands, I thought SKU's are specific to merchants (the way they catalog their inventory internally).
SODA
It depends...Sometimes the merchant supplies the manufacturer sku and sometimes they use their own internal sku. And other times they'll use a variation on the manufacturer sku that includes some brand specific prefix or suffix (e.g. acme-abc-123).If the merchant only gives you their internal sku then it can be really hard to match that product to the same product from other merchants.
tpunder
A: 

I'm dealing with the same issue, I can't find a way to lookup the same product by different merchants.

no upc to match products by (on many feeds), different merchant part/product number, different skus, different product names....it's hard to come up with a reasonable idea to match products

Free Stock Images
I am coming up with a reasonable idea but it's taking time. I want to do this as a service with API.
SODA