tags:

views:

45

answers:

2

I have a list of URLs, each page is a specific category:

http://www.site.com/category-1/page.html
http://www.site.com/category-2/page.html
http://www.site.com/category-3/page.html

On each page are let's say 4 items. I want to extract each item on each page and assign it it's corresponding category number i.e.

category-1_ITEM - CAT-1  
category-1_ITEM - CAT-1  
category-1_ITEM - CAT-1  
category-1_ITEM - CAT-1 

category-2_ITEM - CAT-2 
category-2_ITEM - CAT-2  
category-2_ITEM - CAT-2  
category-2_ITEM - CAT-2  

category-3_ITEM - CAT-3  
category-3_ITEM - CAT-3  
category-3_ITEM - CAT-3  
category-3_ITEM - CAT-3   

I figured this would be pretty straightforward but now I'm having to deal with apparent looping issues, here's the code, I've removed all irrelevant lines for simplicity's sake:

$urls = array(
"http://www.site.com/category-1/page.html",
"http://www.site.com/category-2/page.html",
"http://www.site.com/category-3/page.html"
);

foreach ($urls as $url) {

//Load Page, find items

foreach($items as $item) {

preg_match('#http\:\/\/www\.site\.com\/(.*?)\/page\.html#is',$url,$result);

switch ($result[1]){

case "category-1": $cat = 'CAT-1'; break;
case "category-2": $cat = 'CAT-2'; break;
case "category-3": $cat = 'CAT-3'; break;
}

echo $item . ' - ' . $cat . '<br>';


}
}

This is what it outputs:

category-1_ITEM - CAT-1  
category-1_ITEM - CAT-1  
category-1_ITEM - CAT-1  
category-1_ITEM - CAT-1 

category-1_ITEM - CAT-2  
category-1_ITEM - CAT-2  
category-1_ITEM - CAT-2 
category-1_ITEM - CAT-2 

category-2_ITEM - CAT-2  
category-2_ITEM - CAT-2  
category-2_ITEM - CAT-2 
category-2_ITEM - CAT-2 

category-1_ITEM - CAT-3  
category-1_ITEM - CAT-3  
category-1_ITEM - CAT-3
category-1_ITEM - CAT-3 

category-2_ITEM - CAT-3  
category-2_ITEM - CAT-3  
category-2_ITEM - CAT-3
category-2_ITEM - CAT-3 

category-3_ITEM - CAT-3  
category-3_ITEM - CAT-3  
category-3_ITEM - CAT-3
category-3_ITEM - CAT-3 

Any ideas on what I'm doing wrong? I have a feeling it's a simple mistake, I'm just not seeing it.

A: 
Basiclife
And thank you for your effort as well. :)
Clarissa
+1  A: 

The problem is in this code:

//Load Page, find items

If I may be so bold to make a guess, you're probably doing something like:

$items[] = "some content";
$items[] = "some content";

Not with constants, but the key is what you wrote before the equals sign. All the time you are adding new items to the end of the array, so the first time you have the items from the first page. The second time you add the contents of the second page to that and you have both of them in the array. In other words: you are forgetting to reset $items. Add $items = array() at the beginning of //Load page, find items and you should be fine.

If you are coming from another language, the problem is perhaps better explained in more technical terms: in php code blocks don't create a new scope. Basically only functions do.

Jasper
Indeed! I love a simple solution, thanks!
Clarissa