I need to find all sub-arrays which share any mutual element and merge them into one sub-array. (Implementing in Python but any algorithmic idea would be helpful)
Multidimensional array structure:
categories = {'car':['automobile','auto'],
'bike':['vehicle','motorcycle','motorbike','automobile'],
'software':['computer','macbook','apple','microsoft','mozilla'],
'firefox':['internet','mozilla','browser']
'bicycle':['vehicle']}
I'd like to have 'car', 'bike' and 'bicycle' merged into one list (keep first list's key new list's key could be any of the relevant keys) and 'software' and 'firefox' merged into one list as well.
Performance is crucial.
Best solution I could come with so far is to maintain a flatten one-dimension array of element => list_key (e.g 'automobile' => 'car') and then run the following recursive function for each list in the multidimensional array (pseudocode):
function merge_similar(list_key):
For each element in categories[list_key]:
If flatten_array.has_key(element):
list_to_merge = flatten_array[element]
merge_similar(list_to_merge) /* merge other lists which share an element with our newly found similar list */
categories[list_key] = merge(categories [list_key], categories[list_to_merge])
delete categories[list_to_merge]
Any idea how to improve it's performance?
Thanks!