Finding common blocks | ansaurus

tags:

algorithm

views:

206

answers:

3

+2 Q:

Finding common blocks

I have two files (f1 and f2) containing some text (or binary data).
How can I quickly find common blocks?

e.g.
f1: ABC DEF
f2: XXABC XEF

output:

common blocks:
length 4: "ABC " in f1@0 and f2@2 length 2: "EF" in f1@5 and f2@8

+2 A:

This is a great tool for such purposes.: http://sourceforge.net/projects/duplo/

torial 2008-09-22 20:19:36

+1 A:

Wikipedia has some pseudocode for finding the longest common substring between two sequences of data. In your case, you simply extract all common substring from the table that are not prefixes of other common substrings (i.e. maximal common substrings).

Torsten Marek 2008-09-22 20:25:13

+1 A:

The open-source PMD project has a cut-and-paste detector module which is mentioned on this page: http://pmd.sourceforge.net/integrations.html.

David Medinets 2008-09-23 00:29:59

related questions

How do I find the Excel column name that corresponds to a given integer?

Calculating a cutting list with the least amount of off cut waste.

Red-Black Trees

How to maintain a recursive invariant in a MySQL database?

RFC calculation in Java need help with algorithm

Best word wrap algorithm?

How do you separate game logic from display?

Most effective way for float and double comparison

Choosing a multiplier for a (string) hash function

Optimizing a search algorithm in C

Find the best combination from a given set of multiple sets

What "already invented" algorithm did you invent?

Designing a Calendar system like Google Calendar

How to overload std::swap()

Looking for algorithm that reverses the sprintf() function output

Merge Sort a Linked List

Puzzle: Find largest rectangle (maximal rectangle problem)

graph serialization

Peak detection of measured signal

Big O, how do you calculate/approximate it?

What problems can be solved, or tackled more easily, using graphs and trees?

Followup: "Sorting" colors by distinctiveness

Efficiently get sorted sums of a sorted list

Function for creating color wheels

Fastest way to get value of pi