First of all this is tough thing to solve, so far I didn't come up with a good example but I hope someone here will figure this out. I hope there is known way to solve these kind of problems, or an obscure algorithm.
Scenario:
- In my application I do several requests to the very same webpage
- Webpage has dynamic and random content in it such as (datetime, and quote of the day etc. in theory can be anything)
- Response of this application has got 2 cases, let's call them "TRUE" and "FALSE". For example sometimes response would return a "True Text" sometimes it would be "False Text".
- My application knows 3 samples of "TRUE" case and 3 samples of "FALSE" case, but these are also include random content such as "time" as well.
Challenge
- Now when my application gets a new response how can I understand if this response is an example of "TRUE" or "FALSE" case?
What I've tried
- Process the first sample of TRUE case line by line and generate an integer array from the value of characters
- Do the same thing for second TRUE sample
- Do the same thing for third TRUE sample
- Analyse the differences between these stored TRUE cases and create a new array with
- Now, I know which lines are dynamic (such as datetime), now I create a new final TRUE case array which stores only static lines to a final TRUE case array.
- Now when I got a new case, I create a similar array then compare it with previously stored final TRUE case and if does match (except filtered lines) it's a TRUE case if other lines are massively changed (there is a tolerance value) then it's FALSE.
Limitations and weaknesses of this algorithm is pretty obvious. Although I've got some good results in some cases, but it doesn't work as expected all the time.
My current class works like this:
Dim Analyser AS NEW ContentAnalyzer()
Analyser.AddTrueCase(True1Html)
Analyser.AddTrueCase(True2Html)
Analyser.AddTrueCase(True3Html)
'This will return True if the UnknownHtml is similar to TRUE case, otherwise False
Analyser.IsThisTrue(UnknownHtml)
Sorry the title doesn't make much sense, I couldn't find a good way to describe it.