An older text that may be hard to find is Hopcroft and Ullman's "Introduction to Automata Theory, Languages, and Computation". There are a number of editions --- I have heard that '79 is the best, in as much as it pulls the fewest punches in introducing complex stuff. It a legitimate, albeit small textbook, and it introduces the whole field, not just what you're looking for. I suggest this on the likely vain hope that maybe one of those "trickier" proofs other sources leave out may be your key.
As a gentler starting point, there are a few handy "benchmark" languages.
- If your model can recognize the language of all strings where there are the same number of As and Bs in a string, then it is at least more powerful than FSM.
- If it can't, then it may be equivalent to FSM.
- Similarly, if your model can recognize the language of all strings where the are the same number of As, Bs, and Cs in a string, then it is more powerful than a CFG, or PDA.
These "benchmark languages" are really just functions in disguise --- the first is basically asking if 2 unary numbers are equal, the second is asking if 3 unary numbers are equal. They are nice and simple, and are well-known to be above or below the capabilities of particular models. I don't know simple ones for the more complex machines, so you may be on your own.
Note that for the model "LBA", linearly bounded automata, I believe that there is no known natural language that is computable with a TM, but NOT an LBA. This statement is drawn from hazy memories, so don't take it as a formal proof. :)
Note (lastly) that the "benchmark" languages ARE NOT ESTABLISHING EQUALITY. That is to say, if your model cannot compare 2 unary numbers, that does not mean it is necessarily equivalent to a FSM, it could be even weaker. The languages would only establish what it is greater than in power, or less than in power.
On a completely (completely) different track, Sipser's book does do proofs of equivalence between regexes and FSM, as well as between PDAs and CFGs. I'm not sure how helpful that will be, as you are quite vague on what sort of model of computation you're considering, but if you're dead-set on equivalence, those may be good starting points.