Probably that's the most reliable way. There are ways to inspect what is happening inside a process - looking directly at its internal state and memory - but they are platform-specific and very prone to misbehaving because your dealing with a slightly different version of something - that includes a different flash version as well as a different version of the app. Those methods are more often used for "trainers" for exe games, where there's typically only one or two versions of the executable to worry about.
Lots of screen shots, comparing, figuring out reliable indicator pixels seems the way to go - plus keeping track of what you expect to happen, of course. When the app is running, it should work from a screenshot at a time (hopefully ensuring a consistent picture, with no half-updated views) and then test the minimum number of pixels needed using (perhaps) a decision tree.
There are ways to automate construction of efficient decision trees, but it's probably easier to do it manually based on comparing screen shots. In this case, since Tetris normally creates all new pieces at the same position, with a 1:1 relationship between colour and shape, you can probably determine the shape and position of a new piece from a single pixel colour - so "decision tree" is probably the wrong term, really, in this case - though there are other things the bot needs to read from the screen.
What's more interesting is the logic to actually make gameplay decisions, since that bot clearly isn't just slotting every piece into the most immediately obvious position, but deliberately aiming to create opportunities to clear 3 or 4 rows at a time.