When "working on a new language" and trying to get a reference BNF right, you probably don't want to bias your reference grammar towards any particular parser generator. One of the troubles with writing a test grammar for Bison (LALR(1)) or ANTLR(LL*) is you do just exactly that. You also don't want to get hung up in "how do I code the BNF rules in such a way as make it actually parse" presumably because you are interested in working on the grammar, not working on the parser generator.
So I'd recommend using a full context free parser generator. This will let you write the grammar in the most natural form with the least effort. This might mean giving up "text editor", "editor test window", ... but in my experience (check my stack overflow bio) using a context free parser generator overwhelms those niceties completely. Edit-save-parse just doesn't take a lot of effort.
I understand Bison has a GLR option which would provide context-free parser generation, and is open source, and so it might do for just the testing out the grammar.
The DMS Software Reengineering Toolkit is commercial and also provides a GLR parser, which has been used to implement some 30+ full langauges including C, C++, and COBOL in a number of dialects as well as more modern languages such as Python, Ruby, PHP, ....
The difference between DMS and Bison is that DMS is designed to support all aspects of the construction of a full language analyzer/translator (Unicode lexing, GLR parsing with error reporting and recovery, automatic tree construction, symbol table construction, control and data flow analysis, transformations, prettyprinting, ...). If you wanted to seriously evaluate your "new langauge", you'll eventually need to do all this stuff, and Bison is only a tiny step along this road. DMS will carry you the whole way.