It's a mixed bag ;-)
The second approach seems better in terms of its output:
It consistently puts a string in the element member of the BinaryTreeNode, whereby the first approach uses a mix of strings and BinaryTreeNode instances for that purpose.
At a minimum, I would suggest the first snippet to be rewritten as
BinaryTreeNode b1 = new BinaryTreeNode("A");
BinaryTreeNode b3 = new BinaryTreeNode("B");
BinaryTreeNode b5 = new BinaryTreeNode("C");
BinaryTreeNode bAB = new BinaryTreeNode("-", b1, b3);
BinaryTreeNode bRoot = new BinaryTreeNode("/", bAB, b5);
Another advantage of the second method is that it doesn't require extra storage for the intermediate nodes (bAB, for example above). This is not a big issue because these variables are reference types, and therefore require little storage and copy very efficiently. The 2nd method essentially build these nodes in-situ, and this could be considered more efficient. (This said, trying to create all intermediate nodes in this fashion would become quickly unworkable as the depth of the tree grows...)
There comes an interesting pause, a teaching moment ;-), where we reflect on the temptations of premature optimization...
On the other hand... One advantage of the first method (with the semantic change indicated), is that it makes for much a more regular outlay of the code, which may help locate issues in the tree structure more readily.
[Edit:] A possible hybrid solution would be to use the solution #2 to define the bottom two layers of the tree (the leaf-nodes and the layer above them), i.e. defining [up to] three nodes per line. The advantage of this would be to roughly halve the number of lines necessary to define a tree (could become a factor as the tree grows), and also to not have to figure out and reference a variable for the most numerous nodes (the leaf-nodes). Now... such a structure/syntax, would be much less flexible when/if the tree was appended to, with the need of re-balancing it etc. (maybe this is why the instructor suggested to hardcode the tree, to give the students a "feel" for the type of operations needed for re-balancing the tree.)
This said,
neither approach is Supercalifragilisticexpialidocious.
A slightly cooler approach may be to read the tree from a text file. The syntax in that text may mirror rather closely the one-named-node-per-line apparent in the layout of approach #1, so one may argue that in of itself it wouldn't be a big gain... Maybe only saving a few keystrokes. The advantages of such a solution however become more apparent when we consider that this de-couples the "data" from the "code". We do not need to recompile the program (or to have umpteen different versions of this binary) to work on different trees, we just point the program a different text files. Also this de-coupling becomes interesting when/if the code is refactored, say the name of the node class is modified. In the extreme this code vs. data independence would allow using a completely different tree library, the only necessary change would be to produce a bit of logic which maps the node construction and assembly after the logical structure of a node is parsed from the file.