Looking at the data-link level standards, such as PPP general frame format or Ethernet, it's not clear what happens if the checksum is invalid. How does the protocol know where the next frame begins?
Does it just scan for the next occurrence of "flag" (in the case of PPP)? If so, what happens if the packet payload just so happens to contain "flag" itself? My point is that, whether packet-framing or "length" fields are used, it's not clear how to recover from invalid packets where the "length" field could be corrupt or the "framing" bytes could just so happen to be part of the packet payload.
UPDATE: I found what I was looking for (which isn't strictly what I asked about) by looking up "GFP CRC-based framing". According to Communication networks
The GFP receiver synchronizes to the GFP frame boundary through a three-state process. The receiver is initially in the hunt state where it examines four bytes at a time to see if the CRC computed over the first two bytes equals the contents of the next two bytes. If no match is found the GFP moves forward by one byte as GFP assumes octet synchronous transmission given by the physical layer. When the receiver finds a match it moves to the pre-sync state. While in this intermediate state the receiver uses the tentative PLI (payload length indicator) field to determine the location of the next frame boundary. If a target number N of successful frame detection has been achieved, then the receiver moves into the sync state. The sync state is the normal state where the receiver examines each PLI, validates it using cHEC (core header error checking), extracts the payload, and proceeds to the next frame.
In short, each packet begins with "length" and "CRC(length)". There is no need to escape any characters and the packet length is known ahead of time.
There seems to be two major approaches to packet framing:
- encoding schemes (bit/byte stuffing, Manchester encoding, 4b5b, 8b10b, etc)
- unmodified data + checksum (GFP)
The former is safer, the latter is more efficient. Both are prone to errors if the payload just happens to contain a valid packet and line corruption causes the proceeding bytes to contain the "start of frame" byte sequence but that sounds highly improbable. It's difficult to find hard numbers for GFP's robustness, but a lot of modern protocols seem to use it so one can assume that they know what they're doing.