As is often the case, Wikipedia has an answer:
Architectures that did not have eight-bit bytes include the CDC 6000 series scientific mainframes that divided their 60-bit floating-point words into 10 six-bit bytes. These bytes conveniently held character data from punched Hollerith cards, typically the upper-case alphabet and decimal digits. CDC also often referred to 12-bit quantities as bytes, each holding two 6-bit display code characters, due to the 12-bit I/O architecture of the machine. The PDP-10 used assembly instructions LDB and DPB to load and deposit bytes of any width from 1 to 36-bits—these operations survive today in Common Lisp. Bytes of six, seven, or nine bits were used on some computers, for example within the 36-bit word of the PDP-10. The UNIVAC 1100/2200 series computers (now Unisys) addressed in both 6-bit (Fieldata) and nine-bit (ASCII) modes within its 36-bit word. Telex machines used 5 bits to encode a character.
Factors behind the ubiquity of the eight bit byte include the popularity of the IBM System/360 architecture, introduced in the 1960s, and the 8-bit microprocessors, introduced in the 1970s. The term octet unambiguously specifies an eight-bit byte (such as in protocol definitions, for example).
It's a hardware architecture, not OS, thing. All architectures that you're going to run into, day to day, use eight bits per byte. There may be modern exceptions, particularly in the extremes (mainframe, super, embedded), but I'm not aware of any.