(Converting a short remark with lots of follow on questions from OP into a long answer:)
I entered bootstrap loaders through switches on Data General Novas (16 bit machines) in 1969. For an Packard Bell 250 that I toyed with in 1967, there was a Flexowriter equipped with a paper tape reader, to load the program in the delay line memory.
For the Nova, there was one row of 16 toggle switches, and a half dozen other switches whose functions were "Set Memory Address", "Deposit Word and advance memory address", "Single Step" (we used that a lot for early debugging!), "Run" and "Halt", IIRC.
The Nova initially had 4096 words of 16 bit memory, the worlds ugliest RISC instruction set with 4 16 bit registers (stack? who needs a stinking stack?), and something like a 1 microsecond memory cycle time (instructions took several cycles). The machine had a "serial port" that connected to an ASR33 Teletype (10 characters per second), and we splurged for a high-speed paper tape reader and punch.
It initially came as bare hardware, with a paper tape containing a two-pass assembler that read source from, well, the paper tape reader. We initially wrote assembly code source onto punched paper tape directly with the ASR33 Teletype; this including "rubout" punches to wipe out characters that were wrong. Editing was incredibly painful. A cohort wrote a bad editor as the very first thing, that read a paper tape into a buffer in the computer, let us find lines, delete lines, and insert lines
My first job was to implement a Kemeny&Kurtz style BASIC interpreter on this. Two years later we added 4K more words and 100K word head-per-track disk, and I built a multiuser timesharing system on it, both written with that ugly editor. Ugh. I built OSes on various machines for the next 20 years.
The Linkers/Loaders books talks about "linkers", that combine multiple object files. Our assembler didn't bother; it produced, for your assembly code, a directly loadable binary paper tape. That tape was designed to be loaded by a tiny program that was entered, well, back to the opening of this discussion, a 30-odd word bootstrap program entered from the front panel. After each program run (usually ending in "crash") we usually had to re-enter that boot loader; we got pretty good at since we often did it several times a day.
Regarding coding forms: people did write code on such forms, and then somebody else often "punched" them onto paper tape. The purpose of the form was to make sure the person doing the punching put stuff where you said, and made you the coder rather more careful about pensmanship in what you coded (I still print block capitals pretty neatly). It didn't take us long to realize that you could code free-form, and after that I did most of my coding on graph paper with faint lines so it looked sort of neat before typing it into the editor.
You couldn't effectively hack online with the text editor; it didn't hold enough text for you to cut and paste arbitrarily, so you had to edit in one pass and make changes from beginning to end in order. Consequently, newspaper style editing on paper as you coded intially was far more effective. After first coding, we'd run program listings and then write on those listing what edits we wanted to make, before we went and made them. Screen editors with giant buffers are much nicer!
The Packard Bell 250 was similar to the Data General machine, except it (remarkably) had a built-in loader program and came with an editor. Because it had a delay line memory (think of this as sometihing like a disk track), you place instructions along the delay line such that when the previous instrution was done, the next instruction just so happened to be coming under the "heads". Talk about painful to optimize... It had one great property; when you got frustrated, you could just whack the CPU and the mechanical shock would make the machine forget everything. Very satisyfing for a moment. Then you had to start over.
Ah, be associated with "ye olden days".