views:

23005

answers:

32

We have had an interesting competition once, where everyone would write their implementation of hello world program. One requirement was that is should be less than 20 bytes in compiled form. the winner would be the one whose version is the smallest...

What would be your solution? :)

Platform: 32bit, x86

OS: DOS, Win, GNU/Linux, *BSD

Language: Asm, C, or anything else that compiles into binary executable (i.e. no bash scripts and stuff ;)

A: 

There's a print string function in the BIOS, I'd call that BIOS address with the address of the string, which would follow my small block of code.

I've only done this on an old 8-bit micro and CP/M, but I assume it would work on a PC as the BIOS is similar to CP/M's functions.

JeeBee
BIOS is worse, requires more parameters (so more bytes wasted).
Brian Knoblauch
+2  A: 

The Hello World Literal is what, 13 bytes as is? So that definitely makes a challenge :)

FlySwat
Trite platitude.
orokusaki
+9  A: 

In shell, 19 bytes:

:
echo Hello World

I know you tagged this question 'asm'. I'm just offering this as an example of higher-level coding.

Bill Karwin
Doesn't count, because the runtime has to be included in the byte size :)
FlySwat
Fair enough! :)
Bill Karwin
Why? The question didn't adequately specify OS, execution environment, etc.
JesperE
I think this should count. Technically the asm version uses dos calls. The code that supports those isn't counted in the byte count, so why should it count here.
baash05
LOL! Thanks, but it's no big deal. If the intent of the question was to get x86 asm solutions, that's up to the original poster.
Bill Karwin
+15  A: 
m db 'hello world$'  ; store literal
mov dx, OFFSET m    ; Pointer to m in dx
mov ah, 9           ; Prep for interupt
int 21h;            ; BIOS interupt, will read dx and ah,
                    ; ah=9 makes it print dx

I haven't assembled it yet, but that's about as small as you can make it. This is from memory from doing assembly about a decade ago, so I might have made a syntax error or two.

EDIT: 21 bytes assembled :(

FlySwat
No return to the OS?
Brian Knoblauch
Wasn't in the requirements :D
FlySwat
Fine. Then mine just got a byte shorter! ;)
BoltBait
+8  A: 
title Hello World
dosseg
.model small
.stack 100h
.data
hello_message db 'Hello World',0dh,0ah,'$'
.code
 main  proc
    mov    ax,@data
    mov    ds,ax
    mov    ah,9
    mov    dx,offset hello_message
    int    21h
    mov    ax,4C00h
    int    21h
main  endp
end   main

actually it's from: http://robertsundstrom.wordpress.com/2008/02/17/assembly-x86/

hello world msg edited for smaller executable

and i'm new on this site i wasn't aware i had to post links to everything, he just asked how to do it so i found a resource and gave it to him, what you get for helping i guess.

John T
You get no credit for just posting from http://en.wikiversity.org/wiki/Hello,_world!
FlySwat
+1 for public-domain code reuse
Steven A. Lowe
+271  A: 

20 bytes

        .MODEL  TINY
        .CODE
CODE    SEGMENT BYTE PUBLIC 'CODE'
        ASSUME  CS:CODE,DS:CODE
        ORG     0100H
        DB  'HELLO WORLD$', 0
        INC DH
        MOV AH,9
        INT 21H
        RET

Assemble with Microsoft Macro Assembler using:

ML /AT HELLO.ASM

File size: 20 bytes

That's about as short as it's going to get and still be a well behaved program.

Now, if you just want it shorter and don't care about a few extra characters being output on the screen, you could do it this way:

17 bytes and it does output HELLO WORLD before exiting nicely back to DOS.

 .MODEL  TINY
 .CODE
CODE    SEGMENT BYTE PUBLIC 'CODE'
 ASSUME  CS:CODE,DS:CODE
 ORG     0100H
 MOV    AH,9
 INT    21H
 RET
 DB    'HELLO WORLD$'
CODE    ENDS
BoltBait
Funny, even when using your code, NASM still gives me a 20byte binary.
FlySwat
Sorry, yes, it is 20 bytes when I include the RET command. *blush*
BoltBait
The second snippet doesn't output "HELLO WORLD" under the NTVDM on my system.
Jonas Gulle
wow! 17 bytes is awesome :) the smallest solution I've seen was 19 bytes. it used undocummented DOS int however....
xelurg
@Jonas, it is possible that your PSP had a $ character in it. Usually it won't, but if it does, the smaller program won't work. That's why I said that the program was not 'well behaved'. Try putting a INC DH after the ORG command. It'll work better and still be 19 bytes.
BoltBait
Is that compiling to a COM file? It looks like the first version would execute the code represented by the hello-world string. Admittedly its been a few years since I did assembler.
paxdiablo
@Pax: Yes, it is a com file. And, yes, the data is being executed. It is harmless as none of the instructions affect the critical registers. In fact, I had to include the 0 at the end of the string to complete an instruction so that INC DH would be recognized properly.
BoltBait
@BoltBait, upvoting just because you're sneaky...
paxdiablo
Neither of these work in WinXP's command prompt. When running the COM executable, the DX register is set to the value in the segment registers, not 0100h.
Skizz
The first version is wrong since the 'HELLO WORLD' opcodes include several modifications to the stack so the final 'RET' jumps to an undefined address.
Skizz
Oh, the joy of CP/M legacy :D
Thorbjørn Ravn Andersen
+2  A: 

Unfortunately, Jonathan's solution won't work.

From the reference of int 21h, function 09:

"Sends a string to standard output. The string must end with '$' (ascii 36/24h). The '$'-char is not displayed."

So you have to end your "Hello World" string with a "$", so it's 21 bytes. Then, you have to exit the program (otherwise the behaviour is undefined) with two more instructions:

mov ah,4Ch
int 21h

See boltbait's solution for a working one in less than 20 bytes.

friol
Well, you don't have to clean up afterwards, my program printed helloworld then crashed :) But good call on the terminator. This was a fun trip down memory lane.
FlySwat
I disagree. I have already demonstrated in my post a 19 byte perfectly working program.
BoltBait
Microsoft Assembler 1, Nasm 0.
FlySwat
You're right. The inc dh trick is good :)
friol
There's a couple shorter exits as well. int 20h is fine for a .COM image like this. Even ret is acceptable (returns to the PSP, where there's a "int 20h" instruction!).
Brian Knoblauch
+470  A: 

Very simple:

h

This is from my unreleased "HelloWorld" language. It's an interpreted language, so there's no real compiled form. The language is very simple though - the above is the only valid program, with the well-defined behaviour of printing "Hello World".

Arguably I could have designed it to only accept an empty file as input, but that would have been silly.

Jon Skeet
There is already a language that does this (and more): HQ9+. See http://www.cliff.biffle.org/esoterica/hq9plus.html
Adam Rosenfield
@Adam: HQ9 looks interesting, but a bit too complicated for lil ole me.
Jon Skeet
(Side note... I can't believe this answer has 5 upvotes at the moment. What are you folks on?)
Jon Skeet
Checking the other code-golf questions, I'm amazed that people aren't learning to phrase questions to avoid this type of answer.
JesperE
If this gets accepted, I'm clearly going to have to actually implement the thing :(
Jon Skeet
LOL - would tag as funny but can't bring myself to upvote it
seanb
lol that's great!
Adam
An EE friend of mine implements these answers in hardware so everything is a single instruction. Makes it easy on the programmer as long as you want to do that one thing. :)
brian d foy
doesn't that violate the "compiles into binary executable" requirement
Tristan Havelick
@DrFredEdison: Yes, was edited into the question after I wrote the answer. It did already say "compiled form" which is why I sort of got around it by saying this was as close to a compiled form as you get with the HelloWorld language :)
Jon Skeet
This is funny. I didn't realize you got all your rep from jokes. :)
MusiGenesis
@MusiGenesis: My net rep from this post is 26 - 7 downvotes and 4 upvotes. The other upvotes must have occurred at times when I'd hit the rep limit for the day. If only we could choose when people voted ;)
Jon Skeet
Jon, that gave me the best laugh I have had in quite a while. Thanks.
tyndall
Jon, how can you see the upvotes and downvotes so precisely?
Johannes Schaub - litb
@litb: I look at the Reputation tab, which tells me how much reputation I've *lost* from a question. Divide by 2 to get the downvotes. Add that number to the overall total, and that's the number of upvotes.
Jon Skeet
If the only valid program is h, there is not reason to not compile it to machine code.
Eduardo León
Would this post have gotten so many up-votes if anyone else had posted it? ( I very much doubt it )
Brad Gilbert
@Brad: Probably not. It did get quite a few votes when originally posted though, which was before the "Jon Skeet Facts" question etc.
Jon Skeet
I would downvote this, but my answer was almost as silly. =)
Henk
I am getting low on votes left for the day, but I still had to vote this up. Excellent.
Charlie Flowers
This language is a proper subset of HQ9+
Benson
Wow! This gets 200+ votes vs the serious answer with 120+! ???????!???!?!?!?!?!?!?!?!?!?!?!?!?!
PiPeep
Smart eh? The guy who asked the question said it should have a compiled form that is <= 20 bytes.
@Senthil: "It's an interpreted language, so there's no real compiled form."
Jon Skeet
WTF!! now I know that your reputation is increasing so fast as a result of a stereotype that you always should be right and upvoted.
Mustafa A. Jabbar
Also, if you accepted an empty file then the language would not meet the requirements of a formal language and would be ambiguous.
monksy
Isn't this behavior implemented somewhere in Emacs as well?
McAden
-1 the question clearly states "anything else that compiles into binary executable", sry!
João Portela
it can be even shorter than that
Time Machine
@McAden: Yes.. but it requires an 11-key chord: H, e, l, l, ...
Roger Pate
This is absolutely hilarious!
Bryan Roth
Mine is even short, but not interpreted =D http://stackoverflow.com/questions/284797/hello-world-in-less-than-20-bytes/2613509#2613509
Time Machine
There are 8 (at time of posting) implementations of HQ9+ on [Rosetta Code](http://rosettacode.org/wiki/Category:HQ9%2B_Implementations).
Donal Fellows
You must put the period inside the quotes, according to the English grammar. Damn, the post on 'How to determine if someone is a programmer' really tells the truth.
Time Machine
@Koning: In this case, putting the period within the quotes would give the misleading impression that the period would be printed. Clarity trumps rules in my opinion and that of Fowler.
Jon Skeet
Hoping it's ASCII, because if its Unicode it's twice as long as it needs to be.
Otaku
@Brad Gilbert - "Would this post have gotten so many up-votes if anyone else had posted it?" - [No, others are punished.](http://stackoverflow.com/questions/3499538/code-golf-conways-game-of-life/3522249#3522249)
Andreas_D
+43  A: 

Its highly unlikely the actual compiled executable would be ever only 20 bytes. You have normally lots of OS specific loading code, ELF headers are huge as it is.

The only way to get something running in that short a space would be a "new" compiled format which was merely the string "hello world", that could be injected directly into the parent OS, and somehow the OS just "knew" that that file was to be printed.

But then your getting into technicalities, becuase you are, by proxy, calling an entire kernel to do the real work, and that averages between 2 and 8 MB.

If your going to do something that works without an OS, then you have initialization routines you have to do just to make it work.

And then the question evolves, does "20 bytes" include or exclude the byte-code in the bios?

If you really want to do this, maybe you should have a dedicated hardware architecture, with a hard wired rom chip with the "20 bytes" of code programmed in electronically.

outside these weird technical constraints, I argue that i have my own "interpreted language", known as "cat".

"cat" is also my os, and i use it to create the initial source file

1) Coding, and compiling my bytecode:

 cat > helloworld.txt 
 Hello World^D^D

2) Executing the "compiled" bytecode

 cat helloworld.txt 
 Hello World

see. marvelous. I'll take the gold medal thanks :)

Only 11 bytes.

Late linux trick

$ cd /tmp
$ echo '#!/bin/echo' >> hello
$ chmod u+x hello
$ ./hello world
> ./hello world
$ wc -c ./hello
> 12 ./hello
Kent Fredric
Under DOS, the ".COM" format is pure binary image. The 256 byte PSP is constructed during load. There's no relocation information required. You can have a single byte .COM application under DOS (the ret instruction is a single byte opcode)!
Brian Knoblauch
Well, you can actually shrink a perfectly legal ELF binary quite alot. Not to 20 bytes, buts still: http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html
JesperE
JesperE beat me to the punch. It is possible to shrink ELF down quite a lot, but not to 20 bytes.
Max Lybbert
bonus question: make cat turing complete :)
cobbal
sorry @cobbal, cat only has print implemented, which changes its behaviour depending on environment, but patches welcome.
Kent Fredric
+12  A: 

If you want shortest source code, "Hello World" is a valid program in quite a few languages.

David Thornley
+29  A: 

Are you all using AH=09h/INT21h function?

There is an undocumented INT 29h which echoes the character in AL on the screen. So here is an alternative solution with a loop.

Not "well behaved", but better than BoltBaits which can randomly end prematurely if there is a '$' in the PSP. This program will output "Hello World" among with some junk characters.

17 bytes

  bits 16
  org 100h
next:
  lodsb           ; AC
  int 29h         ; CD29
  loop next       ; E2FB
  ret             ; C3
msg db "Hello World"

Assuming CX is 0 and SI < 106h on entry.

Jonas Gulle
if memory serves me well, winner did use int 29h or int 2eh... it was quite a bit ago, but solution was 19bytes long :)
xelurg
+43  A: 

Here's a 32-byte version using Linux system calls:


.globl _start
_start:
        movb $4, %al
        xor %ebx, %ebx
        inc %ebx
        movl $hello, %ecx
        xor %edx, %edx
        movb $11, %dl
        int $0x80               ;;; sys_write(1, $hello, 11)
        xor %eax, %eax
        inc %eax
        int $0x80               ;;; sys_exit(something)
hello:
        .ascii "Hello world"
When compiled into a minimal ELF file, the full executable is 116 bytes:
00000000  7f 45 4c 46 01 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  02 00 03 00 01 00 00 00  54 80 04 08 34 00 00 00  |........T...4...|
00000020  00 00 00 00 00 00 00 00  34 00 20 00 01 00 00 00  |........4. .....|
00000030  00 00 00 00 01 00 00 00  00 00 00 00 00 80 04 08  |................|
00000040  00 80 04 08 74 00 00 00  74 00 00 00 05 00 00 00  |....t...t.......|
00000050  00 10 00 00 b0 04 31 db  43 b9 69 80 04 08 31 d2  |......1.C.i...1.|
00000060  b2 0b cd 80 31 c0 40 cd  80 48 65 6c 6c 6f 20 77  |[email protected] w|
00000070  6f 72 6c 64                                       |orld|
00000074
Adam Rosenfield
for some reason I couldn't get that code to compile under 160b. ( and that was after STRIP and hand ripping code off the end after helloworld with a hex editor )
Kent Fredric
See http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html . The ELF I gave above is the minimal possible ELF without doing any sneaky trickery from that article. I constructed this ELF by hand, not with any normal compiler or linker.
Adam Rosenfield
Wow, hand crafted elf header.. nice... :) That's a trick I'll have to remember...
dicroce
+55  A: 

I win!

1 byte!

ret                  ; C3

But! First you have to install this little interrupt vector so that we are a part of the operating system :) Works perfectly under the NTVDM too.

Binaries and source code here for a working example. First run "hw_hook.com", then "hello.com".

The RET will end up invoking the INT 20h, so I hook the INT 20h with my "Hello World" interrupt vector.

%macro sethook 3
 push es
 push ds
 xor ax, ax
 mov es, ax
 mov ds, ax
 mov di, (%1 * 4)    ; calculate IDT offset
 mov si, di
 cli
 lodsd               ; get the old handler
 mov [cs:%3], eax    ; save the old handler
 mov ax, %2          ; and the offset to the new handler
 rol eax, 16
 push cs
 pop ax
 rol eax, 16
 stosd               ; set the new handler
 sti
 pop ds
 pop es
%endmacro

 bits 16
 org 100h
_start:
 int 12h             ; get total free memory in ax
 dec ax              ; reserve one kilobyte
 push ds
 push 40h
 pop ds
 mov [ds:13h], ax    ; store the new memory size
 pop ds
 ; now relocate
 push cs
 pop ds
 mov si, hook_size 
 shl ax, 6
 mov es, ax
 xor di, di
 mov cx, hook_size
 rep movsb
 ; setup "hello world" hook :)
 sethook 20h, hello_hook, orig_vector
 mov ax, 4c00h
 int 21h

hello_hook:
 pusha
 pushf
 mov ah, 09h
 mov dx, msg
 int 21h
 popf
 popa
 pushf
 call far [cs:orig_vector]
 iret
 msg db "Hello World$"
 orig_vector dd 0
 hook_size equ $ - hello_hook
Jonas Gulle
As funny as the joke can be, thanks for posting this little code, might actually be helpful to some :D
Vincent Robert
Hmmmm... this makes me want to write "Hello World" to my boot sector
GameFreak
+1 for moving the string out of the domain being measurabled. No bonus just for tying with Jon Skeet.
Thorbjørn Ravn Andersen
lmao that was funny
Shimmy
+10  A: 

0-byte Hello World program

  1. In an empty directory, create a 0-byte file named "Hello World"
  2. Execute it by typing "dir/b" (or "ls" depending on your OS) at the command prompt
Ferruccio
Hrm. But DIR is the program printing Hello World... so really the program is whatever size dir.exe is...
Telos
@Telos: using that logic, you must include the size of the OS as well ;-)
Ferruccio
No, the bigger problem is that the program data is not the file's contents but the filename itself. Therefore, the program is in fact 11 bytes and not 0.
Coding With Style
what if we run the dir outside the directory?the solution is Path dependent, ANSWER REJECTED
Behrooz
+9  A: 

Language? Processor?

I define an language with the following grammar: {Start} -> nil

All statements in this language are defined to print "Hello World" on execution.

The following string is a program in this language that prints "Hello World".

""

Zero bits. I win.

Alan Oursland
sorry you're not a worshipped C# troll, so nobody will vote you up.
Matt Joiner
Haha Matt, nice one! :)
Ahmet Alp Balkan
+6  A: 

The smallest PE files (win32) you can create is 97 bytes.

See the Tiny PE page for all the gory details.

Shane Powell
+264  A: 
henk@korhal ~ $ xxd Hello\ World\! 
0000000: 00                                       .
henk@korhal ~ $ ./Hello\ World\! 
bash: ./Hello World!: cannot execute binary file
henk@korhal ~ $

1 byte binary, prints hello world!

Henk
Haha ;) nice one
Daok
i like this one =) +1 to the best abuse of the operating system
Chii
Couldn't an empty file suffice as well?
Joey
Nope, try it and see.
Henk
An empty file works on NT-based Windows just fine.
Coding With Style
Ah, ok. On Linux you need the null to convince the system that it's not a text file, an empty file is treated as an empty shell script.
Henk
+27  A: 

in DOS:

C:\>debug
-e 10 "Hello World!$"
-a
139B:0100 mov ah,09
139B:0102 mov dl,10
139B:0104 int 21
139B:0106 int 20
139B:0108
-g
Hello World!
Program terminated normally
-

from 0100 to 0108, EIGHT BYTES. Of course the "Hello World!" wasn't really part of the program it was just hanging out in the memory.

I win, competition over.

rawr
HAHA - minus the "go to hell" part!!!
Aaron
+3  A: 

GWBasic / QBasic:

?"Hello World"

tsilb
+1  A: 

I quite like the shell script

#!/bin/ls -1

although it needs to be named "Hello, world!" and be the only file in its directory (hey, at 12 bytes the program is shorter than the output). A little better, and shorter too (7 bytes), is

echo $0

This needs to be named Hello world, but works in any directory (that's gotta' be the weirdest "feature" of a program).

Of course, there's always the empty string as a program in my HelloWorldLanguage in which every string is a program which prints the famous greeting. You can use any of the above as an interpreter for it ;-)

By the way, I think Jon Skeets language is inadequate: it doesn't allow for comments, in-line documentation or in-line test cases. In practice, I think I'd prefer

module Greeter
/**
 * print the given message, defaulting to "Hello world"
 * @argument message The message to be printed
 * @return whether the print was succesful
 * @postcondition Either nothing is printed, or (only if message is non-null) the message is printed
 */
boolean printMessage(String message) {
    try { System.console.OutPut(message); }
    except (InputOutPutException theexception) {
        return false;
    }
    return true;
}
/***
 * void testTestFrameworkIO(String arguments[]) {
 *    assert(printMessage("Hello, world"));
 */

Which is also a valid program in HelloWorldLanguage. I think it's been underengineered: there has to be some applicable design pattern I can use... :D

Jonas Kölker
+3  A: 

BASIC

? "Hello World"

Midhat
+1  A: 

not serious... but today is April 1st:
if letting the shell do the job is valid, just set the prompt to "Hello World" and everytime you type the enter-key you get a new "Hello World" (or change to a directory called that way...) :-)

Carlos Heuberger
+7  A: 

In MSIL

.method private hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       13 (0xd)
  .maxstack  8
  IL_0001:  ldstr      "Hello World"
  IL_0006:  call       void [mscorlib]System.Console::Write(string)
  IL_000c:  ret
}

Well it says 13, doesn't it?
[Edit]
Made the WriteLine to Write..

Binoj Antony
It is 13 + 13 =26
Behrooz
+4  A: 

In good old BASIC:

? "Hello World"
warren
does it have <20 bytes in compiled form?
Ahmet Alp Balkan
@Ahmet Alp Balkan - I don't know about "compiled": BASIC was an interpreted language
warren
+1  A: 

Using DOS scripting:

echo Hello World

Done :-)

The Elite Gentleman
+1  A: 

I know you said no interpreted languages, but...

In IRB, a Ruby shell, 13 bytes: "Hello world"

fahadsadah
+3  A: 
~$ HELLO_WORLD 
HELLO_WORLD: command not found
peak
+3  A: 

Create a file called void main(){puts("Hello, World!");}, assuming your filesystem will allow it. Its contents should be:

A

Compile it with -istdio.h and -DA=__FILE__

If your filesystem supports newlines in giant filenames, you can write entire programs like this.

Time Machine
`main` returns `int`! And you might as well leave the return value off if you're going for implicit declarations.
GMan
A: 
Hello World  

BrainFuck

Below is another program to print Hello World in BrainFuck

+++++ +++++             initialize counter (cell #0) to 10
[                       use loop to set the next four cells to 70/100/30/10
    > +++++ ++              add  7 to cell #1
    > +++++ +++++           add 10 to cell #2 
    > +++                   add  3 to cell #3
    > +                     add  1 to cell #4
    <<<< -                  decrement counter (cell #0)
]                   
> ++ .                  print 'H'
> + .                   print 'e'
+++++ ++ .              print 'l'
.                       print 'l'
+++ .                   print 'o'
> ++ .                  print ' '
<< +++++ +++++ +++++ .  print 'W'
> .                     print 'o'
+++ .                   print 'r'
----- - .               print 'l'
----- --- .             print 'd'
> + .                   print '!'
> .                     print '\n'

I Won, because all of you used more charachters.

Behrooz
Note:BF has many compilers.
Behrooz
Why downvotes?Did i break any rules?
Behrooz
That BFck contains no operations, the output is a string with the size equal to zero.
Frank
@Frank:Nope, the output is Exactly equal to Source because i havn't used .,[]<>+-
Behrooz
Nice, I've had a lot of fun with BFck :) I'd forgotten all about a past implementation I wrote :D
KennyCason
+2  A: 

Open a Notepad, write only Hello World

File > Save As > a.cmd

11 Bytes

Rastro
A: 

No bash, no batch, no asm, no any exotic solution, just python:

print"Hello World"
Csaryus
A: 

Many of these answers stretch the boundaries of what is to be accepted. Numerous implementations, albeit clever, rely on the OS's underlying functionalities. There is no reasonable reason to include such submissions, yet there is also no concise and unequivocal criteria for seclusion of code that relies entirely too much on compiler/interpreter/assembler/OS dependencies.

So then the only real distinction we may use to discern "valid" code is OS portability. The code must be able to run on some form of popular operating system.

But then a new conundrum appears: How much of the system's built in resources are we allowed? Does simply writing "echo Hello World" count? However, I guess this is addressed by the language limitation that it must be compilable. In which case, plenty submissions seems to have missed this restriction. :)

In that case, the only discernibly sane approach is the following:

Rewire the CPU datapath to interpret all instructions as output register 0, then using a powerful magnet and steady hands, burn the "Hello World" representation into register 0 bit by bit.

(Actually, Hello World can not possibly fit onto even a 64bit register, ok use the entire register block).

There you go, a real 0 byte implementation. :)

Razor Storm