Writing the shortest valid C quine
If you enjoy my work, please donate! I work hard keeping oxasploits running!Bitcoin Donation Address:
bc1qclqhff9dlvmmuqgu4907gh6gxy8wy8yqk596yp
You can also sponsor me on GitHub!
Thank you so much and happy hacking!
For a while, I have had a fascination with code poetry (also things like “perl golf”). Elegance is quite beautiful, really.
What is a quine?
In short, it is a program that prints its own source code, without having access to the source itself. This is a quine written in C, by an unknown programmer. Supposedly it was written in the margin of a book by Ray Toal, a professor from Loyola Marymount University, in the 80s.
Ray’s much more in-depth webpage on quines. Check it out, it’s great.
Note: Since writing the article I read that a computer scientist named Szymon Rusinkiewicz won the IOCCC contest for this same quine in 1994: smr.c.
main(){char*s="main(){char*s=%c%s%c;printf(s,34,s,34);}";printf(s,34,s,34);}
If you compile the code, you will probably get a bunch of warnings, but as expected, when executed, it prints it’s own source. You can verify that it isn’t just reading the source code file by removing the .c file.
λ ~/ cat q.c
main(){char*s="main(){char*s=%c%s%c;printf(s,34,s,34);}";printf(s,34,s,34);}
λ ~/
λ ~/ gcc q.c -o quine
q.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
1 | main(){char*s="main(){char*s=%c%s%c;printf(s,34,s,34);}";printf(s,34,s,34);}
| ^~~~
q.c: In function ‘main’:
q.c:1:58: warning: implicit declaration of function ‘printf’ [-Wimplicit-function-declaration]
1 | main(){char*s="main(){char*s=%c%s%c;printf(s,34,s,34);}";printf(s,34,s,34);}
| ^~~~~~
q.c:1:1: note: include ‘<stdio.h>’ or provide a declaration of ‘printf’
+++ |+#include <stdio.h>
1 | main(){char*s="main(){char*s=%c%s%c;printf(s,34,s,34);}";printf(s,34,s,34);}
q.c:1:58: warning: incompatible implicit declaration of built-in function ‘printf’ [-Wbuiltin-declaration-mismatch]
1 | main(){char*s="main(){char*s=%c%s%c;printf(s,34,s,34);}";printf(s,34,s,34);}
| ^~~~~~
q.c:1:58: note: include ‘<stdio.h>’ or provide a declaration of ‘printf’
λ ~/ ./quine
main(){char*s="main(){char*s=%c%s%c;printf(s,34,s,34);}";printf(s,34,s,34);}%
λ ~/
My foray into writing the shortest C quine
I have quite a bit of code, some of it works… most of it, sadly, does not.
So why not cut the code?
I wondered if gcc
would compile an empty file! Could it be?
So I went ahead and did echo > qui.c
, then I ran gcc qui.c -o quine
and… the compiler promptly projectile vomited all over my workstation. I get to look at what it says, and realize it’s not actually the C compiler, it is the linker ld
, trying to make a call to the function main()
, while… we have none.
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/Scrt1.o: in function `_start':
(.text+0x1b): undefined reference to `main'
collect2: error: ld returned 1 exit status
Let’s try this again, the short explanation is I’m going to tell the compiler to only generate an object, and not try to link it yet, then manually link the object to dead space in memory:
Note: This was originally written using echo > qg.c
, however, morb reminded me that echo spits out a newline by default. Using touch
is shorter.
touch qg.c && gcc qg.c -o gg.o -c && ld gg.o -o quine
So actually, this program will print output to standard error, but it is actually the operating system throwing an exception after my code tries to call undefined memory, and segfaults, but it does, in fact, print nothing to standard out, as it should. You can see what makes up a completely bare-bones ELF like this, which is interesting:
λ ~/ objdump -D quine
quine: file format elf64-x86-64
Disassembly of section .note.gnu.property:
0000000000400120 <__bss_start-0xee0>:
400120: 04 00 add $0x0,%al
400122: 00 00 add %al,(%rax)
400124: 10 00 adc %al,(%rax)
400126: 00 00 add %al,(%rax)
400128: 05 00 00 00 47 add $0x47000000,%eax
40012d: 4e 55 rex.WRX push %rbp
40012f: 00 02 add %al,(%rdx)
400131: 00 00 add %al,(%rax)
400133: c0 04 00 00 rolb $0x0,(%rax,%rax,1)
400137: 00 03 add %al,(%rbx)
400139: 00 00 add %al,(%rax)
40013b: 00 00 add %al,(%rax)
40013d: 00 00 add %al,(%rax)
...
Disassembly of section .comment:
0000000000000000 <.comment>:
0: 47 rex.RXB
1: 43 rex.XB
2: 43 3a 20 rex.XB cmp (%r8),%spl
5: 28 55 62 sub %dl,0x62(%rbp)
8: 75 6e jne 78 <__bss_start-0x400f88>
a: 74 75 je 81 <__bss_start-0x400f7f>
c: 20 31 and %dh,(%rcx)
e: 31 2e xor %ebp,(%rsi)
10: 33 2e xor (%rsi),%ebp
12: 30 2d 31 75 62 75 xor %ch,0x75627531(%rip) # 75627549 <__bss_start+0x75226549>
18: 6e outsb %ds:(%rsi),(%dx)
19: 74 75 je 90 <__bss_start-0x400f70>
1b: 31 7e 32 xor %edi,0x32(%rsi)
1e: 32 2e xor (%rsi),%ch
20: 30 34 29 xor %dh,(%rcx,%rbp,1)
23: 20 31 and %dh,(%rcx)
25: 31 2e xor %ebp,(%rsi)
27: 33 2e xor (%rsi),%ebp
29: 30 00 xor %al,(%rax)
Note: The linker will spit out ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
, which I think is a default entry point that the registers can point to, since I did not define it. I’ll have to keep this in mind for future exploit development that the linker will default to this entry address…
We can now run our code. You’ll see the segfault, but if you want you can direct the stderr file descriptor to /dev/null
, to see if it does, indeed, print its own source code.