Coder Social home page Coder Social logo

tiny-x64-helloworld's Introduction

Tiny x64 Hello World

A step by step adventure to find out how small a x64 binary can be which prints "hello, world".

  • OS: CentOS 7, Linux 3.10.0-862.el7.x86_64
  • GCC: gcc (Homebrew GCC 5.5.0_7) 5.5.0

Overview

  • make build: build all steps
  • make list: list binary size of all steps
  • final/hello.out is our final result, and it's 170 bytes πŸŽ‰
$ make build && make list
-rwxr-xr-x 1 root root 16712 Dec  4 00:08 step0/hello.out
-rwxr-xr-x 1 root root 14512 Dec  4 00:08 step1/hello.out
-rwxr-xr-x 1 root root 14512 Dec  4 00:08 step2/hello.out
-rwxr-xr-x 1 root root 13664 Dec  4 00:08 step3/hello.out
-rwxr-xr-x 1 root root 12912 Dec  4 00:08 step4/hello.out
-rwxr-xr-x 1 root root   584 Dec  4 00:08 step5/hello.out
-rwxr-xr-x 1 root root   440 Dec  4 00:08 step6/hello.out
-rwxr-xr-x 1 root root   170 Dec  4 00:08 step7/hello.out
-rwxr-xr-x 1 root root   170 Dec  4 00:08 final/hello.out

Step0

This is our first try, the good old program to print "hello, world".

#include <stdio.h>

int
main()
{
  printf("hello, world\n");
  return 0;
}

cd step0 && make build and we get our first hello.out binary.

Unfortunately, it's too big, 16712 bytes!

Step1: Strip Symbols

Let's take an easy move to strip all the symbols.

Let gcc -c do the work and now we get out new binary, it's 14512 bytes.

Still big, but hey, we certainly make a progress, do hurry πŸ˜‰.

Step2: Optimization

Modern compilers can do a lot of "magic" to optimize our program, let's give it a try.

gcc -O3 enable the maximum optimization level and we will find oud that our binary size keeps the same 😒, 14512 bytes.

It actually makes sense though. Our program is too simple, there isn't any room left to optimize.

Step3: Remove Startup Files

Our C program always starts with main, but Do you ever wonder who calls main?

It turns out the main function is being called by something called crt, the C runtime library which is implemented by the compiler.

If we remove it, our binary must be smaller, right?

We need to change our program a little bit.

  • Let's change our entry function name to nomain to make the fact more obviously that we don't use crt
  • Since we don't use crt, we need to explicitly use system call to exit
#include <stdio.h>
#include <unistd.h>

int
nomain()
{
  printf("hello, world\n");
  _exit(0);
}

You must wonder why we are using _exit rather than exit? Good old StackOverflow always helps, check What is the difference between using _exit() & exit() in a conventional Linux fork-exec? .

Use gcc -e nomain -nostartfiles to compiler our program and now our binary is 13664 bytes.

We are making a progress again!

Step4: Remove Standard Library

We can go more wilder. We don't need the crt to do the startup, why do we need to use printf to print? We can certainly do it on our own!

To print something to the terminal, we need to use the write system call. Here is the full x64 system call table.

To directly invoke system call in C, we need to use inline assembly.

char *str = "hello, world\n";

void
myprint()
{
  asm("movq $1, %%rax \n"
      "movq $1, %%rdi \n"
      "movq %0, %%rsi \n"
      "movq $13, %%rdx \n"
      "syscall \n"
      : // no output
      : "r"(str)
      : "rax", "rdi", "rsi", "rdx");
}

void
myexit()
{
  asm("movq $60, %rax \n"
      "xor %rdi, %rdi \n"
      "syscall \n");
}

int
nomain()
{
  myprint();
  myexit();
}

Looks kind of messy, but it's actually very simple code.

gcc -nostdlib to tell GCC we don't want the standard library since we are cool enough to do all the tings by ourselves.

And we get 12912 bytes.

Our program doesn't depend on anyting now, but it's still very big, why????

Step5: Custom Linker Script

Let's examine sctions of our binary.

$ readelf -S -W step4/hello.out
Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        0000000000401000 001000 00006e 00  AX  0   0 16
  [ 2] .rodata           PROGBITS        0000000000402000 002000 00000e 01 AMS  0   0  1
  [ 3] .eh_frame_hdr     PROGBITS        0000000000402010 002010 000024 00   A  0   0  4
  [ 4] .eh_frame         PROGBITS        0000000000402038 002038 000054 00   A  0   0  8
  [ 5] .data             PROGBITS        0000000000404000 003000 000008 00  WA  0   0  8
  [ 6] .comment          PROGBITS        0000000000000000 003008 000022 01  MS  0   0  1
  [ 7] .shstrtab         STRTAB          0000000000000000 00302a 000040 00      0   0  1

Off is kind of strange, some sections start with very big offset.

Maybe linker does some alignment? Check ld --verbose and yes it does!

So our binary is so big because of alignment, if we use xxd to see the binary content we can see that there are a lot of zeroes.

Time to write our own linker script link.lds.

ENTRY(nomain)

SECTIONS
{
  . = 0x8048000 + SIZEOF_HEADERS;

  tiny : { *(.text) *(.data) *(.rodata*) }

  /DISCARD/ : { *(*) }
}

gcc -T link.lds and we get 584 bytes, a huge step πŸ”₯.

Step6: Assembly

Can we do better? There is nothing we can do inside the C world, it's time to move to the lower level.

Let's write some assembly code! It sounds terrfying, but just give it a try, you will find it's actually very interesting.

We are the God of computer, we can control everyting!

section .data
message: db "hello, world", 0xa

section .text

global nomain
nomain:
  mov rax, 1
  mov rdi, 1
  mov rsi, message
  mov rdx, 13
  syscall
  mov rax, 60
  xor rdi, rdi
  syscall

Use nasm -f elf64 to assemble our code and we get 440 bytes.

Step7: Handmade Binary

Is there anyting we can do now? We are at the lowest level, there is no "lower-level" for us to go.

There is no room for our code, but the binary that runs on OS is not just the code. It is a file format called ELF and it contains some extra info.

So maybe we can do something to shrink that extra info?

Or maybe we can write the ELF from scratch? This way, we can control every bit of our binary.

BITS 64
  org 0x400000

ehdr:           ; Elf64_Ehdr
  db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident
  times 8 db 0
  dw  2         ; e_type
  dw  0x3e      ; e_machine
  dd  1         ; e_version
  dq  _start    ; e_entry
  dq  phdr - $$ ; e_phoff
  dq  0         ; e_shoff
  dd  0         ; e_flags
  dw  ehdrsize  ; e_ehsize
  dw  phdrsize  ; e_phentsize
  dw  1         ; e_phnum
  dw  0         ; e_shentsize
  dw  0         ; e_shnum
  dw  0         ; e_shstrndx
ehdrsize  equ  $ - ehdr

phdr:           ; Elf64_Phdr
  dd  1         ; p_type
  dd  5         ; p_flags
  dq  0         ; p_offset
  dq  $$        ; p_vaddr
  dq  $$        ; p_paddr
  dq  filesize  ; p_filesz
  dq  filesize  ; p_memsz
  dq  0x1000    ; p_align
phdrsize  equ  $ - phdr

_start:
  mov rax, 1
  mov rdi, 1
  mov rsi, message
  mov rdx, 13
  syscall
  mov rax, 60
  xor rdi, rdi
  syscall

message: db "hello, world", 0xa

filesize  equ  $ - $$

nasm -f bin to bake our binary and our final result is 170 bytes.

Final Binary Anatomy

And now, we reach the final limit, 170 bytes, there is no way to reduce that any more.

PS: Actually, there is, check the post A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux. I am not gonna use techniques in this post, because they are so "hack".

Now let's see what exactly every byte does in our 170-bytes final binary.

# ELF Header
00:   7f 45 4c 46 02 01 01 00 # e_ident
08:   00 00 00 00 00 00 00 00 # reserved
10:   02 00 # e_type
12:   3e 00 # e_machine
14:   01 00 00 00 # e_version
18:   78 00 40 00 00 00 00 00 # e_entry
20:   40 00 00 00 00 00 00 00 # e_phoff
28:   00 00 00 00 00 00 00 00 # e_shoff
30:   00 00 00 00 # e_flags
34:   40 00 # e_ehsize
36:   38 00 # e_phentsize
38:   01 00 # e_phnum
3a:   00 00 # e_shentsize
3c:   00 00 # e_shnum
3e:   00 00 # e_shstrndx

# Program Header
40:   01 00 00 00 # p_type
44:   05 00 00 00 # p_flags
48:   00 00 00 00 00 00 00 00 # p_offset
50:   00 00 40 00 00 00 00 00 # p_vaddr
58:   00 00 40 00 00 00 00 00 # p_paddr
60:   aa 00 00 00 00 00 00 00 # p_filesz
68:   aa 00 00 00 00 00 00 00 # p_memsz
70:   00 10 00 00 00 00 00 00 # p_align

# Code
78:   b8 01 00 00 00          # mov    $0x1,%eax
7d:   bf 01 00 00 00          # mov    $0x1,%edi
82:   48 be 9d 00 40 00 00 00 00 00    # movabs $0x40009d,%rsi
8c:   ba 0d 00 00 00          # mov    $0xd,%edx
91:   0f 05                   # syscall
93:   b8 3c 00 00 00          # mov    $0x3c,%eax
98:   48 31 ff                # xor    %rdi,%rdi
9b:   0f 05                   # syscall
9d:   68 65 6c 6c 6f 2c 20 77 6f 72 6c 64 0a # "hello, world\n"

tiny-x64-helloworld's People

Contributors

cj1128 avatar

Stargazers

 avatar Meng Jun avatar Lynx avatar Nantsa Montillet avatar  avatar  avatar Frank Luo avatar Havij khor avatar δΈ‡ι‡Œ avatar  avatar Hyram avatar Gabriel Barros avatar  avatar  avatar Max Milton avatar  avatar  avatar mygui avatar wxing avatar Swordzi avatar cj avatar Wentao Zhang avatar  avatar Fity Yang avatar  avatar iamazy avatar zou shengfu avatar insects avatar ypcpy avatar codedump avatar _Chance_Zhang avatar xtdumpling avatar Jck T. avatar A Bite of Human avatar  avatar 源文雨 avatar  avatar Neeraj Kumar avatar Archer avatar Mike Barker avatar Qz Liu avatar Suzy.Misaka avatar sen ✦ avatar zhaoshenglong avatar ε·¦ε…ƒ avatar Dirk Arnez avatar rayworks avatar Jerry Bendy avatar  avatar Boa avatar xiebaiyuan avatar Neil deGrasse Tyson avatar Htao avatar Poverty avatar Jasonkay avatar  avatar stern_stern avatar fan avatar Wengs95 avatar Tder avatar chanshaw avatar ζœ›ε“₯ avatar ansiz avatar  avatar Danny avatar UPO-JZSB avatar  avatar liu lei avatar Heyman avatar Al Zee avatar  avatar  avatar Weijie Yuan avatar

Watchers

evandrix avatar James Cloos avatar  avatar

tiny-x64-helloworld's Issues

step5 make error

tiny-x64-helloworld/step5$ make
gcc hello.c -s -O3 -e nomain -nostartfiles -nostdlib -T link.lds -o hello.out
str' referenced in section .text' of /tmp/cciCzuEw.o: defined in discarded section .data.rel.local' of /tmp/cciCzuEw.o str' referenced in section .text' of /tmp/cciCzuEw.o: defined in discarded section .data.rel.local' of /tmp/cciCzuEw.o
collect2: error: ld returned 1 exit status
make: *** [Makefile:2:build] ι”™θ―― 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.