Reverse Engineering #5 – C++ Calls on Mac OS X ARM 64 | Default Calling Convention

In the dynamic and frequently contested world of cybersecurity, proficiency in reverse engineering is a vital ability, essential for comprehending and neutralizing security risks. This article adds a knowledge to the comprehension of reverse engineering, with the objective of illustrating how to recognize typical function calls in C/C++ through the examination of assembly code in a 64-bit ARM CPU Apple Mac OS X application.

The reference platform is ARMv8-A 64 bits, the processor is an Apple Silicon M2, the C++ compiler is the Apple clang++ version 15.0.0 (clang-1500.1.0.2.5) for Mac OS X ARM 64 bits. The disassembler tool is Hopper Disassembler (version 5.13.5) for Mac OS X ARM. The compilation is executed with default options of correspondent compilers, namely no optimization is invoked.

In this post the focus is only on functions calls on Mac OS X operating system on ARM 64 bits platform, so the analysis of the disassembled code will cover only way to pass parameters and retrieve the result value and nothing else.

c++ code and related assemblY CODE

The following C++ code show 10 examples of functions calls that takes long parameters and return a long:

#include <iostream>

long f1(long a)
{
	std::cout << "f1(" << a << ")" << std::endl;
	return a;
}

long f2(long a, long b)
{
	std::cout << "f2(" << a << ", " << b << ")" << std::endl;
	return a + b;
}

long f3(long a, long b, long c)
{
	std::cout << "f3(" 
		<< a << ", " 
		<< b << ", " 
		<< c << ")" 
		<< std::endl;
	return a + b + c;
}

long f4(long a, long b, long c, long d)
{
	std::cout << "f4(" 
		<< a << ", " 
		<< b << ", " 
		<< c << ", " 
		<< d << ")" 
		<< std::endl;
	return a + b + c + d;
}

long f5(long a, long b, long c, long d, long e)
{
	std::cout << "f5(" 
		<< a << ", " 
		<< b << ", " 
		<< c << ", " 
		<< d << ", " 
		<< e << ")" 
		<< std::endl;
	return a + b + c + d + e;
}

long f6(long a, long b, long c, long d, long e, long f)
{
	std::cout << "f6(" 
		<< a << ", " 
		<< b << ", " 
		<< c << ", " 
		<< d << ", " 
		<< e << ", " 
		<< f << ")" 
		<< std::endl;
	return a + b + c + d + e + f;
}

long f7(long a, long b, long c, long d, long e, long f, long g)
{
	std::cout << "f7(" 
		<< a << ", " 
		<< b << ", " 
		<< c << ", " 
		<< d << ", " 
		<< e << ", " 
		<< f << ", " 
		<< g << ")" 
		<< std::endl;
	return a + b + c + d + e + f + g;
}

long f8(long a, long b, long c, long d, long e, long f, long g, long h)
{
	std::cout << "f8(" 
		<< a << ", " 
		<< b << ", " 
		<< c << ", " 
		<< d << ", " 
		<< e << ", " 
		<< f << ", " 
		<< g << ", " 
		<< h << ")" 
		<< std::endl;
	return a + b + c + d + e + f + g + h;
}

long f9(long a, long b, long c, long d, long e, long f, long g, long h, 
        long i)
{
        std::cout << "f9("
                << a << ", "
                << b << ", "
                << c << ", "
                << d << ", "
                << e << ", "
                << f << ", "
                << g << ", "
                << h << ", "
                << i << ")"
                << std::endl;
        return a + b + c + d + e + f + g + h + i;
}

long f10(long a, long b, long c, long d, long e, long f, long g, long h, 
        long i, long j)
{
        std::cout << "f10("
                << a << ", "
                << b << ", "
                << c << ", "
                << d << ", "
                << e << ", "
                << f << ", "
                << g << ", "
                << h << ", "
                << i << ", "
                << j << ")"
                << std::endl;
        return a + b + c + d + e + f + g + h + i + j;
}

int main()
{
	long z;
	z = f1(0x1000000000000001);
	z = f2(0x1000000000000001, 0x1000000000000002);
	z = f3(0x1000000000000001, 0x1000000000000002, 0x1000000000000003);
	z = f4(0x1000000000000001, 0x1000000000000002, 0x1000000000000003, 0x1000000000000004);
	z = f5(0x1000000000000001, 0x1000000000000002, 0x1000000000000003, 0x1000000000000004,
	       0x1000000000000005);
	z = f6(0x1000000000000001, 0x1000000000000002, 0x1000000000000003, 0x1000000000000004,
	       0x1000000000000005, 0x1000000000000006);
	z = f7(0x1000000000000001, 0x1000000000000002, 0x1000000000000003, 0x1000000000000004,
	       0x1000000000000005, 0x1000000000000006, 0x1000000000000007);
	z = f8(0x1000000000000001, 0x1000000000000002, 0x1000000000000003, 0x1000000000000004,
	       0x1000000000000005, 0x1000000000000006, 0x1000000000000007, 0x8);
	z = f9(0x1000000000000001, 0x1000000000000002, 0x1000000000000003, 0x1000000000000004,
	       0x1000000000000005, 0x1000000000000006, 0x1000000000000007, 0x8, 0x9);
	z = f10(0x1000000000000001, 0x1000000000000002, 0x1000000000000003, 0x1000000000000004,
	        0x1000000000000005, 0x1000000000000006, 0x1000000000000007, 0x8, 0x9, 0xa);
	return 0;
}

The first function f1 takes a long-type parameter as input, the second function f2 takes two long-type parameters, and so on until the last function f10, which takes 10 long-type parameters as input. To see the full code visit my space on GitHub at this address:
https://github.com/ettoremessina/reverse-engineering/tree/main/macosx-silicon/calls/clang
The corresponding assembly code of the main function (that contains the calls to f1…f10 functions) generated by the compiler is as follows:

; ================ B E G I N N I N G O F P R O C E D U R E ================
; Variables:
; saved_fp: 0
; var_4: int32_t, -4
; var_10: int64_t, -16
; var_14: int32_t, -20
; var_20: int64_t, -32
; var_28: int64_t, -40
; var_30: int64_t, -48
; var_38: int64_t, -56
; var_40: int64_t, -64
; var_48: int64_t, -72
; var_50: int64_t, -80
; var_58: int64_t, -88
; var_60: int64_t, -96

_main:

0000000100002fac sub sp, sp, #0x80
0000000100002fb0 stp fp, lr, [sp, #0x70]
0000000100002fb4 add fp, sp, #0x70
0000000100002fb8 mov x0, #0x1
0000000100002fbc movk x0, #0x1000, lsl #48 ; argument #1 for method __Z2f1l
0000000100002fc0 stur x0, [fp, var_20]
0000000100002fc4 mov x8, #0x2
0000000100002fc8 movk x8, #0x1000, lsl #48
0000000100002fcc str x8, [sp, #0x70 + var_58]
0000000100002fd0 mov x8, #0x3
0000000100002fd4 movk x8, #0x1000, lsl #48
0000000100002fd8 str x8, [sp, #0x70 + var_50]
0000000100002fdc mov x8, #0x4
0000000100002fe0 movk x8, #0x1000, lsl #48
0000000100002fe4 str x8, [sp, #0x70 + var_48]
0000000100002fe8 mov x8, #0x5
0000000100002fec movk x8, #0x1000, lsl #48
0000000100002ff0 str x8, [sp, #0x70 + var_40]
0000000100002ff4 mov x8, #0x6
0000000100002ff8 movk x8, #0x1000, lsl #48
0000000100002ffc str x8, [sp, #0x70 + var_38]
0000000100003000 mov x8, #0x7
0000000100003004 movk x8, #0x1000, lsl #48
0000000100003008 stur x8, [fp, var_30]
000000010000300c mov w8, #0x0
0000000100003010 stur w8, [fp, var_14]
0000000100003014 stur wzr, [fp, var_4]
0000000100003018 bl __Z2f1l ; f1(long)
000000010000301c ldr x1, [sp, #0x70 + var_58] ; argument #2 for method __Z2f2ll
0000000100003020 mov x8, x0
0000000100003024 ldur x0, [fp, var_20] ; argument #1 for method __Z2f2ll
0000000100003028 stur x8, [fp, var_10]
000000010000302c bl __Z2f2ll ; f2(long, long)
0000000100003030 ldr x1, [sp, #0x70 + var_58] ; argument #2 for method __Z2f3lll
0000000100003034 ldr x2, [sp, #0x70 + var_50] ; argument #3 for method __Z2f3lll
0000000100003038 mov x8, x0
000000010000303c ldur x0, [fp, var_20] ; argument #1 for method __Z2f3lll
0000000100003040 stur x8, [fp, var_10]
0000000100003044 bl __Z2f3lll ; f3(long, long, long)
0000000100003048 ldr x1, [sp, #0x70 + var_58] ; argument #2 for method
__Z2f4llll

000000010000304c ldr x2, [sp, #0x70 + var_50] ; argument #3 for method
__Z2f4llll
0000000100003050 ldr x3, [sp, #0x70 + var_48] ; argument #4 for method
__Z2f4llll
0000000100003054 mov x8, x0
0000000100003058 ldur x0, [fp, var_20] ; argument #1 for method __Z2f4llll
000000010000305c stur x8, [fp, var_10]
0000000100003060 bl __Z2f4llll ; f4(long, long, long, long)
0000000100003064 ldr x1, [sp, #0x70 + var_58] ; argument #2 for method
__Z2f5lllll
0000000100003068 ldr x2, [sp, #0x70 + var_50] ; argument #3 for method
__Z2f5lllll
000000010000306c ldr x3, [sp, #0x70 + var_48] ; argument #4 for method
__Z2f5lllll
0000000100003070 ldr x4, [sp, #0x70 + var_40] ; argument #5 for method
__Z2f5lllll
0000000100003074 mov x8, x0
0000000100003078 ldur x0, [fp, var_20] ; argument #1 for method __Z2f5lllll
000000010000307c stur x8, [fp, var_10]
0000000100003080 bl __Z2f5lllll ; f5(long, long, long, long, long)
0000000100003084 ldr x1, [sp, #0x70 + var_58] ; argument #2 for method
__Z2f6llllll
0000000100003088 ldr x2, [sp, #0x70 + var_50] ; argument #3 for method
__Z2f6llllll
000000010000308c ldr x3, [sp, #0x70 + var_48] ; argument #4 for method
__Z2f6llllll
0000000100003090 ldr x4, [sp, #0x70 + var_40] ; argument #5 for method
__Z2f6llllll
0000000100003094 ldr x5, [sp, #0x70 + var_38] ; argument #6 for method
__Z2f6llllll
0000000100003098 mov x8, x0
000000010000309c ldur x0, [fp, var_20] ; argument #1 for method __Z2f6llllll
00000001000030a0 stur x8, [fp, var_10]
00000001000030a4 bl __Z2f6llllll ; f6(long, long, long, long, long, long)
00000001000030a8 ldr x1, [sp, #0x70 + var_58] ; argument #2 for method
__Z2f7lllllll
00000001000030ac ldr x2, [sp, #0x70 + var_50] ; argument #3 for method
__Z2f7lllllll
00000001000030b0 ldr x3, [sp, #0x70 + var_48] ; argument #4 for method
__Z2f7lllllll
00000001000030b4 ldr x4, [sp, #0x70 + var_40] ; argument #5 for method
__Z2f7lllllll
00000001000030b8 ldr x5, [sp, #0x70 + var_38] ; argument #6 for method
__Z2f7lllllll
00000001000030bc ldur x6, [fp, var_30] ; argument #7 for method __Z2f7lllllll
00000001000030c0 mov x8, x0
00000001000030c4 ldur x0, [fp, var_20] ; argument #1 for method __Z2f7lllllll
00000001000030c8 stur x8, [fp, var_10]
00000001000030cc bl __Z2f7lllllll ; f7(long, long, long, long, long, long,
long)
00000001000030d0 ldr x1, [sp, #0x70 + var_58] ; argument #2 for method
__Z2f8llllllll
00000001000030d4 ldr x2, [sp, #0x70 + var_50] ; argument #3 for method
__Z2f8llllllll
00000001000030d8 ldr x3, [sp, #0x70 + var_48] ; argument #4 for method
__Z2f8llllllll
00000001000030dc ldr x4, [sp, #0x70 + var_40] ; argument #5 for method
__Z2f8llllllll
00000001000030e0 ldr x5, [sp, #0x70 + var_38] ; argument #6 for method
__Z2f8llllllll
00000001000030e4 ldur x6, [fp, var_30] ; argument #7 for method __Z2f8llllllll
00000001000030e8 mov x8, x0
00000001000030ec ldur x0, [fp, var_20] ; argument #1 for method __Z2f8llllllll
00000001000030f0 stur x8, [fp, var_10]

00000001000030f4 mov x7, #0x8 ; argument #8 for method __Z2f8llllllll
00000001000030f8 stur x7, [fp, var_28]
00000001000030fc bl __Z2f8llllllll ; f8(long, long, long, long, long, long, long,
long)
0000000100003100 ldr x1, [sp, #0x70 + var_58] ; argument #2 for method
__Z2f9lllllllll
0000000100003104 ldr x2, [sp, #0x70 + var_50] ; argument #3 for method
__Z2f9lllllllll
0000000100003108 ldr x3, [sp, #0x70 + var_48] ; argument #4 for method
__Z2f9lllllllll
000000010000310c ldr x4, [sp, #0x70 + var_40] ; argument #5 for method
__Z2f9lllllllll
0000000100003110 ldr x5, [sp, #0x70 + var_38] ; argument #6 for method
__Z2f9lllllllll
0000000100003114 ldur x6, [fp, var_30] ; argument #7 for method __Z2f9lllllllll
0000000100003118 ldur x7, [fp, var_28] ; argument #8 for method __Z2f9lllllllll
000000010000311c mov x8, x0
0000000100003120 ldur x0, [fp, var_20] ; argument #1 for method __Z2f9lllllllll
0000000100003124 stur x8, [fp, var_10]
0000000100003128 mov x9, sp
000000010000312c mov x8, #0x9
0000000100003130 str x8, [sp, #0x70 + var_60]
0000000100003134 str x8, [x9]
0000000100003138 bl __Z2f9lllllllll ; f9(long, long, long, long, long, long, long,
long, long)
000000010000313c ldr x8, [sp, #0x70 + var_60]
0000000100003140 ldr x1, [sp, #0x70 + var_58] ; argument #2 for method
__Z3f10llllllllll
0000000100003144 ldr x2, [sp, #0x70 + var_50] ; argument #3 for method
__Z3f10llllllllll
0000000100003148 ldr x3, [sp, #0x70 + var_48] ; argument #4 for method
__Z3f10llllllllll
000000010000314c ldr x4, [sp, #0x70 + var_40] ; argument #5 for method
__Z3f10llllllllll
0000000100003150 ldr x5, [sp, #0x70 + var_38] ; argument #6 for method
__Z3f10llllllllll
0000000100003154 ldur x6, [fp, var_30] ; argument #7 for method
__Z3f10llllllllll
0000000100003158 ldur x7, [fp, var_28] ; argument #8 for method
__Z3f10llllllllll
000000010000315c mov x9, x0
0000000100003160 ldur x0, [fp, var_20] ; argument #1 for method
__Z3f10llllllllll
0000000100003164 stur x9, [fp, var_10]
0000000100003168 mov x9, sp
000000010000316c str x8, [x9]
0000000100003170 mov x8, #0xa
0000000100003174 str x8, [x9, #0x8]
0000000100003178 bl __Z3f10llllllllll ; f10(long, long, long, long, long, long,
long, long, long, long)
000000010000317c mov x8, x0
0000000100003180 ldur w0, [fp, var_14]
0000000100003184 stur x8, [fp, var_10]
0000000100003188 ldp fp, lr, [sp, #0x70]
000000010000318c add sp, sp, #0x80
0000000100003190 ret
; endp

The interesting code for this post is the one highlighted in bold.

We start by observing that loading the value of the parameter of f1 in a register (x0 in this case): in fact this value is very large, that is 0x1000000000000001, and the instructions are 32-bit long so the immediate operand that would require 64 bits cannot be inserted into a statement, so the compiler implements the value assignment in the register with more instructions. Precisely the compiler used the following trick: first it loads the lower part with mov x0, #0x1 then with movk x0, #0x1000, lsl #48 loads the upper part (k stands for “keep”, meaning that the uninvolved register bits are not altered) taking the immediate #1000 shifted by 48 bits to the left and then writing the higher bits of the register leaving the other bits unchanged ; in this way the value 0x1000000000000001 is loaded on the register x0.

After the code saves the value of x0 on the memory locations addressed by [fp, var_20] with stur build x0, [fp, var_20]; it should be noted that var_20 is an offset label, value -32, fp is instead the frame pointer of the stack, then [fp, var_20] is the memory address obtained from the value of the frame pointer minus 32 bytes.
The compiler, for its implementation, generated the code that prepares the values that will be used in subsequent calls in advance of the calls themselves. It should be noted that the trick described above is applied several times, using the x8 register (instead of x0) and obviously writing on different memory locations, but the technique to save on the registers is the same as described above.

Finally the function f1 is called with the statement bl __Z2f1l; the parameter is passed through the register x0.
The next statement loads the second parameter of the f2 function to x1 from memory locations addressed by [sp, #0x50 + var_50], then copies the value of x0 (which contains the return value of f1) in x8, then loads the first parameter of the function f2 from memory locations [fp, var_20] into x0 and saves x8 (which was copied from x0 before and now x8 contains the return value of f0) into memory locations addressed by [fp, var_10]. Finally call the function f2 with the instruction bl __Z2f2ll.

Apart from the interweaving of instructions between the recovery of the return value of a function and the preparation of the parameters of the next call, it is already understood that the first parameter is passed in the register x0, the second in the x1 register and the return value is passed back to the x0 register.
This scheme is generalizable until the call of the function f8, which has 8 parameters: for the call of f8 the register x0 is used to pass the first parameter, x1 to pass the second, x2 to the third and so on until x7 which is used to pass the eighth parameter; x0 is always the return value.

From the ninth parameter onwards you use the stack, in fact to call f9, while for the first 8 parameters you always follow the same scheme, to stack the ninth parameter is copied the value of the stack pointer in x9 (with mov x9, sp) then copied the value of the parameter (0x9) on the x8 register (with mov x8, #0x9) and then saved on the stack in locations addressed by [sp, #0x70 + var_60] (with str x8, [sp, #0x70 + var_60]).

Reverse Engineering #5 – C++ Calls on Mac OS X ARM 64 | Default Calling Convention

c++ code and related assemblY CODE

Decompilation of Malware Using Windows API for Autostart

Reverse Engineering #4 – C++ Calling Convention in MAC OS X Intel 64-Bits

Reverse Engineering #3 – C++ Calling Convention on Windows x64

Reverse Engineering #2 – Default Calling Convention in C++ & Rust on Linux x64

Reverse Engineering #1 – Arithmetic Operations in C++ & Rust on Linux