‘tis the Season
July 8, 2024
Jonathan Levin, Co-Founder and CTO
Another year, another set of *OS updates. Apple has released the initial beta versions of iOS/iPadOS/tvOS 18, macOS 10.20 (="15"), watchOS 11, and VisionOS 2. Security researchers and reverse engineers, including our team at Dataflow Forensics, look through these betas for any indication of undocumented features, patches and more.
The sources for XNU (now in version 11215) will eventually be released, along with some of the other Darwin components, on Apple's GitHub pages (formerly, https://opensource.apple.com). This might take a while, however, and even then some components (including architecture specific portions of XNU) remain in closed source. This makes reversing especially important during this "dark period", but also later during other phases of the *OS SDLC.
Back when I was still actively into Darwin book writing and research, I used to maintain what has become the "unofficial Darwin ChangeLog". Unlike the official one, mine detailed the low-level, often intentionally undocumented updates to Darwin, with an emphasis on kernel-level changes. Today's post is an attempt at nostalgia, with a twist for Darwin 24.
As with the other posts by my DFFender colleagues, the aim is not to just show findings or results, but to demonstrate the tools and techniques to do so. The tool used here is disarm(j)
, the unofficial "jtool3
", which once again refactors and reimplements the functions of its predecessors, and further extends them with new capabilities and - for the first time - support for other binary formats, like ELF and PE.
Analyzing the Kernelcaches
Apple has long used kernelcaches in *OS variants, and with the move to Apple Silicon they are now used in macOS as well. Using kernelcaches has many performance and security advantages. It fuses the kexts and the kernel proper together, and makes analysis a bit harder.
Starting with Darwin 20, Apple created the MH_FILESET
Mach-O format (Type #12) for kernelcaches. A fileset is a "set of related Mach-Os", which are loaded together into the same address space. The Mach-Os which are members of the set are indicated by LC_FILESET_ENTRY
load commands, which detail the name and the offset (but curiously, not the size) of each Mach-O.
Using disarm -L
will display all the load commands in a Mach-O. We can use that to compare the iOS 17.5.1 and the 18.0m kernelcaches, like so:
We see two new kexts - com.apple.kec.AppleEncryptedArchive
and com.apple.iokit.IOGameControllerFamily
. The former is used for the new DMG format introduced in iOS 18 - .aea:
Why Apple would go back to encrypting *OS disk images eludes this author - especially after the OTA update saga demonstrated how the GID encryption used by Apple up to iOS 9.x was futile.
AEA is a simple container format, which starts with a magic, version, size, and provides URLs for image key recovery and management, as well as JSON metadata.
A deeper analysis of filesystem changes (new daemons, etc) is left for a future post.
KEXTRACTION
The fileset structure makes it easy to extract the kexts with a single command line:
All kexts will be properly fixed up and ready for further analysis, though it should be noted that their stubs (referencing other kexts or XNU proper) are not (at this time) resolved. As an example of analysis, we consider a perennial favorite - the Sandbox.kext.
Sandbox.kext
The Sandbox is undoubtedly one of the two most important kexts in Darwin. Along with its partner in crime (AMFI) it is responsible for enforcing the system security restrictions.
As Darwin enthusiasts know (q.v. M-III/8), the kernel extension uses two important structures. The first is the mac_policy_conf
. Unlike AMFI's policy (which is dynamically constructed in code), the Sandbox policy is preinitialized in __DATA_CONST
:
And so we note two new MAC policy hooks - hook_proc_check_set_[task/thread]_exception_port
.
The second is the array of Sandbox "operation names", which provides hard-coded strings for all the operation names used by the SandBox Profile Language (SBPL). This can be easily found due to the reference to the first operation, "default", also in __DATA_CONST.__const
.
Comparing the list with the one from iOS 17 reveals two new operations: mach-task-exception-port-set
and sandbox-check
. The former is understandable, as it finally fixes a long vulnerability in exception handling (and, thus, process control). The latter is likely due to the user mode sandbox_check*
APIs (from libsandbox.dylib) being used as a backhanded way to enumerate processes on the system.
XNU
XNU's LC_SOURCE_VERSION
indicates 11215.0.31.522.1
, hinting at many internal builds and rebuilds over the last year. The structure of the kernelcache and kernel proper remains the same as last year, with either PPL (pre-A14) or SPTM (thereafter) segments.
A main interest for vulnerability researchers is to map out any change in the the kernel attack surface. For the kernel proper, this means any new system calls, Mach traps or in-kernel MIG routines. For the various kernel extensions, this means any IOUserClient
changes, particulary any new methods, or modifications in existing ones. In this context, it should be noted that Apple went to great lengths to finally filter system calls, Mach traps, (some) Mach/XPC messages and even IOUserClient
s, starting with Darwin 21 and 23. Thus, the change in attack surface does not necessarily imply reachability from the browser or the application context.
SYSTEM CALLS (SYSENT
AND NSYSENT
)
XNU's system call table (_sysent
) and system call number (nsysent
) have long been removed from the public symbols, but they are trivial to find in __DATA_CONST.
. The observation is due to the very distinct structure of the table. There are several distinct markers to use here:
The system call entry of
exit(2)
calls for no return value, and one argument of auint32_t
- that is, four bytes. This gives a "magic" of00 00 00 00 01 00 04 00
- in the sense that it appears exactly once in the segment.The table entries are in a form of
function/munger/(arg values)
, 24 bytes each.The table is full of repeating entries - particularly,
enosys
(for invalid system calls), or the mungers. These are at pre-determined offsets from one another.
We can refer to the above logic particularly the first rule, as rule sc. As you will shortly see, disarm(j)
can automatically find the _sysent
, and warns on any syscall changes, including suspected new syscalls, as part of its automated analysis of XNU kernels. This will spit out a clear warning to stderr
:
MACH TRAPS (MACH_TRAP_TABLE
)
XNU is a unique kernel in that, in addition to the BSD-style system calls, it exports another "personality", of Mach traps. Mach traps are a vestige of the original Mach micro kernel, and deal mostly with memory management and ports. Unlike system calls, the traps are all in a fixed length mach_trap_table
, hard coded to 128 entries, though a fair number of them unused, and set to kern_invalid
(the equivalent of einval
in the Mach trap world).
The mach_trap_table
can be found similarly, or by first finding kern_invalid
. This is easy thanks to its informative message:
If using a "magic" here, it is 04 05 00 00 00 00 00 00
, which occurs in the expansion of the MACH_TRAP
macro for _kernelrpc_mach_vm_allocate_trap
. This is the macro which creates the 24-byte mach_trap_table
entries. This appears for trap 10 and 11. A match on trap 10 is thus 240 bytes after the beginning of the mach_trap_table
. Let's call this rule mt, as we will revisit it later.
There are no new Mach traps in Darwin 24 so far, and the latest - _exclaves_ctl_trap
is still unimplemented (outside of the latest M4 iPad).
IN-KERNEL MIG ROUTINES
A third dimension of the kernel attack surface lies in its in-kernel MIG routines. These are created from the .defs files in XNU's /osfmk/mach/*.defs using the mig(1)
utility. mig(1)
creates the client and server code, as well as an .h file, which is provided to user mode through the SDK's <mach/*.h> #include
s. The well known Mach primitives - Host, Task, Thread, Exception, and others - are all implemented as MIG messages to ports, whose RECEIVE
rights (and, in practice, obligations) are held by the kernel.
MIG tables are easy to identify, because of their very distinct structure. They, too, reside in the __DATA_CONST.__const
. Without getting into the boring details, disarm(j)
(like its precursors) can readily identify them. This allows a quick and effective comparison between the past and present kernelcaches, like so:
The above shows two new MIG messages have been added: #2411 (in the osfmk/mach/mach_exc.defs), #3466 (in the osfmk/mach/task.defs), and #3632 (osfmk/mach/thread_act.defs).
Conducting Your Own Deeper Analysis
Veteran iOS reversers might remember my old joker(j)
tool, which I wrote in order to symbolicate *OS kernels. The tool, like its name, started out as a joke, but gained surprising popularity when Apple stripped its kernelcaches bare. Since then, the logic was incorporated into jtool2(j) --analyze
, where it lived on as a statically linked module.
When I decided to refactor jtool2(j)
into disarm(j)
, I realized that all the symbolication logic, when hard compiled, would quickly grow outdated. I thus decided to take its core - pattern matching - and put into a separate text file. As with companion file, I emphasize a simple and easily maintainable design, as a simple text file.
The beauty of using a separate file for matching logic is not only that it decouples it from the disassembly engine, but also that, in this way, matchers can be used on any file - Not just kernel caches! Any Mach-O, or even ELFs can be analyzed with matchers. It's a kind of feature one would want in their favorite disassembler/reversing engine - which could probably be made using some Python plugin (or built-in to a future version). With disarm(j)
, it's available here and now.
Analysis can be triggered by specifying JA=1
. Default analysis is very basic, however, so oftentimes you'll want to supply a custom matcher file, using JMATCHERS=/path/to/file
. Because analysis can be CPU intensive, it will automatically create a companion file - a simple text file of 0xaddress:_symbol
records, which will then be loaded automatically by disarm(j)
in future runs.
ARGUMENT MATCHERS
Although it's a command-line tool, disarm(j)
contains a rather sophisticated proprietary disassembler. Among other capabilities, it tracks register values, which enables it to figure out arguments to functions. There are also numerous plug-in options in this tool (i.e. the user can supply dylibs with callbacks on specific function calls, or argument values). The most common use of this feature, however, is creating rules - when argument #x
has value y
. The syntax couldn't be easier:
REGION MATCHERS
Argument matchers are useful for many functions, but another common pattern is looking through values pointed to in the various data segments of a binary. Since disarm(j)
automagically handles chained fixups, pointer rebasing, etc, it is capable of determining pointer targets with 100% certainty. This opens up an ability to create really powerful rules.
For example, remember the system call entry logic, above, what we called rule sc? Easily to express in the following rule:
__DATA_CONST|val=0x0004000100000000|_sysent|_sysent|-40
How about the Mach trap table, the rule mt from above? Just as easy:
__DATACONST|val=0x0000000504|mach_trap_table|_mach_trap_table|-240
As one can discern from the above examples, the format of region matchers is also kept simple:
Matching on strings is highly effective, especially given the large number of sysctl
structures in the kernel's __DATA_CONST.__const
.
Another option is to match by value. Taking, as an example, a fragment of Apple's gAppleSystemVariableGuid
- we know its value (after all, it's globally unique), and that it resides in __TEXT.__const
. Thus, the following example can immediately identify it, along with two other friends:
TAKEAWAYS
Using disarm(j)
's analysis and custom matchers, it's possible to create many symbolication rules, which can be used across multiple versions of the same binary - analysis which is future proof for important binaries like XNU. Thanks to disarm(j)
's handling of many file formats, it can also be used on any Aarch64 binary - user mode Mach-Os, ELFs, or even PE32+.
One more thing..
What about this Apple Intelligence, that's all the buzz (and that popped AAPL stock by 11% last week?) Well, that's a user mode subsystem. Surprisingly, the daemon appears to have made its debut in iOS 17.4-ish. Xn00ping around we see it's the intelligenceplatformd
, with files in ~mobile/Library/IntelligencePlatform/graph.db
And, unsurprisingly, such an important database of intimate knowledge about the user is well protected:
Lots of interesting tidbits here, but that's something we might leave for a future blog post.. ;-)
Cover Image Credit: Image by pvproductions
We are hiring for multiple positions - more details here