Dyld Shared Cache


Introduction

LIEF’s dyld shared cache support allows the inspection and extraction of libraries from Apple dyld shared cache.

import lief

dyld_cache: lief.dsc.DylibSharedCache = lief.dsc.load("macos-15.0.1/")

Warning

takes as input either a directory for loading the whole shared cache or a set of files to load a subset of the cache.
dyld_cache: lief.dsc.DylibSharedCache = ...

for dylib in dyld_cache.libraries:
    print("0x{:016x}: {}".format(dylib.address, dylib.path))
dyld_cache: lief.dsc.DylibSharedCache = ...

liblockdown = dyld_cache.find_lib_from_name("liblockdown.dylib")

macho: lief.MachO.Binary = liblockdown.get()

for segment in macho.segments:
    print(segment.name)
liblockdown = dyld_cache.find_lib_from_name("liblockdown.dylib")

macho: lief.MachO.Binary = liblockdown.get()
macho.write("on-disk-liblockdown.dylib")

Warning

By default, LIEF does not remove dyld shared cache optimizations. To remove some of these optimizations, you can check the structure.

Performance Considerations

Dyld shared cache files are pretty large which means that they can’t be processed in the same way as other regular or binaries.

Note

These functions parse all the format structures (with decent performances) because:

  1. Most of the binary’s sizes are less than gigabytes.

  2. A complete representation is required for modifying binaries.

From a technical perspective, LIEF is using a LIEF::FileStream to access (on-demand) dyld shared cache structures. Thus, the in-memory consumption is limited to the size of the structures being accessed. The drawback of this FileStream is that since this is a file-based access, it takes more time compared to a LIEF::VectorStream.

Additionally, LIEF’s dyld shared cache implementation heavily relies on the iterator pattern to follow the principle: don’t pay overhead for what you don’t access.

When it is possible, LIEF implements the trait of a random access iterator [1] so that we can programmatically do:

dyld_cache: lief.dsc.DyldSharedCache = ...

# No cost
libraries = cache.libraries

# O(1) cost
first_lib = libraries[0]

# O(len(libraries)) cost
for lib in libraries:
    print(lib.path)
For instance, could require to iterate over the dyld shared cache’s stub islands several times. To improve overall performances, LIEF provides a cache-based optimization that can be enabled and configured with:

When you should turn caching on?

You can skip LIEF’s caching if:

  • You don’t plan to extract libraries from the shared cache.

  • You plan to extract only one library from the shared cache and only once

  • You don’t want to have LIEF cache artifacts on your system.

By default, the cache mechanism is not enabled.

References

Python API

C++ API

Rust API: lief::dsc