While this new release adds new functionalities and addresses different bugs, It is worth mentioning that it is the first release to officially expose Rust binding! In addition, an extended version was also released to provide additional functionalities not strictly related to the executable formats.
As discussed in these blog posts:
LIEF is now available in Rust for the following architectures:
aarch64-unknown-linux-gnu
x86_64-apple-darwin
x86_64-pc-windows-msvc
(MT/MD runtimes)x86_64-unknown-linux-gnu
aarch64-apple-ios
aarch64-apple-darwin
I published the release on crates.io so you should be able to start using LIEF in Rust with:
1[package]
2name = "lief-demo"
3version = "0.0.1"
4edition = "2021"
5
6[dependencies]
7lief = "0.15.0"
LIEF is now providing additional features thanks to an extended version. Among those features, it provides support for DWARF and PDB debug formats as well as Objective-C metadata.
This support is a kind of spin-off of iCDump which is now completely integrated into LIEF. Compared to the original iCDump project, it fixes the issue with the new chained relocations (c.f. issue#4) format and can be used on all the platforms supported by LIEF (including Windows) in C++/Rust/Python:
Rust:
1let macho: lief::macho::Binary;
2
3if let Some(metadata) = macho.objc_metadata() {
4 println!("Objective-C metadata found");
5 for class in metadata.classes() {
6 println!("name={}", class.name());
7 for method in class.methods() {
8 println!(" method.name={}", method.name());
9 }
10 }
11}
Python:
1import lief
2macho: lief.MachO.Binary = ...
3metadata: lief.objc.Metadata = macho.objc_metadata
4
5if metadata is not None:
6 print("Objective-C metadata found")
7
8 for clazz in metadata.classes:
9 print(f"name={clazz.name}")
10 for meth in clazz.methods:
11 print(f" method.name={meth.name}")
12
13 # Generate a header like "class-dump"
14 print(metadata.to_decl())
Supporting debug formats like DWARF or PDB has been a long-standing discussion (c.f. issue #17). The main reasons to avoid supporting these formats from scratch were:
On the other hand, I do understand the need to be able to process debug info (if present) from a LIEF binary object. While looking at the API of the different existing projects, I noticed that they are pretty powerful to expose a low-level API that matches the debug format specifications but they don’t provide1 some kind of abstraction over the complexity of these specifications.
Developers and reverse engineers have concepts of compilation units,
functions, global variables, stack variables, etc but before being able to access
this information from a DWARF or a PDB file, you need to go through what a PDB DBI stream is
or understand that the address of a function in DWARF can be determined by
either DW_AT_entry_pc
or DW_AT_low_pc
.
The idea behind the support of the DWARF and PDB formats in LIEF is to:
This LIEF bridge is based on LLVM which did the heavy job of supporting DWARF & PDB within a single framework.
The DWARF & PDB support in LIEF leverages the LLVM API to abstract concepts as listed above.
For instance, you can iterate over all the PDB’s public symbols of the ntoskrnl.pdb
through:
1import lief
2
3ntoskrnl: lief.pdb.DebugInfo = lief.pdb.load("./ntoskrnl.pdb")
4
5for sym in ntoskrnl.public_symbols:
6 print(f"{sym.demangled_name}: 0x{sym.RVA:06x}")
If the PDB embeds extended information about the compilation units we can do (in Rust):
1let pdb = lief::pdb::load("peacecannary.pdb");
2for cu in pdb.compilation_units() {
3 for func in cu.functions() {
4 if func.name().starts_with("peacecannary::CObfuscator") {
5 println!("{}: {} (0x{:04x})", cu.module_name(), func.name(), func.rva());
6 }
7 }
8}
The API for the DWARF format is pretty similar:
1import lief
2
3elf: lief.ELF.Binary = ...
4# If the binary embeds DWARF debug info in the ELF:
5dwarf: lief.dwarf.DebugInfo = elf.debug_info
6# Otherwise:
7dwarf: lief.dwarf.DebugInfo = lief.dwarf.load("my_dwarf.dwarf")
8
9for cu in dwarf.compilation_units:
10 print(f"Produced by: {cu.producer} in {cu.compilation_dir}")
11
12 for func in cu.functions:
13 print(f"0x{func.address:04x}: {func.name} ({func.size} bytes)")
14
15 for var in cu.variables:
16 if var.is_constexpr:
17 continue
18 # Look for global variables only
19 if var.address is not None and var.address > 0:
20 print(f"0x{var.address:04x}: {var.linkage_name} ({var.size} bytes)")
For more details about the API, you can take a look at these dedicated sections:
LIEF is now powered by AI supporting Apple *.hwx
files which are some kind of Mach-O
file for the Apple Neural Engine (ANE).
These *.hwx
start with a new magic identifier: 0xbeefface
and embed
custom LC_
command like the command 0x40
LC Command 0x40
I could be interested in adding the support of this private command in LIEF so if anyone already reversed or has some info about the layout of this command, feel free to reach out.
To support unknown or non-public LC commands in LIEF, I created an artificial
LIEF::MachO::UnknownCommand
which is a placeholder for any Mach-O commands that
are not recognized by LIEF.
For instance, we can inspect the private 0x40
command as follows:
1import lief
2target = lief.MachO.parse("personsemantics-u8-v4.H16.espresso.hwx").at(0)
3lc_0x40: lief.MachO.UnknownCommand = macho.commands[18].command
4
5print(lc_0x40.original_command) # Outputs 0x40/61
6print(bytes(lc_0x40.data)) # Print the raw content of the command
These .hwx
files have been involved in the Dopamine jailbreak
and you can also find a BlackHat presentation about the Apple Neural Engine: Apple Neural Engine Internal.
LIEF can inspect and verify the PE Authenticode and with this release, we can even do that in Rust!
1use lief::pe;
2
3let mut file = std::fs::File::open(path).expect("Can't open the file");
4if let Some(lief::Binary::PE(pe)) = lief::Binary::from(&mut file) {
5 let result = pe.verify_signature(pe::signature::VerificationChecks::DEFAULT);
6 if result.is_ok() {
7 println!("Valid signature!");
8 } else {
9 println!("Signature not valid: {}", result);
10 }
11 return ExitCode::SUCCESS;
12}
This new release also adds the support of the Ms-CounterSignture attribute (OID: 1.3.6.1.4.1.311.3.3.1
)
and some other attributes like Ms-ManifestBinaryID
(OID: 1.3.6.1.4.1.311.10.3.28
)
No breaking updates for the ELF format.
LIEF is now able to parse and modify
binaries compiled with the new DT_RELR
and DT_ANDROID_REL_
relocations.
I also added the helper: LIEF::ELF::Binary::get_relocated_dynamic_array
which
allows us to get a relocated view of the DT_INIT_ARRAY/DT_FINI_ARRAY
.
This can be useful when – for instance – the init array values are null because of relocations:
1import lief
2
3elf: lief.ELF.Binary = ...
4
5# Return: [0, 0, 0, 0, ...]
6elf.get(lief.ELF.DynamicEntry.TAG.INIT_ARRAY).array
7
8# Return relocated values: [0x96db10, 0x9b9c14, 0xe7f660, 0xe7f70c, ...]
9elf.get_relocated_dynamic_array(lief.ELF.DynamicEntry.TAG.INIT_ARRAY)
Since the beginning of LIEF, all the enums used by the different formats were
located in a single header file (e.g. LIEF/PE/enums.hpp
or lief.PE.{enums, ...}
in Python).
Some of them were clashing with system headers that were also #define
some of these enums.
To workaround this issue, we had a dirty hack based on LIEF/{ELF.PE,MachO}/undef.h
that undefines these values before being included.
In LIEF 0.15.0 the scope of the enums has been redefined so that we should no longer
need the undef.h
.
For instance the standalone enum LIEF::ELF::ELF_SECTION_TYPES
(or lief.ELF.SECTION_TYPES
)
has been re-scoped in the LIEF::ELF::Section
class:
1// <LIEF/ELF/Section.hpp>
2class LIEF_API Section : public LIEF::Section {
3 enum class TYPE : uint64_t {
4 SHT_NULL = 0, /**< No associated section (inactive entry). */
5 PROGBITS = 1, /**< Program-defined contents. */
6 ...
7 };
8};
This means that instead of using LIEF::ELF::ELF_SECTION_TYPES::SHT_PROGBITS
or lief.ELF.SECTION_TYPES.SHT_PROGBITS
you should now use:
1- LIEF::ELF::ELF_SECTION_TYPES::SHT_PROGBITS
2+ LIEF::ELF::Section::TYPE::PROGBITS
3
4- lief.ELF.SECTION_TYPES.SHT_PROGBITS
5+ lief.ELF.Section.TYPE.PROGBITS
The list of the enums affected by this change is listed in the changelog.
I received some feedback about performance issues in the latest release (0.14.x
)
compared to former releases. This regression affects Mach-O and PE binaries and
I’m happy to say that this v0.15.0
release should be faster on ELF, PE, and Mach-O
compared to previous releases.
The PE regression comes from the LIEF::PE::OptionalHeader::computed_checksum
introduced
in LIEF 0.12.0 and discussed in this issue: #660.
As of LIEF 0.12.0, this computed_checksum
was computed during the parsing phase, and on large
binaries, this computation might have a significant impact on the performances.
In LIEF 0.15.0, the OptionalHeader’s checksum can be re-computed over the LIEF::PE::Binary
object:
1import lief
2
3pe: lief.PE.Binary = ...
4computed_checksum = pe.compute_checksum()
Thus, avoiding the computation during the parsing phase and moving to an “on-demand” API.
On the other hand, the Mach-O regression was pretty tricky to identify (c.f. issue #1069).
The root cause of the regression was these lines:
1// https://github.com/lief-project/LIEF/blob/0.14.1/src/MachO/BinaryParser.cpp#L285-L290
2for (LARGE_LOOP) {
3 if (!is_printable(name)) {
4 ...
5 }
6}
with is_printable
implemented as follows:
1bool is_printable(const std::string& str) {
2 return std::all_of(std::begin(str), std::end(str),
3 [] (char c) { return std::isprint<char>(c, std::locale("C")); });
4}
Then, while processing large Mach-O binaries with LIEF we can observe:
It turned out that std::locale("C")
is cached by the STL on Linux but not on
macOS & Windows. This means that we were invoking std::locale("C")
for each character of each string (which has a cost).
One solution is to store std::locale("C")
in a static variable as it is done
– under the hood – in the Linux STL.
1bool is_printable(const std::string& str) {
2 return std::all_of(std::begin(str), std::end(str),
3- [] (char c) { return std::isprint<char>(c, std::locale("C")); });
4+ [] (char c) {
5+ static std::locale LC("C");
6+ return std::isprint<char>(c, LC);
7+ });
8}
This actual fix is slightly different though: 7c3f63194.
LIEF Python wheels are now available for Musl-based systems. This support is
motivated by the fact that Python Docker images tagged with the suffix the -alpine
are using Alpine system which is based on Musl libc.
Thus, we can now use Docker’s python-alpine as image base to install LIEF:
1FROM python:3.13.0b3-alpine
2
3RUN pip install --no-cache-dir lief==0.15.0
Note that the LIEF Python wheel for Alpine weighs ~2.5MB compressed and ~7MB decompressed.
This new Rust-oriented release is a major milestone for LIEF. While the library is widely used among Python community with ~16,000 daily downloads on PyPI, I’m eager to see new use cases or issues brought by the Rust community.
As a reminder, there is a Discord channel where you can drop your questions, and remarks (that are not issues 😉).
Thank you also to arttson and lexika979, for their sponsorship.
Which makes sense since this is not the purpose of these projects ↩︎