Mach-O exploration. Tools - nm
— Started in Exploring Mach-O binaries. Tools - pagestuff
Introduction
Today’s plan is to take a look at another command-line tool which is helpful for binary analysis. I’ve already used it in a couple of posts to find specific method declared in binary. This tool is nm
. Short description from man as usually:
nm - display name list (symbol table).
What is a symbol table?
Symbol table represents identifiers from source code mapped to specific addresses. Identifiers for C-languages are functions and global variables that are defined and referenced in a program. Objective-C uses class and instance methods, but basically it’s the same, if we’re talking about their representation in binary.
Note:
Interesting fact which I had found during reading
nm(1)
man-page that since Xcode 8.0 release there are 2 versions ofnm
:nm-classic
andllvm-nm
. At the moment, default version is llvm-nm, which seems is used in this post.
Trying out
I think the most simple way to understand something is to try and see with your own eyes. nm
has different options that differ in formatting and additional information. However, to start it is more than enough to use call without parameters. As a target binary I’ll use binary that was used for previous post about pagestuff. Also you might want to check source code before going further here it is.
$ nm sampler
Output will be:
0000000100000d90 t -[SampleClass .cxx_destruct]
0000000100000d30 t -[SampleClass property]
0000000100000d50 t -[SampleClass setProperty:]
U _NSLog
U _OBJC_CLASS_$_NSObject
0000000100001208 S _OBJC_CLASS_$_SampleClass
00000001000011d8 s _OBJC_IVAR_$_SampleClass._property
U _OBJC_METACLASS_$_NSObject
00000001000011e0 S _OBJC_METACLASS_$_SampleClass
U ___CFConstantStringClassReference
0000000100000000 T __mh_execute_header
U __objc_empty_cache
0000000100000dd0 T _main
U _objc_autoreleasePoolPop
U _objc_autoreleasePoolPush
U _objc_getProperty
U _objc_msgSend
U _objc_setProperty_nonatomic_copy
U _objc_storeStrong
U dyld_stub_binder
Basically, we have 3 columns: address, type and symbol. First distinct difference which I noticed is that some items from the list have no addresses, moreover these items also all have U type. Also we see familiar names, such as SampleClass
class name, property
name of property, NSLog
function, some stuff mixed with preffixes and some _objc_
implementation functions.
Let’s examine what we have in the documentation:
U - referenced, but not defined in the file (undefined).
All items below (marked as U) are functions and constants defined outside binary. NSLog
and NSObject
are provided by Foundation framework, most of other functions are either part of ARC implementation, Objective-C runtime and etc, and were inserted by clang.
U _NSLog
U _OBJC_CLASS_$_NSObject
U _OBJC_METACLASS_$_NSObject
U ___CFConstantStringClassReference
U __objc_empty_cache
U _objc_autoreleasePoolPop
U _objc_autoreleasePoolPush
U _objc_getProperty
U _objc_msgSend
U _objc_setProperty_nonatomic_copy
U _objc_storeStrong
U dyld_stub_binder
T - Global function (text) object
0000000100000000 T __mh_execute_header
0000000100000dd0 T _main
main
is clearly global function. Story with __mh_execute_header
is a bit more complicated. I’ve found answer in the ldsyms.h file. Which basically says it’s an address for mach header for Mach-O executable file. In other words it’s border line for header.
t - Local function (text) object
0000000100000d90 t -[SampleClass .cxx_destruct]
0000000100000d30 t -[SampleClass property]
0000000100000d50 t -[SampleClass setProperty:]
As we’ll see later, all these methods are locally defined. property
is the property we defined and compiler generated implementation for them. .cxx_destruct
is more interesting guest. Initially, it was part of the Objective-C++ implementation, however currently it serves as a deallocation function for both Objective-C and Objective-C++ implementations.
Surprisingly, there was no definition for S
/ s
. So I took definition from man nm(1)
S - symbol in a section other than those above
0000000100001208 S _OBJC_CLASS_$_SampleClass
00000001000011d8 s _OBJC_IVAR_$_SampleClass._property
00000001000011e0 S _OBJC_METACLASS_$_SampleClass
If you remember what pagestuff
produced in the previous post, all these symbols were on the page 1 and were related to .DATA. Meanwhile all t
-symbols were in .TEXT
(page 0).
So all types are checked and now we see that types define how parts of code are linked together. Internal symbols marked as t
/ T
/ s
/ S
are defined right in place, they all are placed inside binary. U
symbols require more attention and should be linked to the external resource (outside binary).
The only part which left untouched is address. The address field provides information where symbol lies relatively to the assembler code. So it doesn’t make any sense to consider it in isolation. We need disassembled code around existing symbol address. To do that we need to use another tool which provides disassembly code, for example lldb
. But it’s one more tool, so we’ll check it out in the next post.
More examples
Another formatting option that I found extremely useful:
$ nm ./Temp/sampler -m
Output:
0000000100000d90 (__TEXT,__text) non-external -[SampleClass .cxx_destruct]
0000000100000d30 (__TEXT,__text) non-external -[SampleClass property]
0000000100000d50 (__TEXT,__text) non-external -[SampleClass setProperty:]
(undefined) external _NSLog (from Foundation)
(undefined) external _OBJC_CLASS_$_NSObject (from libobjc)
0000000100001208 (__DATA,__objc_data) external _OBJC_CLASS_$_SampleClass
00000001000011d8 (__DATA,__objc_ivar) non-external (was a private external) _OBJC_IVAR_$_SampleClass._property
(undefined) external _OBJC_METACLASS_$_NSObject (from libobjc)
00000001000011e0 (__DATA,__objc_data) external _OBJC_METACLASS_$_SampleClass
(undefined) external ___CFConstantStringClassReference (from CoreFoundation)
0000000100000000 (__TEXT,__text) [referenced dynamically] external __mh_execute_header
(undefined) external __objc_empty_cache (from libobjc)
0000000100000dd0 (__TEXT,__text) external _main
(undefined) external _objc_autoreleasePoolPop (from libobjc)
(undefined) external _objc_autoreleasePoolPush (from libobjc)
(undefined) external _objc_getProperty (from libobjc)
(undefined) external _objc_msgSend (from libobjc)
(undefined) external _objc_setProperty_nonatomic_copy (from libobjc)
(undefined) external _objc_storeStrong (from libobjc)
(undefined) external dyld_stub_binder (from libSystem)
Here you can see, types are replaced by segment and section, also additional column appeared that points where this symbol is defined (framework name). I used this option a lot with grep
to find specific symbol in the post about differences of tryLock and lock.
Thank you for reading!
References: