Analyzing internal Go structures in obfuscated binaries
I have been building a small C tool that fingerprints Go binaries (version, build
metadata, function names, the lot) and pointing it at a corpus of malware samples.
A few of them confused an early, naïve version of it: the detector keyed on the
pclntab magic number, and these samples did not have the magic it expected. The
obvious next question is whether they are Go at all, and if so, how much they are
trying to hide. The answer turned out to have two layers, and only one of them
does what its author probably hoped.
The pclntab, briefly
Some background, because everything below turns on it. Every Go binary carries a
pclntab (the “program-counter line table”), written by the linker into its own
section (.gopclntab on ELF, __gopclntab on Mach-O; folded into .rdata on PE).
Despite the name, it is not optional debug data. The runtime needs it to run:
growing a goroutine stack requires each function’s frame size, the garbage collector
needs the pointer map for every frame at every safepoint, and panics, tracebacks, and
runtime.Callers all walk it. That is why it survives -s -w stripping when DWARF
and the symbol table do not, and why a stripped Go binary still hands a reverse
engineer function names, boundaries, and file/line information that a stripped C
binary never would. It is Go’s great gift to the analyst, and the thing an obfuscator
most wants to take back.
It begins with a pcHeader: a 4-byte magic identifying the table format, then
minLC and ptrSize, the function count, and offsets to the sub-tables. Those are
the funcnametab (a blob of NUL-terminated function names), the functab (an
entry-PC-sorted index of _func records, each pointing at its name and its
pc→{sp,file,line} programs), and the file and pc-value tables. moduledata.pcHeader
points at the whole thing, and (this matters later) the runtime validates that
header at startup. The magic also pins the format era: it has stepped from
0xfffffffa (Go 1.16-17) to 0xfffffff0 (1.18-19) to 0xfffffff1 (1.20+) as the
layout changed, which is exactly why analysis tools key on it to decide whether a
file is Go and how to parse it. Two facts the rest of this post leans on: the runtime
depends on this table, and most Go RE tooling bootstraps from its magic.
Layer one: garble
The decisive test for garble is which
names survive. garble rewrites your source before building, replacing identifiers
with short base64 hashes. Modern garble defaults to a broad GOGARBLE=* scope, but
it still does not obfuscate the runtime or its dependency closure, and it must
preserve some package paths the toolchain and linker treat specially. In these
samples that leaves a very particular split. Counting readable symbol strings in
one of the Linux samples:
| Package class | Readable symbols |
|---|---|
runtime.* |
208 |
internal/abi.* |
57 |
main.* |
0 |
go_<something>*, golang.org/x/net.*, github.com/* |
0 |
Runtime and runtime-adjacent stdlib intact; the application and non-stdlib
dependencies gone. That is the garble signature, and the rest of the picture fits:
go version returns unknown
(garble removes “all build, module, and debug information”: the .go.buildinfo
module path, dependency list, and settings are stripped), and the only source paths
that leak are stdlib ones like internal/abi/type.go; the app’s positions are
hashed away.
There is even a control. A sibling sample in the same corpus is the un-garbled
build of what looks like the same family: valid magic, go1.19.2, six readable
main.* symbols, seventy-seven go_<something> strings, and an intact dependency list
(github.com/denisbrodbeck/machineid, golang.org/x/net, go_<something>). Same
code, one build run through garble and one not. The dependency graph that names the
thing outright survives only in the clean build; in the garbled one it is gone.
Layer two: a renamed magic number (also garble)
The thing that actually tripped my detector is the same tool’s other half: what
garble does to the pclntab itself. That pcHeader magic, 0xfffffff1 for Go
1.20+, is in these samples something else:
.gopclntab @ 0xe35920:
f3 33 18 d2 00 00 01 08 e6 5c 00 00 ...
^^^^^^^^^^^ magic = 0xd21833f3
0xd21833f3 is not a magic Go has ever used. But the rest of the header is textbook:
pad1=0, pad2=0, minLC=1, ptrSize=8, a sane nfunc. Four bytes changed,
everything else left alone.
That should not run. The relevant Go 1.20/1.21-era runtime validates the header on
startup, in runtime.moduledataverify1 (src/runtime/symtab.go):
1if hdr.magic != 0xfffffff1 || hdr.pad1 != 0 || hdr.pad2 != 0 ||
2 hdr.minLC != sys.PCQuantum || hdr.ptrSize != goarch.PtrSize ||
3 hdr.textStart != datap.text {
4 throw("invalid function symbol table")
5}Newer Go releases moved textStart out of pcHeader, but the runtime still checks
the magic, padding, instruction quantum, and pointer size before continuing. A binary
whose magic is genuinely 0xd21833f3 aborts before main. But this is live malware,
so the magic the runtime checks cannot be the constant in the source. And indeed,
those four bytes occur in exactly two places in the file: the pcHeader, and one
location in .text. Disassembling the second:
144bd1a: 81 3a f3 33 18 d2 cmpl $0xd21833f3,(%rdx) ; pcHeader.magic ...
244bd20: 0f 85 ... jne ... ; ... vs 0xd21833f3That is moduledataverify1 itself, with the comparison immediate patched to match.
The data and the check were renamed together, so the binary is internally consistent
and runs normally. And the magic differs per platform, all high-entropy:
0xd21833f3 (Linux x64), 0x5575de9a (Linux x86), 0xe9f349cb (macOS x64),
0x3f8b2d21 (macOS arm64).
That is not a guess. It is exactly what garble does, and the source says so. garble
rewrites the runtime’s magic constant Go120PCLnTabMagic in internal/abi/symtab.go
(updateMagicValue) and patches the linker’s generatePCHeader to write the same
value into the header. The value is magicValue(): a hash of the user’s -seed (or,
failing that, the runtime package’s build action ID), so it is deterministic per
build and different on every target, which is precisely the per-platform spread above.
The CHANGELOG lists it plainly: “Randomize the magic number header in pclntab”
(#622). garble does more to the same table: it XOR-encrypts each funcInfo.entryoff
with its nameOff (always on, since #641), decrypting only inside the runtime’s
entry() method. With -tiny it also blanks every unexported function name in
the funcnametab to one shared empty string. The README presents garble mostly as
source and metadata obfuscation; the pclntab surgery lives in its linker patches
and changelog entries rather than in a standalone obfuscation document.
How much does the rename actually hide?
This is where my first read was wrong, and the honest answer is more interesting than
the one I expected. The magic rename does not make the binary disappear from Go
tooling; it depends entirely on how a tool finds the pclntab:
- Tools that scan for the magic (or strictly re-validate it after locating the
table) are defeated. That includes my detector’s original magic-keyed path and
Go’s
debug/gosymline-table parser, which is used by Go’s own objdump path for pclntab-derived file and line data. - Tools with alternate pclntab discovery or repair paths are not. GoReSym still
recovers 3732 functions and reads
TabMeta 1.20; current GoReSym versions also try to locate the table fromruntime_modulesinitand repair a modified magic.
(go tool nm does fail here, but on “no symbol section”, i.e. the stripped
.symtab, not the magic. go version returns unknown because garble stripped the
build info. Neither failure is the magic’s doing.) So the rename is a narrow trick:
it blinds magic-scanners and breaks naïve “is this Go?” bootstrapping, but the moment
a tool keys on structure or section names it walks right in.
That is also the fix. My detector now validates the pcHeader’s structure
(pad/minLC/ptrSize/nfunc ranges) instead of trusting the magic, and when the
magic is non-standard but the structure holds, it assumes the 1.20+ layout and walks
the tables anyway. On the tampered sample:
Consistency : WARN -- non-standard pclntab magic (possibly modified/obfuscated)
Version floor : >= go1.21
funcnametab names : 5013 e.g. internal/abi.Kind.String, ...
functions (functab) : 23782 e.g. internal/abi.Kind.String @ 0x402440, ...
source files : 161 e.g. internal/abi/type.go, ...
moduledata : located
The recovered addresses are real: 0x402440 disassembles to a Go prologue inside
.text, and the entries come back sorted. That tells me this sample predates garble’s
entryoff encryption: a current-garble build XOR-encrypts each entryoff, so the
names would still recover (the encryption keys off nameOff, leaving nameOff and
funcoff alone) but the addresses would be scrambled until you reverse the XOR. Here
they aren’t, so I get both.
But note what comes back even so: runtime and surviving stdlib symbols, their source
paths, the function table, a version floor. The application’s own names are garble
hashes: one-way SHA256(salt ++ seed ++ name) truncated to a few chars, with the
salt kept out of the binary, so they don’t reverse without the original build. And
the dependency list is simply not there. garble took those before the magic was ever
touched. Undoing the rename gets you the scaffolding; it does not undo garble.
What I take from it
garble is a real pain to work against. The name hashing is one-way
and salted out of the binary, the metadata is gone, the dependency graph that would
name the family outright is gone, and on a current build the entryoff encryption
scrambles the addresses too. None of that comes back with a hex editor. An analyst
who only has the garbled binary has genuinely lost the things that make Go pleasant
to reverse, and that loss happened at build time, in the source, long before the
renamed magic was ever written.
But here is the asymmetry that makes a garbled Go binary still a softer target than a stripped C one: a Go program cannot throw everything away, because the runtime needs its own internal structures to function. The pclntab has to be a real pclntab: the stack grower needs frame sizes, the GC needs pointer maps, every panic and traceback walks the function table, so it survives, magic rename and all. That is why undoing the rename hands you back the runtime, the surviving stdlib symbols and their source paths, the function table, the boundaries, and a version floor. garble can rename the magic, but it can’t delete the table, because the binary it ships would no longer run. A stripped non-Go binary owes the runtime nothing comparable; strip it and the information is simply not there to recover.
Don’t gate Go detection on the magic; key on the structure of the pcHeader, which has to be intact for the binary to execute, and you walk straight past the rename into everything the runtime is obliged to keep. What you still won’t get is whatever garble took at the source: the app’s own names, the metadata, the dependency list. For those you are back to an un-garbled sibling build, if you are lucky enough to have one in the same corpus. garble costs the analyst plenty; it just can’t cost them the scaffolding the runtime refuses to live without.