Introduction to Controlflow Graph Analysis
 by Tim Blazytko
 15 March, 2021
This article is a repost. It was originally published at Tim Blazytko’s personal blog: https://synthesis.to/2021/03/15/control_flow_analysis.html.
Following my last blog post, I got a lot of questions about additional material on controlflow analysis. While most compiler books (such as the Dragon Book) cover related topics indepth, I decided to publish my own presentation. The slide deck illustrates the theory of controlflow graph construction, dominance relations and loop analysis. In the second part of this post, I would like to show you how to play around with these concepts using the reverse engineering framework Miasm.

Presentation on controlflow analysis

Code for graph analysis in Miasm
Graph Analysis in Miasm
For playing around with controlflow graphs, dominance relations and loop detection, we take the following graph as an example:
As described in my previous blog post on automated detection of controlflow flattening, this graph has an entry node a
and a loop between c
, e
and g
. To automatically determine all entries, dominance relations and loops, we can use Miasm’s DiGraph
class:
from miasm.core.graph import DiGraph
# define edges
edges = [
("a", "b"),
("a", "c"),
("b", "d"),
("c", "d"),
("c", "e"),
("d", "f"),
("e", "f"),
("e", "g"),
("g", "c"),
]
# init graph
g = DiGraph()
# add edges
for x, y in edges:
g.add_edge(x, y)
# walk over graph entries
for entry in g.heads():
# dominators
dominators = g.compute_dominators(entry)
# dominator tree
dominator_tree = g.compute_dominator_tree(entry)
# natural loops
loops = g.compute_natural_loops(entry)
First, we initialize a DiGraph
instance and add all edges to the graph. Then, we walk over all graph entry nodes and compute the dominance relations, the dominator tree and all loops in the graph. With some additional code for pretty printing (omitted here due to readability), we get the following output:
Graph entry: a
Dominators:
a: a
b: a, b
c: a, c
d: a, d
e: a, c, e
f: a, f
g: a, c, e, g
Dominator Tree:
a > b
a > c
a > d
a > f
c > e
e > g
Natural Loops
g > c: {a, c, e, g}
To apply these techniques to realworld binaries, we just wrap around some disassembling logic:
# init symbol table
loc_db = LocationDB()
# open the binary for analysis
container = Container.from_stream(open(file_path, 'rb'), loc_db)
# cpu abstraction
machine = Machine(container.arch)
# init disassemble engine
mdis = machine.dis_engine(container.bin_stream, loc_db=loc_db)
# disassemble the function at address and get CFG
asm_cfg = mdis.dis_multiblock(start_address)
The code takes a binary as input (provided via file_path
), initializes the disassembly engine and disassembles a function at start_address
. Then, the resulting asm_cfg
is a DiGraph
instance holding the function’s controlflow graph. The entire script that reimplements the heuristic to detect controlflow flattening in Miasm can be found here.