Cpg#
ScubaTrace can read Joern cpg.bin files and expose them as an in-memory
code property graph. The graph keeps Joern node labels and properties while
adding Python helpers for common navigation tasks, such as finding methods,
walking edges, resolving call targets, and locating the method that contains a
source position.
Loading a Cpg#
Use scubatrace.Cpg when you already have a Joern FlatGraph file:
import scubatrace
cpg = scubatrace.Cpg.load("path/to/cpg.bin")
print(cpg.node_count)
print(cpg.edge_count)
When a project is created with scubatrace.JoernConfig, ScubaTrace also
stores the loaded graph on the project:
import scubatrace
project = scubatrace.Project.create(
"path/to/code",
language=scubatrace.language.C,
joern_config=scubatrace.JoernConfig(),
)
cpg = project.cpg
Querying Nodes#
Nodes are keyed by (label, sequence) tuples. For ad-hoc exploration, use
label-based helpers or Joern-style step names:
methods = cpg.nodes_by_label("METHOD")
calls = cpg.nodes_by_label("CALL")
# Dynamic CPG node steps are also available.
methods = cpg.method
calls = cpg.call
main = cpg.find_method("main")
matching = cpg.find_methods(".*Controller.*", regex=True)
CPG properties can be read from the raw properties mapping, with
scubatrace.CpgNode.get(), or through lower-case Python attributes for
properties defined by the CPG schema:
method = cpg.find_method("main")
if method is not None:
print(method["NAME"])
print(method.get("FULL_NAME"))
print(method.full_name)
Call Relationships#
For METHOD nodes, scubatrace.CpgNode.callers and
scubatrace.CpgNode.callees return scubatrace.cpg.MethodCall
objects. Each relationship contains the caller method, callee method, and the
callsite node. Unresolved calls keep callee as None.
method = cpg.find_method("main")
if method is not None:
for relation in method.callees:
callee_name = relation.callee.full_name if relation.callee else "<unresolved>"
location = relation.callsite_location
print(callee_name, location.filename, location.line_number)
Source Locations#
Use scubatrace.Cpg.methods_at() or scubatrace.Cpg.method_at() to map
a source location back to the smallest matching CPG method:
method = cpg.method_at("src/main.c", 42)
exact = cpg.method_at("src/main.c", 42, column_number=8)
The returned nodes expose scubatrace.CpgNode.location, which normalizes
file, line, column, and byte-offset fields into a
scubatrace.SourceLocation object.
NetworkX Export#
Use scubatrace.Cpg.to_networkx() to convert the Cpg into a
networkx.MultiDiGraph. Node attributes include the Cpg label and all node
properties. Edge attributes include the edge label and optional edge property.
graph = cpg.to_networkx()
- class scubatrace.Cpg(nodes: Mapping[tuple[str, int], CpgNode], edges: Iterable[CpgEdge], manifest: Mapping[str, Any] | None = None)#
Bases:
object
- class scubatrace.CpgNode(id: 'NodeId', label: 'str', seq: 'int', properties: 'dict[str, Any]'=<factory>)#
Bases:
object
- class scubatrace.CpgEdge(src: 'NodeId', dst: 'NodeId', label: 'str', property: 'Any' = None)#
Bases:
object
- class scubatrace.cpg.MethodCall(caller: 'CpgNode | None', callee: 'CpgNode | None', callsite: 'CpgNode')#
Bases:
object
- class scubatrace.SourceLocation(filename: 'str | None', line_number: 'int | None', column_number: 'int | None', line_number_end: 'int | None' = None, column_number_end: 'int | None' = None, offset: 'int | None' = None, offset_end: 'int | None' = None)#
Bases:
object
- class scubatrace.cpg.FlatGraphReader(path: str | Path)#
Bases:
object