Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

How Phase 3 Works

This chapter explains the technical implementation of automatic aspect weaving in aspect-rs Phase 3, diving deep into the MIR extraction pipeline and compiler integration.

Overview

Phase 3 transforms aspect-rs from annotation-based to fully automatic aspect weaving:

User writes clean code → aspect-rustc-driver analyzes → Aspects applied automatically

No #[aspect] annotations required. The compiler does everything automatically based on pointcut expressions.

The MIR Extraction Pipeline

What is MIR?

MIR (Mid-level Intermediate Representation) is Rust’s intermediate compilation stage:

Source Code → AST → HIR → MIR → LLVM IR → Machine Code
                            ↑
                     We analyze here

Why MIR instead of AST?

  • Type information fully resolved
  • Macros expanded
  • Generic parameters known
  • Trait bounds resolved
  • More reliable than AST parsing

Accessing MIR via TyCtxt

The compiler provides TyCtxt (Type Context) for MIR access:

#![allow(unused)]
fn main() {
fn analyze_crate_with_aspects(tcx: TyCtxt<'_>, (): ()) {
    // tcx gives access to ALL compiler information
    
    // Access HIR (High-level IR)
    let hir = tcx.hir();
    
    // Iterate all items in crate
    for item_id in hir.items() {
        let item = hir.item(item_id);
        
        // Access function signatures
        if let ItemKind::Fn(sig, generics, body_id) = &item.kind {
            // Extract metadata...
        }
    }
}
}

TyCtxt provides:

  • Complete type information
  • Function signatures
  • Module structure
  • Source locations
  • MIR bodies (when available)
  • Trait implementations

The Challenge: Function Pointers Required

rustc query providers MUST be static functions (not closures):

#![allow(unused)]
fn main() {
// ❌ DOESN'T WORK: Closures not allowed
config.override_queries = Some(|_sess, providers| {
    let config = self.config.clone();  // Capture!
    providers.analysis = move |tcx, ()| {
        // Can't capture 'config'
    };
});

// ✅ WORKS: Function pointer with global state
config.override_queries = Some(|_sess, providers| {
    providers.analysis = analyze_crate_with_aspects;  // Function pointer
});

fn analyze_crate_with_aspects(tcx: TyCtxt<'_>, (): ()) {
    // Access config from global state
    let config = CONFIG.lock().unwrap().clone().unwrap();
}
}

The solution: Use global state with Mutex for thread safety.

Step-by-Step Execution Flow

Step 1: Driver Initialization

// aspect-rustc-driver/src/main.rs
fn main() {
    let args: Vec<String> = std::env::args().collect();
    
    // Parse aspect-specific arguments
    let (aspect_args, rustc_args) = parse_args(&args);
    
    // Extract pointcut expressions from --aspect-pointcut flags
    let pointcuts = extract_pointcuts(&aspect_args);
    
    // Store in global config
    *CONFIG.lock().unwrap() = Some(AspectConfig {
        pointcuts,
        verbose: aspect_args.contains(&"--aspect-verbose".to_string()),
        output_file: find_output_file(&aspect_args),
    });
    
    // Create callbacks
    let mut callbacks = AspectCallbacks::new();
    
    // Run compiler with our callbacks
    let exit_code = RunCompiler::new(&rustc_args, &mut callbacks).run();
    
    std::process::exit(exit_code.unwrap_or(1));
}

Step 2: Compiler Configuration

#![allow(unused)]
fn main() {
impl Callbacks for AspectCallbacks {
    fn config(&mut self, config: &mut interface::Config) {
        // Override the analysis query
        config.override_queries = Some(override_queries);
    }
}

fn override_queries(_session: &Session, providers: &mut Providers) {
    // Replace standard analysis with our custom version
    providers.analysis = analyze_crate_with_aspects;
}
}

This intercepts compilation at the analysis phase, after type checking completes.

Step 3: MIR Extraction

#![allow(unused)]
fn main() {
fn analyze_crate_with_aspects(tcx: TyCtxt<'_>, (): ()) {
    // 1. Retrieve config from global state
    let config = CONFIG.lock().unwrap().clone().unwrap();
    
    // 2. Create MIR analyzer
    let analyzer = MirAnalyzer::new(tcx, config.verbose);
    
    // 3. Extract all functions
    let functions = analyzer.extract_all_functions();
    
    // 4. Match pointcuts
    let matcher = PointcutMatcher::new(config.pointcuts);
    let matches = matcher.match_all(&functions);
    
    // 5. Store results
    *RESULTS.lock().unwrap() = Some(AnalysisResults {
        functions,
        matches,
    });
}
}

Step 4: Function Metadata Extraction

The MirAnalyzer extracts comprehensive function metadata:

#![allow(unused)]
fn main() {
pub struct MirAnalyzer<'tcx> {
    tcx: TyCtxt<'tcx>,
    verbose: bool,
}

impl<'tcx> MirAnalyzer<'tcx> {
    pub fn extract_all_functions(&self) -> Vec<FunctionMetadata> {
        let mut functions = Vec::new();
        
        // Iterate all items in HIR
        for item_id in self.tcx.hir().items() {
            let item = self.tcx.hir().item(item_id);
            
            match &item.kind {
                ItemKind::Fn(sig, generics, body_id) => {
                    // Extract function metadata
                    let metadata = self.extract_function_metadata(item, sig);
                    functions.push(metadata);
                }
                ItemKind::Mod(_) => {
                    // Recurse into modules
                    // (handled by HIR iteration)
                }
                _ => {
                    // Skip non-function items
                }
            }
        }
        
        functions
    }
    
    fn extract_function_metadata(
        &self,
        item: &Item<'tcx>,
        sig: &FnSig<'tcx>
    ) -> FunctionMetadata {
        // Get function name
        let simple_name = item.ident.to_string();
        
        // Get fully qualified name
        let def_id = item.owner_id.to_def_id();
        let qualified_name = self.tcx.def_path_str(def_id);
        
        // Get module path
        let module_path = qualified_name
            .rsplitn(2, "::")
            .nth(1)
            .unwrap_or("crate")
            .to_string();
        
        // Check visibility
        let visibility = match self.tcx.visibility(def_id) {
            Visibility::Public => VisibilityKind::Public,
            _ => VisibilityKind::Private,
        };
        
        // Check async status
        let is_async = sig.header.asyncness == IsAsync::Async;
        
        // Get source location
        let span = item.span;
        let source_map = self.tcx.sess.source_map();
        let location = if let Ok(loc) = source_map.lookup_line(span.lo()) {
            Some(SourceLocation {
                file: loc.file.name.prefer_remapped().to_string(),
                line: loc.line + 1,
            })
        } else {
            None
        };
        
        FunctionMetadata {
            simple_name,
            qualified_name,
            module_path,
            visibility,
            is_async,
            location,
        }
    }
}
}

Extracted data for each function:

  • Simple name: fetch_data
  • Qualified name: crate::api::fetch_data
  • Module path: crate::api
  • Visibility: Public/Private
  • Async status: true/false
  • Source location: src/api.rs:42

Step 5: Pointcut Matching

The PointcutMatcher evaluates pointcut expressions against functions:

#![allow(unused)]
fn main() {
pub struct PointcutMatcher {
    pointcuts: Vec<Pointcut>,
}

impl PointcutMatcher {
    pub fn match_all(&self, functions: &[FunctionMetadata]) -> Vec<MatchResult> {
        let mut results = Vec::new();
        
        for func in functions {
            let mut matched_pointcuts = Vec::new();
            
            for pointcut in &self.pointcuts {
                if self.evaluate_pointcut(pointcut, func) {
                    matched_pointcuts.push(pointcut.clone());
                }
            }
            
            if !matched_pointcuts.is_empty() {
                results.push(MatchResult {
                    function: func.clone(),
                    pointcuts: matched_pointcuts,
                });
            }
        }
        
        results
    }
    
    fn evaluate_pointcut(
        &self,
        pointcut: &Pointcut,
        func: &FunctionMetadata
    ) -> bool {
        match pointcut {
            Pointcut::Execution(pattern) => {
                self.matches_execution(pattern, func)
            }
            Pointcut::Within(module) => {
                self.matches_within(module, func)
            }
            Pointcut::Call(name) => {
                self.matches_call(name, func)
            }
        }
    }
    
    fn matches_execution(
        &self,
        pattern: &str,
        func: &FunctionMetadata
    ) -> bool {
        // Parse pattern: "pub fn *(..)"
        let parts: Vec<&str> = pattern.split_whitespace().collect();
        
        // Check visibility
        if parts.contains(&"pub") && func.visibility != VisibilityKind::Public {
            return false;
        }
        
        // Wildcard matches any name
        if parts.contains(&"*") {
            return true;
        }
        
        // Name matching
        if let Some(name_part) = parts.iter().find(|p| !["pub", "fn", "(..)"].contains(p)) {
            return func.simple_name == *name_part;
        }
        
        false
    }
    
    fn matches_within(
        &self,
        module: &str,
        func: &FunctionMetadata
    ) -> bool {
        // Check if function is within specified module
        func.module_path.contains(module) ||
        func.qualified_name.starts_with(&format!("crate::{}", module))
    }
}
}

Pointcut evaluation examples:

#![allow(unused)]
fn main() {
// "execution(pub fn *(..))" matches:
✓ pub fn fetch_user(id: u64) -> User
✓ pub async fn save_user(user: User) -> Result<()>
✗ fn internal_helper() -> ()  // Not public

// "within(api)" matches:
✓ api::fetch_data()
✓ api::process_data()
✗ internal::helper()  // Different module
}

Step 6: Results Storage and Reporting

#![allow(unused)]
fn main() {
fn analyze_crate_with_aspects(tcx: TyCtxt<'_>, (): ()) {
    // ... extraction and matching ...
    
    // Store results in global state
    *RESULTS.lock().unwrap() = Some(AnalysisResults {
        total_functions: functions.len(),
        matched_functions: matches.len(),
        functions: functions.clone(),
        matches,
    });
    
    // Print summary if verbose
    if config.verbose {
        print_analysis_summary(&functions, &matches);
    }
    
    // Write report to file
    if let Some(output_file) = &config.output_file {
        write_analysis_report(output_file, &functions, &matches);
    }
}

fn print_analysis_summary(
    functions: &[FunctionMetadata],
    matches: &[MatchResult]
) {
    println!("=== Analysis Statistics ===");
    println!("Total functions: {}", functions.len());
    
    let public_count = functions.iter()
        .filter(|f| f.visibility == VisibilityKind::Public)
        .count();
    println!("  Public: {}", public_count);
    
    let private_count = functions.len() - public_count;
    println!("  Private: {}", private_count);
    
    println!("\n=== Pointcut Matching ===");
    for match_result in matches {
        println!("  ✓ Matched: {}", match_result.function.qualified_name);
        for pointcut in &match_result.pointcuts {
            println!("    Pointcut: {:?}", pointcut);
        }
    }
}
}

Complete Example Walkthrough

Let’s trace a complete execution with example code:

Input Code (test_input.rs)

#![allow(unused)]
fn main() {
pub fn public_function(x: i32) -> i32 {
    x + 1
}

fn private_function() {
    println!("private");
}

pub mod api {
    pub fn fetch_data() -> String {
        "data".to_string()
    }
    
    fn internal_helper() {
        println!("internal");
    }
}
}

Execution

$ aspect-rustc-driver \
    --aspect-verbose \
    --aspect-pointcut "execution(pub fn *(..))" \
    --aspect-pointcut "within(api)" \
    test_input.rs --crate-type lib

Step-by-Step Processing

1. Parse arguments:

#![allow(unused)]
fn main() {
pointcuts = [
    "execution(pub fn *(..))",
    "within(api)",
]
verbose = true
output_file = None
}

2. Configure compiler:

#![allow(unused)]
fn main() {
config.override_queries = Some(override_queries);
// providers.analysis = analyze_crate_with_aspects
}

3. Extract functions from MIR:

#![allow(unused)]
fn main() {
functions = [
    FunctionMetadata {
        simple_name: "public_function",
        qualified_name: "crate::public_function",
        module_path: "crate",
        visibility: Public,
        is_async: false,
        location: Some("test_input.rs:1"),
    },
    FunctionMetadata {
        simple_name: "private_function",
        qualified_name: "crate::private_function",
        module_path: "crate",
        visibility: Private,
        is_async: false,
        location: Some("test_input.rs:5"),
    },
    FunctionMetadata {
        simple_name: "fetch_data",
        qualified_name: "crate::api::fetch_data",
        module_path: "crate::api",
        visibility: Public,
        is_async: false,
        location: Some("test_input.rs:10"),
    },
    FunctionMetadata {
        simple_name: "internal_helper",
        qualified_name: "crate::api::internal_helper",
        module_path: "crate::api",
        visibility: Private,
        is_async: false,
        location: Some("test_input.rs:14"),
    },
]
}

4. Match pointcuts:

For "execution(pub fn *(..)):":

  • public_function - public, matches
  • private_function - not public
  • fetch_data - public, matches
  • internal_helper - not public

For "within(api)":

  • public_function - not in api module
  • private_function - not in api module
  • fetch_data - in api module
  • internal_helper - in api module

5. Results:

#![allow(unused)]
fn main() {
matches = [
    MatchResult {
        function: public_function,
        pointcuts: ["execution(pub fn *(..))"],
    },
    MatchResult {
        function: fetch_data,
        pointcuts: [
            "execution(pub fn *(..))",
            "within(api)",
        ],
    },
    MatchResult {
        function: internal_helper,
        pointcuts: ["within(api)"],
    },
]
}

6. Output:

=== Aspect Weaving Analysis ===

Total functions: 4
  Public: 2
  Private: 2

=== Pointcut Matching ===

Pointcut: "execution(pub fn *(..))"
  ✓ Matched: public_function
  ✓ Matched: api::fetch_data
  Total matches: 2

Pointcut: "within(api)"
  ✓ Matched: api::fetch_data
  ✓ Matched: api::internal_helper
  Total matches: 2

=== Matching Summary ===
Total functions matched: 3

Advanced Features

Generic Function Handling

#![allow(unused)]
fn main() {
fn extract_function_metadata(
    &self,
    item: &Item<'tcx>,
    sig: &FnSig<'tcx>
) -> FunctionMetadata {
    // ... basic extraction ...
    
    // Detect generic parameters
    let has_generics = !item.owner_id
        .to_def_id()
        .generics_of(self.tcx)
        .params
        .is_empty();
    
    FunctionMetadata {
        // ...
        has_generics,
    }
}
}

Example:

#![allow(unused)]
fn main() {
pub fn generic_function<T: Clone>(item: T) -> T {
    item.clone()
}

// Extracted as:
FunctionMetadata {
    simple_name: "generic_function",
    has_generics: true,  // ✓ Detected
    // ...
}
}

Async Function Detection

#![allow(unused)]
fn main() {
let is_async = sig.header.asyncness == IsAsync::Async;
}

Example:

#![allow(unused)]
fn main() {
pub async fn fetch_async() -> Result<Data> {
    tokio::time::sleep(Duration::from_secs(1)).await;
    Ok(Data::new())
}

// Extracted as:
FunctionMetadata {
    simple_name: "fetch_async",
    is_async: true,  // ✓ Detected
    // ...
}
}

Source Location Mapping

#![allow(unused)]
fn main() {
let span = item.span;
let source_map = self.tcx.sess.source_map();

if let Ok(loc) = source_map.lookup_line(span.lo()) {
    Some(SourceLocation {
        file: loc.file.name.prefer_remapped().to_string(),
        line: loc.line + 1,
    })
}
}

Example output:

• api::fetch_data (Public)
  Module: crate::api
  Location: src/api.rs:42

Global State Management

The Pattern

#![allow(unused)]
fn main() {
// Global storage
static CONFIG: Mutex<Option<AspectConfig>> = Mutex::new(None);
static RESULTS: Mutex<Option<AnalysisResults>> = Mutex::new(None);

// Write: In callbacks (single-threaded)
impl AspectCallbacks {
    fn new() -> Self {
        // Store config
        *CONFIG.lock().unwrap() = Some(config);
        Self
    }
}

// Read: In query provider (potentially parallel)
fn analyze_crate_with_aspects(tcx: TyCtxt<'_>, (): ()) {
    // Retrieve config
    let config = CONFIG.lock().unwrap().clone().unwrap();
    
    // Use config...
    
    // Store results
    *RESULTS.lock().unwrap() = Some(results);
}
}

Thread Safety

  • Mutex ensures exclusive access
  • clone() avoids holding lock during analysis
  • Safe for parallel query execution

Alternatives Considered

❌ Closures (not allowed):

#![allow(unused)]
fn main() {
// Doesn't compile - closure captures not allowed
config.override_queries = Some(|_sess, providers| {
    let cfg = self.config.clone();  // Capture!
    providers.analysis = move |tcx, ()| { /* use cfg */ };
});
}

❌ thread_local (too complex):

#![allow(unused)]
fn main() {
// Works but unnecessarily complex
thread_local! {
    static CONFIG: RefCell<Option<AspectConfig>> = RefCell::new(None);
}
}

✅ Global Mutex (simple and correct):

#![allow(unused)]
fn main() {
static CONFIG: Mutex<Option<AspectConfig>> = Mutex::new(None);
}

Performance Characteristics

Analysis Speed

For a typical crate with 100 functions:

  • MIR extraction: ~10ms
  • Pointcut matching: ~1ms
  • Report generation: ~1ms
  • Total overhead: ~12ms

Memory Usage

  • Per-function metadata: ~200 bytes
  • 100 functions: ~20KB
  • Negligible compared to compilation

Compilation Impact

Standard rustc:     2.5s
aspect-rustc-driver: 2.52s
Overhead:           +2%

The impact is minimal because we only analyze, not modify, the code.

Debugging and Diagnostics

Verbose Output

$ aspect-rustc-driver --aspect-verbose ...

=== aspect-rustc-driver: Configuring compiler ===
Pointcuts registered: 2

🎉 TyCtxt Access Successful!
=== aspect-rustc-driver: MIR Analysis ===

Extracting function metadata from compiled code...
  Found function: public_function
  Found function: private_function
  Found function: api::fetch_data
Total functions found: 3

Error Handling

#![allow(unused)]
fn main() {
fn analyze_crate_with_aspects(tcx: TyCtxt<'_>, (): ()) {
    let config = match CONFIG.lock().unwrap().clone() {
        Some(cfg) => cfg,
        None => {
            eprintln!("ERROR: Configuration not initialized");
            return;
        }
    };
    
    // Continue analysis...
}
}

Analysis Report

$ aspect-rustc-driver \
    --aspect-output analysis.txt \
    ...

# Generates analysis.txt:
=== Aspect Weaving Analysis Results ===

Total functions: 7

All Functions:
  • public_function (Public)
    Module: crate
    Location: test_input.rs:5

Matched Functions:
  • public_function
    Pointcut: execution(pub fn *(..))

Integration with Build Systems

Cargo Integration

Replace rustc with aspect-rustc-driver:

# Manual compilation
RUSTC=aspect-rustc-driver cargo build \
    -- --aspect-pointcut "execution(pub fn *(..))"

# Or via config
export RUSTC="aspect-rustc-driver --aspect-verbose"
cargo build

Build Scripts

// build.rs
use std::process::Command;

fn main() {
    Command::new("aspect-rustc-driver")
        .args(&[
            "--aspect-pointcut", "execution(pub fn *(..))",
            "--aspect-output", "target/aspect-analysis.txt",
            "src/lib.rs",
        ])
        .status()
        .expect("Failed to run aspect analysis");
}

Key Takeaways

  1. MIR provides reliable metadata - Type-checked, macro-expanded code
  2. TyCtxt gives full compiler access - All information available
  3. Function pointers + global state - Required by rustc API
  4. Pointcut matching at compile-time - Zero runtime cost
  5. Minimal performance impact - ~2% compilation overhead
  6. Comprehensive extraction - Name, module, visibility, async, location
  7. Production-ready analysis - Handles real Rust code

Next Steps


Related Chapters: