What ACRO-R Supports
Source:supports.md
This page provides a comprehensive overview of ACRO’s capabilities for researchers, developers, and non-technical stakeholders. ACRO supports a wide range of statistical analysis functions with automated disclosure control.
Supported Data Analysis Functions
Table Creation & Cross-tabulation
For Researchers: Create frequency tables and cross-tabulations with automatic cell suppression for small counts.
What ACRO Supports:
- crosstab() - Cross-tabulation of two or more variables with frequency counting
- pivot_table() - Spreadsheet-style pivot tables with aggregation functions
- table() - Simple frequency tables for categorical data (R interface only)
Technical Details: - ACRO suppresses, and reports the reason why, the value of an aggregation statistic (mean, median, variance, etc.) for any cell is deemed to be sensitive - The current version of ACRO supports the three most common tests for sensitivity: ensuring the number of contributors is above a frequency threshold, and testing for dominance via N-K rules - N-K Rule : A dominance test where if the top N contributors account for more than K% of the total, the cell is considered disclosive - Frequency Threshold : Cells with fewer than a specified number of contributors are suppressed - All thresholds are configurable via YAML configuration files - For detailed methodology, see our research paper - Automatic flagging of negative or missing values for human review
Example Use Cases: - Survey response analysis by demographics - Clinical trial outcome tables - Market research cross-tabulations - Educational assessment reporting
Statistical Modeling
For Researchers: Run regression analyses with automated checks on model outputs and residual degrees of freedom.
What ACRO Supports:
- ols() - Ordinary Least Squares linear regression
- logit() - Logistic regression for binary outcomes
- probit() - Probit regression for binary outcomes
Technical Details: - For regressions such as linear, probit, and logit, the tests verify that the number of residual degrees of freedom exceeds a threshold - Within the ACRO Python package, the functionality of the ‘ACRO’ class is split into a number of separate classes for maintainability and extensibility. A ‘Tables’ class contains the code necessary to perform disclosure checks on tabular data, such as crosstab. A separate ‘Regression’ class contains the code for checking regressions such as logit and probit
Example Use Cases: - Economic modeling and policy analysis - Medical research and clinical studies - Social science research - Business analytics and forecasting
Programming Language Support
Python (Primary)
For Developers: The ACRO package is a lightweight Python tool that sits over well-known analysis tools that produce outputs such as tables, plots, and statistical models
Supported Libraries: - Pandas - For data manipulation and table creation - Statsmodels - For statistical modeling and regression analysis - NumPy - For numerical computations
Python Version Support: - Python 3.10, 3.11, 3.12, 3.13
R Language
For R Users: Additional programming languages such as this R package are supported by providing front-end packages that interface with the core ACRO Python back-end
Integration Features: - Native R syntax and workflows - R Markdown and Shiny application support - Tidyverse compatibility - CRAN package availability
Getting Started in R: <span class="pre">r<span> <span class="pre">library(“acro”)<span> <span class="pre">acro_init(suppress<span> <span class="pre">=<span> <span class="pre">TRUE)<span> <span class="pre"></span>
Disclosure Control Features
Automated Sensitivity Testing
What ACRO Checks:
For Tables: - Minimum cell counts (frequency thresholds) - Dominance rules (N-K rules for concentration) - Presence of negative or missing values
For Statistical Models: - Residual degrees of freedom thresholds - Model fit diagnostics - Parameter significance testing
For Non-Technical Users: ACRO automatically identifies when research outputs might reveal sensitive information about individuals or organizations, applying industry-standard privacy protection rules without requiring manual review of every result.
Output Management
What ACRO Provides:
- Suppression Masks - Clear indication of which results are hidden and why
- Summary Reports - Detailed explanation of all disclosure checks performed
- Audit Trails - Complete record of all analysis steps and decisions
- Exception Handling - Process for requesting release of flagged outputs
Workflow Integration: The finalise function will: Check that each output with “fail” or “review” status has an exception, if not you will be asked to enter one. Write the outputs to a directory. This directory contains everything that the output checkers need to make a decision
Supported Environments
Research Environments
Where ACRO Works: - Trusted Research Environments (TREs) - Data safe havens - Secure data centers - Academic research computing facilities - Government statistical offices - Healthcare research environments
Installation: See Installation for complete installation instructions and system requirements.
Integration Capabilities
Analysis Workflows
For Research Teams: ACRO integrates seamlessly into existing data analysis workflows, requiring minimal changes to current practices while adding comprehensive privacy protection.
Supported Workflows: - Jupyter notebook analysis - R Markdown documents - Stata do-files and scripts - Batch processing and automation - Interactive analysis sessions
Data Sources: - CSV and Excel files - Database connections - Survey data platforms - Administrative datasets - Clinical trial databases
Technical Architecture
For System Administrators:
Core Technology: Lightweight translation scripts intercept your commands and pass them through to a python ‘engine’, based on industry-standard packages that run your commands and perform statistical disclosure checks on them
System Requirements: - Python 3.10+ runtime environment - Standard scientific computing libraries (pandas, numpy, statsmodels) - Minimal computational overhead - No external network dependencies during analysis
Security Features: - Local processing only (no cloud dependencies) - Audit logging and tracking - Configurable disclosure thresholds - Role-based access controls (through integration with TRE systems)
Documentation and Support: Standard Python coding and naming practices have been used throughout. GitHub continuous integration (CI) runners automatically generate and publish API documentation using the Python docstrings written in numpydoc format
What ACRO Does NOT Support
Current Limitations: - Complex visualizations and plots (coming in future versions) - Time series analysis (specialized disclosure rules needed) - Machine learning models (use SACRO-ML for AI/ML workflows) - Real-time data streams - Distributed computing frameworks
Alternative Solutions: - SACRO-ML - For machine learning and AI model disclosure control - SACRO-Viewer - For interactive output review and approval - Traditional SDC tools - For specialized use cases not covered by ACRO