Quick Start#

This guide will get you up and running with ACRO in just a few minutes.

Installation#

pip install acro

Core ACRO Class#

class acro.ACRO(config='default', suppress=False)[source]

ACRO: Automatic Checking of Research Outputs.

Attributes:

configdict: Safe parameters and their values.
resultsRecords: The current outputs including the results of checks.
suppressbool: Whether to automatically apply suppression

Parameters:

config (str)
suppress (bool)

Methods

`add_comments`(output, comment)	Add a comment to an output.
`add_exception`(output, reason)	Add an exception request to an output.
`crosstab`(index, columns[, values, rownames, ...])	Compute a simple cross tabulation of two (or more) factors.
`custom_output`(filename[, comment])	Add an unsupported output to the results dictionary.
`finalise`([path, ext])	Create a results file for checking.
`hist`(data, column[, by_val, grid, ...])	Create a histogram from a single column.
`logit`(endog, exog[, missing, check_rank])	Fits Logit model.
`logitr`(formula, data[, subset, drop_cols])	Fits Logit model from a formula and dataframe.
`ols`(endog[, exog, missing, hasconst])	Fits Ordinary Least Squares Regression.
`olsr`(formula, data[, subset, drop_cols])	Fits Ordinary Least Squares Regression from a formula and dataframe.
`pivot_table`(data[, values, index, columns, ...])	Create a spreadsheet-style pivot table as a DataFrame.
`print_outputs`()	Print the current results dictionary.
`probit`(endog, exog[, missing, check_rank])	Fits Probit model.
`probitr`(formula, data[, subset, drop_cols])	Fits Probit model from a formula and dataframe.
`remove_output`(key)	Remove an output from the results.
`rename_output`(old, new)	Rename an output.
`surv_func`(time, status, output[, entry, ...])	Estimate the survival function.
`survival_plot`(survival_table, survival_func, ...)	Create the survival plot according to the status of suppressing.
`survival_table`(survival_table, safe_table, ...)	Create the survival table according to the status of suppressing.

Examples

>>> acro = ACRO()
>>> results = acro.ols(
...     y, x
... )
>>> results.summary()
>>> acro.finalise(
...     "MYFOLDER",
...     "json",
... )

__init__(config='default', suppress=False)[source]

Construct a new ACRO object and reads parameters from config.

Parameters:

configstr: Name of a yaml configuration file with safe parameters.
suppressbool, default False: Whether to automatically apply suppression.

Parameters:

config (str)
suppress (bool)

Return type:

None

finalise(path='outputs', ext='json')[source]

Create a results file for checking.

Parameters:

pathstr: Name of a folder to save outputs.
extstr: Extension of the results file. Valid extensions: {json, xlsx}.

Returns:

Records: Object storing the outputs.

Parameters:

path (str)

Return type:

Records | None

remove_output(key)[source]

Remove an output from the results.

Parameters:

keystr: Key specifying which output to remove, e.g., ‘output_0’.

Parameters:

key (str)

Return type:

None

print_outputs()[source]

Print the current results dictionary.

Returns:

str: String representation of all outputs.

Return type:

str

custom_output(filename, comment='')[source]

Add an unsupported output to the results dictionary.

Parameters:

filenamestr: The name of the file that will be added to the list of the outputs.
commentstr: An optional comment.

Parameters:

filename (str)
comment (str)

Return type:

None

rename_output(old, new)[source]

Rename an output.

Parameters:

oldstr: The old name of the output.
newstr: The new name of the output.

Parameters:

old (str)
new (str)

Return type:

None

add_comments(output, comment)[source]

Add a comment to an output.

Parameters:

outputstr: The name of the output.
commentstr: The comment.

Parameters:

output (str)
comment (str)

Return type:

None

add_exception(output, reason)[source]

Add an exception request to an output.

Parameters:

outputstr: The name of the output.
reasonstr: The comment.

Parameters:

output (str)
reason (str)

Return type:

None

Essential Methods#

Data Analysis#

ACRO.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, margins_name='All', dropna=True, normalize=False, show_suppressed=False)

Compute a simple cross tabulation of two (or more) factors.

By default, computes a frequency table of the factors unless an array of values and an aggregation function are passed.

To provide consistent behaviour with different aggregation functions, ‘empty’ rows or columns -i.e. that are all NaN or 0 (count,sum) are removed.

Parameters:

indexarray-like, Series, or list of arrays/Series: Values to group by in the rows.
columnsarray-like, Series, or list of arrays/Series: Values to group by in the columns.
valuesarray-like, optional: Array of values to aggregate according to the factors. Requires aggfunc be specified.
rownamessequence, default None: If passed, must match number of row arrays passed.
colnamessequence, default None: If passed, must match number of column arrays passed.
aggfuncstr, optional: If specified, requires values be specified as well.
marginsbool, default False: Add row/column margins (subtotals).
margins_namestr, default ‘All’: Name of the row/column that will contain the totals when margins is True.
dropnabool, default True: Do not include columns whose entries are all NaN.
normalizebool, {‘all’, ‘index’, ‘columns’}, or {0,1}, default False: Normalize by dividing all values by the sum of values. - If passed ‘all’ or True, will normalize over all values. - If passed ‘index’ will normalize over each row. - If passed ‘columns’ will normalize over each column. - If margins is True, will also normalize margin values.
show_suppressedbool. default False: how the totals are being calculated when the suppression is true

Returns:

DataFrame: Cross tabulation of the data.

Parameters:

margins (bool)
margins_name (str)
dropna (bool)

Return type:

DataFrame

ACRO.olsr(formula, data, subset=None, drop_cols=None, *args, **kwargs)

Fits Ordinary Least Squares Regression from a formula and dataframe.

Parameters:

formulastr or generic Formula object: The formula specifying the model.
dataarray_like: The data for the model. See Notes.
subsetarray_like: An array-like object of booleans, integers, or index values that indicate the subset of df to use in the model. Assumes df is a pandas.DataFrame.
drop_colsarray_like: Columns to drop from the design matrix. Cannot be used to drop terms involving categoricals.
*args: Additional positional argument that are passed to the model.
**kwargs: These are passed to the model with one exception. The eval_env keyword is passed to patsy. It can be either a patsy:patsy.EvalEnvironment object or an integer indicating the depth of the namespace to use. For example, the default eval_env=0 uses the calling namespace. If you wish to use a “clean” environment set eval_env=-1.

Returns:

RegressionResultsWrapper: Results.

Return type:

RegressionResultsWrapper

Notes

data must define __getitem__ with the keys in the formula terms args and kwargs are passed on to the model instantiation. E.g., a numpy structured or rec array, a dictionary, or a pandas DataFrame. Arguments are passed in the same order as statsmodels.

Output Management#

ACRO.finalise(path='outputs', ext='json')[source]

Create a results file for checking.

Parameters:

pathstr: Name of a folder to save outputs.
extstr: Extension of the results file. Valid extensions: {json, xlsx}.

Returns:

Records: Object storing the outputs.

Parameters:

path (str)

Return type:

Records | None

ACRO.print_outputs()[source]

Print the current results dictionary.

Returns:

str: String representation of all outputs.

Return type:

str

Quick Workflow#

Install: pip install acro
Initialize: Create ACRO session with suppress=True
Analyze: Use ACRO methods for statistical analysis
Review: Check outputs with print_outputs()
Finalize: Export results with finalise()

Next Steps#

Configuration - Learn configuration options
API Reference - Explore the full API reference
Notebook Examples - Interactive Jupyter notebook examples

Quick Start#

Installation#

Core ACRO Class#

Essential Methods#

Data Analysis#

Output Management#

Quick Workflow#

Next Steps#

This Page