Welcome to stambo!

About

Statistical Model Comparison with Bootstrap (STAMBO) focuses on statistically sound comparisons between models and samples by implementing the one-tailed bootstrap hypothesis tests:

We have abstracted the bootstrap two-sample test into a single function: stambo.two_sample_test(). To start using the library, one can simply compare just two means (the default assumpes paired design).

import stambo
...
seed = 42
res = stambo.two_sample_test(sample_1, sample_2, statistics={"Mean": lambda x: x.mean()})

If you would like to avoid the paired design, you can simply set the non_paired argument to True.

What makes this libarry different, is that we support implementation of bootsyrap across many metrics at the same time and clustered bootstrap. The latter is particularly useful when the data is from the same patient. Here is how we run it for the case when predictions come a dataset with repeated measurements from the same patient:

import stambo
...
seed = 42
results = stambo.compare_models(y_test, preds_1, preds_2, ("ROCAUC", "AP", "QKappa", "BACC", "MCC"), seed=seed, n_bootstrap=1000)
print(stambo.to_latex(results))

The above will print a LaTeX table, which one can easily copy-paste:

Documentation:

Examples: