mosaicperm.factor.MosaicFactorTest

class mosaicperm.factor.MosaicFactorTest(outcomes: array, exposures: array, test_stat: callable, test_stat_kwargs: dict | None = None, tiles: Tiling | None = None, clusters: array | None = None, impute_zero: bool = True, **kwargs)[source]

Mosaic test for factor models with known exposures.

Parameters:
outcomesnp.array

(n_obs, n_subjects) array of outcomes, e.g., asset returns. outcomes may contain nans to indicate missing values.

exposuresnp.array

(n_obs, n_subjects, n_factors) array of factor exposures OR (n_subjects, n_factors) array of factor exposures if the exposures do not change with time.

test_statfunction

A function mapping a (n_obs, n_subjects)-size array of residuals to either:

  • A single statistic measuring evidence against the null.

  • Alternatively, a 1D array of many statistics, in which case the p-value will adaptively aggregate evidence across all test statistics.

test_stat_kwargsdict

Optional kwargs to be passed to test_stat.

tilesmosaicperm.tilings.Tiling

An optional Tiling to use as the tiling.

clustersnp.array

An optional n_subject-length array where clusters[i] = k signals that subject i is in the kth cluster. If tiles is not provided, this argument allows one to test the null that the residuals are independent between clusters but possibly dependent within clusters. If tiles is provided, this argument is ignored.

impute_zerobool

If True, missing outcomes are represented as exact zeros in the residual matrix. Else, they are represented as np.nan. Note both methods yield provably valid p-values even if the zero imputation is highly inaccurate; the only difference is convenience.

**kwargsdict

Optional kwargs to default_factor_tiles(). Ignored if tiles is provided.

Examples

We run a mosaic permutation test on synthetic data:

>>> import numpy as np
>>> import mosaicperm as mp
>>> 
>>> # synthetic outcomes and exposures
>>> n_obs, n_subjects, n_factors = 100, 200, 20
>>> outcomes = np.random.randn(n_obs, n_subjects)
>>> exposures = np.random.randn(n_obs, n_subjects, n_factors)
>>> 
>>> # example of missing data
>>> outcomes[0:10][:, 0:5] = np.nan
>>> exposures[0:10][:, 0:5] = np.nan
>>> 
>>> # fit mosaic permutation test
>>> mpt = mp.factor.MosaicFactorTest(
...     outcomes=outcomes,
...     exposures=exposures,
...     test_stat=mp.statistics.mean_maxcorr_stat,
>>> )
>>> mpt.fit().summary()

We can also produce a time series plot of this analysis:

>>> mpt.fit_tseries(nrand=100, n_timepoints=20)
>>> fig, ax = mpt.plot_tseries()

Methods

compute_mosaic_residuals()

Computes mosaic-style residual estimates.

fit([nrand, verbose])

Runs the mosaic permutation test.

fit_tseries([nrand, verbose, n_timepoints, ...])

Runs mosaic permutation tests for various windows of the data, producing a time series of p-values.

permute_residuals([method])

Permutes residuals within tiles.

plot_tseries([time_index, alpha, show_plot])

Plots the results of fit_tseries().

summary([coordinate_index])

Produces an output summarizing the test.

summary_plot([show_plot])

Produces a plot summarizing the results of the test.