fitgrid.utils.summary module¶
- fitgrid.utils.summary.plot_AICmin_deltas(summary_df, figsize=None, gridspec_kw=None, **kwargs)[source]¶
plot FitGrid min delta AICs and fitter warnings
Thresholds of AIC_min delta at 2, 4, 7, 10 are from Burnham & Anderson 2004, see Notes.
- Parameters
summary_df (pd.DataFrame) – as returned by fitgrid.utils.summary.summarize
figsize (2-ple) – pyplot.figure figure size parameter
gridspec_kw (dict) – matplotlib.gridspec key : value parameters
kwargs (dict) – keyword args passed to plt.subplots(…)
- Returns
f, axs
- Return type
matplotlib.pyplot.Figure
Notes
[BurAnd2004] p. 270-271. Where \(AIC_{min}\) is the lowest AIC value for “a set of a priori candidate models well-supported by the underlying science \(g_{i}, i = 1, 2, ..., R)\)”,
\[\Delta_{i} = AIC_{i} - AIC_{min}\]“is the information loss experienced if we are using fitted model \(g_{i}\) rather than the best model, \(g_{min}\) for inference.” …
“Some simple rules of thumb are often useful in assessing the relative merits of models in the set: Models having \(\Delta_{i} <= 2\) have substantial support (evidence), those in which \(\Delta_{i} 4 <= 7\) have considerably less support, and models having \(\Delta_{i} > 10\) have essentially no support.”
- fitgrid.utils.summary.plot_betas(summary_df, LHS, alpha=0.05, fdr=None, figsize=None, s=None, df_func=None, **kwargs)[source]¶
Plot model parameter estimates for each data column in LHS
- Parameters
summary_df (pd.DataFrame) – as returned by fitgrid.utils.summary.summarize
LHS (list of str) – column names of the data fitgrid.fitgrid docs
alpha (float) – alpha level for false discovery rate correction
fdr (str {None, ‘BY’, ‘BH’}) – Add markers for FDR adjusted significant \(p\)-values. BY is Benjamini and Yekatuli, BH is Benjamini and Hochberg, None supresses the markers.
df_func ({None, function}) – plot function(degrees of freedom), e.g., np.log10, lambda x: x
s (float) – scatterplot marker size for BH and lmer decorations
kwargs (dict) – keyword args passed to pyplot.subplots()
- Returns
figs
- Return type
list
- fitgrid.utils.summary.summarize(epochs_fg, modeler, LHS, RHS, parallel=True, n_cores=4, **kwargs)[source]¶
Fit the data with one or more model formulas and return summary information.
Convenience wrapper, useful for keeping memory use manageable when gathering betas and fit measures for a stack of models.
- Parameters
epochs_fg (fitgrid.epochs.Epochs) – as returned by fitgrid.epochs_from_dataframe() or fitgrid.from_hdf(), NOT a pandas.DataFrame.
modeler ({‘lm’, ‘lmer’}) – class of model to fit, lm for OLS, lmer for linear mixed-effects. Note: the RHS formula language must match the modeler.
LHS (list of str) – the data columns to model
RHS (model formula or list of model formulas to fit) – see the Python package patsy docs for lm formula langauge and the R library lme4 docs for the lmer formula langauge.
parallel (bool)
n_cores (int) – number of cores to use. See what works, but golden rule if running on a shared machine.
**kwargs (key=value arguments passed to the modeler, optional)
- Returns
summary_df – indexed by timestamp, model_formula, beta, and key, where the keys are ll.l_ci, uu.u_ci, AIC, DF, Estimate, P-val, SE, T-stat, has_warning, logLike.
- Return type
pandas.DataFrame
Examples
>>> lm_formulas = [ '1 + fixed_a + fixed_b + fixed_a:fixed_b', '1 + fixed_a + fixed_b', '1 + fixed_a, '1 + fixed_b, '1', ] >>> lm_summary_df = fitgrid.utils.summarize( epochs_fg, 'lm', LHS=['MiPf', 'MiCe', 'MiPa', 'MiOc'], RHS=lmer_formulas, parallel=True, n_cores=4 )
>>> lmer_formulas = [ '1 + fixed_a + (1 + fixed_a | random_a) + (1 | random_b)', '1 + fixed_a + (1 | random_a) + (1 | random_b)', '1 + fixed_a + (1 | random_a)', ] >>> lmer_summary_df = fitgrid.utils.summarize( epochs_fg, 'lmer', LHS=['MiPf', 'MiCe', 'MiPa', 'MiOc'], RHS=lmer_formulas, parallel=True, n_cores=12, REML=False )