Model summary dataframes

Whereas fitgrid.lm() and fitgrid.lmer() return all the available fit information as LMFitGrid or LMERFitGrid object, respectively, the fitgrid.utils.summary.summarize() function gathers a useful subset into a tidy indexed pandas.Dataframe.

The summary dataframe row and column indexing is standardized, so summaries for different models and model sets can be conveniently split and stacked with ordinary pandas index slicing and dataframe concatenation.

A list of model formulas can be summarized, in which case their summaries are stacked and returned in a single dataframe.

import fitgrid

# a small random data set for illustration
epochs_fg = fitgrid.generate(n_samples=8, n_channels=4, seed=32)

Fit the OLS model … get an LMFitGrid object

fitgrid.lm(epochs_fg, LHS=epochs_fg.channels, RHS="1 + continuous", quiet=True)

Out:

8 by 4 LMFitGrid of type <class 'statsmodels.regression.linear_model.RegressionResultsWrapper'>.

Summarize the OLS model … get a pandas.DataFrame

fitgrid.utils.summary.summarize(
    epochs_fg,
    modeler="lm",
    LHS=epochs_fg.channels,
    RHS=["1 + continuous"],
    quiet=True,
)

Out:

/home/runner/work/fitgrid/fitgrid/fitgrid/utils/summary.py:145: FutureWarning: fitgrid summaries are in early days, subject to change
  warnings.warn(

				channel0	channel1	channel2	channel3
time	model	beta	key
0	1 + continuous	Intercept	2.5_ci	0.554591	-52.1068	-57.3034	-28.1206
			97.5_ci	64.6939	7.34688	8.86844	39.8572
			AIC	194.34	191.305	195.588	196.665
			DF	18	18	18	18
			Estimate	32.6242	-22.3799	-24.2175	5.86833
...	...	...	...	...	...	...	...
7	1 + continuous	continuous	T-stat	0.3572	2.33128	-0.623563	1.13765
			has_warning	False	False	False	False
			logLike	-96.4655	-92.751	-91.9247	-94.2584
			sigma2	1006.29	694.077	639.033	807.003
			warnings

208 rows × 4 columns

Summarize a stack of OLS models … get a pandas.DataFrame with at stack of summaries

fitgrid.utils.summary.summarize(
    epochs_fg,
    modeler="lm",
    LHS=epochs_fg.channels,
    RHS=["1 + continuous", "1 + categorical"],
    quiet=True,
)

Out:

/home/runner/work/fitgrid/fitgrid/fitgrid/utils/summary.py:145: FutureWarning: fitgrid summaries are in early days, subject to change
  warnings.warn(

				channel0	channel1	channel2	channel3
time	model	beta	key
0	1 + continuous	Intercept	2.5_ci	0.554591	-52.1068	-57.3034	-28.1206
			97.5_ci	64.6939	7.34688	8.86844	39.8572
			AIC	194.34	191.305	195.588	196.665
			DF	18	18	18	18
			Estimate	32.6242	-22.3799	-24.2175	5.86833
...	...	...	...	...	...	...	...
7	1 + categorical	categorical[T.cat1]	T-stat	-0.582962	0.837649	0.793966	-0.879485
			has_warning	False	False	False	False
			logLike	-96.3491	-95.0071	-91.7942	-94.532
			sigma2	994.648	869.742	630.748	829.388
			warnings

416 rows × 4 columns

Same goes for LMER models and model stacks …

fitgrid.utils.summary.summarize(
    epochs_fg,
    modeler="lmer",
    LHS=epochs_fg.channels,
    RHS=[
        "1 + categorical + (continuous | categorical)",
        "1 + continuous + (1 | categorical)",
    ],
    parallel=True,
    n_cores=2,
    quiet=True,
)

Out:

/home/runner/work/fitgrid/fitgrid/fitgrid/utils/summary.py:145: FutureWarning: fitgrid summaries are in early days, subject to change
  warnings.warn(

				channel0	channel1	channel2	channel3
time	model	beta	key
0	1+categorical+(continuous\|categorical)	(Intercept)	2.5_ci	-45.4987	-73.282	-67.2839	-67.0309
			97.5_ci	71.93	48.3137	66.708	60.3743
			AIC	175.463	175.236	178.733	179.298
			DF	17.4766	0.68599	15.1721	0.954404
			Estimate	13.2157	-12.4842	-0.287962	-3.32834
...	...	...	...	...	...	...	...
7	1+continuous+(1\|categorical)	continuous	T-stat	0.3572	2.33128	-0.623563	1.16732
			has_warning	True	True	True	False
			logLike	-89.4837	-86.1406	-85.397	-87.4905
			sigma2	1006.29	694.077	639.033	798.97
			warnings	boundary (singular) fit: see ?isSingular	boundary (singular) fit: see ?isSingular	boundary (singular) fit: see ?isSingular

416 rows × 4 columns

Total running time of the script: ( 0 minutes 11.800 seconds)

Gallery generated by Sphinx-Gallery