Note

Click here to download the full example code

Workflow Outline¶

TL; DR conda install fitgrid and jupyter into a fresh conda virtual environment (shown here). Download this document as Jupyter notebook and run it. Read the notes, explore the output. Prerequisites: a working knowledge of conda and virtual environments, regression modeling with formulas like $\mathsf{\sim ~ 1 + a + b + a:b}$ in Python or R, and a bit of general purpose Python and pandas.

With multichannel digital recordings in hand, the basic fitgrid modeling workflow is four steps:

Prepare a 2-D vertical stack of fixed-length epochs as pandas.DataFrame. Rows = observations, indexed for epoch and time. Columns = data channels (numeric) and model predictor variables (numeric, string, boolean). See Format for details.
Load the data into a fitgrid.Epochs object for modeling.
Specify and fit ordinary least squares or linear mixed-effects model formula to populate the FitGrid[times, channels] object with the fit results. For OLS models use the patsy formula syntax which works like lm formulas in R. For mixed-effects models use the lme4::lmer R formula syntax.
Query and slice the FitGrid[times, channels] to fetch tidy time x channel dataframes of model fit results: coefficient estimates, residual error, model diagnostics, etc. for visualization and analysis.

1. Prepare epochs data¶

fitgrid assumes you are modeling epochs, i.e., fixed-length segments of typically multi-channel digital data, time stamped, and collected in a tidy pandas.DataFrame (see Format for details).

Here is a small complete simulated data set with four epochs and three time points per epoch. As required by the format specification, the epoch_id indices are unique and the time indices are all the same within each epoch. There are two predictor variable columns (categorical, continuous) and four data channel

import fitgrid as fg

# fitgrid's built-in epoch data simulator
epochs_df = fg.generate(
    n_epochs=2,  # number of epochs per category level
    n_samples=3,  # number of time stamps per epoch
    n_categories=2,  # number of category levels
    n_channels=4,  # number of data channels
    seed=1,
    return_type="dataframe",
)

# convert the epoch and time stamp columns to a pandas.MultiIndex
epochs_df = epochs_df.set_index(["epoch_id", "time"])
epochs_df  # display

		categorical	continuous	channel0	channel1	channel2	channel3
epoch_id	time
0	0	cat0	0.417022	9.571173	-33.018575	-20.615181	1.524233
	1	cat0	0.720324	-7.481111	34.341711	-25.356169	-19.109869
	2	cat0	0.000114	43.863238	27.047722	-20.137384	5.727465
1	0	cat1	0.302333	-61.804221	15.074830	-0.379938	63.007654
	1	cat1	0.146756	-9.672516	27.025678	-33.519310	3.604769
	2	cat1	0.092339	-11.521631	-20.511836	7.032471	18.516093
2	0	cat0	0.186260	34.013083	-3.686707	49.794065	9.005110
	1	cat0	0.345561	-32.996738	-28.073083	22.261325	-10.567495
	2	cat0	0.396767	-5.172846	-8.036642	-5.755067	-34.275546
3	0	cat1	0.538817	-26.335753	15.910664	-26.628869	-10.480282
	1	cat1	0.419195	1.266412	-20.749823	-22.414749	-6.266827
	2	cat1	0.685220	17.484456	-11.902606	50.773638	17.598696

A real epochs data set is typically much larger but the format is the same as illustrated here with sample EEG data that fitgrid downloads from Zenodo.

The EEG recording has already been snipped into into 1.5 second segments and time stamped with the stimulus events at time = 0. The experimental predictor variable data columns have been added in alignment with the individual epochs.

import pandas as pd
from fitgrid import DATA_DIR, sample_data

# download the epochs data and read into a pd.DataFrame
sample_data.get_file("sub000p3.ms1500.epochs.feather")
p3_epochs_df = pd.read_feather(DATA_DIR / "sub000p3.ms1500.epochs.feather")

Experimental design. These data are single-trial EEG epochs recorded at 250 digital samples per second from one individual in an auditory “oddball” paradigm. The stimuli are a random sequence of high and low pitched tones (tone: hi, lo) that are frequent or infrequent (stim: standard, target). Stimulus trials are presented in two blocks: the hi tones are the infrequent targets in the first block and frequent standards in the second. The task is to respond to the infrequent tones. In this type of paradigm, the average potentials recorded over central and posterior scalp after about 300 ms post-stimulus are typically more positive going for the rare targets than for the frequent standards, a P300 ERP effect. More information about these and other sample data may be found at mkpy data examples.

Data wrangling. As with any data analysis pipeline, the epochs data must be quality controlled and groomed to final form before modeling. fitgrid ingests tidy pandas.DataFrames directly which allows for convenient data preparation with other pandas-aware toolboxes and easy data interchange with other EEG data processing platforms. A smattering of pandas data transforms are illustrated here.

# select the experimental stimulus trials for modeling
p3_epochs_df = p3_epochs_df.query("stim in ['standard', 'target']")

# rename the time stamp column
p3_epochs_df.rename(columns={"match_time": "time_ms"}, inplace=True)

# data QC flags are set on the stimulus events, look up and select the good epochs
good_epochs = p3_epochs_df.query("time_ms == 0 and log_flags == 0")["epoch_id"]
p3_epochs_df = p3_epochs_df.query("epoch_id in @good_epochs")

# select columns of interest for modeling
indices = ["epoch_id", "time_ms"]
predictors = ["stim", "tone"]  # stim=standard, target; tone=hi, lo
channels = ["MiPf", "MiCe", "MiPa", "MiOc"]  # midline electrodes
p3_epochs_df = p3_epochs_df[indices + predictors + channels]

Note

The epoch_id and time indices must be present in the dataframe index (pandas.MultiIndex) when loading the fitgrid.Epochs in the next step. They are also handy for general purpose epoch and time series data wrangling e.g., with pandas.DataFrame.groupby as shown here.

# set the epoch and time column index for fg.Epochs
p3_epochs_df.set_index(["epoch_id", "time_ms"], inplace=True)

# "baseline", i.e., center each epoch, each channel, on its pre-stimulus interval mean
centered = []
for epoch_id, vals in p3_epochs_df.groupby("epoch_id"):
    centered.append(
        vals[channels] - vals[channels].query("time_ms < 0").mean()
    )
p3_epochs_df[channels] = pd.concat(centered)

# done ...
p3_epochs_df

		stim	tone	MiPf	MiCe	MiPa	MiOc
epoch_id	time_ms
0	-748	target	hi	-6.157753	-25.526524	-44.994408	-23.116379
	-744	target	hi	-8.157753	-32.354649	-48.095970	-20.686691
	-740	target	hi	-7.157753	-31.596836	-40.935814	-16.553879
	-736	target	hi	-12.157753	-30.837070	-35.212181	-14.124190
	-732	target	hi	-8.657753	-24.260899	-26.385033	-13.397628
...	...	...	...	...	...	...	...
391	732	standard	hi	-21.056150	-5.361887	4.670756	1.208691
	736	standard	hi	-15.056150	0.708426	10.639506	7.521191
	740	standard	hi	-21.556150	-4.857981	4.912944	4.364941
	744	standard	hi	-23.056150	-4.350168	4.912944	7.036816
	748	standard	hi	-24.556150	-4.096262	4.194194	8.497753

125250 rows × 6 columns

2. Load into fitgrid.Epochs¶

The fitgrid.Epochs object is a lightweight wrapper around the epochs data that streamlines data validation and model fitting under the hood. The original epochs data are readily available in the fitgrid.Epochs.table if needed.

p3_epochs_fg = fg.epochs_from_dataframe(
    p3_epochs_df,
    epoch_id="epoch_id",
    time="time_ms",
    channels=["MiPf", "MiCe", "MiPa", "MiOc"],
)
p3_epochs_fg

Out:

<fitgrid.epochs.Epochs object at 0x7fd9502e3df0>

3. Fit a model¶

Once the fg.Epochs are in place, fitgrid.lm and fitgrid.lmer methods sweep a model formula across the epoch data at each time and channel and capture the model fits. The model formulas are those already in wide use Python and R. For ordinary least squares (OLS) fitgrid uses patsy formulas which work like lm formulas in R when fit with the statsmodels.formula.api. For linear mixed-effects regression models (LME), fitgrid uses lme4::lmer formulas. Under the hood, the LME formulas are passed from Python to R and the lme4::lmer fits returned back to Python and fitgrid via pymer4 ([Jolly2018]).

Here the patsy formula $\sim \mathsf{1 + stim}$ is used for OLS model fitting with statsmodels.

lmg_1_stim = fg.lm(p3_epochs_fg, RHS="1 + stim")

Out:

  0%|          | 0/375 [00:00<?, ?it/s]
  2%|1         | 6/375 [00:00<00:06, 56.75it/s]
  3%|3         | 13/375 [00:00<00:06, 59.77it/s]
  5%|5         | 20/375 [00:00<00:05, 60.00it/s]
  7%|7         | 27/375 [00:00<00:05, 60.79it/s]
  9%|9         | 34/375 [00:00<00:05, 60.85it/s]
 11%|#         | 41/375 [00:00<00:05, 60.86it/s]
 13%|#2        | 48/375 [00:00<00:05, 61.22it/s]
 15%|#4        | 55/375 [00:00<00:05, 61.35it/s]
 17%|#6        | 62/375 [00:01<00:05, 61.68it/s]
 18%|#8        | 69/375 [00:01<00:04, 61.78it/s]
 20%|##        | 76/375 [00:01<00:04, 62.10it/s]
 22%|##2       | 83/375 [00:01<00:04, 62.00it/s]
 24%|##4       | 90/375 [00:01<00:04, 62.07it/s]
 26%|##5       | 97/375 [00:01<00:04, 62.04it/s]
 28%|##7       | 104/375 [00:01<00:04, 62.17it/s]
 30%|##9       | 111/375 [00:01<00:04, 61.77it/s]
 31%|###1      | 118/375 [00:01<00:04, 61.89it/s]
 33%|###3      | 125/375 [00:02<00:04, 61.96it/s]
 35%|###5      | 132/375 [00:02<00:03, 62.13it/s]
 37%|###7      | 139/375 [00:02<00:03, 61.98it/s]
 39%|###8      | 146/375 [00:02<00:03, 61.99it/s]
 41%|####      | 153/375 [00:02<00:03, 60.71it/s]
 43%|####2     | 160/375 [00:02<00:03, 61.25it/s]
 45%|####4     | 167/375 [00:02<00:03, 61.48it/s]
 46%|####6     | 174/375 [00:02<00:03, 61.70it/s]
 48%|####8     | 181/375 [00:02<00:03, 61.92it/s]
 50%|#####     | 188/375 [00:03<00:03, 61.95it/s]
 52%|#####2    | 195/375 [00:03<00:02, 62.09it/s]
 54%|#####3    | 202/375 [00:03<00:02, 62.04it/s]
 56%|#####5    | 209/375 [00:03<00:02, 62.16it/s]
 58%|#####7    | 216/375 [00:03<00:02, 61.42it/s]
 59%|#####9    | 223/375 [00:03<00:02, 60.96it/s]
 61%|######1   | 230/375 [00:03<00:02, 60.03it/s]
 63%|######3   | 237/375 [00:03<00:02, 60.64it/s]
 65%|######5   | 244/375 [00:03<00:02, 60.88it/s]
 67%|######6   | 251/375 [00:04<00:02, 61.15it/s]
 69%|######8   | 258/375 [00:04<00:01, 61.39it/s]
 71%|#######   | 265/375 [00:04<00:01, 61.69it/s]
 73%|#######2  | 272/375 [00:04<00:01, 61.70it/s]
 74%|#######4  | 279/375 [00:04<00:01, 61.79it/s]
 76%|#######6  | 286/375 [00:04<00:01, 61.75it/s]
 78%|#######8  | 293/375 [00:04<00:01, 61.80it/s]
 80%|########  | 300/375 [00:04<00:01, 61.57it/s]
 82%|########1 | 307/375 [00:04<00:01, 61.82it/s]
 84%|########3 | 314/375 [00:05<00:00, 61.73it/s]
 86%|########5 | 321/375 [00:05<00:00, 61.80it/s]
 87%|########7 | 328/375 [00:05<00:00, 60.82it/s]
 89%|########9 | 335/375 [00:05<00:00, 60.34it/s]
 91%|#########1| 342/375 [00:05<00:00, 60.95it/s]
 93%|#########3| 349/375 [00:05<00:00, 61.12it/s]
 95%|#########4| 356/375 [00:05<00:00, 61.46it/s]
 97%|#########6| 363/375 [00:05<00:00, 61.55it/s]
 99%|#########8| 370/375 [00:06<00:00, 61.78it/s]
100%|##########| 375/375 [00:06<00:00, 61.47it/s]

The times and channels of the fitgrid.Epochs define the fitgrid.FitGrid dimensions. Each cell contains the OLS fit object. In this example, there are 375 $\times$ 4 = 1,500 fits in all.

lmg_1_stim

Out:

375 by 4 LMFitGrid of type <class 'statsmodels.regression.linear_model.RegressionResultsWrapper'>.

Model formulas: good news, bad news. The good news is you can easily do or not do whatever you like to specify models with the patsy and lme4::lmer formula syntax. The formulas are passed through to the fitting algorithms and the results returned are captured in the FitGrid. For instance patsy follows long-standing R formula behavior and includes a constant (intercept) in the model and treatment codes categorical predictors like $\mathsf{stim}$ by default. The alphabetically sorted first condition, standard is the reference level “control”, and target is the “treatment. In this instance, the more compact formula $\sim \mathsf{stim}$ is exactly equivalent $\mathsf{\sim 1 + stim}$ in the sense that they both generate the same design matrix (“right hand side”). fitgrid doesn’t care either way and will return the fits for whatever model formula you pass in. The bad news is that you need to know something about this kind of thing, i.e., how the user-friendly patsy and lme4::lmer model formulas map onto the model design matrices that are actually fit under the hood. This is generally true for using these formula languages in any application so there is no additional learning overhead for fitgrid. For OLS formulas see N. J. Smith’s excellent patsy documentation, Coding categorical data and How formulas work. For lme4::lmer formulas see Section 2. Formula module in [BatesEtAl2015].

4. Using the FitGrid[times, channels]¶

When statsmodels fits an OLS model it returns a Python object loaded with much useful information (see statsmodels Regression Results). The same is true of model fit objects returned by pymer4 after running lme4::lmer in R, albeit with some differences in the terminology and available information. The key point is that whatever statsmodels knows about OLS fits, the FitGrid also knows at each time and channel.

And statsmodels knows a lot:

import pprint as pp

pp.pprint(dir(lmg_1_stim))

Out:

['HC0_se',
 'HC1_se',
 'HC2_se',
 'HC3_se',
 '_HCCM',
 '_abat_diagonal',
 '_cache',
 '_data_attr',
 '_data_in_cache',
 '_get_robustcov_results',
 '_is_nested',
 '_use_t',
 '_wexog_singular_values',
 'aic',
 'bic',
 'bse',
 'centered_tss',
 'compare_f_test',
 'compare_lm_test',
 'compare_lr_test',
 'condition_number',
 'conf_int',
 'conf_int_el',
 'cov_HC0',
 'cov_HC1',
 'cov_HC2',
 'cov_HC3',
 'cov_kwds',
 'cov_params',
 'cov_type',
 'df_model',
 'df_resid',
 'eigenvals',
 'el_test',
 'ess',
 'f_pvalue',
 'f_test',
 'fittedvalues',
 'fvalue',
 'get_influence',
 'get_prediction',
 'get_robustcov_results',
 'influential_epochs',
 'initialize',
 'k_constant',
 'llf',
 'load',
 'model',
 'mse_model',
 'mse_resid',
 'mse_total',
 'nobs',
 'normalized_cov_params',
 'outlier_test',
 'params',
 'plot_adj_rsquared',
 'plot_betas',
 'predict',
 'pvalues',
 'remove_data',
 'resid',
 'resid_pearson',
 'rsquared',
 'rsquared_adj',
 'save',
 'save',
 'scale',
 'ssr',
 'summary',
 'summary2',
 't_test',
 't_test_pairwise',
 'tvalues',
 'uncentered_tss',
 'use_t',
 'wald_test',
 'wald_test_terms',
 'wresid']

Query fit results¶

In Python, the information in a single statsmodels fit object is accessed with the usual Python <object>.<name> syntax. The corresponding information in the FitGrid is accessed the same way. The results are returned in a tidy time x channel dataframe or another FitGrid object that makes it easy to visualize and analyze how the model fit attributes vary over time and among the different data channels.

Warning

Look before you leap when querying the grid. If your data set is large and you query the entire grid for results or model attributes that involve the number of observations, e.g., residuals or the design matrix, you will get back dataframes or FitGrid objects at least as large as your epochs data (time x channel) and for more complex models with categorical variables and interaction effects, perhaps many times larger. If you ask fitgrid to do something that will swamp your computer memory, it will.

Parameter estimates i.e., Smith & Kutas (2015) regression ERPs

lmg_1_stim.params

		MiPf	MiCe	MiPa	MiOc
time_ms
-748	Intercept	0.109282	0.696762	1.515409	1.326050
-748	stim[T.target]	-1.181772	-0.990758	0.150699	-0.606789
-744	Intercept	0.178925	0.826652	1.756546	1.674351
-744	stim[T.target]	-1.251414	-0.910561	-0.147937	-1.152934
-740	Intercept	0.162853	0.631908	1.743079	1.643298
...	...	...	...	...	...
740	stim[T.target]	3.696403	-5.168126	1.485546	6.102576
744	Intercept	-4.185361	0.473903	-1.168932	-1.947963
744	stim[T.target]	3.381390	-4.702868	1.686295	5.703556
748	Intercept	-3.992504	0.575766	-1.010459	-1.750909
748	stim[T.target]	3.762607	-3.834425	2.393401	5.750426

750 rows × 4 columns

Parameter estimate standard errors

lmg_1_stim.bse

		MiPf	MiCe	MiPa	MiOc
time_ms
-748	Intercept	1.742788	0.809416	0.833638	0.713185
-748	stim[T.target]	4.334319	2.013020	2.073261	1.773692
-744	Intercept	1.754761	0.817119	0.850462	0.714704
-744	stim[T.target]	4.364095	2.032177	2.115101	1.777472
-740	Intercept	1.783380	0.831498	0.861130	0.718637
...	...	...	...	...	...
740	stim[T.target]	2.781796	2.121423	2.348166	2.113293
744	Intercept	1.100398	0.851261	0.941740	0.837078
744	stim[T.target]	2.736693	2.117090	2.342111	2.081817
748	Intercept	1.104763	0.853895	0.941035	0.831397
748	stim[T.target]	2.747548	2.123641	2.340357	2.067686

750 rows × 4 columns

Parameter t values

lmg_1_stim.tvalues

		MiPf	MiCe	MiPa	MiOc
time_ms
-748	Intercept	0.062705	0.860821	1.817826	1.859336
-748	stim[T.target]	-0.272655	-0.492175	0.072687	-0.342105
-744	Intercept	0.101965	1.011667	2.065403	2.342718
-744	stim[T.target]	-0.286752	-0.448072	-0.069943	-0.648637
-740	Intercept	0.091317	0.759963	2.024177	2.286688
...	...	...	...	...	...
740	stim[T.target]	1.328783	-2.436160	0.632641	2.887709
744	Intercept	-3.803498	0.556707	-1.241246	-2.327097
744	stim[T.target]	1.235575	-2.221384	0.719990	2.739701
748	Intercept	-3.613902	0.674281	-1.073774	-2.105985
748	stim[T.target]	1.369442	-1.805590	1.022665	2.781092

750 rows × 4 columns

Model log likelihood

lmg_1_stim.llf

	MiPf	MiCe	MiPa	MiOc
time_ms
-748	-1599.464631	-1343.310503	-1353.159034	-1301.035318
-744	-1601.751371	-1346.474007	-1359.832371	-1301.746250
-740	-1607.154826	-1352.300581	-1363.995983	-1303.579119
-736	-1608.634978	-1347.604313	-1356.672990	-1293.004616
-732	-1607.273366	-1341.103740	-1350.276509	-1292.799803
...	...	...	...	...
732	-1450.651299	-1371.821368	-1407.664663	-1372.307257
736	-1451.351070	-1360.297510	-1396.868798	-1363.201989
740	-1451.346381	-1360.829253	-1394.746054	-1359.546736
744	-1445.886664	-1360.146236	-1393.883681	-1354.534586
748	-1447.208928	-1361.178159	-1393.633427	-1352.259827

375 rows × 4 columns

Model AIC

lmg_1_stim.aic

	MiPf	MiCe	MiPa	MiOc
time_ms
-748	3202.929263	2690.621005	2710.318069	2606.070636
-744	3207.502743	2696.948014	2723.664741	2607.492501
-740	3218.309651	2708.601161	2731.991966	2611.158238
-736	3221.269956	2699.208625	2717.345980	2590.009231
-732	3218.546733	2686.207479	2704.553018	2589.599606
...	...	...	...	...
732	2905.302599	2747.642736	2819.329326	2748.614514
736	2906.702141	2724.595019	2797.737596	2730.403978
740	2906.692761	2725.658506	2793.492108	2723.093473
744	2895.773328	2724.292473	2791.767361	2713.069172
748	2898.417856	2726.356318	2791.266854	2708.519654

375 rows × 4 columns

Number of observations per grid cell i.e., the number of epochs being modeled

lmg_1_stim.nobs.astype(int)

	MiPf	MiCe	MiPa	MiOc
time_ms
-748	334	334	334	334
-744	334	334	334	334
-740	334	334	334	334
-736	334	334	334	334
-732	334	334	334	334
...	...	...	...	...
732	334	334	334	334
736	334	334	334	334
740	334	334	334	334
744	334	334	334	334
748	334	334	334	334

375 rows × 4 columns

Note

If you are using an interactive environment like Jupyter Notebook or iPython you can type lmg_1_stim. and press <TAB> to pop up a list of available attributes.

Slice by time and channel¶

The fitted grid can be sliced down to a smaller grid of times and channels using familiar pandas.DataFrame index slicing with labels and the : range operator. As in pandas (but not Python) the range includes the upper bound.

Parameter estimates, all times, two channels

lmg_1_stim[:, ["MiPf", "MiPa"]].params

		MiPf	MiPa
time_ms
-748	Intercept	0.109282	1.515409
-748	stim[T.target]	-1.181772	0.150699
-744	Intercept	0.178925	1.756546
-744	stim[T.target]	-1.251414	-0.147937
-740	Intercept	0.162853	1.743079
...	...	...	...
740	stim[T.target]	3.696403	1.485546
744	Intercept	-4.185361	-1.168932
744	stim[T.target]	3.381390	1.686295
748	Intercept	-3.992504	-1.010459
748	stim[T.target]	3.762607	2.393401

750 rows × 2 columns

Log likelihood from -100 to 300 ms, all channels

lmg_1_stim[-100:300, :].llf

	MiPf	MiCe	MiPa	MiOc
time_ms
-100	-1356.925863	-1289.471325	-1330.455335	-1298.089756
-96	-1349.032696	-1295.167515	-1331.068318	-1312.084420
-92	-1353.260465	-1302.090980	-1336.881148	-1329.071935
-88	-1353.456726	-1310.384752	-1344.357556	-1333.771045
-84	-1346.919034	-1305.551377	-1343.953339	-1343.846322
...	...	...	...	...
284	-1399.203094	-1372.272381	-1380.968771	-1338.619438
288	-1399.592031	-1368.037118	-1374.887853	-1352.015048
292	-1404.400821	-1371.220324	-1374.413941	-1355.388335
296	-1401.927584	-1373.227508	-1370.487670	-1360.517199
300	-1401.217751	-1367.321502	-1362.711704	-1352.752984

101 rows × 4 columns

Query model structure¶

Besides the fit results, fit objects contain model information that can be queried. This example reaches into one cell of the FitGrid at time=0 and channel=MiPa and pulls out the treatment coded design matrix (model right hand side) and column labels for inspection.

Here is the design matrix and its column names

lmg_1_stim[0, "MiPa"].model.exog_names.unstack(-2)

		MiPa
		0	1
time_ms
0	0	Intercept	stim[T.target]

lmg_1_stim[0, "MiPa"].model.exog.unstack(-1)

Out:

/home/runner/work/fitgrid-dev/fitgrid-dev/fitgrid/fitgrid.py:302: FutureWarning: inplace is deprecated and will be removed in a future version.
  temp.index.set_levels(self.epoch_index, level=i, inplace=True)

		MiPa
		0	1
time_ms	epoch_id
0	0	1.0	1.0
	1	1.0	1.0
	2	1.0	1.0
	3	1.0	1.0
	4	1.0	1.0
	...	...	...
	387	1.0	0.0
	388	1.0	0.0
	389	1.0	0.0
	390	1.0	0.0
	391	1.0	0.0

334 rows × 2 columns

Visualize¶

The grid query dataframes are readily visualized with matplotlib and pandas or similiar. Here are simple heatmap and line plotting functions for illustration.

import numpy as np
import matplotlib as mpl
from matplotlib import pyplot as plt


def heatmap_plot(grid_df, title, units=None, aspect=50, **kwargs):
    fig, ax = plt.subplots(figsize=(8, 4))
    ax.set_title(title)
    ax.set(xlabel="Time (ms)", ylabel="Channel")
    xtick_labels = range(-700, 800, 100)

    ax.set_xticks([grid_df.index.get_loc(tick) for tick in xtick_labels])
    ax.set_xticklabels(xtick_labels)

    ax.set_yticks(range(len(grid_df.columns)))
    ax.set_yticklabels(grid_df.columns)

    im = ax.imshow(grid_df.T, interpolation="none", aspect=aspect, **kwargs)
    cb = fig.colorbar(im, ax=ax)
    if units:
        cb.ax.set_title(units)
    ax.axvline(grid_df.index.get_loc(0), color='white')
    fig.tight_layout()
    return fig, ax


def line_plot(grid_df, title, units=None, **kwargs):
    fig, ax = plt.subplots(figsize=(8, 4))
    ax.set_title(title)
    if units:
        ax.set(ylabel=units)
    grid_df.plot(ax=ax, **kwargs)
    fig.tight_layout()
    return fig, ax

Plot parameter estimates. The $\hat{\beta}$ over time are regression ERPs ([SmiKut2015]).

Note that in this model, $\beta_0 \mathbf{1} + \beta_1 X_1 + e$, the categorical stimulus variable is treatment coded, so the estimated intercept rERP $\hat{\beta_0}$ is the average ERP for the frequent standard tones, i.e., the “reference” or “control” condition. The estimated coefficient rERP $\hat{\beta}_{\mathsf{stim[T.target]}}$ quantifies the effect of the stimulus “treatment” relative to the “control”, i.e., the average difference ERP between the standard and target.

The parameter line plot figure shows a positive effect of about $5 - 10 \mu\mathsf{V}$ between 300 and 400 ms over central scalp and sustained over posterior scalp, consistent with the P300 ERP effect typical for this oddball paradigm. The heatmap shows the same data, the positive effect appears as the brighter yellow bands.

params_df = lmg_1_stim.params
bse_df = lmg_1_stim.bse

# name index columns for convenient slicing
for grid_df in [params_df, bse_df]:
    grid_df.index.set_names(["time_ms", "params"], inplace=True)

# look up the default line colors
line_colors = [prop["color"] for prop in mpl.rc_params()["axes.prop_cycle"]]

for param, vals in params_df.groupby("params"):
    vals.reset_index('params', inplace=True)
    times = vals.index.to_numpy()
    _, ax = line_plot(
        vals,
        "Model: 1 + stim\n" + r"$\hat{\beta}$ " + f"{param} (+/- 1 robust SE)",
    )
    # add SE bands
    for cdx, chan in enumerate(channels):
        lco = line_colors[cdx]
        bse = bse_df.query("params==@param")[
            chan
        ]  # look up rse this param, chan
        ax.fill_between(
            times, vals[chan] - bse, vals[chan] + bse, alpha=0.1, color=lco
        )

    ax.axhline(0, color="lightgray", lw=1, zorder=0)
    ax.axvline(0, color="gray", lw=1, zorder=1)
    ax.set(ylim=(-15, 15))
    ax.legend(loc="upper left")

$Model: 1 + stim $\hat{\beta}$ Intercept (+/- 1 robust SE)$
$Model: 1 + stim $\hat{\beta}$ stim[T.target] (+/- 1 robust SE)$

for param, vals in params_df.groupby("params"):
    vals.reset_index('params', drop=True, inplace=True)
    title_str = "Model: 1 + stim\n" + r"$\hat{\beta}$ " + f"{param}"
    fig, ax = heatmap_plot(vals, title_str, vmin=-9, vmax=9)

$Model: 1 + stim $\hat{\beta}$ Intercept$
$Model: 1 + stim $\hat{\beta}$ stim[T.target]$

More examples¶

Query various fit measures and parameter estimates

ssr_lmg_1_stim = lmg_1_stim.ssr  # residual sum of squares
llf_1_stim = lmg_1_stim.llf  # log likelihood
aic_1_stim = lmg_1_stim.aic  # Akiake Information Criterion

params_1_stim = (
    lmg_1_stim.params
)  # estimated parameters (a.k.a. weights, coefficients, betas)
tvals_1_stim = lmg_1_stim.tvalues  # t-values
pvals_1_stim = lmg_1_stim.pvalues  # p-values
bse_lmg_1_stim = lmg_1_stim.bse  # param standard errors
rse_lmg_1_stim = (
    lmg_1_stim.HC0_se
)  # robust param standard errors (one version)

# set parameter pd.MultiIndex names for convenient slicing
for attr_df in [
    params_1_stim,
    tvals_1_stim,
    pvals_1_stim,
    bse_lmg_1_stim,
    rse_lmg_1_stim,
]:
    attr_df.index.set_names(["time_ms", "params"], inplace=True)

Drop the predictor variable and fit the intercept only model: $\sim \mathsf{1}$

For the intercept only model, $\beta_0 \mathbf{1} + e$, the estimated intercept $\hat\beta_{0}$ is average ERP for all trials, collapsed across the stimulus conditions.

lmg_1 = fg.lm(p3_epochs_fg, RHS="1")

params_1 = lmg_1.params
params_1.index.set_names(["time_ms", "params"], inplace=True)
ssr_lmg_1 = lmg_1.ssr
llf_1 = lmg_1.llf
aic_1 = lmg_1.aic

Out:

  0%|          | 0/375 [00:00<?, ?it/s]
  2%|2         | 9/375 [00:00<00:04, 88.55it/s]
  5%|5         | 20/375 [00:00<00:03, 96.61it/s]
  8%|8         | 31/375 [00:00<00:03, 99.78it/s]
 11%|#1        | 42/375 [00:00<00:03, 100.95it/s]
 14%|#4        | 53/375 [00:00<00:03, 101.28it/s]
 17%|#7        | 64/375 [00:00<00:03, 101.76it/s]
 20%|##        | 75/375 [00:00<00:02, 101.74it/s]
 23%|##2       | 86/375 [00:00<00:02, 101.75it/s]
 26%|##5       | 97/375 [00:00<00:02, 102.17it/s]
 29%|##8       | 108/375 [00:01<00:02, 102.07it/s]
 32%|###1      | 119/375 [00:01<00:02, 102.12it/s]
 35%|###4      | 130/375 [00:01<00:02, 102.05it/s]
 38%|###7      | 141/375 [00:02<00:07, 32.41it/s]
 41%|####      | 152/375 [00:02<00:05, 40.82it/s]
 43%|####3     | 163/375 [00:02<00:04, 49.84it/s]
 46%|####6     | 174/375 [00:02<00:03, 58.99it/s]
 49%|####9     | 185/375 [00:02<00:02, 67.54it/s]
 52%|#####2    | 196/375 [00:02<00:02, 75.23it/s]
 55%|#####5    | 207/375 [00:02<00:02, 81.73it/s]
 58%|#####8    | 218/375 [00:02<00:01, 87.07it/s]
 61%|######1   | 229/375 [00:03<00:01, 91.22it/s]
 64%|######4   | 240/375 [00:03<00:01, 94.14it/s]
 67%|######6   | 251/375 [00:03<00:01, 96.38it/s]
 70%|######9   | 262/375 [00:03<00:01, 97.94it/s]
 73%|#######2  | 273/375 [00:03<00:01, 98.79it/s]
 76%|#######5  | 284/375 [00:03<00:00, 99.93it/s]
 79%|#######8  | 295/375 [00:03<00:00, 100.54it/s]
 82%|########1 | 306/375 [00:03<00:00, 100.69it/s]
 85%|########4 | 317/375 [00:03<00:00, 101.30it/s]
 87%|########7 | 328/375 [00:03<00:00, 101.60it/s]
 90%|######### | 339/375 [00:04<00:00, 101.67it/s]
 93%|#########3| 350/375 [00:04<00:00, 101.99it/s]
 96%|#########6| 361/375 [00:04<00:00, 101.99it/s]
 99%|#########9| 372/375 [00:04<00:00, 102.01it/s]
100%|##########| 375/375 [00:04<00:00, 84.30it/s]

plot_chans = ["MiPf", "MiPa", "MiCe", "MiOc"]
colors = [
    prop["color"] for prop in mpl.rc_params()["axes.prop_cycle"]
]  # lookup colors

for param, vals in params_1[plot_chans].groupby("params"):
    vals.reset_index('params', inplace=True)
    times = vals.index.to_numpy()
    _, ax = line_plot(
        vals,
        "Model: ~ 1 \n" + r"$\hat{\beta}$ " + f"{param} (+/- 1 robust SE)",
    )
    # add SE
    for cdx, chan in enumerate(plot_chans):
        color = colors[cdx]

        se = rse_lmg_1_stim.query("params==@param")[chan]
        ax.fill_between(
            times, vals[chan] - se, vals[chan] + se, alpha=0.1, color=color
        )

    ax.axhline(0, color="lightgray", lw=1, zorder=0)
    ax.axvline(0, color="gray", lw=1, zorder=1)
    ax.set(ylim=(-15, 15))
    ax.legend(loc="upper left")

$Model: ~ 1 $\hat{\beta}$ Intercept (+/- 1 robust SE)$

Compare models full $\sim 1 + stim$ vs. reduced $\sim 1$

Comparing log likelihood and AIC is useful for evaluating, diagnosing and interpreting model fit.

The full and reduced model likelihoods heatmaps are barely distinguishable. This is unsurprising since the stimulus variable accounts for only a small fraction of the overall variability in the EEG.

heatmap_plot(
    lmg_1_stim.llf,
    r"Log Likelihood Full Model$\mathsf{\sim 1 + stim}$",
    vmin=-1800,
    vmax=-1200,
)

heatmap_plot(
    lmg_1.llf,
    r"Log Likelihood Reduced Model$\mathsf{\sim 1}$",
    vmin=-1800,
    vmax=-1200,
)

$Log Likelihood Full Model$\mathsf{\sim 1 + stim}$$
$Log Likelihood Reduced Model$\mathsf{\sim 1}$$

Out:

(<Figure size 800x400 with 2 Axes>, <AxesSubplot:title={'center':'Log Likelihood Reduced Model$\\mathsf{\\sim 1}$'}, xlabel='Time (ms)', ylabel='Channel'>)

However, the likelihood ratio (= difference of log likelihoods) shows that the stimulus predictor does account for variability in a way that generally aligns with the time course and channels of the post-stimulus positive stimulus effect.

title_str = r"Likelihood Ratio: Models $\frac{\mathsf{\sim 1 + stim}}{\mathsf{\sim 1}}$"
heatmap_plot(lmg_1_stim.llf - lmg_1.llf, title_str)
_, ax = line_plot(lmg_1_stim.llf - lmg_1.llf, title_str)
ax.set(ylabel="Likelihood Ratio")

$Likelihood Ratio: Models $\frac{\mathsf{\sim 1 + stim}}{\mathsf{\sim 1}}$$
$Likelihood Ratio: Models $\frac{\mathsf{\sim 1 + stim}}{\mathsf{\sim 1}}$$

Out:

[Text(24.0, 0.5, 'Likelihood Ratio')]

For this pair-wise model comparison, AIC tells the same story as the likelihood ratio. AIC measures may also be used to compare and evaulate larger sets of models, not just pairs, e.g., [BurAnd2004].

title_str = (
    r"AIC $\Delta = AIC_{\mathsf{\sim 1}} - AIC_{\mathsf{\sim 1 + stim}}$"
)
heatmap_plot(lmg_1.aic - lmg_1_stim.aic, title_str)

_, ax = line_plot(lmg_1.aic - lmg_1_stim.aic, title_str)
ax.set(ylabel=r"AIC $\Delta$")
yticks = [-2, 0, 2, 4, 7, 10]
ax.set_yticks(yticks)
ax.hlines(
    yticks,
    xmin=times[0],
    xmax=times[-1],
    color="lightgray",
    lw=1.0,
    ls="dotted",
)

$AIC $\Delta = AIC_{\mathsf{\sim 1}} - AIC_{\mathsf{\sim 1 + stim}}$$
$AIC $\Delta = AIC_{\mathsf{\sim 1}} - AIC_{\mathsf{\sim 1 + stim}}$$

Out:

<matplotlib.collections.LineCollection object at 0x7fd94a24f610>

Time domain average ERPs: $\mathsf{\sim 0 + stim}$

Suppressing the model intercept triggers “dummy” (indicator) coded categorical predictor variables. For this model, $\beta_0 X_0 + \beta_1 X_1 + e$, the estimated coefficients $\hat\beta_{i}$ are the average of the trials in each condition, i.e., identical to the sum-and-divide time-domain average ERP.

lmg_0_stim = fg.lm(p3_epochs_fg, RHS="0 + stim")
print(type(lmg_0_stim))
params_0_stim = lmg_0_stim.params
rse_0_stim = lmg_0_stim.HC0_se

for attr_df in [params_0_stim, rse_0_stim]:
    attr_df.index.set_names(["time_ms", "params"], inplace=True)

Out:

  0%|          | 0/375 [00:00<?, ?it/s]
  2%|1         | 6/375 [00:00<00:06, 52.91it/s]
  3%|3         | 13/375 [00:00<00:06, 57.63it/s]
  5%|5         | 19/375 [00:00<00:06, 58.62it/s]
  7%|6         | 26/375 [00:00<00:05, 59.52it/s]
  9%|8         | 33/375 [00:00<00:05, 59.84it/s]
 11%|#         | 40/375 [00:00<00:05, 60.18it/s]
 13%|#2        | 47/375 [00:00<00:05, 60.24it/s]
 14%|#4        | 54/375 [00:00<00:05, 60.39it/s]
 16%|#6        | 61/375 [00:01<00:05, 60.35it/s]
 18%|#8        | 68/375 [00:01<00:05, 60.48it/s]
 20%|##        | 75/375 [00:01<00:04, 60.30it/s]
 22%|##1       | 82/375 [00:01<00:04, 60.33it/s]
 24%|##3       | 89/375 [00:01<00:04, 60.24it/s]
 26%|##5       | 96/375 [00:01<00:04, 60.28it/s]
 27%|##7       | 103/375 [00:01<00:04, 60.28it/s]
 29%|##9       | 110/375 [00:01<00:04, 60.32it/s]
 31%|###1      | 117/375 [00:01<00:04, 60.14it/s]
 33%|###3      | 124/375 [00:02<00:04, 59.86it/s]
 35%|###4      | 130/375 [00:02<00:04, 59.77it/s]
 37%|###6      | 137/375 [00:02<00:03, 59.87it/s]
 38%|###8      | 143/375 [00:02<00:03, 59.87it/s]
 40%|####      | 150/375 [00:02<00:03, 60.06it/s]
 42%|####1     | 157/375 [00:02<00:03, 60.06it/s]
 44%|####3     | 164/375 [00:02<00:03, 60.24it/s]
 46%|####5     | 171/375 [00:02<00:03, 60.24it/s]
 47%|####7     | 178/375 [00:02<00:03, 60.34it/s]
 49%|####9     | 185/375 [00:03<00:03, 60.31it/s]
 51%|#####1    | 192/375 [00:03<00:03, 60.27it/s]
 53%|#####3    | 199/375 [00:03<00:02, 60.20it/s]
 55%|#####4    | 206/375 [00:03<00:02, 60.32it/s]
 57%|#####6    | 213/375 [00:03<00:02, 60.18it/s]
 59%|#####8    | 220/375 [00:03<00:02, 60.27it/s]
 61%|######    | 227/375 [00:03<00:02, 60.27it/s]
 62%|######2   | 234/375 [00:03<00:02, 60.40it/s]
 64%|######4   | 241/375 [00:04<00:02, 60.36it/s]
 66%|######6   | 248/375 [00:04<00:02, 60.36it/s]
 68%|######8   | 255/375 [00:04<00:01, 60.49it/s]
 70%|######9   | 262/375 [00:04<00:01, 60.46it/s]
 72%|#######1  | 269/375 [00:04<00:01, 60.51it/s]
 74%|#######3  | 276/375 [00:04<00:01, 60.46it/s]
 75%|#######5  | 283/375 [00:04<00:01, 60.62it/s]
 77%|#######7  | 290/375 [00:04<00:01, 60.52it/s]
 79%|#######9  | 297/375 [00:04<00:01, 60.59it/s]
 81%|########1 | 304/375 [00:05<00:01, 60.44it/s]
 83%|########2 | 311/375 [00:05<00:01, 60.54it/s]
 85%|########4 | 318/375 [00:05<00:00, 60.42it/s]
 87%|########6 | 325/375 [00:05<00:00, 60.52it/s]
 89%|########8 | 332/375 [00:05<00:00, 60.37it/s]
 90%|######### | 339/375 [00:05<00:00, 60.49it/s]
 92%|#########2| 346/375 [00:05<00:00, 60.49it/s]
 94%|#########4| 353/375 [00:05<00:00, 60.44it/s]
 96%|#########6| 360/375 [00:05<00:00, 60.44it/s]
 98%|#########7| 367/375 [00:06<00:00, 60.50it/s]
100%|#########9| 374/375 [00:06<00:00, 60.46it/s]
100%|##########| 375/375 [00:06<00:00, 60.22it/s]
<class 'fitgrid.fitgrid.LMFitGrid'>

plot_chans = ["MiPf", "MiPa", "MiCe", "MiOc"]
colors = [
    prop["color"] for prop in mpl.rc_params()["axes.prop_cycle"]
]  # lookup colors

# rERPs = model parameter estimates over time
for param, vals in params_0_stim[plot_chans].groupby("params"):
    vals.reset_index('params', inplace=True)
    times = vals.index.to_numpy()
    condition = param.replace("stim[", "").replace("]", "").upper()
    _, ax = line_plot(
        vals,
        "Model: ~ 0 + stim\n"
        + r"OLS $\hat{\beta}$ "
        + f"{param} = average ERP ({condition})",
    )
    # add SE
    for cdx, chan in enumerate(plot_chans):
        color = colors[cdx]
        se = rse_0_stim.query("params==@param")[chan]
        ax.fill_between(
            times, vals[chan] - se, vals[chan] + se, alpha=0.1, color=color
        )

    ax.axhline(0, color="lightgray", lw=1, zorder=0)
    ax.axvline(0, color="gray", lw=1, zorder=1)
    ax.set(ylim=(-15, 15))
    ax.legend(loc="upper left")

$Model: ~ 0 + stim OLS $\hat{\beta}$ stim[standard] = average ERP (STANDARD)$
$Model: ~ 0 + stim OLS $\hat{\beta}$ stim[target] = average ERP (TARGET)$

5. Next steps¶

Linear mixed-effects with lme4::lmer¶

Linear mixed-effects models can be fit to epochs data with fitgrid.lmer and model formulas from the lme4::lmer R library. The FitGrid is populated with the lmer fit results and sliced and accessed the same way as the OLS grids. The mixed-effects fits are much slower in general so parallel processing on multicore hardware is likely a practical necessity.

lmeg = fg.lmer(
    epochs_fg,
    RHS="1 + a + (a | subid) + (a | itemid),
    paralell=True,
    n_cores=12
)

Multicore parallel acceleration¶

A good desktop workstation may have 8 cores, a high performance compute node may have dozens or more. With lmer modeling especially, it may be useful to take advantage of multiple cores and fit models with parallel processes to speed up processing though the performance impact depends on the system and size of the job and parallel jobs may run more slowly.

Run parallel processes by setting parallel to True and n_cores to the desired value (defaults to 4) like so:

lmg = fg.lm(
    epochs_fg,
    RHS="1 + a + b + a:b",
    parallel=True,
    n_cores=6,
)

Note

When working on a shared system it is rude to hog too many cores.

Model sets and comparisons¶

To reduce memory demands when fitting and comparing sets of models, fitgrid provides a convenience function fitgrid.utils.summarize that iteratively fits a series of models and scrapes selected results into lightweight summary data frame with key fit results for model interpretation and comparison and a few helper functions for plotting the summary results. See Model comparisons and summaries.

Model and data diagnostics¶

When the native Python and R fit results include diagnostic information, this can be accessed in the FitGrid like any other attributes. In practice, however, doing so may be intractably slow or memory intensive. There are a few fitgrid convenience utilities for diagnostics (see Model and data diagnostics (WIP)) though it is an area that needs more work.

Total running time of the script: ( 0 minutes 23.748 seconds)

Gallery generated by Sphinx-Gallery