AIC and likelihood ratio model comparison

There are many approaches to comparing models and evaluating relative goodness-of-fit. The FitGrid[times, channels] query mechanism is designed to streamline the computations for whatever approach is deemed appropriate for the research question at hand..

The AIC example demonstrates the fitgrid.utils.summarize function, a convenience wrapper that fits a list of models and returns key summary information for each as an indexed data frame. The summary may be passed to a function for visualizing the AIC \(\mathsf{\Delta_{min}}\) model comparison ([BurAnd2004]) as shown, or processed by custom user routines as needed.

For cases where the summary information is insufficient, the likelihood ratio example illustrates how to compute and visualize a model comparison measure derived from the original fitted grids. Other measures may be computed in a similar manner.

Prepare sample data

import pandas as pd
from matplotlib import pyplot as plt
import fitgrid as fg
from fitgrid import DATA_DIR, sample_data

sample_data.get_file("sub000p3.ms1500.epochs.feather")
p3_epochs_df = pd.read_feather(DATA_DIR / "sub000p3.ms1500.epochs.feather")

# drop calibration pulses for these examples
p3_epochs_df = p3_epochs_df.query("stim != 'cal'")

# look up the data QC flags and select the good epochs
good_epochs = p3_epochs_df.query("match_time == 0 and log_flags == 0")[
    "epoch_id"
]
p3_epochs_df = p3_epochs_df.query("epoch_id in @good_epochs")

# rename the time stamp column
p3_epochs_df.rename(columns={"match_time": "time_ms"}, inplace=True)

# select columns of interest for modeling
indices = ["epoch_id", "time_ms"]
predictors = ["stim", "tone"]  # categorical with 2 levels: standard, target
channels = ["MiPf", "MiCe", "MiPa", "MiOc"]  # midline electrodes
p3_epochs_df = p3_epochs_df[indices + predictors + channels]

# set the epoch and time column index for fg.Epochs
p3_epochs_df.set_index(["epoch_id", "time_ms"], inplace=True)

# "baseline", i.e., center each epoch on the 200 ms pre-stimulus interval
centered = []
for epoch_id, vals in p3_epochs_df.groupby("epoch_id"):
    centered.append(
        vals[channels]
        - vals[channels].query("time_ms >= -200 and time_ms < 0").mean()
    )
p3_epochs_df[channels] = pd.concat(centered)

# slice epochs down to a shorter interval
p3_epochs_df.query("time_ms >= -200 and time_ms <= 600", inplace=True)
p3_epochs_df
stim tone MiPf MiCe MiPa MiOc
epoch_id time_ms
0 -200 target hi -0.380001 20.450624 29.704453 21.073549
-196 target hi 0.119999 19.700624 28.266953 17.432924
-192 target hi -3.880001 11.106874 18.485703 11.112612
-188 target hi -1.380001 11.106874 14.891953 6.261050
-184 target hi 3.119999 9.075624 9.641953 3.343081
... ... ... ... ... ... ... ...
391 584 standard hi -27.759998 -19.758007 -14.340283 -3.182412
588 standard hi -27.759998 -24.312695 -23.644970 -8.283975
592 standard hi -19.259998 -13.181836 -14.102002 2.649619
596 standard hi -19.259998 -13.687696 -15.293408 -0.510537
600 standard hi -22.259998 -12.423047 -12.670362 -1.239053

67134 rows × 6 columns



Ingest as fitgrid.Epochs

p3_epochs_fg = fg.epochs_from_dataframe(
    p3_epochs_df, epoch_id="epoch_id", time="time_ms", channels=channels
)

Model summaries

The fitgrid.utils.summarize function is a convenience wrapper that fits a list of models and collects a few key summary measures into a single dataframe indexed by model, beta estimate, and the measure. It supports OLS and LME model fitting and the summaries are returned are in the same format.

This experimental design is a fully crossed 2 (stim: standard, target) \(\times\) 2 (tone: hi, lo). The predictors are treatment coded (patsy default).

Here is a stack of 5 models to summarize and compare:

rhss_T = [
    "1 + stim + tone + stim:tone",  # long form of "stim * tone"
    "1 + stim + tone",
    "1 + stim",
    "1 + tone",
    "1",
]
from fitgrid.utils.summary import summarize

lm_summary_T = summarize(
    p3_epochs_fg,
    modeler="lm",
    RHS=rhss_T,
    LHS=channels,
    parallel=False,
    quiet=True,
)
lm_summary_T

Out:

/home/runner/work/fitgrid/fitgrid/fitgrid/utils/summary.py:145: FutureWarning: fitgrid summaries are in early days, subject to change
  warnings.warn(
MiPf MiCe MiPa MiOc
time_ms model beta key
-200 1 + stim + tone + stim:tone Intercept 2.5_ci -0.966986 -3.30788 -3.08549 -1.96038
97.5_ci 2.15727 0.814112 1.42294 2.0828
AIC 2453.06 2638.19 2698.06 2625.3
DF 330 330 330 330
Estimate 0.595141 -1.24688 -0.831273 0.0612079
... ... ... ... ... ... ... ...
600 1 Intercept T-stat -3.65018 -1.38863 1.23162 1.59486
has_warning False False False False
logLike -1325.39 -1376.77 -1397.84 -1347.49
sigma2 164.281 223.469 253.519 187.524
warnings

31356 rows × 4 columns



AIC model comparison: \(\Delta_{i} = \mathsf{AIC}_{i} - \mathsf{AIC_{min}}\)

Akiakie’s information criterion (AIC) increases with residual error and the number of model parameters so comparison on AIC favors the better fitting, more parsimonious models with lower AIC values.

This example visualizes the channel-wise time course of \(\Delta_{i} = \mathsf{AIC}_{i} - \mathsf{AIC_{min}}\), a measure of the AIC of model i vs. the lowest AIC of any model in the set. Burnham and Anderson propose heuristics where models with \(\Delta_{i}\) around 4 are less well supported by the data than the alternative(s) and models with \(\Delta_{i}\) > 7 substantially less so.

In the next figure, the line plots (left column) and raster plots (right column) show the same data in different ways. Higher amplitude line plots and corresponding darker shades of blue in the raster plot indicate that the model’s AIC is higher than the best candidate in the set.

Prestimulus. Prior to stimulus onset at time = 0, the more parsimonious models (bottom three rows) have systematically lower AIC values (broad regions of lighest blue) than the more complex models (top two rows). This indicates that during this interval, the predictor variables alone and in combination do not soak up enough variability to offset the penalty for increasing the model complexity. In terms of this AIC measure, none of the models appear systematically better supported by the data than the null model (bottom row) in the prestimulus interval.

Post-stimulus. In the interval between around 300 - 375 ms poststimulus, the full model that includes stim and tone predictors and their interaction has the minimum AIC among the models compared at all channels except the prefrontal MiPf. The sharp transients in the magntiude of the AIC differences (> 7) at these channels in this interval indicates substantially less support for the alternative models.

from fitgrid.utils.summary import plot_AICmin_deltas

fig, axs = plot_AICmin_deltas(lm_summary_T, figsize=(12, 12))
fig.tight_layout()
for ax_row in range(len(axs)):
    axs[ax_row, 0].set(ylim=(0, 50))
1 + stim + tone + stim:tone, 1 + stim + tone, 1 + stim, 1 + tone, 1

Likelihood Ratio

This example fits a full and reduced model, computes and then visualizes the time course of likelihood ratios with a few lines of code.

Fit the full model. The log likelihood dataframe is returned by querying the FitGrid[times, channels].

lmg_full = fg.lm(p3_epochs_fg, RHS="1 + stim + tone + stim:tone", LHS=channels)
lmg_full.llf

Out:

  0%|          | 0/201 [00:00<?, ?it/s]
  2%|1         | 4/201 [00:00<00:05, 34.36it/s]
  4%|3         | 8/201 [00:00<00:05, 35.85it/s]
  6%|5         | 12/201 [00:00<00:05, 36.01it/s]
  8%|7         | 16/201 [00:00<00:05, 36.34it/s]
 10%|9         | 20/201 [00:00<00:04, 36.80it/s]
 12%|#1        | 24/201 [00:00<00:04, 36.64it/s]
 14%|#3        | 28/201 [00:00<00:04, 36.82it/s]
 16%|#5        | 32/201 [00:00<00:04, 36.86it/s]
 18%|#7        | 36/201 [00:00<00:04, 36.72it/s]
 20%|#9        | 40/201 [00:01<00:04, 36.71it/s]
 22%|##1       | 44/201 [00:01<00:04, 36.58it/s]
 24%|##3       | 48/201 [00:01<00:04, 36.86it/s]
 26%|##5       | 52/201 [00:01<00:04, 36.95it/s]
 28%|##7       | 56/201 [00:01<00:03, 37.03it/s]
 30%|##9       | 60/201 [00:01<00:03, 36.99it/s]
 32%|###1      | 64/201 [00:01<00:03, 36.73it/s]
 34%|###3      | 68/201 [00:01<00:03, 36.94it/s]
 36%|###5      | 72/201 [00:01<00:03, 37.08it/s]
 38%|###7      | 76/201 [00:02<00:03, 37.10it/s]
 40%|###9      | 80/201 [00:02<00:03, 37.27it/s]
 42%|####1     | 84/201 [00:02<00:03, 37.41it/s]
 44%|####3     | 88/201 [00:02<00:03, 37.36it/s]
 46%|####5     | 92/201 [00:02<00:02, 37.29it/s]
 48%|####7     | 96/201 [00:02<00:02, 37.33it/s]
 50%|####9     | 100/201 [00:02<00:02, 37.22it/s]
 52%|#####1    | 104/201 [00:02<00:02, 37.35it/s]
 54%|#####3    | 108/201 [00:02<00:02, 37.09it/s]
 56%|#####5    | 112/201 [00:03<00:02, 36.98it/s]
 58%|#####7    | 116/201 [00:03<00:02, 36.99it/s]
 60%|#####9    | 120/201 [00:03<00:02, 37.22it/s]
 62%|######1   | 124/201 [00:03<00:02, 37.23it/s]
 64%|######3   | 128/201 [00:03<00:01, 37.06it/s]
 66%|######5   | 132/201 [00:03<00:01, 36.57it/s]
 68%|######7   | 136/201 [00:03<00:01, 36.64it/s]
 70%|######9   | 140/201 [00:04<00:03, 16.01it/s]
 72%|#######1  | 144/201 [00:04<00:02, 19.32it/s]
 74%|#######3  | 148/201 [00:04<00:02, 22.58it/s]
 76%|#######5  | 152/201 [00:04<00:01, 25.66it/s]
 78%|#######7  | 156/201 [00:04<00:01, 28.28it/s]
 80%|#######9  | 160/201 [00:04<00:01, 30.52it/s]
 82%|########1 | 164/201 [00:04<00:01, 32.45it/s]
 84%|########3 | 168/201 [00:05<00:00, 33.92it/s]
 86%|########5 | 172/201 [00:05<00:00, 34.91it/s]
 88%|########7 | 176/201 [00:05<00:00, 35.77it/s]
 90%|########9 | 180/201 [00:05<00:00, 36.29it/s]
 92%|#########1| 184/201 [00:05<00:00, 36.74it/s]
 94%|#########3| 188/201 [00:05<00:00, 37.11it/s]
 96%|#########5| 192/201 [00:05<00:00, 37.13it/s]
 98%|#########7| 196/201 [00:05<00:00, 37.21it/s]
100%|#########9| 200/201 [00:05<00:00, 37.29it/s]
100%|##########| 201/201 [00:05<00:00, 34.14it/s]
MiPf MiCe MiPa MiOc
time_ms
-200 -1222.532134 -1315.097134 -1345.027696 -1308.649342
-196 -1219.015289 -1309.332110 -1343.317795 -1307.756720
-192 -1217.569942 -1299.470975 -1337.332536 -1313.580720
-188 -1203.189206 -1284.976803 -1330.065186 -1304.068471
-184 -1176.462814 -1277.721413 -1331.545780 -1303.852666
... ... ... ... ...
584 -1311.148454 -1361.863041 -1387.438445 -1340.639807
588 -1307.528658 -1365.721171 -1386.531566 -1332.786442
592 -1313.377251 -1370.874212 -1389.435160 -1333.903770
596 -1318.216984 -1373.834763 -1393.517831 -1343.739941
600 -1323.657999 -1373.309777 -1388.665581 -1337.706494

201 rows × 4 columns



Fit the reduced model likewise.

lmg_reduced = fg.lm(p3_epochs_fg, RHS="1 + stim + tone", LHS=channels)
lmg_reduced.llf

Out:

  0%|          | 0/201 [00:00<?, ?it/s]
  2%|1         | 4/201 [00:00<00:05, 36.63it/s]
  4%|3         | 8/201 [00:00<00:05, 38.27it/s]
  6%|5         | 12/201 [00:00<00:04, 38.94it/s]
  8%|8         | 17/201 [00:00<00:04, 39.48it/s]
 10%|#         | 21/201 [00:00<00:04, 39.50it/s]
 12%|#2        | 25/201 [00:00<00:04, 39.38it/s]
 14%|#4        | 29/201 [00:00<00:04, 39.10it/s]
 16%|#6        | 33/201 [00:00<00:04, 39.05it/s]
 18%|#8        | 37/201 [00:00<00:04, 38.94it/s]
 20%|##        | 41/201 [00:01<00:04, 39.12it/s]
 23%|##2       | 46/201 [00:01<00:03, 39.57it/s]
 25%|##5       | 51/201 [00:01<00:03, 39.78it/s]
 28%|##7       | 56/201 [00:01<00:03, 39.98it/s]
 30%|###       | 61/201 [00:01<00:03, 40.06it/s]
 33%|###2      | 66/201 [00:01<00:03, 40.03it/s]
 35%|###5      | 71/201 [00:01<00:03, 39.81it/s]
 37%|###7      | 75/201 [00:01<00:03, 39.84it/s]
 39%|###9      | 79/201 [00:01<00:03, 39.78it/s]
 41%|####1     | 83/201 [00:02<00:02, 39.83it/s]
 43%|####3     | 87/201 [00:02<00:02, 39.80it/s]
 46%|####5     | 92/201 [00:02<00:02, 40.07it/s]
 48%|####8     | 97/201 [00:02<00:02, 39.97it/s]
 51%|#####     | 102/201 [00:02<00:02, 40.17it/s]
 53%|#####3    | 107/201 [00:02<00:02, 40.13it/s]
 56%|#####5    | 112/201 [00:02<00:02, 39.69it/s]
 58%|#####7    | 116/201 [00:02<00:02, 39.59it/s]
 60%|#####9    | 120/201 [00:03<00:02, 39.42it/s]
 62%|######2   | 125/201 [00:03<00:01, 39.66it/s]
 65%|######4   | 130/201 [00:03<00:01, 39.88it/s]
 67%|######6   | 134/201 [00:03<00:01, 39.65it/s]
 69%|######8   | 138/201 [00:03<00:01, 39.69it/s]
 71%|#######1  | 143/201 [00:03<00:01, 39.82it/s]
 73%|#######3  | 147/201 [00:03<00:01, 39.78it/s]
 75%|#######5  | 151/201 [00:03<00:01, 39.56it/s]
 78%|#######7  | 156/201 [00:03<00:01, 39.66it/s]
 80%|########  | 161/201 [00:04<00:01, 39.88it/s]
 83%|########2 | 166/201 [00:04<00:00, 40.03it/s]
 85%|########5 | 171/201 [00:04<00:00, 40.16it/s]
 88%|########7 | 176/201 [00:04<00:00, 40.06it/s]
 90%|######### | 181/201 [00:04<00:00, 40.11it/s]
 93%|#########2| 186/201 [00:04<00:00, 40.16it/s]
 95%|#########5| 191/201 [00:04<00:00, 39.85it/s]
 97%|#########7| 195/201 [00:04<00:00, 39.85it/s]
100%|#########9| 200/201 [00:05<00:00, 39.87it/s]
100%|##########| 201/201 [00:05<00:00, 39.73it/s]
MiPf MiCe MiPa MiOc
time_ms
-200 -1222.598161 -1315.857038 -1345.655697 -1310.202354
-196 -1219.035644 -1309.862783 -1344.004489 -1309.552813
-192 -1217.708926 -1300.291720 -1338.389544 -1315.676625
-188 -1203.608227 -1286.117888 -1331.449561 -1305.668078
-184 -1176.544845 -1278.648741 -1332.709155 -1304.580230
... ... ... ... ...
584 -1311.224343 -1361.863881 -1387.594031 -1341.033271
588 -1307.935020 -1365.731574 -1386.761340 -1333.356553
592 -1313.604190 -1370.888442 -1389.681551 -1334.510610
596 -1318.389052 -1373.835473 -1393.602722 -1344.138936
600 -1324.154922 -1373.313601 -1388.716013 -1338.103694

201 rows × 4 columns



Calculate. The likelihood ratio is the difference of the log likelihoods.

MiPf MiCe MiPa MiOc
time_ms
-200 0.066027 0.759904 0.628001 1.553012
-196 0.020355 0.530672 0.686694 1.796092
-192 0.138984 0.820745 1.057008 2.095905
-188 0.419021 1.141085 1.384376 1.599607
-184 0.082031 0.927328 1.163375 0.727564
... ... ... ... ...
584 0.075889 0.000840 0.155586 0.393464
588 0.406362 0.010402 0.229774 0.570111
592 0.226939 0.014231 0.246391 0.606841
596 0.172068 0.000710 0.084891 0.398995
600 0.496923 0.003825 0.050432 0.397200

201 rows × 4 columns



Visualize. This comparison shows that stimulus x tone interaction term in the model has little systematic impact on the goodness-of-fit as given by the likelihood except around 300 - 375 ms poststimulus, largest over central scalp (MiCe).

fig, ax = plt.subplots(figsize=(12, 3))

# render
im = ax.imshow(likelihood_ratio.T, interpolation="none", aspect=16)
cb = fig.colorbar(im, ax=ax)

# label
ax.set_title("Likelihood ratio")
ax.set(xlabel="Time (ms)", ylabel="Channel")

xtick_labels = range(-200, 600, 100)
ax.set_xticks([likelihood_ratio.index.get_loc(tick) for tick in xtick_labels])
ax.set_xticklabels(xtick_labels)

ax.set_yticks(range(len(likelihood_ratio.columns)))
ax.set_yticklabels(likelihood_ratio.columns)
fig.tight_layout()
Likelihood ratio
fig, ax = plt.subplots(figsize=(8, 3))
ax.set_title("Likelihood Ratio")
_ = (lmg_full.llf - lmg_reduced.llf).plot(ax=ax)
Likelihood Ratio

Total running time of the script: ( 0 minutes 45.731 seconds)

Gallery generated by Sphinx-Gallery