Time domain averages (wide)

Read and check the epochs

[1]:
import pandas as pd
from spudtr import epf
from spudtr import get_demo_df, DATA_DIR, P3_1500_FEATHER

epochs_df = get_demo_df(P3_1500_FEATHER)
eeg_channels = ['MiPf', 'MiCe', 'MiPa', 'MiOc']

epf.check_epochs(epochs_df, eeg_channels, epoch_id="epoch_id", time="time_ms")
epochs_df
[1]:
epoch_id time_ms sub_id eeg_artifact dblock_path log_evcodes log_ccodes dblock_srate ccode instrument ... RMOc LLTe RLTe LLOc RLOc MiOc A2 HEOG rle rhz
0 0 -748 sub000 0 sub000/dblock_0 0 0 250.0 1 eeg ... -25.093750 -0.753906 1.480469 -13.414062 -18.937500 -17.734375 5.660156 98.875000 -39.500000 38.375000
1 0 -744 sub000 0 sub000/dblock_0 0 0 250.0 1 eeg ... -24.593750 0.502441 -2.466797 -17.640625 -17.468750 -15.304688 1.968750 104.750000 -38.031250 41.281250
2 0 -740 sub000 0 sub000/dblock_0 0 0 250.0 1 eeg ... -16.484375 -1.507812 3.947266 -15.648438 -10.085938 -11.171875 8.367188 102.062500 -33.656250 43.718750
3 0 -736 sub000 0 sub000/dblock_0 0 0 250.0 1 eeg ... -11.804688 -15.070312 9.867188 -14.906250 -7.378906 -8.742188 9.351562 100.562500 -42.906250 37.406250
4 0 -732 sub000 0 sub000/dblock_0 0 0 250.0 1 eeg ... -6.394531 -4.019531 9.125000 -10.679688 -6.886719 -8.015625 8.125000 98.375000 -43.875000 37.906250
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
224995 600 732 sub000 0 sub000/dblock_4 0 0 250.0 0 cal ... -4.671875 -3.517578 -4.441406 -4.718750 -4.671875 -3.400391 -4.429688 -4.406250 -3.900391 -4.371094
224996 600 736 sub000 0 sub000/dblock_4 0 0 250.0 0 cal ... -4.179688 -4.019531 -4.195312 -4.222656 -4.425781 -3.644531 -4.429688 -4.160156 -3.412109 -4.371094
224997 600 740 sub000 0 sub000/dblock_4 0 0 250.0 0 cal ... -4.425781 -3.767578 -4.441406 -3.974609 -4.425781 -3.400391 -4.429688 -4.160156 -3.900391 -4.859375
224998 600 744 sub000 0 sub000/dblock_4 0 0 250.0 0 cal ... -4.425781 -4.269531 -4.195312 -4.222656 -4.425781 -3.886719 -4.429688 -4.406250 -3.900391 -4.371094
224999 600 748 sub000 0 sub000/dblock_4 0 0 250.0 0 cal ... -4.179688 -4.019531 -3.947266 -4.222656 -4.179688 -3.400391 -4.183594 -4.406250 -3.412109 -4.371094

225000 rows × 47 columns

Group by time to compute the time-domain average of all epochs and select columns of interest

[2]:
grand_wide = epochs_df.groupby(['time_ms']).mean()[eeg_channels]
grand_wide.columns.name = 'channel'
grand_wide
[2]:
channel MiPf MiCe MiPa MiOc
time_ms
-748 -0.647500 -0.818429 -0.650280 -1.128954
-744 -0.590833 -0.838763 -0.648749 -1.025912
-740 -0.569167 -0.987738 -0.715166 -1.047582
-736 -0.600000 -1.013976 -0.672859 -0.980162
-732 -0.767500 -1.069512 -0.705796 -0.867579
... ... ... ... ...
732 1.345833 -0.855422 -1.573245 -1.943028
736 1.138333 -0.999023 -1.762801 -2.063682
740 0.985000 -1.031177 -1.794903 -2.081421
744 0.877500 -1.011374 -1.770285 -2.010948
748 0.866667 -0.863825 -1.608073 -1.863831

375 rows × 4 columns

Group by time and other columns to compute the average of subsets of epochs

[3]:
subsets_wide = epochs_df.groupby(["time_ms", "stim"]).mean()[eeg_channels]
subsets_wide.columns.name = "channel"
subsets_wide
[3]:
channel MiPf MiCe MiPa MiOc
time_ms stim
-748 cal -4.317307 -3.857553 -4.073911 -4.143690
standard 1.419520 0.651240 0.729152 0.773548
target 0.950000 1.211514 2.442932 -0.413608
-744 cal -4.329327 -3.851473 -4.114043 -4.118098
standard 1.493151 0.686627 0.894989 1.060941
... ... ... ... ... ...
744 standard -1.669520 0.564905 -2.007408 -2.380829
target 19.059999 0.232344 3.668494 3.395762
748 cal -4.305288 -3.828247 -4.077405 -4.089811
standard -1.566781 0.687771 -1.867695 -2.206024
target 18.730000 0.771514 4.286233 3.765410

1125 rows × 4 columns

Time-domain averages (long)

[4]:
subsets_long = subsets_wide.stack()  # pivot the channel columns into one long column
subsets_long.name = "microvolts"
pd.DataFrame(subsets_long)
[4]:
microvolts
time_ms stim channel
-748 cal MiPf -4.317307
MiCe -3.857553
MiPa -4.073911
MiOc -4.143690
standard MiPf 1.419520
... ... ... ...
748 standard MiOc -2.206024
target MiPf 18.730000
MiCe 0.771514
MiPa 4.286233
MiOc 3.765410

4500 rows × 1 columns

Time interval measurments

Interval measurments use the “slice-groupby-apply” pattern.

  • slice the time interval rows

  • group by epoch_id and other tags

  • apply the measurment function to the data, e.g., pandas built-in or user-defined

Start by doing the steps separately to verify.

When the steps are right, chain them for compact expression.

Example: single trial mean amplitude

  1. Load the epochs

[5]:
eeg_channels = ["MiPf", "MiCe", "MiPa", "MiOc"]

epochs_df = get_demo_df(P3_1500_FEATHER).query('stim in ["target", "standard"]')
epf.check_epochs(epochs_df, eeg_channels, epoch_id="epoch_id", time="time_ms")
epochs_df
[5]:
epoch_id time_ms sub_id eeg_artifact dblock_path log_evcodes log_ccodes dblock_srate ccode instrument ... RMOc LLTe RLTe LLOc RLOc MiOc A2 HEOG rle rhz
0 0 -748 sub000 0 sub000/dblock_0 0 0 250.0 1 eeg ... -25.093750 -0.753906 1.480469 -13.414062 -18.937500 -17.734375 5.660156 98.875000 -39.500000 38.375000
1 0 -744 sub000 0 sub000/dblock_0 0 0 250.0 1 eeg ... -24.593750 0.502441 -2.466797 -17.640625 -17.468750 -15.304688 1.968750 104.750000 -38.031250 41.281250
2 0 -740 sub000 0 sub000/dblock_0 0 0 250.0 1 eeg ... -16.484375 -1.507812 3.947266 -15.648438 -10.085938 -11.171875 8.367188 102.062500 -33.656250 43.718750
3 0 -736 sub000 0 sub000/dblock_0 0 0 250.0 1 eeg ... -11.804688 -15.070312 9.867188 -14.906250 -7.378906 -8.742188 9.351562 100.562500 -42.906250 37.406250
4 0 -732 sub000 0 sub000/dblock_0 0 0 250.0 1 eeg ... -6.394531 -4.019531 9.125000 -10.679688 -6.886719 -8.015625 8.125000 98.375000 -43.875000 37.906250
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
146995 391 732 sub000 0 sub000/dblock_3 0 0 250.0 1 eeg ... 9.593750 10.804688 0.000000 4.472656 1.967773 4.617188 -3.937500 -9.296875 -10.242188 -2.429688
146996 391 736 sub000 0 sub000/dblock_3 0 0 250.0 1 eeg ... 15.250000 15.578125 8.632812 10.429688 8.609375 10.929688 -0.246094 -7.343750 -7.800781 2.914062
146997 391 740 sub000 0 sub000/dblock_3 0 0 250.0 1 eeg ... 11.070312 11.554688 4.195312 7.203125 6.886719 7.773438 -4.429688 -7.832031 -13.648438 -2.429688
146998 391 744 sub000 0 sub000/dblock_3 0 0 250.0 1 eeg ... 13.039062 6.781250 6.414062 9.187500 10.578125 10.445312 0.246094 -8.320312 -8.773438 -1.457031
146999 391 748 sub000 0 sub000/dblock_3 0 0 250.0 1 eeg ... 13.531250 8.539062 10.359375 11.671875 13.531250 11.906250 3.937500 -8.320312 -10.242188 0.485840

147000 rows × 47 columns

  1. (optional) select the data columns of interest or skip this and use them all.

[6]:
coi = ["epoch_id", "time_ms", "stim", "MiPf", "MiCe", "MiPa", "MiOc"]
mid = epochs_df[coi]  # select columns of interest

display(mid.head())
display(mid.tail())
epoch_id time_ms stim MiPf MiCe MiPa MiOc
0 0 -748 target -54.5 2.781250 -8.828125 -17.734375
1 0 -744 target -56.5 -4.046875 -11.929688 -15.304688
2 0 -740 target -55.5 -3.289062 -4.769531 -11.171875
3 0 -736 target -60.5 -2.529297 0.954102 -8.742188
4 0 -732 target -57.0 4.046875 9.781250 -8.015625
epoch_id time_ms stim MiPf MiCe MiPa MiOc
146995 391 732 standard 9.5 3.289062 15.265625 4.617188
146996 391 736 standard 15.5 9.359375 21.234375 10.929688
146997 391 740 standard 9.0 3.792969 15.507812 7.773438
146998 391 744 standard 7.5 4.300781 15.507812 10.445312
146999 391 748 standard 6.0 4.554688 14.789062 11.906250
  1. Slice the time interval data sample (rows) to measure and verify by inspection

[7]:
mid_300_500 = mid.query("time_ms >= 300 and time_ms <= 500")

display(mid_300_500.head())
display(mid_300_500.tail())
display(mid_300_500["time_ms"].min(), mid_300_500["time_ms"].max())
epoch_id time_ms stim MiPf MiCe MiPa MiOc
262 0 300 target -46.0 69.5625 77.5000 29.640625
263 0 304 target -41.5 78.4375 82.7500 33.531250
264 0 308 target -39.0 83.1875 84.4375 33.031250
265 0 312 target -39.5 81.9375 82.3125 29.875000
266 0 316 target -36.5 82.6875 83.2500 31.828125
epoch_id time_ms stim MiPf MiCe MiPa MiOc
146933 391 484 standard 14.5 3.541016 5.726562 5.585938
146934 391 488 standard 13.0 -4.300781 -5.484375 -1.943359
146935 391 492 standard 8.5 -6.578125 -10.015625 -1.214844
146936 391 496 standard 6.5 -11.382812 -14.789062 -0.971680
146937 391 500 standard -1.5 -21.250000 -21.937500 -2.429688
300
500
  1. Group by epoch_id, i.e., single trial, and other column labels to preserve them, and apply the built-in mean() function.

Note the time_ms timestamps is just another column of data and also averaged in the interval.

Note pandas.Dataframe has dozens of built-in stats functions besides mean: max(), min(), std(), var(), …

[8]:
mid_300_500_mna = mid_300_500.groupby(["epoch_id", "stim"]).mean()

display(mid_300_500_mna.head(), mid_300_500_mna.tail())
time_ms MiPf MiCe MiPa MiOc
epoch_id stim
0 target 400.0 -42.176472 42.853859 52.521751 12.860375
1 target 400.0 -14.617647 41.024815 42.625919 6.120811
2 target 400.0 -7.186275 24.073071 31.395679 13.836741
3 target 400.0 -16.911764 20.560892 26.349571 16.461147
4 target 400.0 13.039216 27.078394 22.416552 5.758588
time_ms MiPf MiCe MiPa MiOc
epoch_id stim
387 standard 400.0 13.911765 20.714920 22.612095 4.010857
388 standard 400.0 24.696079 -4.259727 0.098336 -2.500594
389 standard 400.0 27.578432 11.336646 14.499387 6.349677
390 standard 400.0 36.323528 -0.446557 3.442656 -0.209578
391 standard 400.0 11.490196 -12.447074 -6.483092 -2.429051
  1. The epoch interval measurements are new data, re-label them appropriately.

[9]:
# drop the no longer meaningful time_ms column
mid_300_500_mna = mid_300_500_mna.drop("time_ms", axis=1)

# describe the type of measurment and interval
mid_300_500_mna["measure"] = "mna"
mid_300_500_mna["interval"] = "300_500"

display(mid_300_500_mna.head(), mid_300_500_mna.tail())
MiPf MiCe MiPa MiOc measure interval
epoch_id stim
0 target -42.176472 42.853859 52.521751 12.860375 mna 300_500
1 target -14.617647 41.024815 42.625919 6.120811 mna 300_500
2 target -7.186275 24.073071 31.395679 13.836741 mna 300_500
3 target -16.911764 20.560892 26.349571 16.461147 mna 300_500
4 target 13.039216 27.078394 22.416552 5.758588 mna 300_500
MiPf MiCe MiPa MiOc measure interval
epoch_id stim
387 standard 13.911765 20.714920 22.612095 4.010857 mna 300_500
388 standard 24.696079 -4.259727 0.098336 -2.500594 mna 300_500
389 standard 27.578432 11.336646 14.499387 6.349677 mna 300_500
390 standard 36.323528 -0.446557 3.442656 -0.209578 mna 300_500
391 standard 11.490196 -12.447074 -6.483092 -2.429051 mna 300_500
  1. (optional) Export the measurements data

[10]:
mid_300_500_mna.reset_index().to_feather(DATA_DIR / "p3_mid_mna_300_500.feather")
  1. Chaining: All of the above, simplified by chaining. The results are verifiably identical.

[11]:
coi = ["epoch_id", "time_ms", "stim", "MiPf", "MiCe", "MiPa", "MiOc"]

# slice-groupby-apply
mid_300_500_mna_c  = (
    epochs_df[coi]
    .query("time_ms >= 300 and time_ms <= 500")
    .groupby(["epoch_id", "stim"])
    .mean()
    .drop("time_ms", axis=1)
)

# describe the type of measurment and interval
mid_300_500_mna_c["measure"] = "mna"
mid_300_500_mna_c["interval"] = "300_500"

# verify steps and chained agree
assert all(mid_300_500_mna_c == mid_300_500_mna)

# export
mid_300_500_mna_c.reset_index().to_feather(DATA_DIR / "p3_mid_mna_300_500.feather")