Time domain averages (wide)
Read and check the epochs
[1]:
import pandas as pd
from spudtr import epf
from spudtr import get_demo_df, DATA_DIR, P3_1500_FEATHER
epochs_df = get_demo_df(P3_1500_FEATHER)
eeg_channels = ['MiPf', 'MiCe', 'MiPa', 'MiOc']
epf.check_epochs(epochs_df, eeg_channels, epoch_id="epoch_id", time="time_ms")
epochs_df
[1]:
epoch_id | time_ms | sub_id | eeg_artifact | dblock_path | log_evcodes | log_ccodes | dblock_srate | ccode | instrument | ... | RMOc | LLTe | RLTe | LLOc | RLOc | MiOc | A2 | HEOG | rle | rhz | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | -748 | sub000 | 0 | sub000/dblock_0 | 0 | 0 | 250.0 | 1 | eeg | ... | -25.093750 | -0.753906 | 1.480469 | -13.414062 | -18.937500 | -17.734375 | 5.660156 | 98.875000 | -39.500000 | 38.375000 |
1 | 0 | -744 | sub000 | 0 | sub000/dblock_0 | 0 | 0 | 250.0 | 1 | eeg | ... | -24.593750 | 0.502441 | -2.466797 | -17.640625 | -17.468750 | -15.304688 | 1.968750 | 104.750000 | -38.031250 | 41.281250 |
2 | 0 | -740 | sub000 | 0 | sub000/dblock_0 | 0 | 0 | 250.0 | 1 | eeg | ... | -16.484375 | -1.507812 | 3.947266 | -15.648438 | -10.085938 | -11.171875 | 8.367188 | 102.062500 | -33.656250 | 43.718750 |
3 | 0 | -736 | sub000 | 0 | sub000/dblock_0 | 0 | 0 | 250.0 | 1 | eeg | ... | -11.804688 | -15.070312 | 9.867188 | -14.906250 | -7.378906 | -8.742188 | 9.351562 | 100.562500 | -42.906250 | 37.406250 |
4 | 0 | -732 | sub000 | 0 | sub000/dblock_0 | 0 | 0 | 250.0 | 1 | eeg | ... | -6.394531 | -4.019531 | 9.125000 | -10.679688 | -6.886719 | -8.015625 | 8.125000 | 98.375000 | -43.875000 | 37.906250 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
224995 | 600 | 732 | sub000 | 0 | sub000/dblock_4 | 0 | 0 | 250.0 | 0 | cal | ... | -4.671875 | -3.517578 | -4.441406 | -4.718750 | -4.671875 | -3.400391 | -4.429688 | -4.406250 | -3.900391 | -4.371094 |
224996 | 600 | 736 | sub000 | 0 | sub000/dblock_4 | 0 | 0 | 250.0 | 0 | cal | ... | -4.179688 | -4.019531 | -4.195312 | -4.222656 | -4.425781 | -3.644531 | -4.429688 | -4.160156 | -3.412109 | -4.371094 |
224997 | 600 | 740 | sub000 | 0 | sub000/dblock_4 | 0 | 0 | 250.0 | 0 | cal | ... | -4.425781 | -3.767578 | -4.441406 | -3.974609 | -4.425781 | -3.400391 | -4.429688 | -4.160156 | -3.900391 | -4.859375 |
224998 | 600 | 744 | sub000 | 0 | sub000/dblock_4 | 0 | 0 | 250.0 | 0 | cal | ... | -4.425781 | -4.269531 | -4.195312 | -4.222656 | -4.425781 | -3.886719 | -4.429688 | -4.406250 | -3.900391 | -4.371094 |
224999 | 600 | 748 | sub000 | 0 | sub000/dblock_4 | 0 | 0 | 250.0 | 0 | cal | ... | -4.179688 | -4.019531 | -3.947266 | -4.222656 | -4.179688 | -3.400391 | -4.183594 | -4.406250 | -3.412109 | -4.371094 |
225000 rows × 47 columns
Group by time
to compute the time-domain average of all epochs and select columns of interest
[2]:
grand_wide = epochs_df.groupby(['time_ms']).mean()[eeg_channels]
grand_wide.columns.name = 'channel'
grand_wide
[2]:
channel | MiPf | MiCe | MiPa | MiOc |
---|---|---|---|---|
time_ms | ||||
-748 | -0.647500 | -0.818429 | -0.650280 | -1.128954 |
-744 | -0.590833 | -0.838763 | -0.648749 | -1.025912 |
-740 | -0.569167 | -0.987738 | -0.715166 | -1.047582 |
-736 | -0.600000 | -1.013976 | -0.672859 | -0.980162 |
-732 | -0.767500 | -1.069512 | -0.705796 | -0.867579 |
... | ... | ... | ... | ... |
732 | 1.345833 | -0.855422 | -1.573245 | -1.943028 |
736 | 1.138333 | -0.999023 | -1.762801 | -2.063682 |
740 | 0.985000 | -1.031177 | -1.794903 | -2.081421 |
744 | 0.877500 | -1.011374 | -1.770285 | -2.010948 |
748 | 0.866667 | -0.863825 | -1.608073 | -1.863831 |
375 rows × 4 columns
Group by time
and other columns to compute the average of subsets of epochs
[3]:
subsets_wide = epochs_df.groupby(["time_ms", "stim"]).mean()[eeg_channels]
subsets_wide.columns.name = "channel"
subsets_wide
[3]:
channel | MiPf | MiCe | MiPa | MiOc | |
---|---|---|---|---|---|
time_ms | stim | ||||
-748 | cal | -4.317307 | -3.857553 | -4.073911 | -4.143690 |
standard | 1.419520 | 0.651240 | 0.729152 | 0.773548 | |
target | 0.950000 | 1.211514 | 2.442932 | -0.413608 | |
-744 | cal | -4.329327 | -3.851473 | -4.114043 | -4.118098 |
standard | 1.493151 | 0.686627 | 0.894989 | 1.060941 | |
... | ... | ... | ... | ... | ... |
744 | standard | -1.669520 | 0.564905 | -2.007408 | -2.380829 |
target | 19.059999 | 0.232344 | 3.668494 | 3.395762 | |
748 | cal | -4.305288 | -3.828247 | -4.077405 | -4.089811 |
standard | -1.566781 | 0.687771 | -1.867695 | -2.206024 | |
target | 18.730000 | 0.771514 | 4.286233 | 3.765410 |
1125 rows × 4 columns
Time-domain averages (long)
[4]:
subsets_long = subsets_wide.stack() # pivot the channel columns into one long column
subsets_long.name = "microvolts"
pd.DataFrame(subsets_long)
[4]:
microvolts | |||
---|---|---|---|
time_ms | stim | channel | |
-748 | cal | MiPf | -4.317307 |
MiCe | -3.857553 | ||
MiPa | -4.073911 | ||
MiOc | -4.143690 | ||
standard | MiPf | 1.419520 | |
... | ... | ... | ... |
748 | standard | MiOc | -2.206024 |
target | MiPf | 18.730000 | |
MiCe | 0.771514 | ||
MiPa | 4.286233 | ||
MiOc | 3.765410 |
4500 rows × 1 columns
Time interval measurments
Interval measurments use the “slice-groupby-apply” pattern.
slice the time interval rows
group by epoch_id and other tags
apply the measurment function to the data, e.g., pandas built-in or user-defined
Start by doing the steps separately to verify.
When the steps are right, chain them for compact expression.
Example: single trial mean amplitude
Load the epochs
[5]:
eeg_channels = ["MiPf", "MiCe", "MiPa", "MiOc"]
epochs_df = get_demo_df(P3_1500_FEATHER).query('stim in ["target", "standard"]')
epf.check_epochs(epochs_df, eeg_channels, epoch_id="epoch_id", time="time_ms")
epochs_df
[5]:
epoch_id | time_ms | sub_id | eeg_artifact | dblock_path | log_evcodes | log_ccodes | dblock_srate | ccode | instrument | ... | RMOc | LLTe | RLTe | LLOc | RLOc | MiOc | A2 | HEOG | rle | rhz | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | -748 | sub000 | 0 | sub000/dblock_0 | 0 | 0 | 250.0 | 1 | eeg | ... | -25.093750 | -0.753906 | 1.480469 | -13.414062 | -18.937500 | -17.734375 | 5.660156 | 98.875000 | -39.500000 | 38.375000 |
1 | 0 | -744 | sub000 | 0 | sub000/dblock_0 | 0 | 0 | 250.0 | 1 | eeg | ... | -24.593750 | 0.502441 | -2.466797 | -17.640625 | -17.468750 | -15.304688 | 1.968750 | 104.750000 | -38.031250 | 41.281250 |
2 | 0 | -740 | sub000 | 0 | sub000/dblock_0 | 0 | 0 | 250.0 | 1 | eeg | ... | -16.484375 | -1.507812 | 3.947266 | -15.648438 | -10.085938 | -11.171875 | 8.367188 | 102.062500 | -33.656250 | 43.718750 |
3 | 0 | -736 | sub000 | 0 | sub000/dblock_0 | 0 | 0 | 250.0 | 1 | eeg | ... | -11.804688 | -15.070312 | 9.867188 | -14.906250 | -7.378906 | -8.742188 | 9.351562 | 100.562500 | -42.906250 | 37.406250 |
4 | 0 | -732 | sub000 | 0 | sub000/dblock_0 | 0 | 0 | 250.0 | 1 | eeg | ... | -6.394531 | -4.019531 | 9.125000 | -10.679688 | -6.886719 | -8.015625 | 8.125000 | 98.375000 | -43.875000 | 37.906250 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
146995 | 391 | 732 | sub000 | 0 | sub000/dblock_3 | 0 | 0 | 250.0 | 1 | eeg | ... | 9.593750 | 10.804688 | 0.000000 | 4.472656 | 1.967773 | 4.617188 | -3.937500 | -9.296875 | -10.242188 | -2.429688 |
146996 | 391 | 736 | sub000 | 0 | sub000/dblock_3 | 0 | 0 | 250.0 | 1 | eeg | ... | 15.250000 | 15.578125 | 8.632812 | 10.429688 | 8.609375 | 10.929688 | -0.246094 | -7.343750 | -7.800781 | 2.914062 |
146997 | 391 | 740 | sub000 | 0 | sub000/dblock_3 | 0 | 0 | 250.0 | 1 | eeg | ... | 11.070312 | 11.554688 | 4.195312 | 7.203125 | 6.886719 | 7.773438 | -4.429688 | -7.832031 | -13.648438 | -2.429688 |
146998 | 391 | 744 | sub000 | 0 | sub000/dblock_3 | 0 | 0 | 250.0 | 1 | eeg | ... | 13.039062 | 6.781250 | 6.414062 | 9.187500 | 10.578125 | 10.445312 | 0.246094 | -8.320312 | -8.773438 | -1.457031 |
146999 | 391 | 748 | sub000 | 0 | sub000/dblock_3 | 0 | 0 | 250.0 | 1 | eeg | ... | 13.531250 | 8.539062 | 10.359375 | 11.671875 | 13.531250 | 11.906250 | 3.937500 | -8.320312 | -10.242188 | 0.485840 |
147000 rows × 47 columns
(optional) select the data columns of interest or skip this and use them all.
[6]:
coi = ["epoch_id", "time_ms", "stim", "MiPf", "MiCe", "MiPa", "MiOc"]
mid = epochs_df[coi] # select columns of interest
display(mid.head())
display(mid.tail())
epoch_id | time_ms | stim | MiPf | MiCe | MiPa | MiOc | |
---|---|---|---|---|---|---|---|
0 | 0 | -748 | target | -54.5 | 2.781250 | -8.828125 | -17.734375 |
1 | 0 | -744 | target | -56.5 | -4.046875 | -11.929688 | -15.304688 |
2 | 0 | -740 | target | -55.5 | -3.289062 | -4.769531 | -11.171875 |
3 | 0 | -736 | target | -60.5 | -2.529297 | 0.954102 | -8.742188 |
4 | 0 | -732 | target | -57.0 | 4.046875 | 9.781250 | -8.015625 |
epoch_id | time_ms | stim | MiPf | MiCe | MiPa | MiOc | |
---|---|---|---|---|---|---|---|
146995 | 391 | 732 | standard | 9.5 | 3.289062 | 15.265625 | 4.617188 |
146996 | 391 | 736 | standard | 15.5 | 9.359375 | 21.234375 | 10.929688 |
146997 | 391 | 740 | standard | 9.0 | 3.792969 | 15.507812 | 7.773438 |
146998 | 391 | 744 | standard | 7.5 | 4.300781 | 15.507812 | 10.445312 |
146999 | 391 | 748 | standard | 6.0 | 4.554688 | 14.789062 | 11.906250 |
Slice the time interval data sample (rows) to measure and verify by inspection
[7]:
mid_300_500 = mid.query("time_ms >= 300 and time_ms <= 500")
display(mid_300_500.head())
display(mid_300_500.tail())
display(mid_300_500["time_ms"].min(), mid_300_500["time_ms"].max())
epoch_id | time_ms | stim | MiPf | MiCe | MiPa | MiOc | |
---|---|---|---|---|---|---|---|
262 | 0 | 300 | target | -46.0 | 69.5625 | 77.5000 | 29.640625 |
263 | 0 | 304 | target | -41.5 | 78.4375 | 82.7500 | 33.531250 |
264 | 0 | 308 | target | -39.0 | 83.1875 | 84.4375 | 33.031250 |
265 | 0 | 312 | target | -39.5 | 81.9375 | 82.3125 | 29.875000 |
266 | 0 | 316 | target | -36.5 | 82.6875 | 83.2500 | 31.828125 |
epoch_id | time_ms | stim | MiPf | MiCe | MiPa | MiOc | |
---|---|---|---|---|---|---|---|
146933 | 391 | 484 | standard | 14.5 | 3.541016 | 5.726562 | 5.585938 |
146934 | 391 | 488 | standard | 13.0 | -4.300781 | -5.484375 | -1.943359 |
146935 | 391 | 492 | standard | 8.5 | -6.578125 | -10.015625 | -1.214844 |
146936 | 391 | 496 | standard | 6.5 | -11.382812 | -14.789062 | -0.971680 |
146937 | 391 | 500 | standard | -1.5 | -21.250000 | -21.937500 | -2.429688 |
300
500
Group by epoch_id, i.e., single trial, and other column labels to preserve them, and apply the built-in
mean()
function.
Note the time_ms
timestamps is just another column of data and also averaged in the interval.
Note pandas.Dataframe
has dozens of built-in stats functions besides mean: max()
, min()
, std()
, var()
, …
[8]:
mid_300_500_mna = mid_300_500.groupby(["epoch_id", "stim"]).mean()
display(mid_300_500_mna.head(), mid_300_500_mna.tail())
time_ms | MiPf | MiCe | MiPa | MiOc | ||
---|---|---|---|---|---|---|
epoch_id | stim | |||||
0 | target | 400.0 | -42.176472 | 42.853859 | 52.521751 | 12.860375 |
1 | target | 400.0 | -14.617647 | 41.024815 | 42.625919 | 6.120811 |
2 | target | 400.0 | -7.186275 | 24.073071 | 31.395679 | 13.836741 |
3 | target | 400.0 | -16.911764 | 20.560892 | 26.349571 | 16.461147 |
4 | target | 400.0 | 13.039216 | 27.078394 | 22.416552 | 5.758588 |
time_ms | MiPf | MiCe | MiPa | MiOc | ||
---|---|---|---|---|---|---|
epoch_id | stim | |||||
387 | standard | 400.0 | 13.911765 | 20.714920 | 22.612095 | 4.010857 |
388 | standard | 400.0 | 24.696079 | -4.259727 | 0.098336 | -2.500594 |
389 | standard | 400.0 | 27.578432 | 11.336646 | 14.499387 | 6.349677 |
390 | standard | 400.0 | 36.323528 | -0.446557 | 3.442656 | -0.209578 |
391 | standard | 400.0 | 11.490196 | -12.447074 | -6.483092 | -2.429051 |
The epoch interval measurements are new data, re-label them appropriately.
[9]:
# drop the no longer meaningful time_ms column
mid_300_500_mna = mid_300_500_mna.drop("time_ms", axis=1)
# describe the type of measurment and interval
mid_300_500_mna["measure"] = "mna"
mid_300_500_mna["interval"] = "300_500"
display(mid_300_500_mna.head(), mid_300_500_mna.tail())
MiPf | MiCe | MiPa | MiOc | measure | interval | ||
---|---|---|---|---|---|---|---|
epoch_id | stim | ||||||
0 | target | -42.176472 | 42.853859 | 52.521751 | 12.860375 | mna | 300_500 |
1 | target | -14.617647 | 41.024815 | 42.625919 | 6.120811 | mna | 300_500 |
2 | target | -7.186275 | 24.073071 | 31.395679 | 13.836741 | mna | 300_500 |
3 | target | -16.911764 | 20.560892 | 26.349571 | 16.461147 | mna | 300_500 |
4 | target | 13.039216 | 27.078394 | 22.416552 | 5.758588 | mna | 300_500 |
MiPf | MiCe | MiPa | MiOc | measure | interval | ||
---|---|---|---|---|---|---|---|
epoch_id | stim | ||||||
387 | standard | 13.911765 | 20.714920 | 22.612095 | 4.010857 | mna | 300_500 |
388 | standard | 24.696079 | -4.259727 | 0.098336 | -2.500594 | mna | 300_500 |
389 | standard | 27.578432 | 11.336646 | 14.499387 | 6.349677 | mna | 300_500 |
390 | standard | 36.323528 | -0.446557 | 3.442656 | -0.209578 | mna | 300_500 |
391 | standard | 11.490196 | -12.447074 | -6.483092 | -2.429051 | mna | 300_500 |
(optional) Export the measurements data
[10]:
mid_300_500_mna.reset_index().to_feather(DATA_DIR / "p3_mid_mna_300_500.feather")
Chaining: All of the above, simplified by chaining. The results are verifiably identical.
[11]:
coi = ["epoch_id", "time_ms", "stim", "MiPf", "MiCe", "MiPa", "MiOc"]
# slice-groupby-apply
mid_300_500_mna_c = (
epochs_df[coi]
.query("time_ms >= 300 and time_ms <= 500")
.groupby(["epoch_id", "stim"])
.mean()
.drop("time_ms", axis=1)
)
# describe the type of measurment and interval
mid_300_500_mna_c["measure"] = "mna"
mid_300_500_mna_c["interval"] = "300_500"
# verify steps and chained agree
assert all(mid_300_500_mna_c == mid_300_500_mna)
# export
mid_300_500_mna_c.reset_index().to_feather(DATA_DIR / "p3_mid_mna_300_500.feather")