Time domain averages (wide)

Read and check the epochs

[1]:

import pandas as pd
from spudtr import epf
from spudtr import get_demo_df, DATA_DIR, P3_1500_FEATHER

epochs_df = get_demo_df(P3_1500_FEATHER)
eeg_channels = ['MiPf', 'MiCe', 'MiPa', 'MiOc']

epf.check_epochs(epochs_df, eeg_channels, epoch_id="epoch_id", time="time_ms")
epochs_df

[1]:

	epoch_id	time_ms	sub_id	eeg_artifact	dblock_path	log_evcodes	log_ccodes	dblock_srate	ccode	instrument	...	RMOc	LLTe	RLTe	LLOc	RLOc	MiOc	A2	HEOG	rle	rhz
0	0	-748	sub000	0	sub000/dblock_0	0	0	250.0	1	eeg	...	-25.093750	-0.753906	1.480469	-13.414062	-18.937500	-17.734375	5.660156	98.875000	-39.500000	38.375000
1	0	-744	sub000	0	sub000/dblock_0	0	0	250.0	1	eeg	...	-24.593750	0.502441	-2.466797	-17.640625	-17.468750	-15.304688	1.968750	104.750000	-38.031250	41.281250
2	0	-740	sub000	0	sub000/dblock_0	0	0	250.0	1	eeg	...	-16.484375	-1.507812	3.947266	-15.648438	-10.085938	-11.171875	8.367188	102.062500	-33.656250	43.718750
3	0	-736	sub000	0	sub000/dblock_0	0	0	250.0	1	eeg	...	-11.804688	-15.070312	9.867188	-14.906250	-7.378906	-8.742188	9.351562	100.562500	-42.906250	37.406250
4	0	-732	sub000	0	sub000/dblock_0	0	0	250.0	1	eeg	...	-6.394531	-4.019531	9.125000	-10.679688	-6.886719	-8.015625	8.125000	98.375000	-43.875000	37.906250
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
224995	600	732	sub000	0	sub000/dblock_4	0	0	250.0	0	cal	...	-4.671875	-3.517578	-4.441406	-4.718750	-4.671875	-3.400391	-4.429688	-4.406250	-3.900391	-4.371094
224996	600	736	sub000	0	sub000/dblock_4	0	0	250.0	0	cal	...	-4.179688	-4.019531	-4.195312	-4.222656	-4.425781	-3.644531	-4.429688	-4.160156	-3.412109	-4.371094
224997	600	740	sub000	0	sub000/dblock_4	0	0	250.0	0	cal	...	-4.425781	-3.767578	-4.441406	-3.974609	-4.425781	-3.400391	-4.429688	-4.160156	-3.900391	-4.859375
224998	600	744	sub000	0	sub000/dblock_4	0	0	250.0	0	cal	...	-4.425781	-4.269531	-4.195312	-4.222656	-4.425781	-3.886719	-4.429688	-4.406250	-3.900391	-4.371094
224999	600	748	sub000	0	sub000/dblock_4	0	0	250.0	0	cal	...	-4.179688	-4.019531	-3.947266	-4.222656	-4.179688	-3.400391	-4.183594	-4.406250	-3.412109	-4.371094

225000 rows × 47 columns

Group by time to compute the time-domain average of all epochs and select columns of interest

[2]:

grand_wide = epochs_df.groupby(['time_ms']).mean()[eeg_channels]
grand_wide.columns.name = 'channel'
grand_wide

[2]:

channel	MiPf	MiCe	MiPa	MiOc
time_ms
-748	-0.647500	-0.818429	-0.650280	-1.128954
-744	-0.590833	-0.838763	-0.648749	-1.025912
-740	-0.569167	-0.987738	-0.715166	-1.047582
-736	-0.600000	-1.013976	-0.672859	-0.980162
-732	-0.767500	-1.069512	-0.705796	-0.867579
...	...	...	...	...
732	1.345833	-0.855422	-1.573245	-1.943028
736	1.138333	-0.999023	-1.762801	-2.063682
740	0.985000	-1.031177	-1.794903	-2.081421
744	0.877500	-1.011374	-1.770285	-2.010948
748	0.866667	-0.863825	-1.608073	-1.863831

375 rows × 4 columns

Group by time and other columns to compute the average of subsets of epochs

[3]:

subsets_wide = epochs_df.groupby(["time_ms", "stim"]).mean()[eeg_channels]
subsets_wide.columns.name = "channel"
subsets_wide

[3]:

	channel	MiPf	MiCe	MiPa	MiOc
time_ms	stim
-748	cal	-4.317307	-3.857553	-4.073911	-4.143690
	standard	1.419520	0.651240	0.729152	0.773548
	target	0.950000	1.211514	2.442932	-0.413608
-744	cal	-4.329327	-3.851473	-4.114043	-4.118098
-744	standard	1.493151	0.686627	0.894989	1.060941
...	...	...	...	...	...
744	standard	-1.669520	0.564905	-2.007408	-2.380829
744	target	19.059999	0.232344	3.668494	3.395762
748	cal	-4.305288	-3.828247	-4.077405	-4.089811
	standard	-1.566781	0.687771	-1.867695	-2.206024
	target	18.730000	0.771514	4.286233	3.765410

1125 rows × 4 columns

Time-domain averages (long)

[4]:

subsets_long = subsets_wide.stack()  # pivot the channel columns into one long column
subsets_long.name = "microvolts"
pd.DataFrame(subsets_long)

[4]:

			microvolts
time_ms	stim	channel
-748	cal	MiPf	-4.317307
		MiCe	-3.857553
		MiPa	-4.073911
		MiOc	-4.143690
	standard	MiPf	1.419520
...	...	...	...
748	standard	MiOc	-2.206024
	target	MiPf	18.730000
		MiCe	0.771514
		MiPa	4.286233
		MiOc	3.765410

4500 rows × 1 columns

Time interval measurments

Interval measurments use the “slice-groupby-apply” pattern.

slice the time interval rows
group by epoch_id and other tags
apply the measurment function to the data, e.g., pandas built-in or user-defined

Start by doing the steps separately to verify.

When the steps are right, chain them for compact expression.

Example: single trial mean amplitude

Load the epochs

[5]:

eeg_channels = ["MiPf", "MiCe", "MiPa", "MiOc"]

epochs_df = get_demo_df(P3_1500_FEATHER).query('stim in ["target", "standard"]')
epf.check_epochs(epochs_df, eeg_channels, epoch_id="epoch_id", time="time_ms")
epochs_df

[5]:

	epoch_id	time_ms	sub_id	eeg_artifact	dblock_path	log_evcodes	log_ccodes	dblock_srate	ccode	instrument	...	RMOc	LLTe	RLTe	LLOc	RLOc	MiOc	A2	HEOG	rle	rhz
0	0	-748	sub000	0	sub000/dblock_0	0	0	250.0	1	eeg	...	-25.093750	-0.753906	1.480469	-13.414062	-18.937500	-17.734375	5.660156	98.875000	-39.500000	38.375000
1	0	-744	sub000	0	sub000/dblock_0	0	0	250.0	1	eeg	...	-24.593750	0.502441	-2.466797	-17.640625	-17.468750	-15.304688	1.968750	104.750000	-38.031250	41.281250
2	0	-740	sub000	0	sub000/dblock_0	0	0	250.0	1	eeg	...	-16.484375	-1.507812	3.947266	-15.648438	-10.085938	-11.171875	8.367188	102.062500	-33.656250	43.718750
3	0	-736	sub000	0	sub000/dblock_0	0	0	250.0	1	eeg	...	-11.804688	-15.070312	9.867188	-14.906250	-7.378906	-8.742188	9.351562	100.562500	-42.906250	37.406250
4	0	-732	sub000	0	sub000/dblock_0	0	0	250.0	1	eeg	...	-6.394531	-4.019531	9.125000	-10.679688	-6.886719	-8.015625	8.125000	98.375000	-43.875000	37.906250
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
146995	391	732	sub000	0	sub000/dblock_3	0	0	250.0	1	eeg	...	9.593750	10.804688	0.000000	4.472656	1.967773	4.617188	-3.937500	-9.296875	-10.242188	-2.429688
146996	391	736	sub000	0	sub000/dblock_3	0	0	250.0	1	eeg	...	15.250000	15.578125	8.632812	10.429688	8.609375	10.929688	-0.246094	-7.343750	-7.800781	2.914062
146997	391	740	sub000	0	sub000/dblock_3	0	0	250.0	1	eeg	...	11.070312	11.554688	4.195312	7.203125	6.886719	7.773438	-4.429688	-7.832031	-13.648438	-2.429688
146998	391	744	sub000	0	sub000/dblock_3	0	0	250.0	1	eeg	...	13.039062	6.781250	6.414062	9.187500	10.578125	10.445312	0.246094	-8.320312	-8.773438	-1.457031
146999	391	748	sub000	0	sub000/dblock_3	0	0	250.0	1	eeg	...	13.531250	8.539062	10.359375	11.671875	13.531250	11.906250	3.937500	-8.320312	-10.242188	0.485840

147000 rows × 47 columns

(optional) select the data columns of interest or skip this and use them all.

[6]:

coi = ["epoch_id", "time_ms", "stim", "MiPf", "MiCe", "MiPa", "MiOc"]
mid = epochs_df[coi]  # select columns of interest

display(mid.head())
display(mid.tail())

	time_ms	stim	MiPf	MiCe	MiPa	MiOc
0	-748	target	-54.5	2.781250	-8.828125	-17.734375
1	-744	target	-56.5	-4.046875	-11.929688	-15.304688
2	-740	target	-55.5	-3.289062	-4.769531	-11.171875
3	-736	target	-60.5	-2.529297	0.954102	-8.742188
4	-732	target	-57.0	4.046875	9.781250	-8.015625

	epoch_id	time_ms	stim	MiPf	MiCe	MiPa	MiOc
146995	391	732	standard	9.5	3.289062	15.265625	4.617188
146996	391	736	standard	15.5	9.359375	21.234375	10.929688
146997	391	740	standard	9.0	3.792969	15.507812	7.773438
146998	391	744	standard	7.5	4.300781	15.507812	10.445312
146999	391	748	standard	6.0	4.554688	14.789062	11.906250

Slice the time interval data sample (rows) to measure and verify by inspection

[7]:

mid_300_500 = mid.query("time_ms >= 300 and time_ms <= 500")

display(mid_300_500.head())
display(mid_300_500.tail())
display(mid_300_500["time_ms"].min(), mid_300_500["time_ms"].max())

	time_ms	stim	MiPf	MiCe	MiPa	MiOc
262	300	target	-46.0	69.5625	77.5000	29.640625
263	304	target	-41.5	78.4375	82.7500	33.531250
264	308	target	-39.0	83.1875	84.4375	33.031250
265	312	target	-39.5	81.9375	82.3125	29.875000
266	316	target	-36.5	82.6875	83.2500	31.828125

	epoch_id	time_ms	stim	MiPf	MiCe	MiPa	MiOc
146933	391	484	standard	14.5	3.541016	5.726562	5.585938
146934	391	488	standard	13.0	-4.300781	-5.484375	-1.943359
146935	391	492	standard	8.5	-6.578125	-10.015625	-1.214844
146936	391	496	standard	6.5	-11.382812	-14.789062	-0.971680
146937	391	500	standard	-1.5	-21.250000	-21.937500	-2.429688

Group by epoch_id, i.e., single trial, and other column labels to preserve them, and apply the built-in mean() function.

Note the time_ms timestamps is just another column of data and also averaged in the interval.

Note pandas.Dataframe has dozens of built-in stats functions besides mean: max(), min(), std(), var(), …

[8]:

mid_300_500_mna = mid_300_500.groupby(["epoch_id", "stim"]).mean()

display(mid_300_500_mna.head(), mid_300_500_mna.tail())

		time_ms	MiPf	MiCe	MiPa	MiOc
epoch_id	stim
0	target	400.0	-42.176472	42.853859	52.521751	12.860375
1	target	400.0	-14.617647	41.024815	42.625919	6.120811
2	target	400.0	-7.186275	24.073071	31.395679	13.836741
3	target	400.0	-16.911764	20.560892	26.349571	16.461147
4	target	400.0	13.039216	27.078394	22.416552	5.758588

		time_ms	MiPf	MiCe	MiPa	MiOc
epoch_id	stim
387	standard	400.0	13.911765	20.714920	22.612095	4.010857
388	standard	400.0	24.696079	-4.259727	0.098336	-2.500594
389	standard	400.0	27.578432	11.336646	14.499387	6.349677
390	standard	400.0	36.323528	-0.446557	3.442656	-0.209578
391	standard	400.0	11.490196	-12.447074	-6.483092	-2.429051

The epoch interval measurements are new data, re-label them appropriately.

[9]:

# drop the no longer meaningful time_ms column
mid_300_500_mna = mid_300_500_mna.drop("time_ms", axis=1)

# describe the type of measurment and interval
mid_300_500_mna["measure"] = "mna"
mid_300_500_mna["interval"] = "300_500"

display(mid_300_500_mna.head(), mid_300_500_mna.tail())

		MiPf	MiCe	MiPa	MiOc	measure	interval
epoch_id	stim
0	target	-42.176472	42.853859	52.521751	12.860375	mna	300_500
1	target	-14.617647	41.024815	42.625919	6.120811	mna	300_500
2	target	-7.186275	24.073071	31.395679	13.836741	mna	300_500
3	target	-16.911764	20.560892	26.349571	16.461147	mna	300_500
4	target	13.039216	27.078394	22.416552	5.758588	mna	300_500

		MiPf	MiCe	MiPa	MiOc	measure	interval
epoch_id	stim
387	standard	13.911765	20.714920	22.612095	4.010857	mna	300_500
388	standard	24.696079	-4.259727	0.098336	-2.500594	mna	300_500
389	standard	27.578432	11.336646	14.499387	6.349677	mna	300_500
390	standard	36.323528	-0.446557	3.442656	-0.209578	mna	300_500
391	standard	11.490196	-12.447074	-6.483092	-2.429051	mna	300_500

(optional) Export the measurements data

[10]:

mid_300_500_mna.reset_index().to_feather(DATA_DIR / "p3_mid_mna_300_500.feather")

Chaining: All of the above, simplified by chaining. The results are verifiably identical.

[11]:

coi = ["epoch_id", "time_ms", "stim", "MiPf", "MiCe", "MiPa", "MiOc"]

# slice-groupby-apply
mid_300_500_mna_c  = (
    epochs_df[coi]
    .query("time_ms >= 300 and time_ms <= 500")
    .groupby(["epoch_id", "stim"])
    .mean()
    .drop("time_ms", axis=1)
)

# describe the type of measurment and interval
mid_300_500_mna_c["measure"] = "mna"
mid_300_500_mna_c["interval"] = "300_500"

# verify steps and chained agree
assert all(mid_300_500_mna_c == mid_300_500_mna)

# export
mid_300_500_mna_c.reset_index().to_feather(DATA_DIR / "p3_mid_mna_300_500.feather")