mkpy.pygarv module
pygarv is the backend for marking artifacts in mkh5 data with tests defined in a YAML file
Successful runs of tests and their results are stored in PyGarv.tr_docs a list of tr_doc dicts, one dict per h5 datablock.
- Parameters
tr_doc[‘tests’] (list) – each item is a dict
Examples
tr_doc[‘tests’]
[ {'dblock_path_idx': 0, 'dblock_path': 'calstest/dblock_0', 'name': 'pygarv', 'tests': [ [{'test': 'ppa'}, {'tag': 'amplitude exursions'}, {'stream': 'MiCe'}, {'threshold': 0.0}, {'interval': 0.0} ], [{'test': 'ppadif'}, {'tag': 'amplitude exursions'}, {'stream': 'MiCe'}, {'threshold': 0.1}, {'interval': 0.1}, {'stream2': 'MiPa'} ] ]}, {'dblock_path_idx': 1, 'dblock_path': 'calstest/dblock_1', 'name': 'pygarv', 'tests': None}, ]
tr_doc[‘fails’] : list
len(tr_doc[‘fails’] == len(tr_doc[‘tests’]) where tr_doc[‘fails’][idx] is
a list of (start, stop) intervals in dblock_tick indexes where tr_doc[‘test’] failed. tr_doc[‘pygarv’]
The tests are specified as a YAML file .yarf.
--- dblock_path: some_path dblock_path_idx: unint name: pygarv tests: - - test_spec - test_spec ... - test_spec
Each test_spec is a YAML map with a mandatory name and tag parameter and optional other paramters as needed for specific tests
test: str tag: str
where
test names a pygarv test function, e.g., mxflat, ppadif tag is a user-defined descriptive tag, e.g., blocking, heog, fancy test
- class mkpy.pygarv.PyGarv(mkh5_f, yarf_f=None)[source]
Bases:
object
container to hold an inventory of functions for computing sample-wise artifact masks.
When invoked at the command line, pygarv needs an mkh5 file to work with
- There are two cases:
- has not been previously garved with _update_mkh5()
no pygarv test info in header
pygarv data streams all zeros
- data has been previously garved with _update_mkh5()
pygarv test info appears in header
test results are unknown, possibly None
pygarv data stream state is unknown
On init the mkh5 file is scanned for previous runs, if found the pygarv data buffers (volatile) are synced with the info from the h5 file.
For each data block:
self.tr_docs are set to match the header[‘pygarv’] dict
self.yarf_fails are set according to dblock[‘pygarv’], self.tr_docs
the value of pygarv = run_test(db_idx) (what-if run) is checked against the dblock data, discrepanices throw a warning
PyGarv now has persistent and volatile rejection data in alignment, suitable for viewing/editing in mkh5viewer
PyGarvTest
The PyGarvTest decorator handles all the default parameter name and type bookkeeping for specific tests
To add a test to the catalog …
1. implement a function that takes two args (hdr, dblock, **kwargs) and returns a boolean artifact mask of length dblock data samples where 0 = good, 1 = bad.
The hdr (dict), and dblock (np.ndarray) are, e.g., as returned by hdr, dblock = mkh5.get_dblock(path_to_datablock) but can by any dict and dblock that expose variable needed to compute the artifact mask.
decorate it with @PyGarvTest(test_name, [key=dtype, key=dtype])
where test_name is the test name and the list of key_i=dtype_i optionally gives extra parameters named key_1, … key_n with data type dtype.
- cppadif = {'interval': None, 'stream': None, 'stream2': None, 'tag': None, 'test': 'cppadif', 'threshold': None}[source]
- cstdev = {'interval': None, 'stream': None, 'tag': None, 'test': 'cstdev', 'threshold': None}[source]
- get_result(pg_test_result)[source]
convenience wrapper to query a test result, decode the mask, and return with its test in a handy package.
- Parameters
pg_test_result (a (tr_doc, pygarv_mask) tuple) – as returned by run_* functions
- maxflat = {'nsamp': None, 'poststim': None, 'prestim': None, 'stream': None, 'tag': None, 'test': 'maxflat', 'threshold': None}[source]
- param_types = {'interval': <class 'float'>, 'stream': <class 'str'>, 'threshold': <class 'float'>}
- ppa = {'poststim': None, 'prestim': None, 'stream': None, 'tag': None, 'test': 'ppa', 'threshold': None}[source]
- ppadif = {'poststim': None, 'prestim': None, 'stream': None, 'stream2': None, 'tag': None, 'test': 'ppadif', 'threshold': None}[source]
- run_dblock(dbp_idx, tr_doc)[source]
Run tests in the tr_doc for datablock at dbp_idx, returns 64-bit pygarv sample mask.
- Parameters
dpb_idx (uint) – index of the ith dblock in self.dblock_paths
tr_doc (dict) – PyYarf format dict with tr_doc[‘tests’]
- Returns
dict of results like so:
{name: 'results', dblock_path: str (== the yarf_dbp), pygarv : np.ndarray(shape=(len(dblock),), dtype=dblock['pygarv'].dtype), fails : list of uint 2-ples (x0, x1)}
- Return type
results
The fails list amounts to an RLL compression of the boolean vector pygarv > 0
- Raises
ValueError if tr_doc['dblock_path'] != self.dblock_paths[dbp_idx] –
- class mkpy.pygarv.PyGarvTest(test, **kwargs)[source]
Bases:
OrderedDict
Decorator class for the PyGarv tests.
This enforces an extensible standard form on PyGarv test specs and execution.
The class derives from OrderedDict so it returns .keys() .values() .items() in fixed original parameter order. This is useful for populating test UI elements and reading writing YAML sequences without scrambling the key:value pairs the way a dict() might.
- Parameters
param_specs ([(key,type), …]) –
- keystr
parameter label
- typePython type
required Python data type for values of the key
(‘test’,str), (‘tag’, str), (‘stream’, str),
Default test parameters (in sequence order)
- teststr
corresponds to the self._test() function that runs it
- tagstr
user specified descriptive tag for the test … anything sensible
- streamstr
name or regex pattern for primary dblock data stream(s) to run the test on
Optional test specific parameter:type pairs are defined in the decorator arguments
- Raises
ValueError – If the type of a test parameter differs from that in
param_specs
PyGarvTest
overridesOrderedDict.__setitem__()
with additional type checking on the value of test[‘key’] = valueThe class variable
param_specs
specifies mandatoryPyGarvTest
parameters and types.Optional decorator arguments can extend the mandatory parameters and types and will be automatically passed to the decorated test function.
all PyGarvTest instances have _default_params with key, type
optional decorater args extend PyGarvTest instances with additional params
public CRUD API is standardized
To preserve test spec order for display and yamlized round trips, test specs are stored internally as OrderedDicts and the setter/getter API wants and returns lists of dict, i.e., ..code-block:: python
[{‘test’:’ppa’}, …{‘interval’:1500.0}]
- run(hdr, dblock, \*\*kwargs)
- Parameters
hdr (dict) – metadata consulted in running the tests, e.g., sampling rate
dblock (np.ndarray (named dtypes)) – columns of data, typically accessed by dtype.name
- Returns
results – sample-wise data rejection mask, 1=bad, 0=good
- Return type
np.ndarray, dtype=bool, length = len(dblock)
- Usage()
- -----
- property param_types
- property params
names of the parameters this test as a list
- property specs
- property specs_as_yaml
returns current specs as yaml string
- property types
data types of the values for the parameters as a list
- class mkpy.pygarv.PyYarf(yarf_f=None)[source]
Bases:
object
YAML test file I/O for PyGarv artifact test parameters
- Parameters
yarf_f (str) – file path to well-formed YAML with PyYarf test specification structure
- Variables
yarf_docs (list) –
each item is a yarf_doc dict that yamlizes in-out without modification ..code-block:: python
- {‘name’: ‘pygarv’ (str),
’dblock_path_idx’: n (uint) ‘dblock_path’: path_to_a_mkh5_dblock (str), ‘tests’: [ test_spec, … test_spec] (list)}
- IO methods
read yarf_docs from yaml write yarf_docs to yaml read yarf_docs from mkh5 headers
- PyYarf YAML format:
exactly one yaml document per mkh5 dblock_path
each doc is a map with 3 keys: name, dblock_path, tests
the value of name must be pygarv (str)
the value of dblock_path in the ith yaml doc must == mkh5.data_blocks[i] (str)
the value of tests must be a list of test specifications (see PyGarvTest docs)
Examples
# generated by PyYarf --- dblock_path_idx: 0 dblock_path: calstest/dblock_0 name: pygarv tests: - - test: ppa_event - tag: tag1 - stream: MiPf - threshold: 20.0 - prestim: 500.0 - poststim: 1500.0 - - test: ppa_event - tag: tag1 - stream: MiCe - threshold: 50.0 - prestim: 100.0 - poststim: 1000.0 - - test: ppa_event - tag: tag1 - stream: MiPa - threshold: 10.0 - prestim: 10.0 - poststim: 200.0 --- dblock_path_idx: 1 dblock_path: calstest/dblock_1 name: pygarv tests: []
- read_from_mkh5(mkh5_f)[source]
scan mkh5 dblock headers and dblock[‘pygarv’] stream artifact test info
- Returns
yarf_docs – dict is a PyYarf format dict see PyYarf doc string for details
- Return type
list of list of dict where