Metric Bundles#

class rubin_sim.maf.metric_bundles.MetricBundle(metric, slicer, constraint=None, stacker_list=None, run_name='run name', metadata=None, info_label=None, plot_dict=None, display_dict=None, summary_metrics=None, maps_list=None, file_root=None, plot_funcs=None)[source]#

Bases: object

Define a metric bundle combination of metric, slicer, and constraint.

Parameters:

metric (metric) – The Metric class to run per slice_point
slicer (slicer) – The Slicer to apply to the incoming visit data (the observations).
constraint (str or None, opt) – A (sql-style) constraint to apply to the visit data, to apply a broad sub-selection.
stacker_list (list [stacker], opt) – A list of pre-configured stackers to use to generate additional columns per visit. These will be generated automatically if needed, but pre-configured versions will override these.
run_name (str, opt) – The name of the simulation being run. This will be added to output files and plots. Setting it prevents file conflicts when running the same metric on multiple simulations, and provides a way to identify which simulation is being analyzed.
metadata (str, opt) – A deprecated version of info_label (below). Values set by metadata will be used for info_label. If both are set, info_label is used.
info_label (str or None, opt) – Information to add to the output metric data file name and plot labels. If this is not provided, it will be auto-generated from the constraint (if any). Setting this provides an easy way to specify different configurations of a metric, a slicer, or just to rewrite your constraint into friendlier terms. (i.e. a constraint like ‘scheduler_note not like “%DD%”’ can become “non-DD” in the file name and plot labels by specifying info_label).
plot_dict (dict of plotting parameters, opt) – Specify general plotting parameters, such as x/y/color limits.
display_dict (dict of display parameters, opt) – Specify parameters for show_maf web pages, such as the side bar labels and figure captions. Keys: ‘group’, ‘subgroup’, ‘caption’, and ‘order’ (such as to set metrics in filter order, etc)
summary_metrics (list of metrics) – A list of summary metrics to run to summarize the primary metric, such as MedianMetric, etc.
maps_list (list of maps) – A list of pre-configured maps to use for the metric. This will be auto-generated if specified by the metric class, but pre-configured versions will override these.

Notes

Define the “thing” you are measuring, with a combination of * metric (calculated per data_slice) * slicer (how to create the data_slices) * constraint (an optional definition of a large subset of data)

Together these define a unique combination of an opsim benchmark, or “metric bundle”. An example would be: a CountMetric, a HealpixSlicer, and a constraint of “filter=’r’”.

After the metric is evaluated at each slice_point created by the slicer, the resulting metric values are saved in the MetricBundle.

The MetricBundle also saves the summary metrics to be used to generate summary statistics over those metric values, as well as the resulting summary statistic values.

Plotting parameters and display parameters (for show_maf) are saved in the MetricBundle, as well as additional info_label such as the opsim run name, and relevant stackers and maps to apply when calculating the metric values.

compute_summary_stats(results_db=None)[source]#

Compute summary statistics on metric_values, using summaryMetrics (metricbundle list).

Parameters:: results_db (Optional[ResultsDb]) – ResultsDb object to use to store the summary statistic values on disk.

classmethod load(filename)[source]#

Create a metric bundle and load its content from disk.

Parameters:: filename (str) – The file from which to read the metric bundle data.

output_json()[source]#

Set up and call the baseSlicer outputJSON method, to output to IO string.

Returns:: IO object containing JSON data representing the metric bundle data.
Return type:: io

plot(plot_handler=None, plot_func=None, outfile_suffix=None, savefig=False)[source]#

Create all plots available from the slicer. plotHandler holds the output directory info, etc.

Parameters:

plot_handler (plot_handler, opt) – The plot_handler saves the output location and results_db connection for a set of plots.
plot_func (maf.plots.BasePlotter, opt) – Any plotter function. If not specified, the plotters in self.plotFuncs will be used.
outfile_suffix (str, opt) – Optional string to append to the end of the plot output files. Useful when creating sequences of images for movies.
savefig (bool, opt) – Flag indicating whether or not to save the figure to disk. Default is False.

Returns:

made_plots – Dictionary of plot_type:figure key/value pairs, indicating what plots were created and what matplotlib figures were used.

Return type:

dict

read(filename)[source]#

Read metric data from disk. Overwrites any data currently in metricbundle.

Parameters:: filename (str) – The file from which to read the metric bundle data.

reduce_metric(reduce_func, reduce_func_name=None, reduce_plot_dict=None, reduce_display_dict=None)[source]#

Run ‘reduceFunc’ (any function that operates on self.metric_values).

Typically reduceFunc will be the metric reduce functions, as they are tailored to expect the metric_values format. reduceDisplayDict and reducePlotDicts are displayDicts and plotDicts to be applied to the new metricBundle.

Parameters:

reduce_func (Func) – Any function that will operate on self.metric_values (typically metric.reduce* function).
reduce_plot_dict (dict, opt) – Plot dictionary for the results of the reduce function.
reduce_display_dict (dict, opt) – Display dictionary for the results of the reduce function.

Returns:

newmetric_bundle – New metric bundle, inheriting info_label from this metric bundle, but containing the new metric values calculated with the ‘reduceFunc’.

Return type:

MetricBundle

set_display_dict(display_dict=None, results_db=None)[source]#

Set or update any property of display_dict.

Parameters:

display_dict (dict of str) – Dictionary of display parameters for show_maf. Expected keys are: ‘group’, ‘subgroup’, ‘order’, ‘caption’. ‘group’, ‘subgroup’, and ‘order’ control where the metric results are shown on the show_maf page. ‘caption’ provides a caption to use with the metric results. These values are saved in the results database.
results_db (maf.ResultsDb) – A MAF results database, used to save the display parameters.

set_plot_dict(plot_dict)[source]#

Set or update any property of plot_dict.

Parameters:: plot_dict (dict) – A dictionary of plotting parameters. The usable keywords vary with each rubin_sim.maf.plots Plotter.

set_plot_funcs(plot_funcs)[source]#

Set or reset the plotting functions.

The default is to use all the plotFuncs associated with the slicer, which is what happens in self.plot if setPlotFuncs is not used to override self.plotFuncs.

Parameters:: plot_funcs (List [BasePlotter]) – The plotter or plotters to use to generate visuals for this metric.

set_run_name(run_name, update_file_root=True)[source]#

Set (or reset) the run_name. FileRoot will be updated accordingly if desired.

Parameters:

run_name (str) – Run Name, which will become part of the fileRoot.
fileRoot (bool, optional) – Flag to update the fileRoot with the run_name.

set_summary_metrics(summary_metrics)[source]#

Set (or reset) the summary metrics for the metricbundle.

Parameters:: summary_metrics (List [BaseMetric]) – Instantiated summary metrics to use to calculate summary statistics for this metric.

write(comment='', out_dir='.', outfile_suffix=None, results_db=None)[source]#

Write metric_values (and associated info_label) to disk.

Parameters:

comment (str) – Any additional comments to add to the output file
out_dir (str) – The output directory
outfile_suffix (str) – Additional suffix to add to the output files (typically a numerical suffix for movies)
results_db (maf.ResultsDb) – Results database to store information on the file output

write_db(results_db=None, outfile_suffix=None)[source]#: Write the metric information to the results database

class rubin_sim.maf.metric_bundles.MetricBundleGroup(bundle_dict, db_con, out_dir='.', results_db=None, verbose=False, save_early=True, db_table=None)[source]#

Bases: object

Calculate all values for a group of MetricBundles.

Parameters:

bundle_dict (dict or list [MetricBundles]) – Individual MetricBundles should be placed into a dictionary, and then passed to the MetricBundleGroup. The dictionary keys can then be used to identify MetricBundles if needed – and to identify new MetricBundles which could be created if ‘reduce’ functions are run on a particular MetricBundle. A bundle_dict can be conveniently created from a list of MetricBundles using makeBundlesDictFromList (done automatically if a list is passed).
db_con (str or database connection object) – A str that is the path to a sqlite3 file or a database object that can be used by pandas.read_sql. Advanced use: It is possible to set this to None, in which case data should be passed directly to the runCurrent method (and runAll should not be used).
out_dir (str, opt) – Directory to save the metric results. Default is the current directory.
results_db (ResultsDb, opt) – A results database to store summary stat information. If not specified, one will be created in the out_dir. This database saves information about the metrics calculated, including their summary statistics.
verbose (bool, opt) – Flag to turn on/off verbose feedback.
save_early (bool, opt) – If True, metric values will be saved immediately after they are first calculated (to prevent data loss) as well as after summary statistics are calculated. If False, metric values will only be saved after summary statistics are calculated.
db_table (str, opt) – The name of the table in the db_obj to query for data. For modern opsim outputs, this table is observations (default None).

Notes

The MetricBundleGroup will query data from a single database table (for multiple constraints), use that data to calculate metric values for multiple slicers, and calculate summary statistics and generate plots for all metrics included in the dictionary passed to the MetricBundleGroup.

We calculate the metric values here, rather than in the individual MetricBundles, because it is much more efficient to step through a slicer once (and calculate all the relevant metric values at each point) than it is to repeat this process multiple times.

The MetricBundleGroup also determines how to efficiently group the MetricBundles to reduce the number of sql queries of the database, grabbing larger chunks of data at once.

get_data(constraint)[source]#

Query the data from the database.

The currently bundleDict should generally be set before calling getData (using setCurrent).

Parameters:: constraint (str) – The constraint for the currently active set of MetricBundles.

plot_all(save_figs=True, outfile_suffix=None, fig_format='pdf', dpi=600, trim_whitespace=True, thumbnail=True, closefigs=True)[source]#

Generate all the plots for all the metricBundles in bundleDict.

Generating all plots, for all MetricBundles, At this point, assumes that clearMemory was False.

Parameters:

savefig (bool, optional) – If True, save figures to disk, to self.out_dir directory.
outfile_suffix (str, optional) – Append outfile_suffix to the end of every plot file generated. Useful for generating sequential series of images for movies.
fig_format (str, optional) – Matplotlib figure format to use to save to disk.
dpi (int, optional) – DPI for matplotlib figure.
trim_whitespace (bool, optional) – If True, trim additional whitespace from final figures.
thumbnail (bool, optional) – If True, save a small thumbnail jpg version of the output file to disk as well. This is useful for show_maf web pages.
closefigs (bool, optional) – Close the matplotlib figures after they are saved to disk. If many figures are generated, closing the figures saves significant memory.

plot_current(savefig=True, outfile_suffix=None, fig_format='pdf', dpi=600, trim_whitespace=True, thumbnail=True, closefigs=True)[source]#

Generate the plots for the currently active set of MetricBundles.

Parameters:

savefig (bool, optional) – If True, save figures to disk, to self.out_dir directory.
outfile_suffix (str, optional) – Append outfile_suffix to the end of every plot file generated. Useful for generating sequential series of images for movies.
fig_format (str, optional) – Matplotlib figure format to use to save to disk.
dpi (int, optional) – DPI for matplotlib figure.
trim_whitespace (bool, optional) – If True, trim additional whitespace from final figures.
thumbnail (bool, optional) – If True, save a small thumbnail jpg version of the output file to disk as well. This is useful for show_maf web pages.
closefigs (bool, optional) – Close the matplotlib figures after they are saved to disk. If many figures are generated, closing the figures saves significant memory.

read_all()[source]#

Attempt to read all MetricBundles from disk.

You must set the metrics/slicer/constraint/run_name for a metricBundle appropriately, so that the file_root is correct.

reduce_all(update_summaries=True)[source]#

Run the reduce methods for all metrics in bundleDict.

Running this method, for all MetricBundles at once, assumes that clearMemory was False.

Parameters:: update_summaries (bool, optional) – If True, summary metrics are removed from the top-level (non-reduced) MetricBundle. Usually this should be True, as summary metrics are generally intended to run on the simpler data produced by reduce metrics.

reduce_current(update_summaries=True)[source]#

Run all reduce functions for the metricbundle in the currently active set of MetricBundles.

Parameters:: update_summaries (bool, optional) – If True, summary metrics are removed from the top-level (non-reduced) MetricBundle. Usually this should be True, as summary metrics are generally intended to run on the simpler data produced by reduce metrics.

run_all(clear_memory=False, plot_now=False, plot_kwargs=None)[source]#

Calculates metric values, then runs reduce functions and summary statistics for all MetricBundles, over all constraints.

Parameters:

clear_memory (bool, optional) – If True, deletes metric values from memory after running each constraint group.
plot_now (bool, optional) – If True, plots the metric values immediately after calculation.
plot_kwargs (bool, optional) – kwargs to pass to plotCurrent.

run_current(constraint, sim_data=None, clear_memory=False, plot_now=False, plot_kwargs=None)[source]#

Calculates the metric values, then runs reduce functions and summary statistics for metrics in the current set only (see self.setCurrent).

Parameters:

constraint (str) – constraint to use to set the currently active metrics
sim_data (np.ndarray, opt) – If simData is not None, then this numpy structured array is used instead of querying data from the dbObj.
clear_memory (bool, opt) – If True, metric values are deleted from memory after they are calculated (and saved to disk).
plot_now (bool, opt) – Plot immediately after calculating metric values (instead of the usual procedure, which is to plot after metric values are calculated for all constraints).
plot_kwargs (kwargs, opt) – Plotting kwargs to pass to plotCurrent.

Notes

This is useful, for the context of running only a specific set of metric bundles so that the user can provide sim_data directly.

set_current(constraint)[source]#

Utility to set the currentBundleDict (i.e. a set of metricBundles with the same SQL constraint).

Parameters:: constraint (str) – The subset of MetricBundles with metricBundle.constraint == constraint will be included in a subset identified as the currentBundleDict. These are the active metrics to be calculated and plotted, etc.

Notes

This is useful, for the context of running only a specific set of metric bundles so that the user can provide sim_data directly.

summary_all()[source]#

Run the summary statistics for all metrics in bundleDict.

Calculating all summary statistics, for all MetricBundles, at this point assumes that clearMemory was False.

summary_current()[source]#: Run summary statistics on all the metricBundles in the currently active set of MetricBundles.

write_all()[source]#

Save all the MetricBundles to disk.

Saving all MetricBundles to disk at this point assumes that clearMemory was False.

write_current()[source]#: Save all the MetricBundles in the currently active set to disk.

class rubin_sim.maf.metric_bundles.MoMetricBundle(metric, slicer, constraint=None, stacker_list=None, run_name='run name', info_label=None, file_root=None, plot_dict=None, plot_funcs=None, display_dict=None, child_metrics=None, summary_metrics=None)[source]#

Bases: MetricBundle

Define a moving object metric bundle combination of moving-object metric, moving-object slicer, and constraint.

Parameters:

metric (metric) – The Metric class to run per slice_point
slicer (slicer) – The Slicer to apply to the incoming visit data (the observations).
constraint (str or None, opt) – A (sql-style) constraint to apply to the visit data, to apply a broad sub-selection.
stacker_list (list [stacker], opt) – A list of pre-configured stackers to use to generate additional columns per visit. These will be generated automatically if needed, but pre-configured versions will override these.
run_name (str, opt) – The name of the simulation being run. This will be added to output files and plots. Setting it prevents file conflicts when running the same metric on multiple simulations, and provides a way to identify which simulation is being analyzed.
info_label (str or None, opt) – Information to add to the output metric data file name and plot labels. If this is not provided, it will be auto-generated from the constraint (if any). Setting this provides an easy way to specify different configurations of a metric, a slicer, or just to rewrite your constraint into friendlier terms. (i.e. a constraint like ‘scheduler_note not like “%DD%”’ can become “non-DD” in the file name and plot labels by specifying info_label).
plot_dict (dict of plotting parameters, opt) – Specify general plotting parameters, such as x/y/color limits.
display_dict (dict of display parameters, opt) – Specify parameters for show_maf web pages, such as the side bar labels and figure captions. Keys: ‘group’, ‘subgroup’, ‘caption’, and ‘order’ (such as to set metrics in filter order, etc)
child_metrics (list of metrics) – A list of child metrics to run to summarize the primary metric, such as Discovery_At_Time, etc.
summary_metrics (list of metrics) – A list of summary metrics to run to summarize the primary or child metric, such as CompletenessAtH, etc.

Notes

Define the “thing” you are measuring, with a combination of * metric (calculated per object) * slicer (contains information on the moving objects and their observations) * constraint (an optional definition of a large subset of data)

The MoMetricBundle also saves the child metrics to be used to generate summary statistics over those metric values, as well as the resulting summary statistic values.

Plotting parameters and display parameters (for show_maf) are saved in the MoMetricBundle, as well as additional info_label such as the opsim run name, and relevant stackers and maps to apply when calculating the metric values.

compute_summary_stats(results_db=None)[source]#

Compute summary statistics on metric_values, using summaryMetrics, for self and child bundles.

Parameters:: results_db (ResultsDb) – Database which holds the summary statistic information.

reduce_metric(reduce_func, reduce_plot_dict=None, reduce_display_dict=None)[source]#

Run ‘reduceFunc’ (any function that operates on self.metric_values).

Typically reduceFunc will be the metric reduce functions, as they are tailored to expect the metric_values format. reduceDisplayDict and reducePlotDicts are displayDicts and plotDicts to be applied to the new metricBundle.

Parameters:

reduce_func (Func) – Any function that will operate on self.metric_values (typically metric.reduce* function).
reduce_plot_dict (dict, opt) – Plot dictionary for the results of the reduce function.
reduce_display_dict (dict, opt) – Display dictionary for the results of the reduce function.

Returns:

newmetric_bundle – New metric bundle, inheriting info_label from this metric bundle, but containing the new metric values calculated with the ‘reduceFunc’.

Return type:

MetricBundle

set_child_bundles(child_metrics=None)[source]#

Identify any child metrics to be run on this (parent) bundle. and create the new metric bundles that will hold the child values, linking to this bundle. Remove the summaryMetrics from self afterwards.

Parameters:: child_metrics (MoMetric) – Child metrics work like reduce functions for non-moving objects. They pull out subsets of the original metric values, typically do more processing on those values, and then save them in new metric bundles.

class rubin_sim.maf.metric_bundles.MoMetricBundleGroup(bundle_dict, out_dir='.', results_db=None, verbose=True)[source]#

Bases: object

Run groups of MoMetricBundles.

Parameters:

bundle_dict (dict or list [MoMetricBundles]) – Individual MoMetricBundles should be placed into a dictionary, and then passed to the MoMetricBundleGroup. The dictionary keys can then be used to identify MoMetricBundles if needed – and to identify new MetricBundles which could be created if ‘reduce’ functions are run on a particular MoMetricBundle. MoMetricBundles must all have the same Slicer (same set of moving object observations).
out_dir (str, opt) – Directory to save the metric results. Default is the current directory.
results_db (ResultsDb, opt) – A results database to store summary stat information. If not specified, one will be created in the out_dir. This database saves information about the metrics calculated, including their summary statistics.
verbose (bool, opt) – Flag to turn on/off verbose feedback.

plot_all(savefig=True, outfile_suffix=None, fig_format='pdf', dpi=600, thumbnail=True, closefigs=True)[source]#: Make a few generically desired plots. Given the nature of the outputs for much of the moving object metrics, a good deal of the plotting for the moving object batch is handled in a custom manner joining together multiple metricsbundles.

run_all()[source]#: Run all constraints and metrics for these moMetricBundles.

run_constraint(constraint)[source]#

Calculate the metric values for all the metricBundles which match this constraint in the metricBundleGroup. Also calculates child metrics and summary statistics, and writes all to disk.

Parameters:: constraint (str) – SQL-where or pandas constraint for the metricBundles.

rubin_sim.maf.metric_bundles.create_empty_metric_bundle()[source]#

Create an empty metric bundle.

Returns:: MetricBundle – An empty metric bundle, configured with just the BaseMetric and BaseSlicer.
Return type:: MetricBundle

rubin_sim.maf.metric_bundles.create_empty_mo_metric_bundle()[source]#

Create an empty metric bundle.

Returns:: MoMetricBundle – An empty metric bundle, configured with just the BaseMetric and BaseSlicer.
Return type:: MoMetricBundle

rubin_sim.maf.metric_bundles.make_bundles_dict_from_list(bundle_list)[source]#

Utility to convert a list of MetricBundles into a dictionary, keyed by the file_root names.

Raises an exception if the file_root duplicates another metricBundle. (Note this should alert to potential cases of filename duplication).

Parameters:: bundle_list (list [MetricBundles]) – List of metric bundles to convert into a dict.

rubin_sim.maf.metric_bundles.make_completeness_bundle(bundle, completeness_metric, h_mark=None, results_db=None)[source]#

Evaluate a MoMetricBundle with a completeness-style metric, and downsample into a new MoMetricBundle marginalized over the population.

Parameters:

bundle (MoMetricBundle) – The metric bundle with a completeness summary statistic.
completeness_metric (metric) – The summary (completeness) metric to run on the bundle.
h_mark (float, optional) – The Hmark value to add to the plotting dictionary of the new mock bundle. Default None.
results_db (ResultsDb, optional) – The results_db in which to record the summary statistic value at Hmark. Default None.

Returns:

mo_metric_bundle

Return type:

MoMetricBundle

Notes

This utility turns a metric bundle which could evaluate a metric over the population, into a secondary or mock metric bundle, using either MoCompleteness or MoCumulativeCompleteness summary metrics to marginalize over the population of moving objects. This lets us use the plotHandler + plots.MetricVsH to generate plots across the population, using the completeness information. This utility will also work with completeness metric run in order to calculate fraction of the population, or with MoCompletenessAtTime metric.