MetricBundleGroup

class rubin_sim.maf.metric_bundles.MetricBundleGroup(bundle_dict, db_con, out_dir='.', results_db=None, verbose=False, save_early=True, db_table=None)

Bases: object

The MetricBundleGroup exists to calculate the metric values for a group of MetricBundles.

The MetricBundleGroup will query data from a single database table (for multiple constraints), use that data to calculate metric values for multiple slicers, and calculate summary statistics and generate plots for all metrics included in the dictionary passed to the MetricBundleGroup.

We calculate the metric values here, rather than in the individual MetricBundles, because it is much more efficient to step through a slicer once (and calculate all the relevant metric values at each point) than it is to repeat this process multiple times.

The MetricBundleGroup also determines how to efficiently group the MetricBundles to reduce the number of sql queries of the database, grabbing larger chunks of data at once.

Parameters:
bundle_dictdict or list of MetricBundles

Individual MetricBundles should be placed into a dictionary, and then passed to the MetricBundleGroup. The dictionary keys can then be used to identify MetricBundles if needed – and to identify new MetricBundles which could be created if ‘reduce’ functions are run on a particular MetricBundle. A bundle_dict can be conveniently created from a list of MetricBundles using makeBundlesDictFromList (done automatically if a list is passed in)

db_constr or database connection object

A str that is the path to a sqlite3 file or a database object that can be used by pandas.read_sql. Advanced use: It is possible to set this to None, in which case data should be passed directly to the runCurrent method (and runAll should not be used).

out_dirstr, optional

Directory to save the metric results. Default is the current directory.

results_dbResultsDb, optional

A results database. If not specified, one will be created in the out_dir. This database saves information about the metrics calculated, including their summary statistics.

verbosebool, optional

Flag to turn on/off verbose feedback.

save_earlybool, optional

If True, metric values will be saved immediately after they are first calculated (to prevent data loss) as well as after summary statistics are calculated. If False, metric values will only be saved after summary statistics are calculated.

db_tablestr, optional

The name of the table in the db_obj to query for data.

Methods Summary

get_data(constraint)

Query the data from the database.

plot_all([save_figs, outfile_suffix, ...])

Generate all the plots for all the metricBundles in bundleDict.

plot_current([savefig, outfile_suffix, ...])

Generate the plots for the currently active set of MetricBundles.

read_all()

Attempt to read all MetricBundles from disk.

reduce_all([update_summaries])

Run the reduce methods for all metrics in bundleDict.

reduce_current([update_summaries])

Run all reduce functions for the metricbundle in the currently active set of MetricBundles.

run_all([clear_memory, plot_now, plot_kwargs])

Runs all the metricBundles in the metricBundleGroup, over all constraints.

run_current(constraint[, sim_data, ...])

Run all the metricBundles which match this constraint in the metricBundleGroup.

set_current(constraint)

Utility to set the currentBundleDict (i.e.

summary_all()

Run the summary statistics for all metrics in bundleDict.

summary_current()

Run summary statistics on all the metricBundles in the currently active set of MetricBundles.

write_all()

Save all the MetricBundles to disk.

write_current()

Save all the MetricBundles in the currently active set to disk.

Methods Documentation

get_data(constraint)

Query the data from the database.

The currently bundleDict should generally be set before calling getData (using setCurrent).

Parameters:
constraintstr

The constraint for the currently active set of MetricBundles.

plot_all(save_figs=True, outfile_suffix=None, fig_format='pdf', dpi=600, trim_whitespace=True, thumbnail=True, closefigs=True)

Generate all the plots for all the metricBundles in bundleDict.

Generating all plots, for all MetricBundles, at this point, assumes that clearMemory was False.

Parameters:
save_figsbool, optional

If True, save figures to disk, to self.out_dir directory.

outfile_suffixbool, optional

Append outfile_suffix to the end of every plot file generated. Useful for generating sequential series of images for movies.

fig_formatstr, optional

Matplotlib figure format to use to save to disk. Default pdf.

dpiint, optional

DPI for matplotlib figure. Default 600.

trim_whitespacebool, optional

If True, trim additional whitespace from final figures. Default True.

thumbnailbool, optional

If True, save a small thumbnail jpg version of the output file to disk as well. This is useful for showMaf web pages. Default True.

closefigsbool, optional

Close the matplotlib figures after they are saved to disk. If many figures are generated, closing the figures saves significant memory. Default True.

plot_current(savefig=True, outfile_suffix=None, fig_format='pdf', dpi=600, trim_whitespace=True, thumbnail=True, closefigs=True)

Generate the plots for the currently active set of MetricBundles.

Parameters:
savefigbool, optional

If True, save figures to disk, to self.out_dir directory.

outfile_suffixstr, optional

Append outfile_suffix to the end of every plot file generated. Useful for generating sequential series of images for movies.

fig_formatstr, optional

Matplotlib figure format to use to save to disk. Default pdf.

dpiint, optional

DPI for matplotlib figure. Default 600.

trim_whitespacebool, optional

If True, trim additional whitespace from final figures. Default True.

thumbnailbool, optional

If True, save a small thumbnail jpg version of the output file to disk as well. This is useful for showMaf web pages. Default True.

closefigsbool, optional

Close the matplotlib figures after they are saved to disk. If many figures are generated, closing the figures saves significant memory. Default True.

read_all()

Attempt to read all MetricBundles from disk.

You must set the metrics/slicer/constraint/run_name for a metricBundle appropriately; then this method will search for files in the location self.out_dir/metricBundle.fileRoot. Reads all the files associated with all metricbundles in self.bundle_dict.

reduce_all(update_summaries=True)

Run the reduce methods for all metrics in bundleDict.

Running this method, for all MetricBundles at once, assumes that clearMemory was False.

Parameters:
update_summariesbool, optional

If True, summary metrics are removed from the top-level (non-reduced) MetricBundle. Usually this should be True, as summary metrics are generally intended to run on the simpler data produced by reduce metrics.

reduce_current(update_summaries=True)

Run all reduce functions for the metricbundle in the currently active set of MetricBundles.

Parameters:
update_summariesbool, optional

If True, summary metrics are removed from the top-level (non-reduced) MetricBundle. Usually this should be True, as summary metrics are generally intended to run on the simpler data produced by reduce metrics.

run_all(clear_memory=False, plot_now=False, plot_kwargs=None)

Runs all the metricBundles in the metricBundleGroup, over all constraints.

Calculates metric values, then runs reduce functions and summary statistics for all MetricBundles.

Parameters:
clear_memorybool, optional

If True, deletes metric values from memory after running each constraint group.

plot_nowbool, optional

If True, plots the metric values immediately after calculation.

plot_kwargsbool, optional

kwargs to pass to plotCurrent.

run_current(constraint, sim_data=None, clear_memory=False, plot_now=False, plot_kwargs=None)

Run all the metricBundles which match this constraint in the metricBundleGroup.

Calculates the metric values, then runs reduce functions and summary statistics for metrics in the current set only (see self.setCurrent).

Parameters:
constraintstr

constraint to use to set the currently active metrics

sim_datanumpy.ndarray, optional

If simData is not None, then this numpy structured array is used instead of querying data from the dbObj.

clear_memorybool, optional

If True, metric values are deleted from memory after they are calculated (and saved to disk).

plot_nowbool, optional

Plot immediately after calculating metric values (instead of the usual procedure, which is to plot after metric values are calculated for all constraints).

plot_kwargskwargs, optional

Plotting kwargs to pass to plotCurrent.

set_current(constraint)

Utility to set the currentBundleDict (i.e. a set of metricBundles with the same SQL constraint).

Parameters:
constraintstr

The subset of MetricBundles with metricBundle.constraint == constraint will be included in a subset identified as the currentBundleDict. These are the active metrics to be calculated and plotted, etc.

summary_all()

Run the summary statistics for all metrics in bundleDict.

Calculating all summary statistics, for all MetricBundles, at this point assumes that clearMemory was False.

summary_current()

Run summary statistics on all the metricBundles in the currently active set of MetricBundles.

write_all()

Save all the MetricBundles to disk.

Saving all MetricBundles to disk at this point assumes that clearMemory was False.

write_current()

Save all the MetricBundles in the currently active set to disk.