Convert FOB submodule to regular folder

This commit is contained in:
arihanv 2025-05-18 16:36:28 -07:00
parent 94f046ad40
commit 94825011a0
74 changed files with 4563 additions and 0 deletions

View file

@ -0,0 +1,131 @@
# Evaluation
During training you can monitor your experiments with [Tensorboard](https://www.tensorflow.org/tensorboard).
We also try to provide some useful functionality to quickly evaluate and compare the results of your experiments.
One can use the ```evaluate_experiment.py``` to get a quick first impression of a finished experiment run.
## Plotting vs. raw data
You can use the plotting pipeline with your customized setting (as shown in the usage examples).
Alternatively you can use the script to export your data to a .csv and process the data to your own needs.
In this scenario, set ```evaluation.output_types: [csv] # no plotting, just the data``` in your experiment yaml.
## Usage Examples
In the following you can find 4 example use cases for experiments and how to visualize the results as heatmaps.
1. testing an optimizer on a task
2. comparing two optimizers on the same task
3. comparing multiple optimizers on different tasks
4. comparing the influence of a single hyperparameter
Here we want to focus on the plotting. For instructions on how to run experiments, refer to the main [README](../../README.md). To get started right away, we provide the data for this example. If you want to reproduce it, refer to [this section](#reproducing-the-data).
### Plotting the experiment
By default, calling the `run_experiment.py` will plot the experiment after training and testing. To disable, set `engine.plot=false`.
To plot your experiment afterwards, call the `evaluate_experiment.py` with the same experiment yaml. To adjust how to plot, change the values under the `evaluation` key of the experiment. Take a look at the [evaluation/default.yaml](default.yaml) to see which settings are available. Some of these keys are explained in the examples below to give the reader a first impression. Note that some default parameters are set in the respective tasks (e.g. in [tasks/mnist/default.yaml](../tasks/mnist/default.yaml)).
### Example use cases
Here are some example scenarios to give you an understanding of how our plotting works. Run the commands from the root of the repository. Take a look at the yaml files used in the command to see what is going on.
#### Example 1
This example is a good starting point; it shows the performance of a single default optimizer on one of the tasks.
Experiment file: [examples/plotting/1_mnist-adamw.yaml](../../examples/plotting/1_mnist-adamw.yaml)
```python -m pytorch_fob.evaluate_experiment examples/plotting/1_mnist-adamw.yaml```
![your plot is not finished yet](../../examples/plotting/1_mnist-adamw-last-heatmap.png)
This example uses only the final model performance and only creates the plot as png.
Helpful settings:
- ```checkpoints: [last]``` # you could use [last, best] to additionaly plot the model with the best validation
- ```output_types: [png]``` # you could use [pdf, png] to also create a pdf
#### Example 2
You can compare two different optimizers.
Experiment file: [examples/plotting/2_adamw-vs-sgd.yaml](../../examples/plotting/2_adamw-vs-sgd.yaml)
```python -m pytorch_fob.evaluate_experiment examples/plotting/2_adamw-vs-sgd.yaml```
![your plot is not finished yet](../../examples/plotting/2_adamw-vs-sgd-last-heatmap.png)
Helpful settings:
- ```plot.x_axis: [optimizer.weight_decay, optimizer.kappa_init_param]``` # the values given here are used as the value for the axis. The order in the list is used from left to right for the plot columns
- `column_split_key: optimizer.name` This creates a column for each different optimizer (default behavior). You can set this to null to disable columns or choose a different key.
#### Example 3
There are multiple tasks in the benchmark, this example shows how to get a quick overview over multiple at the same time.
Experiment file: [examples/plotting/3_mnist-and-tabular_adamw-vs-sgd.yaml](../../examples/plotting/3_mnist-and-tabular_adamw-vs-sgd.yaml)
```python -m pytorch_fob.evaluate_experiment examples/plotting/3_mnist-and-tabular_adamw-vs-sgd.yaml```
![your plot is not finished yet](../../examples/plotting/3_mnist-and-tabular_adamw-vs-sgd-last-heatmap.png)
Helpful settings:
- ```split_groups: ["task.name"]```
Every non unique value for each parameter name in `split_groups` will create its own subplot.
Instead of a list you can set to `false` to disable splitting or `true` to split on every parameter that is different between runs (except those already in `column_split_key` or `aggregate_groups`).
This list is useful if there are just a few parameters you want to split.
#### Example 4
Any parameter that is neither on the x-axis nor y-axis will either be aggregated over or split into subplots.
Any individual square of a heatmap shows the *mean* and *std* over multiple runs (as seen in the previous plots). Here we show how to choose the runs to aggregate.
Experiment file: [examples/plotting/4_adamw-vs-sgd_seeds.yaml](../../examples/plotting/4_adamw-vs-sgd_seeds.yaml)
```python -m pytorch_fob.evaluate_experiment examples/plotting/4_adamw-vs-sgd_seeds.yaml```
![your plot is not finished yet](../../examples/plotting/4_adamw-vs-sgd_seeds-last-heatmap.png)
Helpful settings:
- Control the std with
- ```plot.std``` # toggle off with ```False```
- ```plot.aggfunc: std``` # also try ```var```
- control the rows with
- ```split_groups: ["engine.seed"]```
- ```aggregate_groups: []```
Per default the plot will display the *mean* and *std* calculated over the seeds.
We need to remove the seed from the ```aggregate_groups``` list (by giving an empty list instead). This list is useful if there are additional parameters you want to aggregate over.
-------------------------------------------------------------------------------
### Reproducing the Data
Lets create some data that we can plot; from the root directory call:
#### Data Download
first we make sure the data is already downloaded beforehand:
```python -m pytorch_fob.dataset_setup examples/plotting/3_mnist-and-tabular_adamw-vs-sgd.yaml```
This will download the mnist data (required for 1-4) and tabular (required for 3) into the [examples/data](../../examples/data) directory - path can be changed in the corresponding yaml you want to use (e.g. [examples/plotting/1_mnist-adamw.yaml](../../examples/plotting/1_mnist-adamw.yaml) if you have already set up your benchmark).
Estimated disk usage for the data: ~65M
#### Training
The 2 tasks will be run on 2x2 hyperparameter on 2 different seeds per optimizer for a total of 32 runs.
```python -m pytorch_fob.run_experiment examples/plotting/3_mnist-and-tabular_adamw-vs-sgd.yaml```
After training finished you should find 32 run directories in [examples/plotting/outputs](../../examples/plotting/outputs)
All parameters that differ from the default value are noted in the directory name.

View file

@ -0,0 +1,5 @@
from pathlib import Path
def evaluation_path() -> Path:
return Path(__file__).resolve().parent

View file

@ -0,0 +1,54 @@
evaluation:
data_dirs: null # List of Paths
output_dir: null # output filename is output_dir / experiment_name
experiment_name: null
split_groups: false # {True, False, [param.a, param.b, ...]} create additional plots where the data is grouped by the given parameter; True to detect all params with multiple unique values
aggregate_groups: # groups over which to aggregate values and compute mean/std. Default: [engine.seed]
- engine.seed
depth: 1 # the depth of the trial dirs relative to the given data_dirs
checkpoints: [best, last] # which model checkpoint to use
output_types: [pdf, png, csv] # choose all you want from {csv, pdf, png} and put it in brackets
verbose: False # debug prints
column_split_key: optimizer.name # if set, will split the dataframe and plot it in columns. Default: optimizer.name
column_split_order: null # sets the order in which the columns are plotted.
# keeping the values on null -> automatically figure it out if possible, or let matplotlib decide
plot:
x_axis: # indices on x axis (same order as order of subigures given in data_dirs)
- optimizer.weight_decay
y_axis: # indices on y axis (same order as order of subigures given in data_dirs)
- optimizer.learning_rate
metric: null # is automatically chosen from task name, this will overwrite it
limits: null # sets the limits for the colormap, 2 ints, order does not matter, leave empty for automatic
std: True # show std over aggregated values
aggfunc: std # for example {std, var, sem} which function to use to aggregate over the seeds; will only be used when 'std' is set to true
# format:
# string, how many digits to display, expects two values seperated by a dot (e.g. "2.3")
# to make accuracy -> percent use a '2' in front of the dot
# to display 3 digits after the decimal point, write a '3' behind the dot
format: null # for example {"2.0", "2.1", "2.3", "0.2", ...}
single_file: true # if true, save all heatmaps in one file. 'split_groups' are represented as rows.
plotstyle:
tight_layout: True
text:
usetex: True # you can give latex code in the yaml: $\sqrt{\pi \cdot \sigma}$ but some cluster dont have it installed# the font in the tiles of the matrix
# general font
font:
family: "serif" # matplotlib {serif, sans-serif, cursive, fantasy, monospace}
size: 14
# the font in the tiles of the matrix
matrix_font:
size: 12
scale: 1.0 # scales *figsize* argument by this value, useful for ".png"
color_palette: "rocket"
dpi: 300
# the name of the files storing the hyperparameters of the experiments and the scores
experiment_files:
best_model: results_best_model.json
last_model: results_final_model.json
config: config.yaml

View file

@ -0,0 +1,30 @@
# pretty names for the plot
names:
# optimizer
adamw_baseline: AdamW
sgd_baseline: SGD
adamcpr: AdamCPR
adamcpr_fast: AdamCPR
sgd_stepwise: SGD (stepwise)
# metric
test_acc: Test Accuracy
test_loss: Test Loss
test_mIoU: Test mean Intersection over Union
test_mAcc: Test mean Accuracy
test_rmse: Test Root Mean Square Error (RMSE)
test_rocauc: Test ROC-AUC
# parameter
learning_rate: Learning Rate
weight_decay: Weight Decay
kappa_init_param: Kappa Init Param
# tasks
classification: classification
classification_small: classification_small
detection: detection
graph: graph
graph_tiny: graph_tiny
mnist: mnist
segmentation: segmentation
tabular: tabular
template: template
translation: translation

View file

@ -0,0 +1,564 @@
import json
from pathlib import Path
from os import PathLike
from typing import List, Literal
from itertools import repeat
import matplotlib.pyplot as plt
from matplotlib.figure import Figure
import seaborn as sns
import pandas as pd
from pytorch_fob.engine.parser import YAMLParser
from pytorch_fob.engine.utils import AttributeDict, convert_type_inside_dict, log_warn, log_info, log_debug
from pytorch_fob.evaluation import evaluation_path
def get_available_trials(dirname: Path, config: AttributeDict, depth: int = 1):
"""finds the path for all trials in the *dirname* directory"""
# RECURSIVELY FIND ALL DIRS IN DIRNAME (up to depth)
assert isinstance(dirname, Path)
subdirs: list[Path] = [dirname]
all_results_must_be_same_depth = True
for _ in range(depth):
if all_results_must_be_same_depth:
new_subdirs: list[Path] = []
for subdir in subdirs:
new_subdirs += [x for x in subdir.iterdir() if x.is_dir()]
subdirs = new_subdirs
else:
for subdir in subdirs:
subdirs += [x for x in subdir.iterdir() if x.is_dir()]
format_str = "\n " # f-string expression part cannot include a backslash
log_debug(f"found the following directories:{format_str}{format_str.join(str(i) for i in subdirs)}.")
def is_trial(path: Path):
# here we could do additional checks to filter the subdirectories
# currently we only check if there is a config file
for x in path.iterdir():
found_a_config_file = x.name == config.experiment_files.config
if found_a_config_file:
return True
return False
subdirs = list(filter(is_trial, subdirs[::-1]))
log_debug(f"We assume the following to be trials:{format_str}{format_str.join(str(i) for i in subdirs)}.")
return subdirs
def dataframe_from_trials(trial_dir_paths: List[Path], config: AttributeDict) -> pd.DataFrame:
"""takes result from get_available_trials and packs them in a dataframe,
does not filter duplicate hyperparameter settings."""
dfs: List[pd.DataFrame] = []
for path in trial_dir_paths:
config_file = path / config.experiment_files.config
if config.last_instead_of_best:
result_file = path / config.experiment_files.last_model
else:
result_file = path / config.experiment_files.best_model
all_files_exist = all([
config_file.is_file(),
result_file.is_file()
])
if not all_files_exist:
log_warn(f"WARNING: one or more files are missing in {path}. Skipping this hyperparameter setting." +
f" <{config_file}>: {config_file.is_file()} and\n <{result_file}>: {result_file.is_file()})")
continue
yaml_parser = YAMLParser()
yaml_content = yaml_parser.parse_yaml(config_file)
# convert the sub dicts first, then the dict itself
yaml_content = convert_type_inside_dict(yaml_content, src=dict, tgt=AttributeDict)
yaml_content = AttributeDict(yaml_content)
# use user given value
metric_of_value_to_plot = config.plot.metric
# compute it if user has not given a value
if not metric_of_value_to_plot:
raise ValueError("evaluation.plot.metric is not set")
data = pd.json_normalize(yaml_content)
with open(result_file, "r", encoding="utf8") as f:
content = json.load(f)
if metric_of_value_to_plot in content[0]:
data.at[0, metric_of_value_to_plot] = content[0][metric_of_value_to_plot]
else:
log_warn(f"could not find value for {metric_of_value_to_plot} in json")
dfs.append(data)
if len(dfs) == 0:
raise ValueError("no dataframes found, check your config")
df = pd.concat(dfs, sort=False)
return df
def create_matrix_plot(dataframe: pd.DataFrame, config: AttributeDict, cols: str, idx: str, ax=None,
cbar: bool = True, vmin: None | int = None, vmax: None | int = None):
"""
Creates one heatmap and puts it into the grid of subplots.
Uses pd.pivot_table() and sns.heatmap().
"""
df_entry = dataframe.iloc[0]
metric_name = df_entry["evaluation.plot.metric"]
# CLEANING LAZY USER INPUT
# cols are x-axis, idx are y-axis
if cols not in dataframe.columns:
log_warn("x-axis value not present in the dataframe; did you forget to add a 'optimizer.' as a prefix?\n" +
f" using '{'optimizer.' + cols}' as 'x-axis' instead.")
cols = "optimizer." + cols
if idx not in dataframe.columns:
log_warn("y-axis value not present in the dataframe; did you forget to add a 'optimizer.' as a prefix?\n" +
f" using '{'optimizer.' + idx}' as 'y-axis' instead.")
idx = "optimizer." + idx
# create pivot table and format the score result
pivot_table = pd.pivot_table(dataframe,
columns=cols, index=idx, values=metric_name,
aggfunc='mean')
fmt = None
format_string = dataframe["evaluation.plot.format"].iloc[0]
# scaline the values given by the user to fit his format needs (-> and adapting the limits)
value_exp_factor, decimal_points = format_string.split(".")
value_exp_factor = int(value_exp_factor)
decimal_points = int(decimal_points)
if vmin:
vmin *= (10 ** value_exp_factor)
if vmax:
vmax *= (10 ** value_exp_factor)
pivot_table = (pivot_table * (10 ** value_exp_factor)).round(decimal_points)
fmt=f".{decimal_points}f"
# up to here limits was the min and max over all dataframes,
# usually we want to use user values
if "evaluation.plot.limits" in dataframe.columns:
limits = dataframe["evaluation.plot.limits"].iloc[0]
if limits:
vmin = min(limits)
vmax = max(limits)
log_debug(f"setting cbar limits to {vmin}, {vmax} ")
colormap_name = config.plotstyle.color_palette
low_is_better = dataframe["evaluation.plot.test_metric_mode"].iloc[0] == "min"
if low_is_better:
colormap_name += "_r" # this will "inver" / "flip" the colorbar
colormap = sns.color_palette(colormap_name, as_cmap=True)
metric_legend = pretty_name(metric_name)
# FINETUNE POSITION
# left bottom width height
# cbar_ax = fig.add_axes([0.92, 0.235, 0.02, 0.6])
cbar_ax = None
if not config.plot.std:
return sns.heatmap(pivot_table, ax=ax, cbar_ax=cbar_ax,
annot=True, fmt=fmt,
annot_kws={'fontsize': config.plotstyle.matrix_font.size},
cbar=cbar, vmin=vmin, vmax=vmax, cmap=colormap, cbar_kws={'label': f"{metric_legend}"})
else:
# BUILD STD TABLE
pivot_table_std = pd.pivot_table(dataframe,
columns=cols, index=idx, values=metric_name,
aggfunc=config.plot.aggfunc, fill_value=float("inf"), dropna=False
)
if float("inf") in pivot_table_std.values.flatten():
log_warn("WARNING: Not enough data to calculate the std, skipping std in plot")
pivot_table_std = (pivot_table_std * (10 ** value_exp_factor)).round(decimal_points)
annot_matrix = pivot_table.copy().astype("string")
for i in pivot_table.index:
for j in pivot_table.columns:
mean = pivot_table.loc[i, j]
std = pivot_table_std.loc[i, j]
std_string = f"\n±({round(std, decimal_points)})" if std != float("inf") else "" # type: ignore
annot_matrix.loc[i, j] = f"{round(mean, decimal_points)}{std_string}" # type: ignore
fmt = "" # cannot format like before, as we do not only have a number
return sns.heatmap(pivot_table, ax=ax, cbar_ax=cbar_ax,
annot=annot_matrix, fmt=fmt,
annot_kws={'fontsize': config.plotstyle.matrix_font.size},
cbar=cbar, vmin=vmin, vmax=vmax, cmap=colormap, cbar_kws={'label': f"{metric_legend}"})
def get_all_num_rows_and_their_names(dataframe_list: list[pd.DataFrame], config):
n_rows: list[int] = []
row_names: list[list[str]] = []
for i, df in enumerate(dataframe_list):
x_axis = config.plot.x_axis[i]
y_axis = config.plot.y_axis[0]
metrics = df["evaluation.plot.metric"].unique()
ignored_cols = [x_axis, y_axis]
ignored_cols += list(metrics)
ignored_cols += config.get("ignore_keys", [])
ignored_cols += config.get("aggregate_groups", [])
current_n_rows, current_names = get_num_rows(df, ignored_cols, config)
n_rows.append(current_n_rows)
if not current_names: # will be empty if we have only one row
current_names.append("default")
row_names.append(current_names)
return n_rows, row_names
def get_num_rows(dataframe: pd.DataFrame, ignored_cols: list[str], config: AttributeDict
) -> tuple[int, list[str]]:
"""each matrix has 2 params (on for x and y each), one value, and we aggregate over seeds;
if there are more than than these 4 parameter with different values,
we want to put that in seperate rows instead of aggregating over them.
returning: the number of rows (atleast 1) and the names of the cols"""
necesarry_rows = 0
# the user might specify a value for the groups that we should split on in <split_groups>
whitelisted_cols: list[str] | Literal["all"] = "all" # everything is whitelisted if this value stays 'all'
if isinstance(config.split_groups, list):
whitelisted_cols = config.split_groups[:]
elif config.split_groups is False:
whitelisted_cols = []
columns_with_non_unique_values = []
for col in dataframe.columns:
is_eval_key = col.startswith("evaluation.")
is_ignored = col in ignored_cols
is_whitelisted = whitelisted_cols == "all" or col in whitelisted_cols
if any([is_ignored, is_eval_key, not is_whitelisted]):
if is_whitelisted:
log_warn(f"{col} is in the whitelist, but will be ignored. Probably {col} is in both 'split_groups' and 'aggregate_groups'.")
log_debug(f"ignoring {col}")
continue
nunique = dataframe[col].nunique(dropna=False)
if nunique > 1:
log_debug(f"adding {col} since there are {nunique} unique values")
for unique_hp in dataframe[col].unique():
columns_with_non_unique_values.append(f"{col}={unique_hp}")
necesarry_rows += (nunique) # each unique parameter should be an individal plot
rows_number = max(necesarry_rows, 1)
col_names = columns_with_non_unique_values
log_debug(f"{rows_number=}")
log_debug(f"{col_names=}")
return rows_number, col_names
def find_global_vmin_vmax(dataframe_list, config):
vmin: int | float | None = None
vmax: int | float | None = None
num_cols = len(dataframe_list)
if num_cols > 1:
# all subplots should have same colors -> we need to find the limits
vmin = float('inf')
vmax = float('-inf')
for i in range(num_cols):
dataframe = dataframe_list[i]
cols = config.plot.x_axis[i]
idx = config.plot.y_axis[0]
key = config.plot.metric
pivot_table = pd.pivot_table(dataframe,
columns=cols, index=idx,
values=key,
aggfunc='mean')
min_value_present_in_current_df = pivot_table.min().min()
max_value_present_in_current_df = pivot_table.max().max()
log_debug("colorbar_limits:\n" +
f" subfigure number {i+1}, checking for metric {key}: \n" +
f" min value is {min_value_present_in_current_df},\n" +
f" max value is {max_value_present_in_current_df}")
vmin = min(vmin, min_value_present_in_current_df)
vmax = max(vmax, max_value_present_in_current_df)
return vmin, vmax
def create_figure(dataframe_list: list[pd.DataFrame], config: AttributeDict):
"""
Takes a list of dataframes. Each dataframe is processed into a column of heatmaps.
"""
num_cols: int = len(dataframe_list)
# calculate the number of rows for each dataframe
n_rows, row_names = get_all_num_rows_and_their_names(dataframe_list, config)
# Handling of the number of rows in the plot
# we could either create a full rectangular grid, or allow each subplot to nest subplots
# for nesting we would need to create subfigures instead of subplots i think
if config.split_groups is False:
n_rows_max = 1
row_names = [["default"] for _ in range(num_cols)]
else:
n_rows_max = max(n_rows)
log_debug(f"{n_rows=} and {num_cols=}")
# TODO, figsize was just hardcoded for (1, 2) grid and left to default for (1, 1) grid
# probably not worth the hazzle to create something dynamic (atleast not now)
# EDIT: it was slightly adapted to allow num rows without being completely unreadable
# margin = (num_subfigures - 1) * 0.3
# figsize=(5*n_cols + margin, 2.5)
scale = config.plotstyle.scale
if num_cols == 1 and n_rows_max > 1:
figsize = (2**3 * scale, 2 * 3 * n_rows_max * scale)
elif num_cols == 2:
# TODO: after removing cbar from left subifgure, it is squished
# there is an argument to share the legend, we should use that
figsize = (12 * scale, 5.4 * n_rows_max * scale)
elif num_cols > 2:
figsize = (12 * (num_cols / 2) * scale, 5.4 * n_rows_max * scale)
else:
figsize = None
# TODO: use seaborn FacetGrid
fig, axs = plt.subplots(n_rows_max, num_cols, figsize=figsize)
if n_rows_max == 1:
axs = [axs]
if num_cols == 1:
axs = [[ax] for ax in axs] # adapt for special case so we have unified types
# Adjust left and right margins as needed
# fig.subplots_adjust(left=0.1, right=0.9, top=0.97, hspace=0.38, bottom=0.05,wspace=0.3)
# None -> plt will chose vmin and vmax
vmin, vmax = find_global_vmin_vmax(dataframe_list, config)
for i in range(num_cols):
num_nested_subfigures: int = n_rows[i]
if not config.split_groups:
create_one_grid_element(dataframe_list, config, axs, i,
j=0,
max_i=num_cols,
max_j=0,
vmin=vmin,
vmax=vmax,
n_rows=n_rows,
row_names=row_names)
else:
for j in range(num_nested_subfigures):
create_one_grid_element(dataframe_list, config, axs, i,
j,
max_i=num_cols,
max_j=num_nested_subfigures,
vmin=vmin,
vmax=vmax,
n_rows=n_rows,
row_names=row_names)
if config.plotstyle.tight_layout:
fig.tight_layout()
# SUPTITLE (the super title on top of the whole figure in the middle)
# # TODO super title might be squished when used together with tight layout (removing for now)
# if n_rows_max > 1 or num_cols > 1:
# # set experiment name as title when multiple matrices in image
# if config.experiment_name:
# fig.suptitle(config.experiment_name)
return fig, axs
def create_one_grid_element(dataframe_list: list[pd.DataFrame], config: AttributeDict, axs,
i: int, j: int, max_i: int, max_j: int, vmin, vmax, n_rows, row_names):
"""does one 'axs' element as it is called in plt"""
num_nested_subfigures: int = n_rows[i]
name_for_additional_subplots: list[str] = row_names[i]
num_subfigures = max_i # from left to right
num_nested_subfigures = max_j # from top to bottom
dataframe = dataframe_list[i]
cols = config.plot.x_axis[i]
idx = config.plot.y_axis[0]
# only include colorbar once
include_cbar: bool = i == num_subfigures - 1
model_param = name_for_additional_subplots[j]
if model_param == "default":
current_dataframe = dataframe # we do not need to do further grouping
else:
param_name, param_value = model_param.split("=", maxsplit=1)
if pd.api.types.is_numeric_dtype(dataframe[param_name]):
param_value = float(param_value)
try:
current_dataframe = dataframe.groupby([param_name]).get_group((param_value,))
except KeyError:
log_warn(f"WARNING: was not able to groupby '{param_name}'," +
"maybe the data was created with different versions of fob; skipping this row")
log_debug(f"{param_name=}{param_value=}{dataframe.columns=}{dataframe[param_name]=}")
return False
current_plot = create_matrix_plot(current_dataframe, config,
cols, idx,
ax=axs[j][i],
cbar=include_cbar, vmin=vmin, vmax=vmax)
# LABELS
# Pretty name for label "learning_rate" => "Learning Rate"
# remove x_label of all but last row, remove y_label for all but first column
if i > 0:
current_plot.set_ylabel('', labelpad=8)
else:
current_plot.set_ylabel(pretty_name(current_plot.get_ylabel()))
if j < num_nested_subfigures - 1:
current_plot.set_xlabel('', labelpad=8)
else:
current_plot.set_xlabel(pretty_name(current_plot.get_xlabel()))
# reading optimizer and task name after grouping
df_entry = current_dataframe.iloc[0] # just get an arbitrary trial
opti_name = df_entry['optimizer.name']
task_name = df_entry['task.name']
# TITLE
# title (heading) of the heatmap: <optimname> on <taskname> (+ additional info)
title = f"{pretty_name(opti_name)} on {pretty_name(task_name)}"
if max_i > 1 or max_j > 1:
title += "" if model_param == "default" else f"\n{model_param}"
current_plot.set_title(title)
def extract_dataframes(workload_paths: List[Path], config: AttributeDict, depth: int = 1
) -> list[pd.DataFrame]:
df_list: list[pd.DataFrame] = []
num_dataframes: int = len(workload_paths)
for i in range(num_dataframes):
available_trials = get_available_trials(workload_paths[i], config, depth)
dataframe = dataframe_from_trials(available_trials, config)
df_list.append(dataframe)
return df_list
def get_output_file_path(dataframe_list: list[pd.DataFrame], config: AttributeDict, suffix: str = "") -> Path:
task_names = [df.iloc[0]["task.name"] for df in dataframe_list]
optim_names = [df.iloc[0]["optimizer.name"] for df in dataframe_list]
task_name = "_".join(sorted(set(task_names)))
optim_name = "_".join(sorted(set(optim_names)))
here = Path(__file__).parent.resolve()
output_dir = Path(config.output_dir) if config.output_dir else here
experiment_name = Path(config.experiment_name) if config.experiment_name else f"{optim_name}-{task_name}"
output_file_path = output_dir / experiment_name
return Path(f"{output_file_path}-{suffix}" if suffix else output_file_path)
def set_plotstyle(config: AttributeDict):
plt.rcParams["text.usetex"] = config.plotstyle.text.usetex
plt.rcParams["font.family"] = config.plotstyle.font.family
plt.rcParams["font.size"] = config.plotstyle.font.size
def pretty_name(name: str, pretty_names: dict | str = {}) -> str: # type: ignore pylint: disable=dangerous-default-value
"""
Tries to use a mapping for the name, else will do some general replacement.
mapping can be a directory or a filename of a yaml file with 'names' key
"""
# reading from yaml and caching the dictionary
label_file: Path = evaluation_path() / "labels.yaml"
if isinstance(pretty_names, str):
label_file = Path(pretty_names)
if pretty_names == {} or isinstance(pretty_names, str):
yaml_parser = YAMLParser()
yaml_content = yaml_parser.parse_yaml(label_file)
pretty_names: dict[str, str] = yaml_content["names"]
# applying pretty names
name_without_yaml_prefix = name.split(".")[-1]
if name in pretty_names.keys():
name = pretty_names[name]
elif name_without_yaml_prefix in pretty_names.keys():
name = pretty_names[name_without_yaml_prefix]
else:
name = name.replace('_', ' ').title()
return name
def save_csv(dfs: list[pd.DataFrame], output_filename: Path):
for i, df in enumerate(dfs):
csv_output_filename = f"{output_filename.resolve()}-{i}.csv"
log_info(f"saving raw data as {csv_output_filename}")
df.to_csv(path_or_buf=csv_output_filename, index=False)
def save_plot(fig: Figure, output_file_path: Path, file_type: str, dpi: int):
plot_output_filename = f"{output_file_path.resolve()}.{file_type}"
log_info(f"saving figure as <{plot_output_filename}>")
fig.savefig(plot_output_filename, dpi=dpi)
def save_files(fig, dfs: list[pd.DataFrame], output_file_path: Path, config: AttributeDict):
output_file_path.parent.mkdir(parents=True, exist_ok=True)
for file_type in config.output_types:
if file_type == "csv":
save_csv(dfs, output_file_path)
elif file_type == "png" or file_type == "pdf":
save_plot(fig, output_file_path, file_type, config.plotstyle.dpi)
def clean_config(config: AttributeDict) -> AttributeDict:
"""some processing that allows the user to be lazy, shortcut for the namespace, hidden values are found and config.all_values"""
if "evaluation" in config.keys():
evaluation_config: AttributeDict = config.evaluation
evaluation_config["all_values"] = config
config = evaluation_config
else:
log_warn("there is no 'evaluation' in the yaml provided!")
if "data_dirs" in config.keys():
value_is_none = not config.data_dirs
value_has_wrong_type = not isinstance(config.data_dirs, (PathLike, str, list))
if value_is_none or value_has_wrong_type:
raise ValueError(f"Error: 'evaluation.data_dirs' was not provided correctly! check for typos in the yaml provided! value given: {config.data_dirs}")
# allow the user to write a single string instead of a list of strings
if not isinstance(config.output_types, list):
config["output_types"] = [config.output_types]
log_info("fixing value for key <config.output_types> to be a list[str]")
if not isinstance(config.data_dirs, list):
config["data_dirs"] = [Path(config.data_dirs)]
log_info("fixing value for key <config.data_dirs> to be a list[Path]")
# x_axis
if not isinstance(config.plot.x_axis, list):
config["plot"]["x_axis"] = [config.plot.x_axis]
log_info("fixing value for key <config.plot.x_axis> to be a list[str]")
if len(config.plot.x_axis) < len(config.data_dirs):
# use same x axis for all if only one given
missing_elements = len(config.data_dirs) - len(config.plot.x_axis)
config["plot"]["x_axis"] += repeat(config.plot.x_axis[0], missing_elements)
# y_axis
if not isinstance(config.plot.y_axis, list):
config["plot"]["y_axis"] = [config.plot.y_axis]
log_info("fixing value for key <config.plot.y_axis> to be a list[str]")
if len(config.plot.y_axis) < len(config.data_dirs):
# use same x axis for all if only one given
missing_elements = len(config.data_dirs) - len(config.plot.y_axis)
config["plot"]["y_axis"] += repeat(config.plot.y_axis[0], missing_elements)
return config
def main(config: AttributeDict):
config = clean_config(config) # sets config to config.evaluation, cleans some data
workloads: List[Path] = [Path(name) for name in config.data_dirs]
log_debug(f"{workloads}=")
set_plotstyle(config)
dfs = extract_dataframes(workloads, depth=config.depth, config=config)
fig, _ = create_figure(dfs, config)
output_file_path = get_output_file_path(dfs, config)
save_files(fig, dfs, output_file_path, config)