mirror of
https://github.com/NousResearch/atropos.git
synced 2026-04-24 17:04:55 +00:00
Convert FOB submodule to regular folder
This commit is contained in:
parent
94f046ad40
commit
94825011a0
74 changed files with 4563 additions and 0 deletions
131
environments/optimizer/FOB/pytorch_fob/evaluation/README.md
Normal file
131
environments/optimizer/FOB/pytorch_fob/evaluation/README.md
Normal file
|
|
@ -0,0 +1,131 @@
|
|||
# Evaluation
|
||||
|
||||
During training you can monitor your experiments with [Tensorboard](https://www.tensorflow.org/tensorboard).
|
||||
We also try to provide some useful functionality to quickly evaluate and compare the results of your experiments.
|
||||
|
||||
One can use the ```evaluate_experiment.py``` to get a quick first impression of a finished experiment run.
|
||||
|
||||
## Plotting vs. raw data
|
||||
|
||||
You can use the plotting pipeline with your customized setting (as shown in the usage examples).
|
||||
Alternatively you can use the script to export your data to a .csv and process the data to your own needs.
|
||||
|
||||
In this scenario, set ```evaluation.output_types: [csv] # no plotting, just the data``` in your experiment yaml.
|
||||
|
||||
## Usage Examples
|
||||
|
||||
In the following you can find 4 example use cases for experiments and how to visualize the results as heatmaps.
|
||||
|
||||
1. testing an optimizer on a task
|
||||
2. comparing two optimizers on the same task
|
||||
3. comparing multiple optimizers on different tasks
|
||||
4. comparing the influence of a single hyperparameter
|
||||
|
||||
Here we want to focus on the plotting. For instructions on how to run experiments, refer to the main [README](../../README.md). To get started right away, we provide the data for this example. If you want to reproduce it, refer to [this section](#reproducing-the-data).
|
||||
|
||||
### Plotting the experiment
|
||||
|
||||
By default, calling the `run_experiment.py` will plot the experiment after training and testing. To disable, set `engine.plot=false`.
|
||||
To plot your experiment afterwards, call the `evaluate_experiment.py` with the same experiment yaml. To adjust how to plot, change the values under the `evaluation` key of the experiment. Take a look at the [evaluation/default.yaml](default.yaml) to see which settings are available. Some of these keys are explained in the examples below to give the reader a first impression. Note that some default parameters are set in the respective tasks (e.g. in [tasks/mnist/default.yaml](../tasks/mnist/default.yaml)).
|
||||
|
||||
### Example use cases
|
||||
|
||||
Here are some example scenarios to give you an understanding of how our plotting works. Run the commands from the root of the repository. Take a look at the yaml files used in the command to see what is going on.
|
||||
|
||||
#### Example 1
|
||||
|
||||
This example is a good starting point; it shows the performance of a single default optimizer on one of the tasks.
|
||||
Experiment file: [examples/plotting/1_mnist-adamw.yaml](../../examples/plotting/1_mnist-adamw.yaml)
|
||||
|
||||
```python -m pytorch_fob.evaluate_experiment examples/plotting/1_mnist-adamw.yaml```
|
||||
|
||||

|
||||
|
||||
This example uses only the final model performance and only creates the plot as png.
|
||||
|
||||
Helpful settings:
|
||||
|
||||
- ```checkpoints: [last]``` # you could use [last, best] to additionaly plot the model with the best validation
|
||||
- ```output_types: [png]``` # you could use [pdf, png] to also create a pdf
|
||||
|
||||
|
||||
#### Example 2
|
||||
|
||||
You can compare two different optimizers.
|
||||
Experiment file: [examples/plotting/2_adamw-vs-sgd.yaml](../../examples/plotting/2_adamw-vs-sgd.yaml)
|
||||
|
||||
```python -m pytorch_fob.evaluate_experiment examples/plotting/2_adamw-vs-sgd.yaml```
|
||||
|
||||

|
||||
|
||||
Helpful settings:
|
||||
|
||||
- ```plot.x_axis: [optimizer.weight_decay, optimizer.kappa_init_param]``` # the values given here are used as the value for the axis. The order in the list is used from left to right for the plot columns
|
||||
- `column_split_key: optimizer.name` This creates a column for each different optimizer (default behavior). You can set this to null to disable columns or choose a different key.
|
||||
|
||||
|
||||
#### Example 3
|
||||
|
||||
There are multiple tasks in the benchmark, this example shows how to get a quick overview over multiple at the same time.
|
||||
Experiment file: [examples/plotting/3_mnist-and-tabular_adamw-vs-sgd.yaml](../../examples/plotting/3_mnist-and-tabular_adamw-vs-sgd.yaml)
|
||||
|
||||
```python -m pytorch_fob.evaluate_experiment examples/plotting/3_mnist-and-tabular_adamw-vs-sgd.yaml```
|
||||
|
||||

|
||||
|
||||
Helpful settings:
|
||||
|
||||
- ```split_groups: ["task.name"]```
|
||||
|
||||
Every non unique value for each parameter name in `split_groups` will create its own subplot.
|
||||
Instead of a list you can set to `false` to disable splitting or `true` to split on every parameter that is different between runs (except those already in `column_split_key` or `aggregate_groups`).
|
||||
This list is useful if there are just a few parameters you want to split.
|
||||
|
||||
#### Example 4
|
||||
|
||||
Any parameter that is neither on the x-axis nor y-axis will either be aggregated over or split into subplots.
|
||||
Any individual square of a heatmap shows the *mean* and *std* over multiple runs (as seen in the previous plots). Here we show how to choose the runs to aggregate.
|
||||
Experiment file: [examples/plotting/4_adamw-vs-sgd_seeds.yaml](../../examples/plotting/4_adamw-vs-sgd_seeds.yaml)
|
||||
|
||||
```python -m pytorch_fob.evaluate_experiment examples/plotting/4_adamw-vs-sgd_seeds.yaml```
|
||||
|
||||

|
||||
|
||||
Helpful settings:
|
||||
|
||||
- Control the std with
|
||||
- ```plot.std``` # toggle off with ```False```
|
||||
- ```plot.aggfunc: std``` # also try ```var```
|
||||
- control the rows with
|
||||
- ```split_groups: ["engine.seed"]```
|
||||
- ```aggregate_groups: []```
|
||||
|
||||
Per default the plot will display the *mean* and *std* calculated over the seeds.
|
||||
We need to remove the seed from the ```aggregate_groups``` list (by giving an empty list instead). This list is useful if there are additional parameters you want to aggregate over.
|
||||
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
### Reproducing the Data
|
||||
|
||||
Lets create some data that we can plot; from the root directory call:
|
||||
|
||||
#### Data Download
|
||||
|
||||
first we make sure the data is already downloaded beforehand:
|
||||
|
||||
```python -m pytorch_fob.dataset_setup examples/plotting/3_mnist-and-tabular_adamw-vs-sgd.yaml```
|
||||
|
||||
This will download the mnist data (required for 1-4) and tabular (required for 3) into the [examples/data](../../examples/data) directory - path can be changed in the corresponding yaml you want to use (e.g. [examples/plotting/1_mnist-adamw.yaml](../../examples/plotting/1_mnist-adamw.yaml) if you have already set up your benchmark).
|
||||
|
||||
Estimated disk usage for the data: ~65M
|
||||
|
||||
#### Training
|
||||
|
||||
The 2 tasks will be run on 2x2 hyperparameter on 2 different seeds per optimizer for a total of 32 runs.
|
||||
|
||||
```python -m pytorch_fob.run_experiment examples/plotting/3_mnist-and-tabular_adamw-vs-sgd.yaml```
|
||||
|
||||
After training finished you should find 32 run directories in [examples/plotting/outputs](../../examples/plotting/outputs)
|
||||
|
||||
All parameters that differ from the default value are noted in the directory name.
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
from pathlib import Path
|
||||
|
||||
|
||||
def evaluation_path() -> Path:
|
||||
return Path(__file__).resolve().parent
|
||||
|
|
@ -0,0 +1,54 @@
|
|||
evaluation:
|
||||
data_dirs: null # List of Paths
|
||||
output_dir: null # output filename is output_dir / experiment_name
|
||||
experiment_name: null
|
||||
split_groups: false # {True, False, [param.a, param.b, ...]} create additional plots where the data is grouped by the given parameter; True to detect all params with multiple unique values
|
||||
aggregate_groups: # groups over which to aggregate values and compute mean/std. Default: [engine.seed]
|
||||
- engine.seed
|
||||
depth: 1 # the depth of the trial dirs relative to the given data_dirs
|
||||
checkpoints: [best, last] # which model checkpoint to use
|
||||
output_types: [pdf, png, csv] # choose all you want from {csv, pdf, png} and put it in brackets
|
||||
verbose: False # debug prints
|
||||
column_split_key: optimizer.name # if set, will split the dataframe and plot it in columns. Default: optimizer.name
|
||||
column_split_order: null # sets the order in which the columns are plotted.
|
||||
|
||||
# keeping the values on null -> automatically figure it out if possible, or let matplotlib decide
|
||||
plot:
|
||||
x_axis: # indices on x axis (same order as order of subigures given in data_dirs)
|
||||
- optimizer.weight_decay
|
||||
y_axis: # indices on y axis (same order as order of subigures given in data_dirs)
|
||||
- optimizer.learning_rate
|
||||
metric: null # is automatically chosen from task name, this will overwrite it
|
||||
limits: null # sets the limits for the colormap, 2 ints, order does not matter, leave empty for automatic
|
||||
std: True # show std over aggregated values
|
||||
aggfunc: std # for example {std, var, sem} which function to use to aggregate over the seeds; will only be used when 'std' is set to true
|
||||
# format:
|
||||
# string, how many digits to display, expects two values seperated by a dot (e.g. "2.3")
|
||||
# to make accuracy -> percent use a '2' in front of the dot
|
||||
# to display 3 digits after the decimal point, write a '3' behind the dot
|
||||
format: null # for example {"2.0", "2.1", "2.3", "0.2", ...}
|
||||
single_file: true # if true, save all heatmaps in one file. 'split_groups' are represented as rows.
|
||||
|
||||
plotstyle:
|
||||
tight_layout: True
|
||||
text:
|
||||
usetex: True # you can give latex code in the yaml: $\sqrt{\pi \cdot \sigma}$ but some cluster dont have it installed# the font in the tiles of the matrix
|
||||
|
||||
# general font
|
||||
font:
|
||||
family: "serif" # matplotlib {serif, sans-serif, cursive, fantasy, monospace}
|
||||
size: 14
|
||||
|
||||
# the font in the tiles of the matrix
|
||||
matrix_font:
|
||||
size: 12
|
||||
|
||||
scale: 1.0 # scales *figsize* argument by this value, useful for ".png"
|
||||
color_palette: "rocket"
|
||||
dpi: 300
|
||||
|
||||
# the name of the files storing the hyperparameters of the experiments and the scores
|
||||
experiment_files:
|
||||
best_model: results_best_model.json
|
||||
last_model: results_final_model.json
|
||||
config: config.yaml
|
||||
|
|
@ -0,0 +1,30 @@
|
|||
# pretty names for the plot
|
||||
names:
|
||||
# optimizer
|
||||
adamw_baseline: AdamW
|
||||
sgd_baseline: SGD
|
||||
adamcpr: AdamCPR
|
||||
adamcpr_fast: AdamCPR
|
||||
sgd_stepwise: SGD (stepwise)
|
||||
# metric
|
||||
test_acc: Test Accuracy
|
||||
test_loss: Test Loss
|
||||
test_mIoU: Test mean Intersection over Union
|
||||
test_mAcc: Test mean Accuracy
|
||||
test_rmse: Test Root Mean Square Error (RMSE)
|
||||
test_rocauc: Test ROC-AUC
|
||||
# parameter
|
||||
learning_rate: Learning Rate
|
||||
weight_decay: Weight Decay
|
||||
kappa_init_param: Kappa Init Param
|
||||
# tasks
|
||||
classification: classification
|
||||
classification_small: classification_small
|
||||
detection: detection
|
||||
graph: graph
|
||||
graph_tiny: graph_tiny
|
||||
mnist: mnist
|
||||
segmentation: segmentation
|
||||
tabular: tabular
|
||||
template: template
|
||||
translation: translation
|
||||
564
environments/optimizer/FOB/pytorch_fob/evaluation/plot.py
Normal file
564
environments/optimizer/FOB/pytorch_fob/evaluation/plot.py
Normal file
|
|
@ -0,0 +1,564 @@
|
|||
import json
|
||||
from pathlib import Path
|
||||
from os import PathLike
|
||||
from typing import List, Literal
|
||||
from itertools import repeat
|
||||
import matplotlib.pyplot as plt
|
||||
from matplotlib.figure import Figure
|
||||
import seaborn as sns
|
||||
import pandas as pd
|
||||
from pytorch_fob.engine.parser import YAMLParser
|
||||
from pytorch_fob.engine.utils import AttributeDict, convert_type_inside_dict, log_warn, log_info, log_debug
|
||||
from pytorch_fob.evaluation import evaluation_path
|
||||
|
||||
|
||||
def get_available_trials(dirname: Path, config: AttributeDict, depth: int = 1):
|
||||
"""finds the path for all trials in the *dirname* directory"""
|
||||
# RECURSIVELY FIND ALL DIRS IN DIRNAME (up to depth)
|
||||
assert isinstance(dirname, Path)
|
||||
subdirs: list[Path] = [dirname]
|
||||
all_results_must_be_same_depth = True
|
||||
for _ in range(depth):
|
||||
if all_results_must_be_same_depth:
|
||||
new_subdirs: list[Path] = []
|
||||
for subdir in subdirs:
|
||||
new_subdirs += [x for x in subdir.iterdir() if x.is_dir()]
|
||||
subdirs = new_subdirs
|
||||
else:
|
||||
for subdir in subdirs:
|
||||
subdirs += [x for x in subdir.iterdir() if x.is_dir()]
|
||||
format_str = "\n " # f-string expression part cannot include a backslash
|
||||
log_debug(f"found the following directories:{format_str}{format_str.join(str(i) for i in subdirs)}.")
|
||||
|
||||
def is_trial(path: Path):
|
||||
# here we could do additional checks to filter the subdirectories
|
||||
# currently we only check if there is a config file
|
||||
for x in path.iterdir():
|
||||
found_a_config_file = x.name == config.experiment_files.config
|
||||
if found_a_config_file:
|
||||
return True
|
||||
return False
|
||||
|
||||
subdirs = list(filter(is_trial, subdirs[::-1]))
|
||||
log_debug(f"We assume the following to be trials:{format_str}{format_str.join(str(i) for i in subdirs)}.")
|
||||
return subdirs
|
||||
|
||||
|
||||
def dataframe_from_trials(trial_dir_paths: List[Path], config: AttributeDict) -> pd.DataFrame:
|
||||
"""takes result from get_available_trials and packs them in a dataframe,
|
||||
does not filter duplicate hyperparameter settings."""
|
||||
dfs: List[pd.DataFrame] = []
|
||||
|
||||
for path in trial_dir_paths:
|
||||
|
||||
config_file = path / config.experiment_files.config
|
||||
if config.last_instead_of_best:
|
||||
result_file = path / config.experiment_files.last_model
|
||||
else:
|
||||
result_file = path / config.experiment_files.best_model
|
||||
all_files_exist = all([
|
||||
config_file.is_file(),
|
||||
result_file.is_file()
|
||||
])
|
||||
if not all_files_exist:
|
||||
log_warn(f"WARNING: one or more files are missing in {path}. Skipping this hyperparameter setting." +
|
||||
f" <{config_file}>: {config_file.is_file()} and\n <{result_file}>: {result_file.is_file()})")
|
||||
continue
|
||||
|
||||
yaml_parser = YAMLParser()
|
||||
yaml_content = yaml_parser.parse_yaml(config_file)
|
||||
# convert the sub dicts first, then the dict itself
|
||||
yaml_content = convert_type_inside_dict(yaml_content, src=dict, tgt=AttributeDict)
|
||||
yaml_content = AttributeDict(yaml_content)
|
||||
|
||||
# use user given value
|
||||
metric_of_value_to_plot = config.plot.metric
|
||||
|
||||
# compute it if user has not given a value
|
||||
if not metric_of_value_to_plot:
|
||||
raise ValueError("evaluation.plot.metric is not set")
|
||||
|
||||
data = pd.json_normalize(yaml_content)
|
||||
|
||||
with open(result_file, "r", encoding="utf8") as f:
|
||||
content = json.load(f)
|
||||
if metric_of_value_to_plot in content[0]:
|
||||
data.at[0, metric_of_value_to_plot] = content[0][metric_of_value_to_plot]
|
||||
else:
|
||||
log_warn(f"could not find value for {metric_of_value_to_plot} in json")
|
||||
|
||||
dfs.append(data)
|
||||
|
||||
if len(dfs) == 0:
|
||||
raise ValueError("no dataframes found, check your config")
|
||||
df = pd.concat(dfs, sort=False)
|
||||
|
||||
return df
|
||||
|
||||
|
||||
def create_matrix_plot(dataframe: pd.DataFrame, config: AttributeDict, cols: str, idx: str, ax=None,
|
||||
cbar: bool = True, vmin: None | int = None, vmax: None | int = None):
|
||||
"""
|
||||
Creates one heatmap and puts it into the grid of subplots.
|
||||
Uses pd.pivot_table() and sns.heatmap().
|
||||
"""
|
||||
df_entry = dataframe.iloc[0]
|
||||
metric_name = df_entry["evaluation.plot.metric"]
|
||||
|
||||
# CLEANING LAZY USER INPUT
|
||||
# cols are x-axis, idx are y-axis
|
||||
if cols not in dataframe.columns:
|
||||
log_warn("x-axis value not present in the dataframe; did you forget to add a 'optimizer.' as a prefix?\n" +
|
||||
f" using '{'optimizer.' + cols}' as 'x-axis' instead.")
|
||||
cols = "optimizer." + cols
|
||||
if idx not in dataframe.columns:
|
||||
log_warn("y-axis value not present in the dataframe; did you forget to add a 'optimizer.' as a prefix?\n" +
|
||||
f" using '{'optimizer.' + idx}' as 'y-axis' instead.")
|
||||
idx = "optimizer." + idx
|
||||
# create pivot table and format the score result
|
||||
pivot_table = pd.pivot_table(dataframe,
|
||||
columns=cols, index=idx, values=metric_name,
|
||||
aggfunc='mean')
|
||||
|
||||
fmt = None
|
||||
format_string = dataframe["evaluation.plot.format"].iloc[0]
|
||||
|
||||
# scaline the values given by the user to fit his format needs (-> and adapting the limits)
|
||||
value_exp_factor, decimal_points = format_string.split(".")
|
||||
value_exp_factor = int(value_exp_factor)
|
||||
decimal_points = int(decimal_points)
|
||||
if vmin:
|
||||
vmin *= (10 ** value_exp_factor)
|
||||
if vmax:
|
||||
vmax *= (10 ** value_exp_factor)
|
||||
pivot_table = (pivot_table * (10 ** value_exp_factor)).round(decimal_points)
|
||||
fmt=f".{decimal_points}f"
|
||||
|
||||
# up to here limits was the min and max over all dataframes,
|
||||
# usually we want to use user values
|
||||
if "evaluation.plot.limits" in dataframe.columns:
|
||||
limits = dataframe["evaluation.plot.limits"].iloc[0]
|
||||
if limits:
|
||||
vmin = min(limits)
|
||||
vmax = max(limits)
|
||||
log_debug(f"setting cbar limits to {vmin}, {vmax} ")
|
||||
|
||||
colormap_name = config.plotstyle.color_palette
|
||||
low_is_better = dataframe["evaluation.plot.test_metric_mode"].iloc[0] == "min"
|
||||
if low_is_better:
|
||||
colormap_name += "_r" # this will "inver" / "flip" the colorbar
|
||||
colormap = sns.color_palette(colormap_name, as_cmap=True)
|
||||
metric_legend = pretty_name(metric_name)
|
||||
|
||||
# FINETUNE POSITION
|
||||
# left bottom width height
|
||||
# cbar_ax = fig.add_axes([0.92, 0.235, 0.02, 0.6])
|
||||
cbar_ax = None
|
||||
|
||||
if not config.plot.std:
|
||||
return sns.heatmap(pivot_table, ax=ax, cbar_ax=cbar_ax,
|
||||
annot=True, fmt=fmt,
|
||||
annot_kws={'fontsize': config.plotstyle.matrix_font.size},
|
||||
cbar=cbar, vmin=vmin, vmax=vmax, cmap=colormap, cbar_kws={'label': f"{metric_legend}"})
|
||||
else:
|
||||
# BUILD STD TABLE
|
||||
pivot_table_std = pd.pivot_table(dataframe,
|
||||
columns=cols, index=idx, values=metric_name,
|
||||
aggfunc=config.plot.aggfunc, fill_value=float("inf"), dropna=False
|
||||
)
|
||||
if float("inf") in pivot_table_std.values.flatten():
|
||||
log_warn("WARNING: Not enough data to calculate the std, skipping std in plot")
|
||||
|
||||
pivot_table_std = (pivot_table_std * (10 ** value_exp_factor)).round(decimal_points)
|
||||
|
||||
annot_matrix = pivot_table.copy().astype("string")
|
||||
for i in pivot_table.index:
|
||||
for j in pivot_table.columns:
|
||||
mean = pivot_table.loc[i, j]
|
||||
std = pivot_table_std.loc[i, j]
|
||||
std_string = f"\n±({round(std, decimal_points)})" if std != float("inf") else "" # type: ignore
|
||||
annot_matrix.loc[i, j] = f"{round(mean, decimal_points)}{std_string}" # type: ignore
|
||||
|
||||
fmt = "" # cannot format like before, as we do not only have a number
|
||||
|
||||
return sns.heatmap(pivot_table, ax=ax, cbar_ax=cbar_ax,
|
||||
annot=annot_matrix, fmt=fmt,
|
||||
annot_kws={'fontsize': config.plotstyle.matrix_font.size},
|
||||
cbar=cbar, vmin=vmin, vmax=vmax, cmap=colormap, cbar_kws={'label': f"{metric_legend}"})
|
||||
|
||||
|
||||
def get_all_num_rows_and_their_names(dataframe_list: list[pd.DataFrame], config):
|
||||
n_rows: list[int] = []
|
||||
row_names: list[list[str]] = []
|
||||
for i, df in enumerate(dataframe_list):
|
||||
x_axis = config.plot.x_axis[i]
|
||||
y_axis = config.plot.y_axis[0]
|
||||
metrics = df["evaluation.plot.metric"].unique()
|
||||
ignored_cols = [x_axis, y_axis]
|
||||
ignored_cols += list(metrics)
|
||||
ignored_cols += config.get("ignore_keys", [])
|
||||
ignored_cols += config.get("aggregate_groups", [])
|
||||
current_n_rows, current_names = get_num_rows(df, ignored_cols, config)
|
||||
n_rows.append(current_n_rows)
|
||||
if not current_names: # will be empty if we have only one row
|
||||
current_names.append("default")
|
||||
row_names.append(current_names)
|
||||
|
||||
return n_rows, row_names
|
||||
|
||||
def get_num_rows(dataframe: pd.DataFrame, ignored_cols: list[str], config: AttributeDict
|
||||
) -> tuple[int, list[str]]:
|
||||
"""each matrix has 2 params (on for x and y each), one value, and we aggregate over seeds;
|
||||
if there are more than than these 4 parameter with different values,
|
||||
we want to put that in seperate rows instead of aggregating over them.
|
||||
returning: the number of rows (atleast 1) and the names of the cols"""
|
||||
necesarry_rows = 0
|
||||
|
||||
# the user might specify a value for the groups that we should split on in <split_groups>
|
||||
whitelisted_cols: list[str] | Literal["all"] = "all" # everything is whitelisted if this value stays 'all'
|
||||
if isinstance(config.split_groups, list):
|
||||
whitelisted_cols = config.split_groups[:]
|
||||
elif config.split_groups is False:
|
||||
whitelisted_cols = []
|
||||
|
||||
columns_with_non_unique_values = []
|
||||
for col in dataframe.columns:
|
||||
is_eval_key = col.startswith("evaluation.")
|
||||
is_ignored = col in ignored_cols
|
||||
is_whitelisted = whitelisted_cols == "all" or col in whitelisted_cols
|
||||
if any([is_ignored, is_eval_key, not is_whitelisted]):
|
||||
if is_whitelisted:
|
||||
log_warn(f"{col} is in the whitelist, but will be ignored. Probably {col} is in both 'split_groups' and 'aggregate_groups'.")
|
||||
log_debug(f"ignoring {col}")
|
||||
continue
|
||||
nunique = dataframe[col].nunique(dropna=False)
|
||||
if nunique > 1:
|
||||
log_debug(f"adding {col} since there are {nunique} unique values")
|
||||
for unique_hp in dataframe[col].unique():
|
||||
columns_with_non_unique_values.append(f"{col}={unique_hp}")
|
||||
necesarry_rows += (nunique) # each unique parameter should be an individal plot
|
||||
|
||||
rows_number = max(necesarry_rows, 1)
|
||||
col_names = columns_with_non_unique_values
|
||||
log_debug(f"{rows_number=}")
|
||||
log_debug(f"{col_names=}")
|
||||
|
||||
return rows_number, col_names
|
||||
|
||||
|
||||
def find_global_vmin_vmax(dataframe_list, config):
|
||||
vmin: int | float | None = None
|
||||
vmax: int | float | None = None
|
||||
num_cols = len(dataframe_list)
|
||||
|
||||
if num_cols > 1:
|
||||
# all subplots should have same colors -> we need to find the limits
|
||||
vmin = float('inf')
|
||||
vmax = float('-inf')
|
||||
|
||||
for i in range(num_cols):
|
||||
dataframe = dataframe_list[i]
|
||||
cols = config.plot.x_axis[i]
|
||||
idx = config.plot.y_axis[0]
|
||||
key = config.plot.metric
|
||||
|
||||
pivot_table = pd.pivot_table(dataframe,
|
||||
columns=cols, index=idx,
|
||||
values=key,
|
||||
aggfunc='mean')
|
||||
|
||||
min_value_present_in_current_df = pivot_table.min().min()
|
||||
max_value_present_in_current_df = pivot_table.max().max()
|
||||
|
||||
log_debug("colorbar_limits:\n" +
|
||||
f" subfigure number {i+1}, checking for metric {key}: \n" +
|
||||
f" min value is {min_value_present_in_current_df},\n" +
|
||||
f" max value is {max_value_present_in_current_df}")
|
||||
vmin = min(vmin, min_value_present_in_current_df)
|
||||
vmax = max(vmax, max_value_present_in_current_df)
|
||||
|
||||
return vmin, vmax
|
||||
|
||||
|
||||
def create_figure(dataframe_list: list[pd.DataFrame], config: AttributeDict):
|
||||
"""
|
||||
Takes a list of dataframes. Each dataframe is processed into a column of heatmaps.
|
||||
"""
|
||||
num_cols: int = len(dataframe_list)
|
||||
|
||||
# calculate the number of rows for each dataframe
|
||||
n_rows, row_names = get_all_num_rows_and_their_names(dataframe_list, config)
|
||||
|
||||
# Handling of the number of rows in the plot
|
||||
# we could either create a full rectangular grid, or allow each subplot to nest subplots
|
||||
# for nesting we would need to create subfigures instead of subplots i think
|
||||
if config.split_groups is False:
|
||||
n_rows_max = 1
|
||||
row_names = [["default"] for _ in range(num_cols)]
|
||||
else:
|
||||
n_rows_max = max(n_rows)
|
||||
|
||||
log_debug(f"{n_rows=} and {num_cols=}")
|
||||
|
||||
# TODO, figsize was just hardcoded for (1, 2) grid and left to default for (1, 1) grid
|
||||
# probably not worth the hazzle to create something dynamic (atleast not now)
|
||||
# EDIT: it was slightly adapted to allow num rows without being completely unreadable
|
||||
# margin = (num_subfigures - 1) * 0.3
|
||||
# figsize=(5*n_cols + margin, 2.5)
|
||||
scale = config.plotstyle.scale
|
||||
if num_cols == 1 and n_rows_max > 1:
|
||||
figsize = (2**3 * scale, 2 * 3 * n_rows_max * scale)
|
||||
elif num_cols == 2:
|
||||
# TODO: after removing cbar from left subifgure, it is squished
|
||||
# there is an argument to share the legend, we should use that
|
||||
figsize = (12 * scale, 5.4 * n_rows_max * scale)
|
||||
elif num_cols > 2:
|
||||
figsize = (12 * (num_cols / 2) * scale, 5.4 * n_rows_max * scale)
|
||||
else:
|
||||
figsize = None
|
||||
|
||||
# TODO: use seaborn FacetGrid
|
||||
fig, axs = plt.subplots(n_rows_max, num_cols, figsize=figsize)
|
||||
if n_rows_max == 1:
|
||||
axs = [axs]
|
||||
if num_cols == 1:
|
||||
axs = [[ax] for ax in axs] # adapt for special case so we have unified types
|
||||
|
||||
# Adjust left and right margins as needed
|
||||
# fig.subplots_adjust(left=0.1, right=0.9, top=0.97, hspace=0.38, bottom=0.05,wspace=0.3)
|
||||
|
||||
# None -> plt will chose vmin and vmax
|
||||
vmin, vmax = find_global_vmin_vmax(dataframe_list, config)
|
||||
|
||||
for i in range(num_cols):
|
||||
num_nested_subfigures: int = n_rows[i]
|
||||
|
||||
if not config.split_groups:
|
||||
create_one_grid_element(dataframe_list, config, axs, i,
|
||||
j=0,
|
||||
max_i=num_cols,
|
||||
max_j=0,
|
||||
vmin=vmin,
|
||||
vmax=vmax,
|
||||
n_rows=n_rows,
|
||||
row_names=row_names)
|
||||
else:
|
||||
for j in range(num_nested_subfigures):
|
||||
create_one_grid_element(dataframe_list, config, axs, i,
|
||||
j,
|
||||
max_i=num_cols,
|
||||
max_j=num_nested_subfigures,
|
||||
vmin=vmin,
|
||||
vmax=vmax,
|
||||
n_rows=n_rows,
|
||||
row_names=row_names)
|
||||
|
||||
if config.plotstyle.tight_layout:
|
||||
fig.tight_layout()
|
||||
# SUPTITLE (the super title on top of the whole figure in the middle)
|
||||
# # TODO super title might be squished when used together with tight layout (removing for now)
|
||||
# if n_rows_max > 1 or num_cols > 1:
|
||||
# # set experiment name as title when multiple matrices in image
|
||||
# if config.experiment_name:
|
||||
# fig.suptitle(config.experiment_name)
|
||||
return fig, axs
|
||||
|
||||
|
||||
def create_one_grid_element(dataframe_list: list[pd.DataFrame], config: AttributeDict, axs,
|
||||
i: int, j: int, max_i: int, max_j: int, vmin, vmax, n_rows, row_names):
|
||||
"""does one 'axs' element as it is called in plt"""
|
||||
num_nested_subfigures: int = n_rows[i]
|
||||
name_for_additional_subplots: list[str] = row_names[i]
|
||||
num_subfigures = max_i # from left to right
|
||||
num_nested_subfigures = max_j # from top to bottom
|
||||
dataframe = dataframe_list[i]
|
||||
|
||||
cols = config.plot.x_axis[i]
|
||||
idx = config.plot.y_axis[0]
|
||||
# only include colorbar once
|
||||
include_cbar: bool = i == num_subfigures - 1
|
||||
|
||||
model_param = name_for_additional_subplots[j]
|
||||
if model_param == "default":
|
||||
current_dataframe = dataframe # we do not need to do further grouping
|
||||
else:
|
||||
param_name, param_value = model_param.split("=", maxsplit=1)
|
||||
if pd.api.types.is_numeric_dtype(dataframe[param_name]):
|
||||
param_value = float(param_value)
|
||||
try:
|
||||
current_dataframe = dataframe.groupby([param_name]).get_group((param_value,))
|
||||
except KeyError:
|
||||
log_warn(f"WARNING: was not able to groupby '{param_name}'," +
|
||||
"maybe the data was created with different versions of fob; skipping this row")
|
||||
log_debug(f"{param_name=}{param_value=}{dataframe.columns=}{dataframe[param_name]=}")
|
||||
return False
|
||||
current_plot = create_matrix_plot(current_dataframe, config,
|
||||
cols, idx,
|
||||
ax=axs[j][i],
|
||||
cbar=include_cbar, vmin=vmin, vmax=vmax)
|
||||
|
||||
# LABELS
|
||||
# Pretty name for label "learning_rate" => "Learning Rate"
|
||||
# remove x_label of all but last row, remove y_label for all but first column
|
||||
if i > 0:
|
||||
current_plot.set_ylabel('', labelpad=8)
|
||||
else:
|
||||
current_plot.set_ylabel(pretty_name(current_plot.get_ylabel()))
|
||||
if j < num_nested_subfigures - 1:
|
||||
current_plot.set_xlabel('', labelpad=8)
|
||||
else:
|
||||
current_plot.set_xlabel(pretty_name(current_plot.get_xlabel()))
|
||||
|
||||
# reading optimizer and task name after grouping
|
||||
df_entry = current_dataframe.iloc[0] # just get an arbitrary trial
|
||||
opti_name = df_entry['optimizer.name']
|
||||
task_name = df_entry['task.name']
|
||||
|
||||
# TITLE
|
||||
# title (heading) of the heatmap: <optimname> on <taskname> (+ additional info)
|
||||
title = f"{pretty_name(opti_name)} on {pretty_name(task_name)}"
|
||||
if max_i > 1 or max_j > 1:
|
||||
title += "" if model_param == "default" else f"\n{model_param}"
|
||||
current_plot.set_title(title)
|
||||
|
||||
|
||||
def extract_dataframes(workload_paths: List[Path], config: AttributeDict, depth: int = 1
|
||||
) -> list[pd.DataFrame]:
|
||||
df_list: list[pd.DataFrame] = []
|
||||
num_dataframes: int = len(workload_paths)
|
||||
|
||||
for i in range(num_dataframes):
|
||||
available_trials = get_available_trials(workload_paths[i], config, depth)
|
||||
dataframe = dataframe_from_trials(available_trials, config)
|
||||
df_list.append(dataframe)
|
||||
|
||||
return df_list
|
||||
|
||||
|
||||
def get_output_file_path(dataframe_list: list[pd.DataFrame], config: AttributeDict, suffix: str = "") -> Path:
|
||||
task_names = [df.iloc[0]["task.name"] for df in dataframe_list]
|
||||
optim_names = [df.iloc[0]["optimizer.name"] for df in dataframe_list]
|
||||
task_name = "_".join(sorted(set(task_names)))
|
||||
optim_name = "_".join(sorted(set(optim_names)))
|
||||
|
||||
here = Path(__file__).parent.resolve()
|
||||
|
||||
output_dir = Path(config.output_dir) if config.output_dir else here
|
||||
experiment_name = Path(config.experiment_name) if config.experiment_name else f"{optim_name}-{task_name}"
|
||||
output_file_path = output_dir / experiment_name
|
||||
|
||||
return Path(f"{output_file_path}-{suffix}" if suffix else output_file_path)
|
||||
|
||||
|
||||
def set_plotstyle(config: AttributeDict):
|
||||
plt.rcParams["text.usetex"] = config.plotstyle.text.usetex
|
||||
plt.rcParams["font.family"] = config.plotstyle.font.family
|
||||
plt.rcParams["font.size"] = config.plotstyle.font.size
|
||||
|
||||
def pretty_name(name: str, pretty_names: dict | str = {}) -> str: # type: ignore pylint: disable=dangerous-default-value
|
||||
"""
|
||||
Tries to use a mapping for the name, else will do some general replacement.
|
||||
mapping can be a directory or a filename of a yaml file with 'names' key
|
||||
"""
|
||||
|
||||
# reading from yaml and caching the dictionary
|
||||
label_file: Path = evaluation_path() / "labels.yaml"
|
||||
if isinstance(pretty_names, str):
|
||||
label_file = Path(pretty_names)
|
||||
|
||||
if pretty_names == {} or isinstance(pretty_names, str):
|
||||
yaml_parser = YAMLParser()
|
||||
yaml_content = yaml_parser.parse_yaml(label_file)
|
||||
pretty_names: dict[str, str] = yaml_content["names"]
|
||||
|
||||
# applying pretty names
|
||||
name_without_yaml_prefix = name.split(".")[-1]
|
||||
if name in pretty_names.keys():
|
||||
name = pretty_names[name]
|
||||
elif name_without_yaml_prefix in pretty_names.keys():
|
||||
name = pretty_names[name_without_yaml_prefix]
|
||||
else:
|
||||
name = name.replace('_', ' ').title()
|
||||
return name
|
||||
|
||||
|
||||
def save_csv(dfs: list[pd.DataFrame], output_filename: Path):
|
||||
for i, df in enumerate(dfs):
|
||||
csv_output_filename = f"{output_filename.resolve()}-{i}.csv"
|
||||
log_info(f"saving raw data as {csv_output_filename}")
|
||||
df.to_csv(path_or_buf=csv_output_filename, index=False)
|
||||
|
||||
|
||||
def save_plot(fig: Figure, output_file_path: Path, file_type: str, dpi: int):
|
||||
plot_output_filename = f"{output_file_path.resolve()}.{file_type}"
|
||||
log_info(f"saving figure as <{plot_output_filename}>")
|
||||
fig.savefig(plot_output_filename, dpi=dpi)
|
||||
|
||||
|
||||
def save_files(fig, dfs: list[pd.DataFrame], output_file_path: Path, config: AttributeDict):
|
||||
output_file_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
for file_type in config.output_types:
|
||||
if file_type == "csv":
|
||||
save_csv(dfs, output_file_path)
|
||||
elif file_type == "png" or file_type == "pdf":
|
||||
save_plot(fig, output_file_path, file_type, config.plotstyle.dpi)
|
||||
|
||||
|
||||
def clean_config(config: AttributeDict) -> AttributeDict:
|
||||
"""some processing that allows the user to be lazy, shortcut for the namespace, hidden values are found and config.all_values"""
|
||||
if "evaluation" in config.keys():
|
||||
evaluation_config: AttributeDict = config.evaluation
|
||||
evaluation_config["all_values"] = config
|
||||
config = evaluation_config
|
||||
else:
|
||||
log_warn("there is no 'evaluation' in the yaml provided!")
|
||||
if "data_dirs" in config.keys():
|
||||
value_is_none = not config.data_dirs
|
||||
value_has_wrong_type = not isinstance(config.data_dirs, (PathLike, str, list))
|
||||
if value_is_none or value_has_wrong_type:
|
||||
raise ValueError(f"Error: 'evaluation.data_dirs' was not provided correctly! check for typos in the yaml provided! value given: {config.data_dirs}")
|
||||
|
||||
# allow the user to write a single string instead of a list of strings
|
||||
if not isinstance(config.output_types, list):
|
||||
config["output_types"] = [config.output_types]
|
||||
log_info("fixing value for key <config.output_types> to be a list[str]")
|
||||
|
||||
if not isinstance(config.data_dirs, list):
|
||||
config["data_dirs"] = [Path(config.data_dirs)]
|
||||
log_info("fixing value for key <config.data_dirs> to be a list[Path]")
|
||||
|
||||
# x_axis
|
||||
if not isinstance(config.plot.x_axis, list):
|
||||
config["plot"]["x_axis"] = [config.plot.x_axis]
|
||||
log_info("fixing value for key <config.plot.x_axis> to be a list[str]")
|
||||
if len(config.plot.x_axis) < len(config.data_dirs):
|
||||
# use same x axis for all if only one given
|
||||
missing_elements = len(config.data_dirs) - len(config.plot.x_axis)
|
||||
config["plot"]["x_axis"] += repeat(config.plot.x_axis[0], missing_elements)
|
||||
|
||||
# y_axis
|
||||
if not isinstance(config.plot.y_axis, list):
|
||||
config["plot"]["y_axis"] = [config.plot.y_axis]
|
||||
log_info("fixing value for key <config.plot.y_axis> to be a list[str]")
|
||||
if len(config.plot.y_axis) < len(config.data_dirs):
|
||||
# use same x axis for all if only one given
|
||||
missing_elements = len(config.data_dirs) - len(config.plot.y_axis)
|
||||
config["plot"]["y_axis"] += repeat(config.plot.y_axis[0], missing_elements)
|
||||
|
||||
return config
|
||||
|
||||
|
||||
def main(config: AttributeDict):
|
||||
config = clean_config(config) # sets config to config.evaluation, cleans some data
|
||||
workloads: List[Path] = [Path(name) for name in config.data_dirs]
|
||||
log_debug(f"{workloads}=")
|
||||
|
||||
set_plotstyle(config)
|
||||
|
||||
dfs = extract_dataframes(workloads, depth=config.depth, config=config)
|
||||
fig, _ = create_figure(dfs, config)
|
||||
|
||||
output_file_path = get_output_file_path(dfs, config)
|
||||
|
||||
save_files(fig, dfs, output_file_path, config)
|
||||
Loading…
Add table
Add a link
Reference in a new issue