atropos/environments/hack0/conversational_style_dpo
2025-05-18 17:50:15 -07:00
..
__init__.py dev - push for submission 2025-05-18 17:50:15 -07:00
conversational_style_dpo_env.py dev - push for submission 2025-05-18 17:50:15 -07:00
gsm8k_dpo_rollouts_1.html dev - push for submission 2025-05-18 17:50:15 -07:00
gsm8k_dpo_rollouts_2.html dev - push for submission 2025-05-18 17:50:15 -07:00
gsm8k_dpo_rollouts_4.html dev - push for submission 2025-05-18 17:50:15 -07:00
gsm8k_dpo_rollouts_6.html dev - push for submission 2025-05-18 17:50:15 -07:00
gsm8k_dpo_rollouts_8.html dev - push for submission 2025-05-18 17:50:15 -07:00
gsm8k_dpo_rollouts_9.html dev - push for submission 2025-05-18 17:50:15 -07:00
gsm8k_dpo_rollouts_15.html dev - push for submission 2025-05-18 17:50:15 -07:00
gsmk8k_conversational_style_dpo_env.py dev - push for submission 2025-05-18 17:50:15 -07:00
requirements.txt dev - push for submission 2025-05-18 17:50:15 -07:00
train_dpo_conversational.py dev - push for submission 2025-05-18 17:50:15 -07:00