Commit graph

15 commits

Author SHA1 Message Date
Oliver
d4e19056ea cfg 2025-04-28 21:27:00 +01:00
Oliver
55c9810113 training updates 2025-04-28 20:56:23 +01:00
Oliver
e58f3c1a35 cfg 2025-04-24 21:11:16 +01:00
Oliver
37b88d194b add loss_agg_mode 2025-04-24 20:46:37 +01:00
Oliver
1ee3b0bbb8 add use kl param 2025-04-24 20:42:57 +01:00
Oliver
e39b6b5f27 cfg change 2025-04-24 20:16:52 +01:00
Oliver
68ef3fa249 rm original cfg 2025-04-24 19:38:24 +01:00
Oliver
830ac3e10a impl conditional reward 2025-04-24 19:36:30 +01:00
Oliver
450d3dcfa4 cfg tweak 2025-04-24 17:29:51 +01:00
Oliver
897e618bfa cfg 2025-04-22 21:15:08 +01:00
Oliver
1343bcf63e cfg 2025-04-22 20:34:37 +01:00
Oliver
1ccd62bc1a cfg 2025-04-22 20:33:04 +01:00
Oliver
e372224ee1 cfgs 2025-04-22 20:32:17 +01:00
Oliver
4aeffa8182 cfgs 2025-04-22 20:30:13 +01:00
Oliver
4c2f83da5f add math config 2025-04-22 20:20:29 +01:00