Commit graph

27 commits

Author SHA1 Message Date
Oliver
f211b8544f cfg 2025-04-29 19:15:33 +01:00
Oliver
374760577e cfg 2025-04-29 19:03:28 +01:00
Oliver
be28d6f5f0 cfg 2025-04-29 09:03:19 +01:00
Oliver
fe479916bf cfg 2025-04-29 09:00:09 +01:00
Oliver
40e3293299 cfg 2025-04-28 23:41:37 +01:00
Oliver
197a976dfd cfg 2025-04-28 23:33:21 +01:00
Oliver
fd12db4352 cfg 2025-04-28 23:21:22 +01:00
Oliver
f404b2e603 cfg 2025-04-28 23:17:41 +01:00
Oliver
82ac4b27ba fix 2025-04-28 23:12:22 +01:00
Oliver
2dff704f0b cfg 2025-04-28 22:48:45 +01:00
Oliver
fa52788e31 cfg 2025-04-28 22:44:08 +01:00
Oliver
8dd7c86368 cfg 2025-04-28 22:32:10 +01:00
Oliver
d4e19056ea cfg 2025-04-28 21:27:00 +01:00
Oliver
55c9810113 training updates 2025-04-28 20:56:23 +01:00
Oliver
e58f3c1a35 cfg 2025-04-24 21:11:16 +01:00
Oliver
37b88d194b add loss_agg_mode 2025-04-24 20:46:37 +01:00
Oliver
1ee3b0bbb8 add use kl param 2025-04-24 20:42:57 +01:00
Oliver
e39b6b5f27 cfg change 2025-04-24 20:16:52 +01:00
Oliver
68ef3fa249 rm original cfg 2025-04-24 19:38:24 +01:00
Oliver
830ac3e10a impl conditional reward 2025-04-24 19:36:30 +01:00
Oliver
450d3dcfa4 cfg tweak 2025-04-24 17:29:51 +01:00
Oliver
897e618bfa cfg 2025-04-22 21:15:08 +01:00
Oliver
1343bcf63e cfg 2025-04-22 20:34:37 +01:00
Oliver
1ccd62bc1a cfg 2025-04-22 20:33:04 +01:00
Oliver
e372224ee1 cfgs 2025-04-22 20:32:17 +01:00
Oliver
4aeffa8182 cfgs 2025-04-22 20:30:13 +01:00
Oliver
4c2f83da5f add math config 2025-04-22 20:20:29 +01:00