Commit graph

12 commits

Author SHA1 Message Date
Shannon Sands
220b92be47 Linting and cleanup 2025-05-10 21:15:00 +10:00
Shannon Sands
6617d402b3 Doing exact V* calc 2025-05-10 20:24:31 +10:00
Shannon Sands
a049dde6b1 Adding thinking reward 2025-05-10 19:50:30 +10:00
Shannon Sands
840ff20921 Fixed typo, revising reward function 2025-05-10 19:45:06 +10:00
Shannon Sands
7fe1a40368 readd multistep masking 2025-05-10 09:24:55 +10:00
Shannon Sands
9efd8c1529 linting 2025-05-10 08:44:35 +10:00
Shannon Sands
06c4a9e65c linting 2025-05-10 08:43:03 +10:00
Shannon Sands
0248cc1227 Removed old code, added comments 2025-05-10 08:39:52 +10:00
Shannon Sands
ba604d44f9 update local server 2025-05-10 08:18:41 +10:00
Shannon Sands
c506bb147e simplified config and reward 2025-05-10 08:04:39 +10:00
Shannon Sands
7e95c0b67d moving test sever 2025-05-10 07:47:44 +10:00
Shannon Sands
a7dfd377da moving env to clean branch 2025-05-10 07:44:29 +10:00