Commit graph

115 commits

Author SHA1 Message Date
Shannon Sands
d6f9d58606 new env runs locally 2025-05-14 11:57:45 -07:00
Shannon Sands
54ae40840d no-thinking env added 2025-05-14 11:28:39 -07:00
Shannon Sands
21cc528b85 move best-of-n selection to util 2025-05-14 10:35:12 -07:00
Shannon Sands
4c00e2b209 move message history out to utils 2025-05-14 10:13:56 -07:00
Shannon Sands
8cd9e4d776 made private collect_trajectory re changes 2025-05-13 07:58:48 +10:00
Shannon Sands
36f6822d71 Merge branch 'main' into blackjack2-env 2025-05-13 07:54:04 +10:00
Shannon Sands
d980acfaf9 linting 2025-05-13 07:52:03 +10:00
Shannon Sands
e480c30b8b removed new fn 2025-05-13 07:49:28 +10:00
Teknium
1d78951d63
Update README.md
More updates for clarity
2025-05-12 11:31:42 -07:00
Teknium
2e8f0f2636
Update README.md
Add diagram
2025-05-12 11:14:44 -07:00
dmahan93
004dbc8565
Merge pull request #40 from NousResearch/remove-torch-dependency-in-lib
Remove dependency on torch for default installation
2025-05-12 10:48:18 -05:00
dmahan93
727c7ba640 Remove dependency on torch for default installation 2025-05-12 10:17:41 -05:00
dmahan93
706097db21
Merge pull request #36 from NousResearch/add-gym-frozen-lake-example
add gym taxi env
2025-05-12 08:49:11 -05:00
Shannon Sands
101cbdd803 Merge branch 'main' into blackjack2-env 2025-05-12 07:22:24 +10:00
hjc-puro
22673018fb
Merge pull request #13 from NousResearch/2025-05-03-http-error-logging
Improve error logging for HTTP requests
2025-05-11 02:28:43 +08:00
Shannon Sands
f69b16357b removed unused fn 2025-05-10 21:29:08 +10:00
Shannon Sands
220b92be47 Linting and cleanup 2025-05-10 21:15:00 +10:00
Shannon Sands
6617d402b3 Doing exact V* calc 2025-05-10 20:24:31 +10:00
Shannon Sands
a049dde6b1 Adding thinking reward 2025-05-10 19:50:30 +10:00
Shannon Sands
840ff20921 Fixed typo, revising reward function 2025-05-10 19:45:06 +10:00
hjc-puro
e68df555ba use parse_http_rseponse 2025-05-10 05:12:08 -04:00
hjc-puro
a659217afe
Merge branch 'main' into 2025-05-03-http-error-logging 2025-05-10 17:09:22 +08:00
dmahan93
1fe7deae47 Merge commit 'b386960d78' into add-gym-frozen-lake-example 2025-05-09 19:23:16 -05:00
dmahan93
b386960d78
Merge pull request #37 from NousResearch/fix-util-pre-commit
fix pre-commit
2025-05-09 19:20:10 -05:00
dmahan93
37f040a883 fix pre-commit 2025-05-09 19:14:45 -05:00
dmahan93
92428fec8f add gym taxi env 2025-05-09 19:05:01 -05:00
Shannon Sands
7fe1a40368 readd multistep masking 2025-05-10 09:24:55 +10:00
Shannon Sands
4d0f919fd1 linting 2025-05-10 09:10:31 +10:00
Shannon Sands
6c6a1c5d06 update handle_send_to_api 2025-05-10 09:07:54 +10:00
Shannon Sands
9efd8c1529 linting 2025-05-10 08:44:35 +10:00
Shannon Sands
06c4a9e65c linting 2025-05-10 08:43:03 +10:00
Shannon Sands
0248cc1227 Removed old code, added comments 2025-05-10 08:39:52 +10:00
Shannon Sands
ba604d44f9 update local server 2025-05-10 08:18:41 +10:00
Shannon Sands
c506bb147e simplified config and reward 2025-05-10 08:04:39 +10:00
Shannon Sands
7e95c0b67d moving test sever 2025-05-10 07:47:44 +10:00
Shannon Sands
a7dfd377da moving env to clean branch 2025-05-10 07:44:29 +10:00
Shannon Sands
4f6a0014bc precommit 2025-05-10 07:30:57 +10:00
dmahan93
c1ba77ec26
Merge pull request #7 from misrasaurabh1/codeflash/optimize-grab_exact_from_heterogeneous_queue-ma3pegzo
️ Speed up function `grab_exact_from_heterogeneous_queue` by 1,680%
2025-05-09 12:18:56 -05:00
dmahan93
f5ea281245
Merge pull request #33 from NousResearch/fix-quick-start-guide
add pre-commit workflow and readme.md changes to point to debugging tools
2025-05-09 10:17:31 -05:00
dmahan93
baf2732f87 add pre-commit workflow and readme.md changes to point to debugging tools first 2025-05-09 10:15:59 -05:00
dmahan93
d54376a617
Merge pull request #32 from NousResearch/pre-commit-run
run pre-commit on all files
2025-05-09 09:54:49 -05:00
dmahan93
40b12dae60 run pre-commit on all files 2025-05-09 09:54:20 -05:00
dmahan93
b959c30ebf
Merge pull request #31 from NousResearch/fix-math-evals-due-to-updated-dataset
fix olympiadbench due to upstream changes
2025-05-09 09:42:06 -05:00
dmahan93
e09ae8d3d3 fix olympiadbench due to upstream changes 2025-05-09 09:41:10 -05:00
hjc-puro
f303853e36
Update README.md 2025-05-09 02:41:17 -04:00
hjc-puro
629d8c1731
Merge pull request #14 from NousResearch/2025-05-02-server-cli 2025-05-09 13:37:54 +08:00
artem
693b28b961
Merge pull request #22 from NousResearch/vision_env_fixes
fix multimodal envs. add view_run_multimodal
2025-05-08 20:28:57 -07:00
dmahan93
8ff48065a3
Update server_manager.py to not continue to API config stuff if serverbaseline is set 2025-05-08 20:18:15 -05:00
Teknium
d17356a1af
Update README.md 2025-05-08 17:22:56 -07:00
dmahan93
f9b39c28f9
Merge pull request #27 from NousResearch/24-keyerror-on-self_state-in-base-register-env-fail
24 keyerror on self state in base register env fail
2025-05-08 17:46:41 -05:00