Commit graph

  • 76e8bf7564 chore: merge ensemble for integration RUFFY-369 2026-03-30 17:15:46 +05:30
  • 3d08cdfe83 style: resolve merge conflicts and back-port stabilization fixes for integration RUFFY-369 2026-03-30 17:15:31 +05:30
  • dedea3018b style: fix lints and back-port stabilization fixes for trainer optimization RUFFY-369 2026-03-30 17:14:58 +05:30
  • 9e7d19f7d3 style: fix lints and numerical stability checks RUFFY-369 2026-03-30 17:14:30 +05:30
  • 8a3a582beb style: fix lints and pin dependencies for reward normalization RUFFY-369 2026-03-30 17:14:19 +05:30
  • 95c3e4984c fix: pin antlr4-python3-runtime for compatibility RUFFY-369 2026-03-30 17:14:16 +05:30
  • 38731e0434 style: fix linting and imports in curriculum scheduler RUFFY-369 2026-03-30 17:13:14 +05:30
  • 475424fb3f
    Merge e43aefcb2d into c20c85256e Rekttobi 2026-03-30 07:28:51 +00:00
  • e43aefcb2d docs: add Pidgin English Q&A environment to README unknown 2026-03-30 08:25:12 +01:00
  • 8cfa6ff4c7
    Merge branch 'main' into docs/add-troubleshooting-guide kokoron 2026-03-30 14:23:41 +07:00
  • b30c9ececc docs: add troubleshooting guide for common setup issues kokoron 2026-03-30 06:55:33 +00:00
  • 12ee1240d4
    Merge 792ba5d3ae into c20c85256e SignaBuilder 2026-03-29 21:51:58 +00:00
  • 792ba5d3ae
    Merge branch 'NousResearch:main' into community/moe-routing SignaBuilder 2026-03-29 16:51:54 -05:00
  • c1e01d581e
    Merge b13f808e1e into c20c85256e Forostovec 2026-03-29 13:22:14 -07:00
  • 6f15df6008
    Merge 34a0bab77a into c20c85256e Philip Lippmann 2026-03-29 13:02:51 -07:00
  • 9dfb8242ee
    Merge 121deb5349 into c20c85256e Thomasyoung113 2026-03-29 20:00:52 +00:00
  • 4706346e91
    Fix typos in README.mdfix: correct typos in environments/community/README.md Robbian Saputra Gumay 2026-03-30 01:55:44 +07:00
  • 3993fdd52c
    Merge eaaeb928dc into c20c85256e Srishti Gureja 2026-03-29 21:55:14 +05:30
  • f0d2fc2826
    Merge 0ab46d65b0 into c20c85256e Siddharth Balyan 2026-03-28 07:52:20 +00:00
  • 0ab46d65b0 Add streaming eval sample logging to BaseEnv sid/traj-saving-eval-mode alt-glitch 2026-03-28 00:50:52 -07:00
  • 0e975e3695
    Merge dd2b3663a1 into c20c85256e Philip Lippmann 2026-03-28 04:25:16 +00:00
  • fd41c5c0a0
    Merge 90d5044066 into c20c85256e radik878 2026-03-28 04:15:31 +00:00
  • 9f017e8231
    Merge aebba980bf into c20c85256e alireza78a 2026-03-28 03:48:10 +00:00
  • 3004a9fe3c
    Merge 9fcd2d34cc into c20c85256e Enzam 2026-03-28 03:41:08 +00:00
  • 9dbcd0bc95
    Merge 74bce3f103 into c20c85256e J-SUPHA 2026-03-28 03:23:34 +00:00
  • 7e6f292540
    Merge bac04c6343 into c20c85256e GarmashAlex 2026-03-28 03:15:10 +00:00
  • ebec428b94
    Merge 4cfe09432c into c20c85256e MozirDmitriy 2026-03-28 02:36:33 +00:00
  • 5a0d62a9db
    Merge 4b77dc4935 into c20c85256e nevasini1 2026-03-28 01:29:48 +00:00
  • 49566d5417
    Merge 052083aa8b into c20c85256e Ocheretovich 2026-03-28 01:03:11 +00:00
  • 07f1fc79e6
    Merge e98e04bc17 into c20c85256e shannonsands 2026-03-28 00:46:44 +00:00
  • 3596d22c36
    Merge 9df39bb475 into c20c85256e Not Lain 2026-03-28 00:20:57 +00:00
  • 7f4a7f4b93
    Merge b343814990 into c20c85256e Srishti Gureja 2026-03-28 00:09:30 +00:00
  • 9d10b00cae
    Merge ec576701b1 into c20c85256e 0xbyt4 2026-03-28 00:08:18 +00:00
  • 83a343d3a9 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2026-03-27 23:08:02 +00:00
  • 63d717e4b4
    Merge branch 'main' into sid/traj-saving-eval-mode Siddharth Balyan 2026-03-28 04:36:31 +05:30
  • 7a4edb569c add trajectory saving to eval mode. alt-glitch 2026-03-27 16:04:04 -07:00
  • 8f0a5449e1
    Merge 655faa775c into c20c85256e Rohan 2026-03-27 22:55:12 +00:00
  • c20c85256e
    Merge pull request #416 from NousResearch/teacherenv main dmahan93 2026-03-27 17:36:21 -05:00
  • a5e02a7087 feat: add API performance tracking for trainer-inference communication RUFFY-369 2026-03-28 03:59:25 +05:30
  • 36a4a909ba feat: add numerical verification utilities for RL correctness checking RUFFY-369 2026-03-28 03:48:45 +05:30
  • 01da524b6b feat: add curriculum learning scheduler for sample-efficient RL training RUFFY-369 2026-03-28 03:39:01 +05:30
  • 0674e31a53 feat: add online reward normalization for multi-env RL training stability RUFFY-369 2026-03-28 03:31:28 +05:30
  • feef039cfd feat: add EnsembleReward with robust aggregation and inter-rater reliability RUFFY-369 2026-03-28 03:22:57 +05:30
  • feddad9d57 style: apply black and ruff formatting for production standards RUFFY-369 2026-03-28 01:07:46 +05:30
  • 8cd30c3703 refactor:final production-ready audit; remove debug artifacts and non-ASCII characters RUFFY-369 2026-03-28 00:21:49 +05:30
  • 287b7e7250 chore:add debugging for int server error RUFFY-369 2026-03-27 12:19:54 +05:30
  • 74da1e5171 get over the first hump trainer4teacher Jai Suphavadeeprasit 2026-03-24 11:05:26 -07:00
  • 0b245c5ab5 socratics socratic Jai Suphavadeeprasit 2026-03-24 10:45:30 -07:00
  • d84e4af213 style: remove debug prints from code_debug_env scoring RUFFY-369 2026-03-24 16:15:13 +05:30
  • 9e727ce5ca fix: handle identical scores in process mode with noise instead of None RUFFY-369 2026-03-24 15:55:33 +05:30
  • 5c2afa8ea7 fix: correct dataset name to bigcode/humanevalpack RUFFY-369 2026-03-24 15:32:25 +05:30
  • 590e8a1ef2 feat: add code_debug community environment RUFFY-369 2026-03-24 13:05:15 +05:30
  • 0fec002516 logging time taken for inference Jai Suphavadeeprasit 2026-03-23 13:31:37 -07:00
  • 40516ca195 logging time taken for inference Jai Suphavadeeprasit 2026-03-23 12:55:43 -07:00
  • 0f787151f2 logging the teacher step Jai Suphavadeeprasit 2026-03-23 12:21:34 -07:00
  • 7ed3cc6d1b logging the teacher step Jai Suphavadeeprasit 2026-03-23 11:52:51 -07:00
  • ee0cc6eeac set up logging Jai Suphavadeeprasit 2026-03-20 14:52:41 -04:00
  • ff7c4aa0c0 set up logging Jai Suphavadeeprasit 2026-03-20 14:49:12 -04:00
  • 9232ae6abd silly issues Jai Suphavadeeprasit 2026-03-20 13:14:32 -04:00
  • ae84b6a021 silly issues Jai Suphavadeeprasit 2026-03-20 11:06:30 -04:00
  • 6f450679e9 testing teacher fully end to end with forward kl Jai Suphavadeeprasit 2026-03-20 00:42:20 -04:00
  • e1542ee731 clean example trainer teacherenv Jai Suphavadeeprasit 2026-03-23 11:30:04 -07:00
  • fae87dcaaa clean vllm tonight Jai Suphavadeeprasit 2026-03-23 11:28:40 -07:00
  • 75a032bf3e revert openai server Jai Suphavadeeprasit 2026-03-23 11:26:05 -07:00
  • 295bb9c446 revert openai server Jai Suphavadeeprasit 2026-03-23 11:25:28 -07:00
  • 8745f0533e revert teacher logprobs Jai Suphavadeeprasit 2026-03-23 11:23:47 -07:00
  • 79ff1642f8 revert gsm8k Jai Suphavadeeprasit 2026-03-23 11:18:14 -07:00
  • ed36c6342c docs: replace stale gate wording in tiered experts README Thomas Perry 2026-03-22 10:14:12 -05:00
  • ed5e65a3d4 fix: align Graph of Tiered Experts with Hermes-series defaults Thomas Perry 2026-03-22 10:13:22 -05:00
  • 87c08faaff refine naming to Graph of Tiered Experts Architecture Thomas Perry 2026-03-22 00:50:53 -05:00
  • 4b77dc4935 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2026-03-21 22:01:22 +00:00
  • ce58c3aca2 Apply isort import order (ruff) to arithmetic_chain_server nevasini1 2026-03-21 17:59:57 -04:00
  • e6bc008545 Add arithmetic_chain community environment nevasini1 2026-03-21 17:42:01 -04:00
  • 6753db0354 docs: add Python 3.10+ requirement note for Mac users Nawf23 2026-03-20 12:30:53 +00:00
  • 45f569f3af clean Jai Suphavadeeprasit 2026-03-18 09:20:08 -04:00
  • 41947e98d6 clean Jai Suphavadeeprasit 2026-03-17 12:25:38 -04:00
  • 79baac1ea7 clean Jai Suphavadeeprasit 2026-03-17 12:23:35 -04:00
  • 01e25707b0 student student Jai Suphavadeeprasit 2026-03-17 12:02:48 -04:00
  • 7aba0d3fc8 fresh eyes check Jai Suphavadeeprasit 2026-03-14 11:20:15 -04:00
  • d14dfb9f55 docs: add python3 fallback for venv setup hasbunallah 2026-03-14 06:08:11 +00:00
  • 805a0c0eac revert to similar structure Jai Suphavadeeprasit 2026-03-13 20:52:40 -04:00
  • f053c77a62 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2026-03-14 00:43:19 +00:00
  • 9bd299b3ef better logging for devex Jai Suphavadeeprasit 2026-03-13 20:41:41 -04:00
  • 3a85ede8ba [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2026-03-13 22:51:54 +00:00
  • a171358f2e structural changes Jai Suphavadeeprasit 2026-03-13 18:49:01 -04:00
  • 12ba3cc3bd [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2026-03-13 21:25:23 +00:00
  • 1b8ff075c4 adding tests Jai Suphavadeeprasit 2026-03-13 17:23:40 -04:00
  • 6c564799bc [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2026-03-13 21:02:04 +00:00
  • 697c594c72 changes Jai Suphavadeeprasit 2026-03-13 16:57:46 -04:00
  • 82964b6e48 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2026-03-13 20:13:30 +00:00
  • a8cdb53a4d address problems Jai Suphavadeeprasit 2026-03-13 16:12:05 -04:00
  • 322e7e6623 remove comments Jai Suphavadeeprasit 2026-03-13 13:29:47 -04:00
  • 994e9c287d [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2026-03-13 17:20:56 +00:00
  • a1b545c734 remove cross tokenization and fix location of configs Jai Suphavadeeprasit 2026-03-13 13:19:28 -04:00
  • 148a4fd5eb remove training code Jai Suphavadeeprasit 2026-03-13 12:52:52 -04:00
  • 862cd3667d clean logging Jai Suphavadeeprasit 2026-03-13 12:38:52 -04:00
  • 600c54f5f8 clean log Jai Suphavadeeprasit 2026-03-13 12:09:08 -04:00
  • d1b0dee8f7 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2026-03-13 15:14:05 +00:00
  • d8857eb69f investigating weird training issue Jai Suphavadeeprasit 2026-03-13 09:07:11 -04:00
  • 3df0e45659 investigating weird training issue Jai Suphavadeeprasit 2026-03-12 20:02:15 -04:00