Commit graph

763 commits

Author SHA1 Message Date
google-labs-jules[bot]
276a845dd7 feat: Implement SWE-RL Environment with Full Refinements
I've implemented the SWERLEnv in environments/swe_rl_env.py, based on the
SWE-RL paper (arXiv:2502.18449). This version incorporates extensive
refinements based on your feedback.

Key features implemented in environments/swe_rl_env.py:
- Core environment structure (setup, trajectory collection, scoring, evaluation).
- "Thinking" step: LLM is prompted for reasoning within <think> </think> tags
  before generating a patch. Includes strict parsing for these tags.
- Dynamic prompt construction using `tokenizer.apply_chat_template` with
  NousResearch/DeepHermes-3-Llama-3-8B-Preview as the default model.
- Hugging Face dataset integration: Loads data from HF Hub with configurable
  dataset name, splits, and column mappings.
- Reward mechanism: Based on thinking tag correctness, patch format
  (SEARCH/REPLACE), and similarity to the oracle patch.
- Comprehensive WandB logging for training/evaluation metrics.

NOTE: I made multiple attempts to update 'environments/README.md'
with documentation for this new environment. While I
reported success in some turns, this was not consistently verifiable
and may not have been correctly applied. The README.md file may
require manual verification and updating for the SWERLEnv.
2025-05-22 01:28:00 +00:00
Andrew
5d34ea821d removed html from data folder 2025-05-21 18:10:10 -07:00
Andrew
6316cf31a5 chore: remove .zip and .html files per review feedback 2025-05-21 18:08:22 -07:00
based-tachikoma
227e594ebf add debug_target.pdb test file 2025-05-21 16:50:15 -07:00
Eric Liu
7eae51cc5c Move to subfolder 2025-05-21 16:19:00 -07:00
Eric Liu
a88e3afddf DeepSacrifice 2025-05-21 16:18:46 -07:00
Allan Niemerg
7a653044a4 add GAIA download to README 2025-05-21 16:18:08 -05:00
Allan Niemerg
7710e151cc This adds the SmolaGents integration to Atropos, enabling the creation of high-quality agent trajectories for training data. 2025-05-21 15:47:57 -05:00
Andrew
c3a4461008 feat: update and refactored meteorology environment with latest changes 2025-05-20 20:23:00 -07:00
based-tachikoma
1ee67de035 refactor, full run 2025-05-20 20:12:59 -07:00
based-tachikoma
de9dfff221 rfdiffusion fix 2025-05-20 20:12:59 -07:00
hallerite
4d9bec44c6
[env]: add initial ProteinBinderEnv
Co-authored-by: based-tachikoma <based.tachikoma@gmail.com>
2025-05-18 20:03:21 -07:00
Earl Potters
db0cf9e6c0 Remove outdated DynastAI documentation and test scripts
- Deleted the ATROPOS_INTEGRATION.md and INSTALL_AND_RUN.md files, which contained installation and usage instructions for DynastAI.
- Removed test script test_dynastai_env.py and installation verification script verify_install.py, as they are no longer needed.
2025-05-18 19:06:20 -07:00
ParsaIdp
71f6d48e87
Create optimizer_benchmark_environmenr.py 2025-05-18 18:14:10 -07:00
ParsaIdp
f9a444b6f2
Update optimizer_benchmark_env.py 2025-05-18 18:13:25 -07:00
Kirill Igumenshchev (aider)
f59aaba24a feat: ask to generate 3 example jokes in dataset question prompt 2025-05-18 18:12:48 -07:00
Dylan Anderson
7e91a94a3e Add wandb 2025-05-18 18:00:21 -07:00
arihanv
291dcd8351 add: env 2025-05-18 17:58:56 -07:00
Karthik-Ragunath
34e9784311 pushing jsonl files 2025-05-18 17:56:27 -07:00
Kirill Igumenshchev
41cf093415 feat: add HTML rendering for humor datasets 2025-05-18 17:55:59 -07:00
Alex
444bd5b1d7
doctor.jsonl 2025-05-18 17:55:30 -07:00
Josh
c17cdb4486 Update README 2025-05-18 17:53:59 -07:00
Joshua Jerin
ab9a6f6d97
Update README.md 2025-05-18 20:53:13 -04:00
Dylan Anderson
1525e9404a Add youtube 2025-05-18 17:53:07 -07:00
Joshua Jerin
baa6a1feef
Update README.md 2025-05-18 20:50:53 -04:00
Steven Li
4eae1c44ca add examples to cat system prompt 2025-05-18 17:50:42 -07:00
Tvpower
320614e294 added videp 2025-05-18 17:50:33 -07:00
Karthik-Ragunath
9725761f5b dev - push for submission 2025-05-18 17:50:15 -07:00
ParsaIdp
856b437b3a
Update wrapper.py 2025-05-18 17:49:31 -07:00
Joshua Jerin
c4e02454e0 refactor 2 2025-05-18 17:48:25 -07:00
Kirill Igumenshchev (aider)
96043a968f refactor: update score method to use LLM with detailed rubric for joke evaluation 2025-05-18 17:48:14 -07:00
Joshua Jerin
d8e16c7991 refactor 2025-05-18 17:47:29 -07:00
Josh
7065d936d7 Update README 2025-05-18 17:47:16 -07:00
FIRST_NAME LAST_NAME
f401a746f1 fix 2025-05-18 17:47:08 -07:00
justin5764
55f4face3d Create LeanRLREADME.md 2025-05-18 17:46:24 -07:00
Drew Sny
30549fc812 added compressed jsonl wandb 2025-05-18 17:43:52 -07:00
Jonah Philion
4e83714b44 make the evaluator more discerning 2025-05-18 17:43:36 -07:00
iyaja
1764a80094 submit: pokemon showdown env 2025-05-18 17:43:17 -07:00
Pranceraz
c7ce1be94c working 2025-05-18 17:42:58 -07:00
Pranceraz
8163481fdc work in progress 2025-05-18 17:41:58 -07:00
justin5764
a4d253bc5c Commit 2025-05-18 17:41:35 -07:00
Kirill Igumenshchev (aider)
db1e68d2ab fix: implement abstract evaluate method in HumorEnv to fix instantiation error 2025-05-18 17:41:21 -07:00
Kirill Igumenshchev (aider)
b99757ec03 fix: default to 'serve' command if no subcommand is provided in CLI 2025-05-18 17:40:31 -07:00
jeannemtl
dd179b8fa6 Add latest quantum training artifacts 2025-05-19 00:39:27 +00:00
Josh
fedcf7d376 Add README 2025-05-18 17:39:12 -07:00
Kirill Igumenshchev (aider)
24a350bc71 feat: add HumorEnv environment for humor dataset in hack0 directory 2025-05-18 17:39:01 -07:00
Joshua Jerin
eb10d3f4df requirements.txt 2025-05-18 17:38:39 -07:00
Josh
904360a02e Cleanup. End-to-end functionality in place 2025-05-18 17:38:29 -07:00
Dylan Anderson
2acd8aef3e add more patients 2025-05-18 17:38:21 -07:00
Earl Potters
92048e423f Merge branch 'main' of https://github.com/Slyracoon23/atropos 2025-05-18 17:37:06 -07:00