mirror of
https://github.com/open-thought/reasoning-gym.git
synced 2026-04-24 17:05:03 +00:00
3.4 KiB
3.4 KiB
Contributing to reasoning-gym
Delevloper Setup
- Clone the project
git clone https://github.com/open-thought/reasoning-gym.git
- Create a virtual environment (here we use conda)
conda create --name reasoning_gym python=3.11 -y
conda activate reasoning_gym
- Link project and install dependencies
pip install -e .
- Install development dependencies
pip install -r requirements-dev.txt
Procedural Datasets
- We are primarily interested in problems/riddles for which guessing the answer has very little chance of success (good example: multiplying numbers). The problem of tasks with small sets of possible answers (like true/false, multiple-choice) is that RL has to deal with very noisy rewards, which makes it for learning faithful Chain-of-Thoughts.
- Each dataset should come with a configuration class, the dataset class derived from
ProceduralDataset(see dataset.py) and unit tests. - All datasets return dict items with the keys
"question","answer"and"metadata". When no single good answer can be given set "answer" toNone. - For non-trivial datasets override the
score_answer()method which returns a numeric value in the range [0, 1] to indicate how close the result is to the actual result. - take a look at a simple dataset implementation like chain_sum.py and test_chain_sum.py.
- provide clear instructions in the question prompt that would allow an average human to produce an asswer in the correct format.
Submitting Work - Pull-Requets
We're all working on different parts of reasoning-gym together. To make contributions smoothly we recommend the following:
- Fork this project repository and clone it to your local machine. (Read more About Forks)
- On a new branch in your fork (aka a "feature branch" and not
main) work on a small focused change that only touches on a few files. - Run
pre-commitand make sure all files have formatting fixed. This simplifies life for reviewers. - Package up a small bit of work that solves part of the problem into a Pull Request and send it out for review.
- If you're lucky, we can merge your change into
mainwithout any problems. - Merge in your change and move on to a new issue or the second step of your current issue.
Tips
- To keep your PR clean don't include changes of
GALLERY.md- the overview file is automatically updated regulary automatically - install the pre-commit hook via
pre-commit install - when using AI coding assistants (cursor, aider, ..) please run
pre-commit run -ato format all files before committing.