reasoning-gym/CONTRIBUTING.md

3.4 KiB

Contributing to reasoning-gym

Delevloper Setup

  1. Clone the project
git clone https://github.com/open-thought/reasoning-gym.git
  1. Create a virtual environment (here we use conda)
conda create --name reasoning_gym python=3.11 -y
conda activate reasoning_gym
  1. Link project and install dependencies
pip install -e .
  1. Install development dependencies
pip install -r requirements-dev.txt

Procedural Datasets

  • We are primarily interested in problems/riddles for which guessing the answer has very little chance of success (good example: multiplying numbers). The problem of tasks with small sets of possible answers (like true/false, multiple-choice) is that RL has to deal with very noisy rewards, which makes it for learning faithful Chain-of-Thoughts.
  • Each dataset should come with a configuration class, the dataset class derived from ProceduralDataset (see dataset.py) and unit tests.
  • All datasets return dict items with the keys "question", "answer" and "metadata". When no single good answer can be given set "answer" to None.
  • For non-trivial datasets override the score_answer() method which returns a numeric value in the range [0, 1] to indicate how close the result is to the actual result.
  • take a look at a simple dataset implementation like chain_sum.py and test_chain_sum.py.
  • provide clear instructions in the question prompt that would allow an average human to produce an asswer in the correct format.

Submitting Work - Pull-Requets

We're all working on different parts of reasoning-gym together. To make contributions smoothly we recommend the following:

  1. Fork this project repository and clone it to your local machine. (Read more About Forks)
  2. On a new branch in your fork (aka a "feature branch" and not main) work on a small focused change that only touches on a few files.
  3. Run pre-commit and make sure all files have formatting fixed. This simplifies life for reviewers.
  4. Package up a small bit of work that solves part of the problem into a Pull Request and send it out for review.
  5. If you're lucky, we can merge your change into main without any problems.
  6. Merge in your change and move on to a new issue or the second step of your current issue.

Tips

  • To keep your PR clean don't include changes of GALLERY.md - the overview file is automatically updated regulary automatically
  • install the pre-commit hook via pre-commit install
  • when using AI coding assistants (cursor, aider, ..) please run pre-commit run -a to format all files before committing.