diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index f129f596..f2bf0ecd 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,62 +1,80 @@ -# Contributing to reasoning-gym +# Contributing to Reasoning Gym -### Delevloper Setup +Thank you for your interest in contributing to Reasoning Gym! This document provides guidelines and instructions for contributing to the project. -1. Clone the project +## Development Setup -``` -git clone https://github.com/open-thought/reasoning-gym.git -``` +1. Clone the repository: + ```bash + git clone https://github.com/open-thought/reasoning-gym.git + ``` -2. Create a virtual environment (here we use conda) +2. Create a virtual environment (using conda): + ```bash + conda create --name reasoning_gym python=3.11 -y + conda activate reasoning_gym + ``` -``` -conda create --name reasoning_gym python=3.11 -y -conda activate reasoning_gym -``` +3. Install the package in editable mode: + ```bash + pip install -e . + ``` -3. Link project and install dependencies +4. Install development dependencies: + ```bash + pip install -r requirements-dev.txt + ``` -``` -pip install -e . -``` +## Creating Procedural Datasets -4. Install development dependencies +When creating new datasets, please follow these guidelines: -``` -pip install -r requirements-dev.txt -``` - - -## Procedural Datasets - -- We are primarily interested in problems/riddles for which guessing the answer has very little chance of success (good example: multiplying numbers). The problem of tasks with small sets of possible answers (like true/false, multiple-choice) is that RL has to deal with very noisy rewards, which makes it for learning faithful Chain-of-Thoughts. -- Each dataset should come with a configuration class, the dataset class derived from `ProceduralDataset` (see [dataset.py](https://github.com/open-thought/reasoning-gym/blob/main/reasoning_gym/dataset.py)) and unit tests. -- All datasets return dict items with the keys `"question"`, `"answer"` and `"metadata"`. When no single good answer can be given set "answer" to `None`. -- For non-trivial datasets override the `score_answer()` method which returns a numeric value in the range [0, 1] to indicate how close the result is to the actual result. -- take a look at a simple dataset implementation like [chain_sum.py](reasoning_gym/arithmetic/chain_sum.py) and [test_chain_sum.py](https://github.com/open-thought/reasoning-gym/blob/main/tests/test_chain_sum.py). -- provide clear instructions in the question prompt that would allow an average human to produce an asswer in the correct format. - - -## Submitting Work - Pull-Requets - -We're all working on different parts of reasoning-gym together. To make contributions smoothly we recommend the following: - -1. [Fork this project repository](https://docs.github.com/en/get-started/quickstart/fork-a-repo) and clone it to your local machine. (Read more [About Forks](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/about-forks)) -1. On a [new branch](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-and-deleting-branches-within-your-repository) in your fork (aka a "feature branch" and not `main`) work on a small focused change that only touches on a few files. -1. Run `pre-commit` and make sure all files have formatting fixed. This simplifies life for reviewers. -1. Package up a small bit of work that solves part of the problem - [into a Pull Request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork) - and - [send it out for review](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/requesting-a-pull-request-review). -1. If you're lucky, we can merge your change into `main` without any problems. -1. Merge in your change and move on to a new issue or the second step of your current issue. - - -### Tips - -- To keep your PR clean don't include changes of `GALLERY.md` - the overview file is automatically updated regulary automatically -- install the pre-commit hook via `pre-commit install` -- when using AI coding assistants (cursor, aider, ..) please run `pre-commit run -a` to format all files before committing. +1. **Focus on Complex Problems**: + - Prioritize problems where guessing has a low probability of success (e.g., number multiplication) + - Avoid tasks with small answer sets (true/false, multiple-choice) as they create noisy rewards for RL + +2. **Implementation Requirements**: + - Create a configuration class + - Derive your dataset class from `ProceduralDataset` (see [dataset.py](https://github.com/open-thought/reasoning-gym/blob/main/reasoning_gym/dataset.py)) + - Include comprehensive unit tests + - Return dictionary items with keys: `"question"`, `"answer"`, and `"metadata"` + - Use `None` for `"answer"` when multiple valid answers exist + - For complex datasets, implement the `score_answer()` method (return value range: [0, 1]) + +3. **Getting Started**: + - Review example implementations: + - [chain_sum.py](reasoning_gym/arithmetic/chain_sum.py) + - [test_chain_sum.py](https://github.com/open-thought/reasoning-gym/blob/main/tests/test_chain_sum.py) + - Write clear question prompts that an average human can understand and answer correctly + +## Pull Request Process + +1. **Fork and Clone**: + - [Fork the repository](https://docs.github.com/en/get-started/quickstart/fork-a-repo) + - Clone your fork locally + - Read more about [forks](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/about-forks) + +2. **Create a Feature Branch**: + - Work on a [new branch](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-and-deleting-branches-within-your-repository) + - Keep changes focused and minimal + +3. **Code Quality**: + - Install pre-commit hooks: `pre-commit install` + - Run `pre-commit run -a` before committing + - When using AI coding assistants (cursor, aider, etc.), ensure proper formatting + +4. **Submit Your PR**: + - [Create a Pull Request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork) + - [Request review](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/requesting-a-pull-request-review) + - Do not include changes to `GALLERY.md` (it's updated automatically) + +5. **Review Process**: + - Address reviewer feedback promptly + - Keep discussions constructive + - Once approved, your changes will be merged into `main` + +## Need Help? + +Join our community discussion in the `#reasoning-gym` channel on the [GPU-Mode Discord server](https://discord.gg/gpumode).