mirror of https://github.com/NousResearch/atropos.git synced 2026-04-22 16:48:57 +00:00

History

nevasini1 ce58c3aca2 Apply isort import order (ruff) to arithmetic_chain_server Made-with: Cursor		2026-03-21 17:59:57 -04:00
..
arithmetic_chain_server.py	Apply isort import order (ruff) to arithmetic_chain_server	2026-03-21 17:59:57 -04:00
README.md	Add arithmetic_chain community environment	2026-03-21 17:42:34 -04:00

README.md

Arithmetic Chain

Self-contained RL environment: procedurally generated multi-step integer problems (add / subtract / multiply from a starting value). The model must answer with \boxed{integer}; rewards use the same math_verify path as GSM8K.

No Hugging Face dataset — training items are sampled on the fly.

Run (serve)

From the repo root, with Atropos API and an OpenAI-compatible inference server configured in config_init or via CLI overrides:

python environments/community/arithmetic_chain/arithmetic_chain_server.py serve --slurm false

Process (debug rollouts)

python environments/community/arithmetic_chain/arithmetic_chain_server.py process \
  --env.data_path_to_save_groups rollouts.jsonl \
  --slurm false

Uses ManagedServer for token/logprob tracking (compatible with trainers that expect Atropos’ standard scored groups).

README.md Unescape Escape

Arithmetic Chain

Run (serve)

Process (debug rollouts)

README.md