[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
2026-04-19 12:57:58 +00:00 · 2026-02-20 04:58:43 +00:00 · 2026-02-20 04:58:43 +00:00 · 60fb6cae11
commit 60fb6cae11
parent ccdd5a1ca6
11 changed files with 221 additions and 136 deletions
--- a/README.md
+++ b/README.md
@ -294,10 +294,10 @@ Distillation is configured in `BaseEnvConfig` and available via CLI under `--env

 Both setups are supported:

- **Self-distillation (same model family for teacher and student)**  
+- **Self-distillation (same model family for teacher and student)**
  Point `teacher_base_url` to a server running the same model (or equivalent checkpoint family) as the student. This is the most stable setup for token-level alignment.

- **Cross-model distillation (different teacher and student models)**  
+- **Cross-model distillation (different teacher and student models)**
  Also supported, but tokenization compatibility becomes more important. If token vocabularies/template behavior differ significantly, alignment quality may degrade.

 In practice, self-distillation is usually easiest to bring up first, then cross-model can be layered in once your pipeline is stable.