Merge pull request #192 from rnkrtt/main

Fix typo in author name Gurning -> Gurung in community README
2026-04-28 17:29:30 +00:00 · 2025-06-23 10:15:58 -05:00 · 2025-06-23 10:15:58 -05:00 · 5b2b5e9947
commit 5b2b5e9947
parent f30453514b af1c98d7a8
1 changed files with 1 additions and 1 deletions
--- a/environments/community/README.md
+++ b/environments/community/README.md
@ -293,7 +293,7 @@ A unique environment for training LLMs to express needs and desires through auth
 **Author**: [JakeBoggs](https://github.com/JakeBoggs)
 **Purpose**: Train LLMs to generate humorous punchlines using Verifiable Rewards via Completion Likelihood Improvement (VR-CLI)

-A specialized environment for training LLMs to understand humor by generating joke punchlines through a novel RL technique from the paper "Learning to Reason for Long-Form Story Generation" (Gurning & Lapata, 2025). The environment teaches models to first generate reasoning that leads to good punchlines, with rewards based on how much the reasoning improves the likelihood of the actual punchline.
+A specialized environment for training LLMs to understand humor by generating joke punchlines through a novel RL technique from the paper "Learning to Reason for Long-Form Story Generation" (Gurung & Lapata, 2025). The environment teaches models to first generate reasoning that leads to good punchlines, with rewards based on how much the reasoning improves the likelihood of the actual punchline.

 **Features**:
 - **VR-CLI Methodology**: Uses Verifiable Rewards via Completion Likelihood Improvement for reduced overfitting