Update README.md

2026-04-19 12:57:58 +00:00 · 2025-05-18 20:53:13 -04:00 · 2025-05-18 20:53:13 -04:00 · ab9a6f6d97
commit ab9a6f6d97
parent baa6a1feef
1 changed files with 18 additions and 19 deletions
--- a/environments/hack0/README.md
+++ b/environments/hack0/README.md
@ -19,27 +19,26 @@ What makes this environment particularly compelling is that it's measurable, dom
 ## Quickstart (100 words)

 ```bash
-# Run a single episode
-python environments/rubiks_cube_demo.py --curriculum_level 2
+pip install -r requirements.txt

-# Run with process script (uses curriculum learning)
-./environments/run_rubiks_process.sh
+cd atropos/environments/hack0

-# Train a model
-python train_rubiks_model.py
-```
-
-### Configuration
-
-Core parameters:
-```yaml
-# configs/rubiks_training.yaml
-curriculum_learning: true
-starting_level: 1
-max_level: 5
-auto_progress: true
-token_level_rewards: true
-visualization_dir: "./rubiks_visualizations/"
+(OPENAI_API_KEY="OPENAI_KEY" \
+      python rubiks_cube_environment.py process \
+      --slurm false \
+      --openai.model_name gpt-4.1-nano \
+      --env.tokenizer_name "NousResearch/DeepHermes-3-Llama-3-3B-Preview" \
+      --env.use_wandb true \
+      --env.group_size 4 \
+      --env.max_steps 15 \
+      --env.scramble_moves 5 \
+      --env.data_path_to_save_groups "rubiks_process_results.jsonl" \
+      --env.wandb_name "rubiks_cube_hackathon" \
+      --env.debug_mode true \
+      --env.use_curriculum true \
+      --env.generate_visualizations true \
+      --env.visualizations_dir "./rubiks_visualizations" \
+      --env.provide_solving_strategies true)
 ```

 ## Performance Metrics & Training (150 words)