Commit graph

30 commits

Author SHA1 Message Date
Oliver Stanley
3286a68361 First version of CodeI/O reasoning data (#264)
* notebook for prepping first set of raw code files
* updated codeio processing notebook for repo-level processing
* fix for edge case in codeio scoring
* Add reformat notebook
* filtering pass
* add non-determinism filtering
* Tweak CodeIODataset & include first real data
* add basic codeio test, metadata
2025-03-05 22:34:11 +01:00
Andreas Köpf
ba9d625ef4 Merge pull request #186 from zafstojano/feat/codeio
feat(env): CodeIO
2025-02-27 12:18:13 +01:00
Zafir Stojanovski
2c566f76ea final tweaks 2025-02-27 08:38:34 +01:00
Zafir Stojanovski
4a59d13100 update timeout 2025-02-26 20:27:43 +01:00
Zafir Stojanovski
20c8392417 e2b testing 2025-02-26 20:19:52 +01:00
Zafir Stojanovski
8a0423f185 filtering 2025-02-25 22:21:26 +01:00
Zafir Stojanovski
f19498edb8 async 2025-02-24 22:07:35 +01:00
Zafir Stojanovski
9a27b80fd1 generate inputs synchronously 2025-02-24 15:58:06 +01:00
Zafir Stojanovski
0d07746a4e sampling code 2025-02-23 00:40:11 +01:00
Andreas Köpf
5c73043a1e Merge pull request #176 from olliestanley/codeio-experiments
Experiments with CodeI/O techniques for synthesising reasoning data
2025-02-22 16:24:17 +01:00
Zafir Stojanovski
e84cec26ed greedy coreset sampling 2025-02-22 16:15:14 +01:00
Zafir Stojanovski
e9ff3a1ee2 exploratory notebook 2025-02-22 00:46:33 +01:00
Oliver
94cd3c4d43 Add steps to synthesize CoTs with DeepSeekV3 2025-02-21 23:36:19 +00:00
Oliver
3297fc1bc0 Improve prompt for better LLM adherence 2025-02-21 23:00:48 +00:00
Andreas Koepf
74f590e24f more native type hints 2025-02-21 21:23:14 +01:00
Oliver
fc2c43b7d3 Prompt tweak 2025-02-21 18:34:13 +00:00
Oliver
c2b42d2717 Merge branch 'main' into codeio-experiments 2025-02-21 17:25:08 +00:00
Andreas Koepf
ff5b210106 use native types List->list, Dict->dict, Set->set, Tuple->tuple 2025-02-21 15:15:38 +01:00
Oliver
4f0812464f Prompt tweak for code preprocessing 2025-02-20 20:07:32 +00:00
Oliver
6fccd7ccb9 Add initial CodeI/O experiment notebook 2025-02-20 20:03:36 +00:00
Andreas Köpf
b23d25c92a Merge pull request #65 from zafstojano/env/group-anagrams
Group Anagrams together
2025-02-06 13:03:27 +01:00
Zafir Stojanovski
df8a7893dc add source for words_alpha.txt 2025-02-06 10:12:38 +01:00
Andreas Koepf
1e7864eb2a update gsmcross-check status 2025-02-05 21:14:19 +01:00
Andreas Koepf
3ca9a709e8 gsm_symbolic generator changes 2025-02-05 20:58:01 +01:00
Zafir Stojanovski
74471ac85c generate all english anagrams 2025-02-05 16:25:23 +01:00
Andreas Koepf
156b09951e black formatting 2025-02-03 22:57:24 +01:00
abdulhakeem
285a072105 Add EOL to test_generator_files 2025-02-01 20:41:31 -06:00
abdulhakeem
031f0b7728 Remove .DS_Store 2025-02-01 20:39:37 -06:00
Andreas Koepf
d1a35397a0 add eval demo for generated script 2025-01-29 18:28:17 +01:00
Andreas Koepf
402ea58f62 first steps for automatic generation of gsm generator functions 2025-01-29 17:55:37 +01:00