Oliver Stanley
3286a68361
First version of CodeI/O reasoning data ( #264 )
...
* notebook for prepping first set of raw code files
* updated codeio processing notebook for repo-level processing
* fix for edge case in codeio scoring
* Add reformat notebook
* filtering pass
* add non-determinism filtering
* Tweak CodeIODataset & include first real data
* add basic codeio test, metadata
2025-03-05 22:34:11 +01:00
Andreas Köpf
ba9d625ef4
Merge pull request #186 from zafstojano/feat/codeio
...
feat(env): CodeIO
2025-02-27 12:18:13 +01:00
Zafir Stojanovski
2c566f76ea
final tweaks
2025-02-27 08:38:34 +01:00
Zafir Stojanovski
4a59d13100
update timeout
2025-02-26 20:27:43 +01:00
Zafir Stojanovski
20c8392417
e2b testing
2025-02-26 20:19:52 +01:00
Zafir Stojanovski
8a0423f185
filtering
2025-02-25 22:21:26 +01:00
Zafir Stojanovski
f19498edb8
async
2025-02-24 22:07:35 +01:00
Zafir Stojanovski
9a27b80fd1
generate inputs synchronously
2025-02-24 15:58:06 +01:00
Zafir Stojanovski
0d07746a4e
sampling code
2025-02-23 00:40:11 +01:00
Andreas Köpf
5c73043a1e
Merge pull request #176 from olliestanley/codeio-experiments
...
Experiments with CodeI/O techniques for synthesising reasoning data
2025-02-22 16:24:17 +01:00
Zafir Stojanovski
e84cec26ed
greedy coreset sampling
2025-02-22 16:15:14 +01:00
Zafir Stojanovski
e9ff3a1ee2
exploratory notebook
2025-02-22 00:46:33 +01:00
Oliver
94cd3c4d43
Add steps to synthesize CoTs with DeepSeekV3
2025-02-21 23:36:19 +00:00
Oliver
3297fc1bc0
Improve prompt for better LLM adherence
2025-02-21 23:00:48 +00:00
Andreas Koepf
74f590e24f
more native type hints
2025-02-21 21:23:14 +01:00
Oliver
fc2c43b7d3
Prompt tweak
2025-02-21 18:34:13 +00:00
Oliver
c2b42d2717
Merge branch 'main' into codeio-experiments
2025-02-21 17:25:08 +00:00
Andreas Koepf
ff5b210106
use native types List->list, Dict->dict, Set->set, Tuple->tuple
2025-02-21 15:15:38 +01:00
Oliver
4f0812464f
Prompt tweak for code preprocessing
2025-02-20 20:07:32 +00:00
Oliver
6fccd7ccb9
Add initial CodeI/O experiment notebook
2025-02-20 20:03:36 +00:00
Andreas Köpf
b23d25c92a
Merge pull request #65 from zafstojano/env/group-anagrams
...
Group Anagrams together
2025-02-06 13:03:27 +01:00
Zafir Stojanovski
df8a7893dc
add source for words_alpha.txt
2025-02-06 10:12:38 +01:00
Andreas Koepf
1e7864eb2a
update gsmcross-check status
2025-02-05 21:14:19 +01:00
Andreas Koepf
3ca9a709e8
gsm_symbolic generator changes
2025-02-05 20:58:01 +01:00
Zafir Stojanovski
74471ac85c
generate all english anagrams
2025-02-05 16:25:23 +01:00
Andreas Koepf
156b09951e
black formatting
2025-02-03 22:57:24 +01:00
abdulhakeem
285a072105
Add EOL to test_generator_files
2025-02-01 20:41:31 -06:00
abdulhakeem
031f0b7728
Remove .DS_Store
2025-02-01 20:39:37 -06:00
Andreas Koepf
d1a35397a0
add eval demo for generated script
2025-01-29 18:28:17 +01:00
Andreas Koepf
402ea58f62
first steps for automatic generation of gsm generator functions
2025-01-29 17:55:37 +01:00