Commit graph

724 commits

Author SHA1 Message Date
hjc-puro
08b01bcb46 update readme to describe process subcommand 2025-05-02 03:52:59 -04:00
hjc-puro
78cfef9daf add process subcommand 2025-05-02 03:42:10 -04:00
hjc-puro
0f966ec3fb add import 2025-05-02 03:24:14 -04:00
Aseem Saxena
e26c491bf5
Merge branch 'main' into codeflash/optimize-grab_exact_from_heterogeneous_queue-ma3pegzo 2025-05-01 14:36:15 -07:00
hjc-puro
f28dbb19ab
Merge pull request #5 from NousResearch/2025-04-30-rejection-sampling-docs
Add rejection sampling description to offline SFT docs. Also add `atropos-dpo-gen` to the pyproject.toml.
2025-05-01 01:11:51 +08:00
hjc-puro
5b6cbed6ac mention rejection sampling 2025-04-30 13:08:33 -04:00
hjc-puro
1e8029c8da
Merge pull request #4 from NousResearch/2025-04-30-sft-gen-docs
[README] Add offline SFT data gen docs
2025-05-01 00:48:11 +08:00
hjc-puro
5a35ae39db
Update README.md 2025-04-30 12:47:51 -04:00
hjc-puro
2821737fd8 add offline data gen docs 2025-04-30 11:08:19 -04:00
codeflash-ai[bot]
837ef6295d
️ Speed up function grab_exact_from_heterogeneous_queue by 1,680%
Here’s a highly optimized version of your code for both **runtime** and **memory**, based on the profile hot spots.

- **Avoid repeated summing** for checking lengths in a growing list — we keep a running sum.
- **Avoid repeatedly copying lists/dicts** by using lists of indices and marking to remove in one pass, and using set operations for fast membership checks.
- **Avoid creating lots of small dicts** and list extensions inside loops.
- **Combine related generator expressions** so costly operations are only done once.
- **Group similar linear scans** into one to minimize number of loops over `queue`.
- Use **pre-allocated lists and sets** where it saves time.

Here's the rewritten function (all comments preserved except where the code logic was changed).



**Key optimizations:**
- Only a *single pass* over queue for setup.
- No repeated `.append(dict)`; pass only indices around until the end.
- Use `.clear()` for lists inside dict to avoid reallocations.
- Use lists of lengths for O(1) access everywhere.
- Maintain a running sum for batch size check, not repeated `sum`.

This should **dramatically cut runtime**, especially at the hot spots from your line profiler output. If you need even more speed and the queue is huge/long-lived, consider reworking the data structure for the queue itself (`deque`, heap, etc.), but for code-level optimization this is near optimal for this algorithm!
2025-04-30 08:58:23 +00:00
Teknium
ddcf3fc490
Update README.md
add API docs to getting started
2025-04-30 00:33:02 -07:00
Teknium
4b632a6c6b
Update README.md
better getting started
2025-04-29 22:21:51 -07:00
Teknium
b87121120e
Update README.md
fix getting started
2025-04-29 22:20:29 -07:00
Teknium
e071a002eb
Update README.md
final image fix
2025-04-29 19:41:08 -07:00
Teknium
b7b2cc8a50
add banner image 2025-04-29 19:40:35 -07:00
Teknium
40f80fb056
Update README.md 2025-04-29 19:38:17 -07:00
Teknium
8edf424c7a
Update README.md
fix banner image?
2025-04-29 19:37:56 -07:00
dmahan93
4029acdfb1
Update README.md 2025-04-29 17:57:22 -05:00
Teknium
87c3e918d2
Update README.md
change to mit
2025-04-29 15:09:57 -07:00
Teknium
c3f369ac9b
Update README.md 2025-04-29 15:07:18 -07:00
Teknium
fe6ed20986
Merge pull request #2 from sukrucildirr/main
Update README.md
2025-04-29 13:42:17 -07:00
sukrucildirr
33d09530b0
Update README.md 2025-04-29 23:28:58 +03:00
Teknium
76dfcffd52
Update README.md 2025-04-29 13:10:48 -07:00
Dakota Nous
621d00dd80 first commit 2025-04-29 12:10:10 -07:00