update to tech report version (#10)

* feat(run_eval): add checkpoint resume functionality and update example documentation;
- update new bootcamp benchmark dataset

* refactor(data_pipeline): optimize data generation pipeline; add multiple preset configurations for data generation

* docs: update bootcamp list and add new scripts

- Update Fulllist_InternBootcamp.md with new bootcamps and categories
- Add new scripts to .gitignore:
  - examples/pipelines/filter_autogen_configs.py
  - examples/pipelines/quickgen_data_configs_from_eval_meta.py
- Update dependencies in setup.py:
  - Add scipy and scikit-learn

* refactor(internbootcamp): update bootcamp modules and improve error handling

- Update import statements in __init__.py files
- Add timestamp to target directory name in verl_data_preprocess.py
- Improve error handling and scoring logic in bootcamp_judger.py
- Remove unnecessary comments and update puzzle descriptions in multiple files
This commit is contained in:
Yongkang Chen 2025-08-28 12:39:47 +08:00 committed by GitHub
parent 125a7818e0
commit a8249acc18
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2952 changed files with 105460 additions and 17649 deletions

View file

@ -1,6 +1,4 @@
"""#
### 谜题描述
"""### 谜题描述
Polycarpus got an internship in one well-known social network. His test task is to count the number of unique users who have visited a social network during the day. Polycarpus was provided with information on all user requests for this time period. For each query, we know its time... and nothing else, because Polycarpus has already accidentally removed the user IDs corresponding to the requests from the database. Thus, it is now impossible to determine whether any two requests are made by the same person or by different people.
But wait, something is still known, because that day a record was achieved M simultaneous users online! In addition, Polycarpus believes that if a user made a request at second s, then he was online for T seconds after that, that is, at seconds s, s + 1, s + 2, ..., s + T - 1. So, the user's time online can be calculated as the union of time intervals of the form [s, s + T - 1] over all times s of requests from him.