reasoning-gym/notebooks/codeio/PreprocessCode.ipynb
2025-02-21 23:00:48 +00:00

266 lines
10 KiB
Text

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## CodeI/O\n",
"\n",
"Original paper (DeepSeek): https://arxiv.org/pdf/2502.07316\n",
"\n",
"The approach begins by obtaining high quality raw code data and preprocessing it by prompting an LLM. The output of this preprocessing, for each raw code file used, should be:\n",
"\n",
"- cleaned reference code, with a main entrypoint function\n",
"- a query, converting the reference code into a question (along the lines of \"given [function parameters...] how can we obtain [desired outputs...]\")\n",
"- a natural language description of all inputs (function parameters) and outputs (function return values)\n",
"- an input generator, which can generate a dictionary of valid inputs for the function\n",
"\n",
"This notebook seeks to experiment with prompting an LLM to this end, as a starting point. The raw code data is from this GitHub repository that the DeepSeek paper mentions as one of their raw code sources: https://github.com/TheAlgorithms/Python"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import random\n",
"from pathlib import Path\n",
"from dotenv import load_dotenv\n",
"load_dotenv()\n",
"raw_files = list(Path(\"raw_files/\").iterdir())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the below prompt is built for DeepSeekV3. It may not work with other LLMs."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"prompt_template = \"\"\"\n",
"You are tasked with preprocessing a raw file of Python code into a standard format. The format is made up of several components. Here is a very simple example of a raw code file:\n",
"\n",
"def kg_to_pounds(weights):\n",
" return [w * 2.20462 for w in weights]\n",
"\n",
"def filter_weekly(original_measurements, days):\n",
" return [m for i, m in enumerate(original_measurements) if i % 7 == 0]\n",
"\n",
"def main(kgs, days):\n",
" lbs = kg_to_pounds(kgs)\n",
"\n",
" for measurement in filter_weekly(lbs, days):\n",
" print(measurement)\n",
"\n",
"1. Cleaned reference code, with a main entrypoint function that takes all required arguments as parameters and returns all outputs.\n",
"\n",
"The name of the main entrypoint function should be `main`. The parameters should be clearly named but do not require type hints. The function should return a dict mapping output names to values. The function should contain all the necessary code to perform the functionality, without splitting into several functions. The function should not print or otherwise output anything; results should be returned as part of the result dict.\n",
"\n",
"Example function signature: `def main(weights_kg, days):`\n",
"\n",
"2. A query, defined as natural language description of the question the function answers.\n",
"\n",
"Example query: \"You are given two lists of integers, `weights_kg` and `days`. The unit of `weights_kg` is kilograms. `days` refers to the number of days passed, starting from zero. Your task is to convert the integers to pounds and filter to only one weight measurement every 7 days. Return the list of integers in pounds.\"\n",
"\n",
"The query should be as detailed as the code requires to be fully explained. It should be clear what the function does, what the inputs are, and what the outputs are.\n",
"\n",
"3. A natural language description of all inputs (function parameters) and outputs (return values) of the function.\n",
"\n",
"Example description:\n",
"\n",
"Input:\n",
" weights_kg (list of int): List of weight values in kilograms.\n",
" days (list of int): List of integers representing the number of days passed, starting from zero.\n",
"\n",
"Output:\n",
" return (dict): A dictionary with one key:\n",
" - weights_lb (list of int): List of filtered weight values in pounds.\n",
"\n",
"4. Python 3.11 code for an input generator, which randomly generates valid sets of inputs for the functions.\n",
"\n",
"The input generator should return a dict mapping parameter names to values. The values should be randomly generated, but should be valid inputs for the function.\n",
"\n",
"Example input generator:\n",
"\n",
"def input_generator():\n",
" weights = [np.random.uniform(0, 100) for _ in range(40)]\n",
" days = list(range(40))\n",
" return {{\"weights_kg\": weights, \"days\": days}}\n",
"\n",
"Using the guidelines and example above, preprocess the following raw code file into the standard format:\n",
"\n",
"{0}\n",
"\n",
"Output the components (reference code, query, description, input generator) in order. Separate each component with a line of dashes (---). Avoid code blocks and do not output any Markdown formatting. Respond only with the four components, no prefix or additional text.\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import time\n",
"from openai import OpenAI\n",
"from openai.types.chat import ChatCompletion, ChatCompletionMessageParam\n",
"from typing import Any, Iterable\n",
"\n",
"def llm_generate(\n",
" client: OpenAI,\n",
" messages: Iterable[ChatCompletionMessageParam],\n",
" sampling_params: dict[str, Any],\n",
") -> ChatCompletion:\n",
" max_retry = 3\n",
" for trial in range(max_retry):\n",
" try:\n",
" return client.chat.completions.create(\n",
" messages=messages,\n",
" **sampling_params,\n",
" )\n",
" except Exception as e:\n",
" print(\"failure response:\", e)\n",
" time.sleep(trial * trial) # quadratic backoff\n",
" if trial == max_retry - 1:\n",
" raise\n",
"\n",
"open_router_client = OpenAI(\n",
" base_url=\"https://openrouter.ai/api/v1\",\n",
" api_key=os.getenv(\"OPENROUTER_API_KEY\"),\n",
" timeout=90.0,\n",
")\n",
"\n",
"sampling_params = {\n",
" \"model\": \"deepseek/deepseek-chat:free\",\n",
" \"max_tokens\": 8192,\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"raw_files/bitmask.py\n",
"def main(task_performed, total_tasks):\n",
" dp = [[-1 for _ in range(total_tasks + 1)] for _ in range(2 ** len(task_performed))]\n",
" task = defaultdict(list)\n",
" final_mask = (1 << len(task_performed)) - 1\n",
"\n",
" def count_ways_until(mask, task_no):\n",
" if mask == final_mask:\n",
" return 1\n",
" if task_no > total_tasks:\n",
" return 0\n",
" if dp[mask][task_no] != -1:\n",
" return dp[mask][task_no]\n",
"\n",
" total_ways_util = count_ways_until(mask, task_no + 1)\n",
" for p in task[task_no]:\n",
" if mask & (1 << p):\n",
" continue\n",
" total_ways_util += count_ways_until(mask | (1 << p), task_no + 1)\n",
" \n",
" dp[mask][task_no] = total_ways_util\n",
" return dp[mask][task_no]\n",
"\n",
" for i in range(len(task_performed)):\n",
" for j in task_performed[i]:\n",
" task[j].append(i)\n",
"\n",
" total_ways = count_ways_until(0, 1)\n",
" return {\"total_ways\": total_ways}\n",
"\n",
"---\n",
"\n",
"You are given a list `task_performed` and an integer `total_tasks`. `task_performed` represents the tasks that can be performed by each person, where each sublist corresponds to the tasks a person can do. `total_tasks` is the total number of tasks (N). Your task is to calculate the total number of ways to distribute the tasks among the persons. Each person can do only one task, and a task can be done by only one person. Return the total number of ways.\n",
"\n",
"---\n",
"\n",
"Input:\n",
" task_performed (list of list of int): List of tasks that each person can perform. Each sublist contains the tasks a person can do.\n",
" total_tasks (int): The total number of tasks.\n",
"\n",
"Output:\n",
" return (dict): A dictionary with one key:\n",
" - total_ways (int): The total number of ways to distribute the tasks.\n",
"\n",
"---\n",
"\n",
"def input_generator():\n",
" import random\n",
" M = random.randint(2, 5)\n",
" N = random.randint(2, 5)\n",
" task_performed = [random.sample(range(1, N + 1), random.randint(1, N)) for _ in range(M)]\n",
" return {\"task_performed\": task_performed, \"total_tasks\": N}\n"
]
}
],
"source": [
"raw_file = random.choice(raw_files)\n",
"\n",
"print(raw_file)\n",
"\n",
"raw_code = raw_file.read_text()\n",
"\n",
"prompt = prompt_template.format(raw_code)\n",
"\n",
"messages = [\n",
" {\"role\": \"user\", \"content\": prompt},\n",
"]\n",
"\n",
"response = llm_generate(open_router_client, messages, sampling_params)\n",
"print(response.choices[0].message.content)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"code, query, parameters, generator = response.choices[0].message.content.split(\"\\n---\\n\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}