reasoning-gym/tests
Rich Jones 11c9790a25 [Env] Game of Life Halting Prediction (#272)
This is a variant of the Game of Life task, which rather than trying to test the algorithmic simulation, tests the ability of the model to do explanatory reasoning of the board. The idea is that a model with good explanatory reasoning will be able to see that a game will not halt without simulating it into the future.

The task presents a GoL board, and the model is asked to predict if the board will halt (die, all cells zero) after n steps. Sometimes, the board will be made up of 'oscillators', isolated structures which never die. Othertimes, it is filled with non-oscillators, structures which will always die after a few steps. The model should deduce which case the presented board is.
2025-03-07 10:05:12 +01:00
..
__init__.py build: Initialize reasoning_gym package structure with packaging and development setup 2025-01-23 10:50:54 +01:00
test_ab.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_advanced_geometry.py lint 2025-02-01 17:01:11 +01:00
test_aiw.py post merge lint 2025-02-02 10:04:18 +01:00
test_arc_1d.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_arc_1d_tasks.py move arc_1d into from cognition into arc folder 2025-02-08 19:37:26 +01:00
test_arc_agi.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_base_conversion.py lint 2025-01-31 12:16:08 +01:00
test_basic_arithmetic.py extend format tests to allow questions that ends with question marks 2025-02-21 15:50:03 +02:00
test_bf.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_binary_alternation.py pre-commit 2025-02-21 13:39:05 +01:00
test_binary_matrix.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_bitwise_arithmetic.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_caesar_cipher.py formatting 2025-01-25 18:51:28 +01:00
test_calendar_arithmetic.py post merge formatting 2025-02-02 15:24:39 +01:00
test_chain_sum.py add ProductsDataset (multiplication tasks) 2025-02-13 17:59:02 +01:00
test_circuit_logic.py feat: Add scoring method & unit tests for circuit logic dataset 2025-02-16 22:48:51 +01:00
test_coaching.py use native types List->list, Dict->dict, Set->set, Tuple->tuple 2025-02-21 15:15:38 +01:00
test_codeio.py First version of CodeI/O reasoning data (#264) 2025-03-05 22:34:11 +01:00
test_color_cube_rotation.py fix CCC scoring 2025-02-26 12:54:40 +01:00
test_complex_arithmetic.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_composite.py reasoning-gym-server & cli tool (#154) 2025-02-19 22:41:33 +01:00
test_count_bits.py feat(env): Count Bits Curriculum (#267) 2025-03-05 22:44:04 +01:00
test_count_primes.py Fix primes representation in count_primes dataset metadata 2025-02-26 14:58:21 +02:00
test_countdown.py Fixed countdown score_answer (#265) 2025-03-05 22:30:12 +01:00
test_course_schedule.py feat(env): Course Schedule Curriculum (#266) 2025-03-05 22:42:46 +01:00
test_cryptarithm.py use correct signature for CryptarithmDataset.score_answer() method 2025-02-20 11:55:32 +01:00
test_dataset.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_dataset_common.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_decimal_arithmetic.py add to init 2025-02-20 10:51:00 +01:00
test_decimal_chain_sum.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_dice.py lint again 2025-02-11 13:00:12 +01:00
test_emoji_mystery.py added tests 2025-02-21 17:58:13 +00:00
test_family_relationships.py feat: Add mother-in-law and father-in-law relationship detection 2025-01-27 21:24:35 +01:00
test_figlet_fonts.py test: Add deterministic test for FigletFontDataset generation 2025-01-30 00:59:57 +01:00
test_fraction_simplification.py formatting 2025-01-24 10:34:07 +01:00
test_futoshiki.py more tolerant parsing of futoshiki answers 2025-02-16 14:23:40 +01:00
test_game_of_life.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_game_of_life_halting.py [Env] Game of Life Halting Prediction (#272) 2025-03-07 10:05:12 +01:00
test_gcd.py formatting 2025-01-24 10:34:07 +01:00
test_graph_color.py add graph coloring 2025-02-13 01:28:09 +01:00
test_group_anagrams.py test malformed json answer 2025-02-06 10:13:02 +01:00
test_gsm_symbolic.py updated algorithmics dataset (#269) 2025-03-05 23:32:53 +01:00
test_intermediate_integration.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_isomorphic_strings.py isomorphic strings 2025-02-07 18:23:34 +01:00
test_jugs.py fix jugs unit test 2025-02-20 23:09:46 +01:00
test_knight_swap.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_knights_knaves.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_largest_island.py largest island curriculum (#270) 2025-03-05 22:45:35 +01:00
test_lcm.py formatting 2025-01-24 10:34:07 +01:00
test_leg_counting.py formatting 2025-01-24 10:34:07 +01:00
test_letter_counting.py formatting 2025-01-24 10:34:07 +01:00
test_letter_jumble.py more dynamic scoring for jumble (#246) 2025-03-01 18:50:59 +01:00
test_list_functions.py Commit more changes 2025-02-21 00:37:29 -06:00
test_mahjong_puzzle.py feat(env): Mahjong Puzzle Curriculum (#263) 2025-03-05 22:28:02 +01:00
test_manipulate_matrix.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_maze.py add reasoning_gym.create_dataset({name}, ...) global factory function 2025-01-25 00:58:34 +01:00
test_mini_sudoku.py formatting 2025-01-24 10:34:07 +01:00
test_n_queens.py feat(env): NQueens Curriculum (#262) 2025-03-05 15:05:17 +01:00
test_needle_haystack.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_number_filtering.py formatting 2025-01-24 10:34:07 +01:00
test_number_format.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_number_sequences.py formatting, cleanup 2025-01-24 17:12:42 +01:00
test_number_sorting.py formatting 2025-01-24 10:34:07 +01:00
test_palindrome.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_palindrome_partitioning.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_polynomial_equations.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_polynomial_multiplication.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_pool_matrix.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_power_function.py updated algorithmics dataset (#269) 2025-03-05 23:32:53 +01:00
test_prime_factorization.py normalize answer and partial reward 2025-02-09 11:13:23 +01:00
test_products.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_propositional_logic.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_puzzle24.py Added puzzle24 closes #208 (#268) 2025-03-05 22:36:37 +01:00
test_quantum_lock.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_ransom_note.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_rearc.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_rectangle_count.py add rectangle count dataset 2025-02-11 13:56:27 +01:00
test_rotate_matrix.py generlize to k rotations 2025-02-08 15:14:04 +01:00
test_rotting_oranges.py rotten oranges 2025-02-20 22:33:39 +01:00
test_rubiks_cube.py seed test config 2025-02-27 10:44:28 +01:00
test_rush_hour.py fix handling of walls, add unit test 2025-02-14 23:29:17 +01:00
test_self_reference.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_sentence_reordering.py formatting 2025-02-16 16:18:39 +01:00
test_shortest_path.py shortest path curriculum (#271) 2025-03-05 22:46:10 +01:00
test_simple_equations.py fix operators configuration 2025-01-24 19:44:46 +01:00
test_simple_geometry.py lint 2025-02-01 17:01:11 +01:00
test_simple_integration.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_sokoban.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_spell_backward.py rename word_reversal.py -> word_sequence_reversal.py 2025-01-26 11:57:50 +01:00
test_spiral_matrix.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_string_insertion.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_string_manipulation.py Fix more conflict 2025-02-13 21:24:05 -06:00
test_string_splitting.py remove template from test 2025-02-14 21:58:45 +01:00
test_string_synthesis.py string synthesis 2025-02-13 16:33:28 +01:00
test_sudoku.py formatting 2025-01-24 10:34:07 +01:00
test_syllogisms.py lint 2025-02-08 17:22:55 +01:00
test_time_intervals.py Add time interval dataset class 2025-02-01 02:10:48 +01:00
test_tower_of_hanoi.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_tsumego.py fix & simplify score_answer() of TsumegoDataset 2025-02-26 19:04:30 +01:00
test_utils.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_word_ladder.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_word_sequence_reversal.py rename word_reversal.py -> word_sequence_reversal.py 2025-01-26 11:57:50 +01:00
test_word_sorting.py Minor question template & score_answer improvements (#261) 2025-03-04 21:55:09 +01:00
test_zebra.py test: Add deterministic test for ZebraDataset generation 2025-02-03 22:59:23 +01:00