diff --git a/GALLERY.md b/GALLERY.md
index eeec0119..9e56a205 100644
--- a/GALLERY.md
+++ b/GALLERY.md
@@ -18,7 +18,6 @@ This gallery shows examples from all available datasets using their default conf
 - [fraction_simplification](#fraction_simplification)
 - [game_of_life](#game_of_life)
 - [gcd](#gcd)
-- [gsm_symbolic](#gsm_symbolic)
 - [intermediate_integration](#intermediate_integration)
 - [lcm](#lcm)
 - [leg_counting](#leg_counting)
@@ -843,32 +842,6 @@ Metadata: {'numbers': [297, 30], 'result': 3}
 
 ````
 
-### gsm_symbolic
-Default configuration:
-```python
-seed = 42
-size = 500
-```
-
-Example tasks:
-````
-Example 1:
-Question: There are currently 16 orange balls, 12 yellow balls, and 44 blue balls in the shop. orange balls cost $13, yellow balls cost $10 and blue balls cost $6. How much will the shop have received after all the balls are sold?
-Answer: 592
-Metadata: {'difficulty': 1.0, 'answer_value': 592, 'answer_cot': 'For the orange balls, 16 balls * $13/ball = $208.\nFor the yellow balls, 12 balls * $10/ball = $120.\nFor the blue balls, 44 balls * $6/ball = $264.\nFor all balls, $208 + $120 + $264 = $592.\n#### 592', 'variables': {'store': 'shop', 'colors': ['orange', 'yellow', 'blue'], 'quantities': [16, 12, 44], 'prices': [13, 10, 6], 'currency': '$', 'subtotals': [208, 120, 264], 'total': 592}}
-
-Example 2:
-Question: A plumber works for 3 weeks every month and for 4 days every week. If he gets paid £150 every day, how much does he earn if he works for a year?
-Answer: 21600
-Metadata: {'difficulty': 1.0, 'answer_value': 21600, 'answer_cot': 'The plumber works for 4 days every week and works for 3 weeks every month so he works for 4 days/week * 3 weeks/month = 12 days/month\nIf he earns £150 every day he then earns £150/day * 12 days/month = £1800/month\nA year is equal to 12 months so every year he earns £1800/month * 12 months/year = £21600\n#### 21600', 'variables': {'occupation': 'plumber', 'weeks_per_month': 3, 'days_per_week': 4, 'pay_per_day': 150, 'currency': '£', 'days_per_month': 12, 'monthly_pay': 1800}}
-
-Example 3:
-Question: Ava sliced an mango into 33 pieces. She ate 5 slice, her cousin ate 7 more than her, and her brother ate 4 more than her cousin. How many slices of mango did they all eat?
-Answer: 33
-Metadata: {'difficulty': 1.0, 'answer_value': 33, 'answer_cot': 'Her cousin ate 5 + 7 = 12 slices.\nHer brother ate 12 + 4 = 16 slices.\nThey ate a total of 5 + 12 + 16 = 33 slices.\n#### 33', 'variables': {'name': 'Ava', 'fruit': 'mango', 'total_slices': 33, 'first_person_slices': 5, 'second_person_extra': 7, 'third_person_extra': 4, 'sibling1': 'cousin', 'sibling2': 'brother', 'total_eaten': 33}}
-
-````
-
 ### intermediate_integration
 Generates intermediate integration problem - either
     by substitution or by parts
@@ -2061,12 +2034,12 @@ Example tasks:
 ````
 Example 1:
 Question: Transform the word ladder 'HAND' to 'GLEE' by changing one letter at a time.
-Answer: HAND,RAND,REND,FEND,FEED,FLED,FLEE,GLEE
+Answer: HAND,RAND,REND,REED,FEED,FLED,FLEE,GLEE
 Metadata: {'start_word': 'HAND', 'end_word': 'GLEE', 'word_length': 4, 'chain_length': 8}
 
 Example 2:
 Question: Transform the word ladder 'JAZZ' to 'DORM' by changing one letter at a time.
-Answer: JAZZ,JIZZ,FIZZ,FUZZ,FUZE,FAZE,FARE,FARM,FORM,DORM
+Answer: JAZZ,JIZZ,FIZZ,FUZZ,FUZE,FAZE,FARE,FORE,FORM,DORM
 Metadata: {'start_word': 'JAZZ', 'end_word': 'DORM', 'word_length': 4, 'chain_length': 10}
 
 Example 3:
@@ -2157,21 +2130,21 @@ Example tasks:
 ````
 Example 1:
 Question: This is a logic puzzle. There are 4 houses (numbered 1 on the left, 4 on the right), from the perspective of someone standing across the street from them. Each has a different person in them. They have different characteristics:
- - Each person has a unique name: arnold, eric, alice, peter
- - People use different phone models: samsung galaxy s21, iphone 13, google pixel 6, oneplus 9
- - Each person has a favorite drink: tea, water, milk, coffee
- - The people keep different animals: fish, cat, horse, bird
+ - Each person has a unique name: alice, eric, arnold, peter
+ - People use different phone models: samsung galaxy s21, oneplus 9, google pixel 6, iphone 13
+ - Each person has a favorite drink: coffee, water, tea, milk
+ - The people keep different animals: horse, cat, fish, bird
 
-1. The one who only drinks water is Peter.
-2. The cat lover is in the second house.
-3. The coffee drinker is the fish enthusiast.
-4. The person who uses a OnePlus 9 is the tea drinker.
-5. Peter is directly left of Arnold.
-6. The person who keeps horses is in the fourth house.
-7. The person who keeps horses is Alice.
-8. Alice is the person who uses a Google Pixel 6.
-9. The person who uses a Samsung Galaxy S21 is the one who only drinks water.
-10. Peter is in the first house.
+1. Peter is the one who only drinks water.
+9. The fish enthusiast is directly left of the person who keeps horses.
+7. The bird keeper is Peter.
+6. Alice is in the fourth house.
+2. The tea drinker is the person who uses a OnePlus 9.
+3. The person who uses an iPhone 13 is the fish enthusiast.
+5. The person who uses an iPhone 13 is directly left of the person who uses a Google Pixel 6.
+8. The coffee drinker is the person who uses an iPhone 13.
+4. The tea drinker and the person who uses an iPhone 13 are next to each other.
+10. Eric and the person who uses a Google Pixel 6 are next to each other.
 
 What is Name of the person who lives in House 1?
 Answer: peter
@@ -2179,20 +2152,21 @@ Metadata: {'num_people': 4, 'num_characteristics': 4}
 
 Example 2:
 Question: This is a logic puzzle. There are 4 houses (numbered 1 on the left, 4 on the right), from the perspective of someone standing across the street from them. Each has a different person in them. They have different characteristics:
- - Each person has a unique name: alice, eric, arnold, peter
- - Each mother is accompanied by their child: fred, bella, samantha, meredith
- - The people are of nationalities: norwegian, swede, brit, dane
- - Everyone has something different for lunch: spaghetti, grilled cheese, pizza, stew
+ - Each person has a unique name: arnold, peter, eric, alice
+ - Each mother is accompanied by their child: meredith, samantha, bella, fred
+ - The people are of nationalities: dane, norwegian, brit, swede
+ - Everyone has something different for lunch: stew, spaghetti, pizza, grilled cheese
 
-1. The person who loves the stew is Eric.
-2. The person's child is named Fred is directly left of the person who loves the spaghetti eater.
-3. The person's child is named Samantha is Peter.
-4. The person who is a pizza lover is the person's child is named Meredith.
-5. The person's child is named Meredith is directly left of Eric.
-6. The British person is the person's child is named Meredith.
+1. The Dane is in the second house.
+8. The Norwegian is the person who loves the spaghetti eater.
+5. The person who is a pizza lover is the person's child is named Meredith.
+2. Peter is directly left of the person who loves eating grilled cheese.
+3. The British person is Alice.
+9. The Swedish person is in the fourth house.
+6. The person who is a pizza lover and Eric are next to each other.
 7. The person's child is named Samantha is in the third house.
-8. Arnold is the Swedish person.
-9. The person's child is named Samantha is the Norwegian.
+10. The person who is a pizza lover is in the first house.
+4. Eric is the person's child is named Fred.
 
 What is Name of the person who lives in House 1?
 Answer: alice
@@ -2200,21 +2174,21 @@ Metadata: {'num_people': 4, 'num_characteristics': 4}
 
 Example 3:
 Question: This is a logic puzzle. There are 4 houses (numbered 1 on the left, 4 on the right), from the perspective of someone standing across the street from them. Each has a different person in them. They have different characteristics:
- - Each person has a unique name: alice, peter, eric, arnold
- - Everyone has a different favorite cigar: prince, dunhill, pall mall, blue master
- - Everyone has something different for lunch: stew, pizza, spaghetti, grilled cheese
- - Each person has a favorite color: green, red, yellow, white
+ - Each person has a unique name: arnold, eric, peter, alice
+ - Everyone has a different favorite cigar: blue master, pall mall, dunhill, prince
+ - Everyone has something different for lunch: spaghetti, pizza, stew, grilled cheese
+ - Each person has a favorite color: yellow, white, red, green
 
-1. Eric is the person who loves white.
-2. Alice and the Dunhill smoker are next to each other.
+7. The person whose favorite color is green is the person who loves the spaghetti eater.
+5. The Dunhill smoker is the person who loves the stew.
+4. The person who loves yellow is the Dunhill smoker.
 3. The person who loves the stew is Arnold.
-4. The person whose favorite color is green is directly left of the person who loves the stew.
-5. The person who smokes Blue Master is Alice.
-6. Alice is the person who loves the spaghetti eater.
-7. The person partial to Pall Mall is directly left of Eric.
-8. The Prince smoker is in the fourth house.
-9. The person who loves yellow is in the second house.
-10. Arnold and the person who loves eating grilled cheese are next to each other.
+1. The person whose favorite color is green is Alice.
+2. The person partial to Pall Mall is Peter.
+9. The person who smokes Blue Master is in the first house.
+10. Peter is directly left of the person who loves white.
+8. The person who loves eating grilled cheese is the person whose favorite color is red.
+6. The person partial to Pall Mall is in the third house.
 
 What is Name of the person who lives in House 1?
 Answer: alice
diff --git a/reasoning_gym/arithmetic/__init__.py b/reasoning_gym/arithmetic/__init__.py
index 9a6d775a..6d615efa 100644
--- a/reasoning_gym/arithmetic/__init__.py
+++ b/reasoning_gym/arithmetic/__init__.py
@@ -12,7 +12,8 @@ from .calendar_arithmetic import CalendarArithmeticConfig, CalendarArithmeticDat
 from .chain_sum import ChainSum, ChainSumConfig
 from .fraction_simplification import FractionSimplificationConfig, FractionSimplificationDataset
 from .gcd import GCDConfig, GCDDataset
-from .gsm_symbolic.gsm_symbolic_datasets import GSMSymbolicDataset, GSMSymbolicDatasetConfig
+
+# from .gsm_symbolic.gsm_symbolic_datasets import GSMSymbolicDataset, GSMSymbolicDatasetConfig
 from .lcm import LCMConfig, LCMDataset
 from .leg_counting import LegCountingConfig, LegCountingDataset
 from .prime_factorization import PrimeFactorizationConfig, PrimeFactorizationDataset
@@ -38,8 +39,8 @@ __all__ = [
     "LegCountingDataset",
     "PrimeFactorizationConfig",
     "PrimeFactorizationDataset",
-    "GSMSymbolicDatasetConfig",
-    "GSMSymbolicDataset",
+    # "GSMSymbolicDatasetConfig",
+    # "GSMSymbolicDataset",
     "TimeIntervalsConfig",
     "TimeIntervalsDataset",
 ]
diff --git a/reasoning_gym/logic/contrib/logic_puzzle/generate.py b/reasoning_gym/logic/contrib/logic_puzzle/generate.py
index 28f679cd..ab365235 100644
--- a/reasoning_gym/logic/contrib/logic_puzzle/generate.py
+++ b/reasoning_gym/logic/contrib/logic_puzzle/generate.py
@@ -4,14 +4,10 @@ puzzle_generator.py
 This is a driver script that can be used to generate new zebra puzzles.
 """
 
-import json
-import pickle
-import sys
-
 # from tqdm import tqdm
 from itertools import product
-from random import choices, randint, sample, seed, shuffle
-from typing import Dict, Iterable, List, Optional, Set, Tuple, Type
+from random import Random
+from typing import Dict, Iterable, List, Set, Tuple, Type
 
 from tabulate import tabulate
 
@@ -72,7 +68,7 @@ def generate_consecutive_beside(puzzle: Puzzle, solution: Dict[Literal, int]) ->
         for pair in pairs:
             # consecutive is just a more informative version of beside, but they have same structure
             # because of this, don't include both
-            if randint(0, 1) == 0:
+            if puzzle.rng.randint(0, 1) == 0:
                 clues.add(consecutive(pair[0], pair[1], puzzle.houses))
             else:
                 clues.add(beside(pair[0], pair[1], puzzle.houses))
@@ -169,7 +165,7 @@ def try_to_remove(puzzle: Puzzle, clues: Set[Clue], n: int, must_have=set()) ->
         return weights.get(type(clue), 1)
 
     weights = [weight(clue) for clue in clues]
-    candidates: Set[Clue] = set(choices(list(clues), weights, k=n))
+    candidates: Set[Clue] = set(puzzle.rng.choices(list(clues), weights, k=n))
     candidates = candidates - must_have
     clues = clues.difference(candidates)
     if has_unique_solution(puzzle, clues):
@@ -191,7 +187,7 @@ def reduce_individually(
     and added to `removed`. If no clues can be removed, we return the original two sets.
     """
 
-    candidates = set(sample(list(clues), len(clues)))
+    candidates = set(puzzle.rng.sample(list(clues), len(clues)))
     for clue in candidates:
         if clue not in must_have:
             clues.remove(clue)
@@ -239,7 +235,7 @@ def reduce_clues(puzzle: Puzzle, clues: Set[Clue], must_have=set()) -> Tuple[Set
     """
 
     # this is a stupid way to shuffle the set of clues without modifying it
-    minimal_clues = set(sample(list(clues), k=len(clues)))
+    minimal_clues = set(puzzle.rng.sample(list(clues), k=len(clues)))
     while True:
         # print(f"There are {len(minimal_clues)} clues in ba sing se")
 
@@ -278,7 +274,7 @@ def reduce_clues(puzzle: Puzzle, clues: Set[Clue], must_have=set()) -> Tuple[Set
     return minimal_clues, removed_clues
 
 
-def question_generation(col_name, table_data):
+def question_generation(rng: Random, col_name, table_data):
     values_by_cols = {}
     for row in table_data:
         for idx, value in enumerate(row):
@@ -294,7 +290,7 @@ def question_generation(col_name, table_data):
                 continue
             question = f"What is {col} of the person who lives in House {row[0]}?"
             options = values_by_cols[col][:]
-            shuffle(options)
+            rng.shuffle(options)
             truth = row[cid]
             assert truth in options
             questions_data.append(
@@ -306,18 +302,18 @@ def question_generation(col_name, table_data):
     return questions_data
 
 
-def generate_solution_dict(selected_elements: List[Literal], n: int) -> Dict[Literal, int]:
+def generate_solution_dict(rng: Random, selected_elements: List[Literal], n: int) -> Dict[Literal, int]:
     solution = {}
     house_ids = list(range(1, n + 1))
     for element in selected_elements:
-        shuffle(house_ids)
-        attributes: List[element] = list(element.__members__.values())
+        rng.shuffle(house_ids)
+        attributes: List[Literal] = list(element.__members__.values())
         for i in range(n):
             solution[attributes[i]] = house_ids[i]
     return solution
 
 
-def wrap_up_dict(random_elements, solution, puzzle, reduced, extra_clues, context, K, M):
+def wrap_up_dict(rng: Random, random_elements, solution, puzzle, reduced, extra_clues, context, K, M):
     col_names = [e.__name__ for e in random_elements]
     house_data = {}
     for item, house in solution.items():
@@ -337,7 +333,7 @@ def wrap_up_dict(random_elements, solution, puzzle, reduced, extra_clues, contex
     table = tabulate(table_data, headers=col_names, tablefmt="grid")
 
     ## Generate multiple-choice questions
-    q_data = question_generation(col_names, table_data)
+    q_data = question_generation(rng, col_names, table_data)
     all_in_one = {}
     all_in_one["size"] = f"{K}*{M}"
     all_in_one["puzzle_context"] = context
@@ -358,7 +354,7 @@ def check_correctness(p):
     return set(solution_set) == set(_first_solution)
 
 
-def generate_puzzle(K=2, M=3, mode="train"):
+def generate_puzzle(rng: Random, K=2, M=3):
     elements = [Color, Nationality, Animal, Drink, Cigar, Food, Flower, PhoneModel, Children, Smoothie]
     clue_types = [
         generate_found_at,
@@ -366,12 +362,12 @@ def generate_puzzle(K=2, M=3, mode="train"):
         generate_consecutive_beside,
     ]
 
-    shuffle(elements)
+    rng.shuffle(elements)
     random_elements = [Name] + elements[: M - 1]
-    solution = generate_solution_dict(random_elements, K)
+    solution = generate_solution_dict(rng, random_elements, K)
 
     # set up the puzzle with default constraints
-    puzzle = Puzzle(element_types=random_elements, elements=solution.keys(), n_houses=K).set_constraints()
+    puzzle = Puzzle(rng=rng, element_types=random_elements, elements=solution.keys(), n_houses=K).set_constraints()
     puzzle.solution = solution
     context = str(puzzle)
 
@@ -383,68 +379,11 @@ def generate_puzzle(K=2, M=3, mode="train"):
 
     reduced, _ = reduce_clues(puzzle, clues)
     extra_clues = clues - reduced
-    extra_clues = set(sample(list(extra_clues), min(len(extra_clues), 30)))
+    extra_clues = set(rng.sample(list(extra_clues), min(len(extra_clues), 30)))
     for clue in reduced:
         puzzle.add_clue(clue)
 
     assert has_unique_solution(puzzle, puzzle.clues, remove_after=False)
     assert check_correctness(puzzle)
-    all_in_one = wrap_up_dict(random_elements, solution, puzzle, reduced, extra_clues, context, K, M)
+    all_in_one = wrap_up_dict(rng, random_elements, solution, puzzle, reduced, extra_clues, context, K, M)
     return all_in_one, puzzle
-
-
-# def main():
-#     mode = sys.argv[1]
-#     print(f"mode={mode}")
-#     if mode.startswith("train"):
-#         seed(1337)
-#         N = 30
-#         if mode.endswith("_large"):
-#             N = 150
-#         if mode.endswith("_xl"):
-#             N = 1000
-#         Ks = [2,3,4]
-#         Ms = [2,3,4]
-
-#         if mode.endswith("_xxl"):
-#             N = 500
-#             Ks = [2,3,4,5,6]
-#             Ms = [2,3,4,5,6]
-
-#     elif mode == "dev" or mode.startswith("test_"):
-#         seed(42+len(mode))
-#         N = 10
-#         Ks = [2,3,4,5]
-#         Ms = [2,3,4,5]
-#         if mode.startswith("test_id_xl"):
-#             Ks = [2,3,4,5,6]
-#             Ms = [2,3,4,5,6]
-#         if mode.startswith("test_id_xxl"):
-#             Ks = [2,3,4,5,6,7]
-#             Ms = [2,3,4,5,6,7]
-#         if mode.endswith("_50"):
-#             N = 50
-
-#     instances = []
-#     puzzle_objs = []
-#     for K, M, idx in tqdm(list(product(Ks, Ms, list(range(N))))):
-#         if mode.startswith("test_id_xl"):
-#             if K != 6 and M != 6:
-#                 continue
-#         if mode.startswith("test_id_xxl"):
-#             if K != 7 and M != 7:
-#                 continue
-#         instance, puzzle = generate_puzzle(K, M, mode)
-#         instance["idx"] = f"lgp-{mode}-{K}x{M}-{idx}"
-#         instances.append(instance)
-#         puzzle_objs.append({"idx": instance["idx"], "puzzle": puzzle})
-
-#     with open(f"logic_grid_puzzles.{mode}.pkl", "wb") as f:
-#         pickle.dump(puzzle_objs, f)
-
-#     with open(f"logic_grid_puzzles.{mode}.json", "w") as f:
-#         json.dump(instances, f, indent=2)
-
-
-if __name__ == "__main__":
-    main()
diff --git a/reasoning_gym/logic/contrib/logic_puzzle/graph/reasoning_path.py b/reasoning_gym/logic/contrib/logic_puzzle/graph/reasoning_path.py
index 18a286c9..cc63a80e 100644
--- a/reasoning_gym/logic/contrib/logic_puzzle/graph/reasoning_path.py
+++ b/reasoning_gym/logic/contrib/logic_puzzle/graph/reasoning_path.py
@@ -26,7 +26,6 @@ def logic_grid_puzzle(inputfile, ground_truth, size, lower_part, higher_part):
     reasoning_result = []
     answers = json.load(open(ground_truth, "r"))
     puzzles = pickle.load(open(inputfile, "rb"))
-    cell_difficulty = {}
     mode = inputfile[inputfile.find("puzzles.") + 8 : inputfile.find(".pkl")]
     print("Number of puzzles", len(answers))
     assert len(answers) == len(puzzles)
diff --git a/reasoning_gym/logic/contrib/logic_puzzle/puzzle.py b/reasoning_gym/logic/contrib/logic_puzzle/puzzle.py
index a72c2bf7..a6e5f018 100644
--- a/reasoning_gym/logic/contrib/logic_puzzle/puzzle.py
+++ b/reasoning_gym/logic/contrib/logic_puzzle/puzzle.py
@@ -6,8 +6,8 @@ Solve the Einstein puzzle using Raymond Hettinger's approach.
 from __future__ import annotations
 
 from contextlib import contextmanager
-from random import shuffle
-from typing import Dict, Generator, Iterable, List, Set, Tuple, Type
+from random import Random
+from typing import Generator, Iterable, List, Set, Tuple, Type
 
 from reasoning_gym.logic.contrib.logic_puzzle.clues import (
     Clue,
@@ -58,6 +58,7 @@ class Puzzle:
     def __init__(
         self,
         *,
+        rng: Random,
         element_types: Iterable[Type[Literal]],
         elements: Iterable[Literal] = None,
         n_houses: int = 5,
@@ -73,6 +74,7 @@ class Puzzle:
         ones.
         """
 
+        self.rng = rng
         self.element_classes = list(element_types)
         if elements is None:
             self.literals = [el for el_class in self.element_classes for el in el_class]
@@ -145,15 +147,17 @@ class Puzzle:
         s += f"They have different characteristics:\n"
         for element_type in self.element_classes:
             literals = [l for l in self.literals if isinstance(l, element_type)]
-            shuffle(literals)
+            self.rng.shuffle(literals)
             desc = element_type.description()
             idx = desc.index(":") if ":" in desc else None
             desc = desc[:idx]
             s += f" - {desc}: " + ", ".join(e.name.replace("_", " ") for e in literals) + "\n"
 
         s += "\n"
-        for i, clue in enumerate(self.clues):
-            s += f"{i + 1}. {clue}\n"
+
+        clues = sorted(f"{i + 1}. {clue}\n" for i, clue in enumerate(self.clues))
+        self.rng.shuffle(clues)
+        s += "".join(clues)
 
         return s
 
@@ -196,7 +200,7 @@ if __name__ == "__main__":
     literals: List[Literal] = [el for group in enum_classes for el in group]
 
     # set up the puzzle with constraints and clues
-    puzzle = Puzzle(element_types=[Color, Nationality, Drink, Cigar, Animal])
+    puzzle = Puzzle(rng=Random(), element_types=[Color, Nationality, Drink, Cigar, Animal])
 
     puzzle = (
         puzzle.set_constraints()
@@ -246,7 +250,7 @@ if __name__ == "__main__":
     literals = [el for group in enum_classes for el in group]
 
     # set up the puzzle with constraints and clues
-    puzzle = Puzzle(element_types=[Mother, Children, Flower, Food])
+    puzzle = Puzzle(rng=Random(), element_types=[Mother, Children, Flower, Food])
 
     puzzle = (
         puzzle.set_constraints()
diff --git a/reasoning_gym/logic/zebra_puzzles.py b/reasoning_gym/logic/zebra_puzzles.py
index 3ba177a7..9992f25e 100644
--- a/reasoning_gym/logic/zebra_puzzles.py
+++ b/reasoning_gym/logic/zebra_puzzles.py
@@ -1,6 +1,6 @@
 from dataclasses import dataclass
-from random import Random, seed
-from typing import Dict, List, Optional, Tuple
+from random import Random
+from typing import Dict, Optional
 
 from ..factory import ProceduralDataset, register_dataset
 from .contrib.logic_puzzle.generate import generate_puzzle
@@ -36,11 +36,11 @@ class ZebraDataset(ProceduralDataset):
                 - answer: str, a solution string
                 - metadata: dict with generation parameters
         """
-        seed(self.seed + idx)
+        rng = Random(self.seed + idx)
 
         K = self.config.num_people
         M = self.config.num_characteristics
-        instance, puzzle = generate_puzzle(K, M, "train")
+        instance, puzzle = generate_puzzle(rng, K, M)
         q = instance["questions"][0]["question"]
         answer = instance["questions"][0]["answer"]
         question = str(puzzle) + "\n" + q