Training Data Modules¶

対象: pca.training.targets, pca.training.dataset, pca.training.data

Purpose¶

self-play JSONL を安定した dataclass schema と batch tensor に変換する。古い JSONL も読めるように default field を保ち、policy/value と belief の両方の学習に供給する。

Modules¶

Module	Role	Implementation Details
`pca.training.targets`	JSONL schema	`SearchTrainingTarget`, `BeliefTrainingTarget`, `AuxPrizeTrainingTarget`, `SelfPlayRecord` を定義する。
`pca.training.dataset`	compatibility facade	旧 import path を維持するため `pca.training.data` の主要 API を re-export する。
`pca.training.data.records`	JSONL loader	dict から dataclass を復元し、古い record の欠損 field を補う。search/belief usable filter も持つ。
`pca.training.data.search_collate`	search batch	action padding、target policy、selected action、value、aux prize、integrated belief を `SearchBatch` にまとめる。
`pca.training.data.belief_collate`	belief batch	belief target の card ids を multi-hot にし、`BeliefBatch` を作る。
`pca.training.data.collate_utils`	collate helpers	`multi_hot`, `card_ids_to_indices`, object row padding。
`pca.training.data.weights`	record weights	teacher policy weight、low-progress downweight、passive deck-out filter を実装する。
`pca.training.data.types`	batch dataclasses	`SearchBatch` と `BeliefBatch`。

Public API¶

API	Usage
`load_records_jsonl(path)`	JSONL を list として読む。
`iter_records_jsonl(path)`	streaming iterator。
`usable_search_records(records)`	policy/value training に使える record を抽出する。
`usable_belief_records(records)`	belief training に使える record を抽出する。
`collate_search_batch(records, ...)`	policy/value training batch を作る。
`collate_belief_batch(records, ...)`	belief training batch を作る。
`record_policy_weight(record, ...)`	record ごとの policy loss weight を計算する。

Usage¶

from pca.training.dataset import load_records_jsonl, usable_search_records, collate_search_batch

records = usable_search_records(load_records_jsonl("data/selfplay/run.jsonl"))
batch = collate_search_batch(records[:32])

Notes¶

JSONL schema を増やす場合は dataclass に default 付き field を追加する。
loader は古い JSONL を壊さないことを優先する。