JP Patent App. 2026-046625 (Subset B)

PSDP — Within-language Parallelism

Phase-Synchronous Deterministic Parallelism. Parallelize a sequential program in the same language, without changing a single bit of its output. A commutator-norm phase-sync mechanism gives an algebraic guarantee of determinism under parallel execution.

Headline numbers

3.38 ×

compute-heavy
NENC kernel sweep (Java 8), near-linear scaling with core count

2.17 ×

graph-class kernels
GRA sweep (Java 8), bandwidth-bound, plateaus accordingly

1.02 ×

database / TPC
I/O bound, but bit-exactness is preserved

3072 / 3072

AVX-512 blocks
32-lane SIMD, bit-exact PASS, ±5% vs AVX2 (memory-bound)

safety mechanisms
claims 29–32, exposed as 5 profiles in SlimeSyCUDA (GAME_MIN to SAFETY_MAX)

12+

languages supported
each Subset A target language ships with a paired PSDP variant

Mechanism — commutator-norm phase synchronization

A sufficient condition for parallel execution to produce the same result as sequential execution is derived from the commutator norm ‖[A, B]‖ of the relation operators involved. At runtime, we enforce phase synchronization to keep the commutator below a threshold, and the Böttcher–Wenzel inequality then yields a closed-form upper bound on numerical drift. This is what eliminates the familiar phenomenon of "numbers change when the code goes parallel."

// PSDP example (excerpt from OrderBatchProcessor_PSDP.java)
// The bench verifies that the SHA-256 of the input/output exactly
// matches the sequential original (OrderBatchProcessor_ORIGINAL.java).
public class OrderBatchProcessor {
    public Result process(List<Order> orders) {
        return orders.parallelStream()        // only this is parallel
            .map(this::settle)
            .collect(PSDP.phaseSyncReduce(...));   // phase-sync reduce
    }
}

The parallelization touches only a handful of API calls. Logic is not rewritten. A regression bench checks that sequential and parallel forms produce identical SHA-256 outputs.

Three layers

PSDP applies the same principle at three different layers. Identity of result is established via "phase sync + commutator-norm threshold" at every layer.

Layer 1 — Algorithm Algorithmic parallelization Sequential batch → parallel batch. Java parallelStream and equivalents are introduced "without changing the output."

Layer 2 — QP Quantization-parameter parallelism Parallel QP control for video/signal pipelines, used in SlimeCodec and SlimeSyCUDA.

Layer 3 — SIMD Instruction-level parallelism AVX-512 32-lane svt_av1_quantize_fp variant — bit-exact PASS on 3072 blocks, ±5% vs AVX2 (memory-bound).

Eleven safety mechanisms (claims 29–32)

We classify the situations where parallel execution might leave deterministic territory into 11 categories, and provide a safety mechanism for each. SlimeSyCUDA (the GPU extension) exposes these as five staged profiles:

GAME_MIN	Lowest overhead, intended for games. Safety mechanisms minimized; frame time is the priority.
BALANCED	Default profile. General-purpose parallelization.
STRICT	Strict profile for finance / scientific workloads. Bit-exact verification on every loop.
AUDIT	Full audit logging. Connects to the Subset A audit chain for round-trip proof.
SAFETY_MAX	All 11 mechanisms enabled. Intended for mission-critical applications such as aviation and medical systems.

Languages with PSDP support

In addition to the five Subset A targets, the research implementation extends to a wider set of languages (23 converters under track_c/converter_*). Because both "sequential" and "PSDP-parallel" forms are emitted from the same Slot IR, migration (Subset A) and parallelization (Subset B) can be combined in a single tooling pipeline.

Java 8 / 17 / 21 Rust C / C++ C# Kotlin Kotlin coroutines Go Scala Clojure Erlang Python Node.js PHP Swift FORTRAN Common Lisp

Benchmarks (Java 8, core sweep)

Category	kernel	Speed-up	Notes
Compute	NENC (numerical equivalence)	3.38 ×	nearly linear in core count, CPU-bound
Graph	GRA (graph kernels)	2.17 ×	memory-bandwidth bound, bit-exact preserved
Database	TPC (transactions)	1.02 ×	I/O bound, but result invariance is guaranteed
SIMD	svt_av1_quantize_fp (AVX-512)	3072 / 3072	bit-exact PASS, ±5% vs AVX2 (memory-bound)

Note: we do not accept the customary trade-off of "result drifts in exchange for speed-up." Bit-exactness is an absolute constraint. The 1.02× ceiling on the database benchmark is an I/O-bound physical limit, not a cost imposed by the safety mechanisms.

Audit suitability

Bit-exactSequential and parallel outputs match by SHA-256. The standard failure mode of "the numbers shift slightly when we go parallel" is eliminated.
Phase-sync guarantee‖[A, B]‖ ≤ ε is enforced at runtime. The Böttcher–Wenzel inequality then yields an algebraic upper bound on numerical drift.
11 safety mechanismsCovered by claims 29–32, exposed as 5 staged profiles (GAME_MIN through SAFETY_MAX) for application-specific tuning.
Audit-chain couplingConnects to the Subset A audit chain (claim 9), giving a single pipeline that produces a bidirectional proof for both transformation and parallelization.
Regression resilienceSame input + same version → bit-identical SHA-256. Output does not drift across parallel or GPU execution.

Typical use cases

Financial batches	Parallelize nightly batch jobs to shorten the window, while guaranteeing "not a single yen drift." The SAFETY_MAX profile is intended for audit settings.
Scientific computing	FFT, Conv2D, LU decomposition and similar numerical kernels are parallelized bit-exactly. The familiar problem "we can't compare the results because parallel changed them" simply does not arise.
Video coding	Apply bit-exact parallelization to SlimeCodec QP control and the SVT-AV1 AVX-512 quantizer. Encoder parallel performance and reproducibility coexist.
Games / real-time	SlimeSyCUDA (GPU variant) exposes the GAME_MIN profile to minimize safety overhead and prioritize frame time.

Relationship with Subset A: migrate legacy with Subset A (cross-language transpilation), then apply PSDP (Subset B) on the migrated code in the same pipeline. Migration plus parallelization happen in a single tool, providing a single path from legacy code to "audit-grade parallel modern systems."

Technical specifications

Patent	JP Patent App. 2026-046625 (Subset B = PSDP) / JP Patent App. 2026-046620 (Subset A coupling)
Claims	Phase synchronization (Layers 1–3) / 11 safety mechanisms (claims 29–32) / Audit chain coupling (claim 9)
Paper	PSDP Paper JP v5d (910 KB PDF, 2026-03-04)
Implementations	Java 8 PoC / AVX-512 MVP-A (svt_av1_quantize_fp 32-lane) / Rust + WASM demo (index_psdp.html) / 12+ language converters bundled
Standard tests	NENC / GRA / TPC kernel sweeps (Java 8) / 3072 AVX-512 blocks bit-exact
License model	Combined licensing with Subset A. Converter is licensed; converted output is unlicensed. Ed25519 3-hop activation.

PSDP — Within-language Parallelism

Headline numbers

Mechanism — commutator-norm phase synchronization

Three layers

Eleven safety mechanisms (claims 29–32)

Languages with PSDP support

Benchmarks (Java 8, core sweep)

Audit suitability

Typical use cases

Technical specifications

Related documentation