Dimension Reducers

Making mathematics
robust

We build tools that formalize, stress-test, verify, and structure mathematical knowledge — for LLM training, for automated refereeing, and for retrieval that understands mathematical structure, not just text.

By the
numbers
3,300+ research citations
700K+ papers indexed
31K proofs audited
97.4% semantic accuracy
HIGH-DIMENSIONAL cluster A cluster B cluster C LATENT STRUCTURE ∂ᵣ

Three pillars of
mathematics robustification

01

Refereeing at scale

Automated auditing of mathematical claims in papers and LLM outputs. We find errors, flag counterexamples, and verify proofs — at a throughput no human review process can match.

02

Torture testing

Adversarial stress-testing of LLM mathematical reasoning and research papers. We generate the hardest edge cases, verify solutions, and produce clean training data from the wreckage.

03

Structured RAG

Semantic search that exploits mathematical structure — not just text similarity. Our retrieval understands theorems, definitions, and proof dependencies, enabling AI systems to reason over mathematics, not just pattern-match.

The platform

Live tools for mathematics robustification — refereeing, torture testing, structured retrieval, and dimensionality reduction.

Live · Core product

DiRe-JAX — Dimension Reduction Suite

Advanced dimensionality reduction library that improves upon UMAP in both performance and accuracy. Built on JAX for GPU-accelerated computation, with theoretical improvements grounded in differential geometry.

GPU
JAX-accelerated
>UMAP
accuracy gains
View Project →
Live · Search

arXiv Math Semantic Search

AI-powered search and Q&A over 700,000+ arXiv mathematics papers. Ask natural-language questions — get AI-synthesized answers with citations. Hybrid BGE embeddings + full-text search. Multi-LLM support.

729K
papers indexed
290M
chunks embedded
<1s
query latency
Try It Live →
Live · Verification

arXiv Proof Audit Database

AI system that automatically audits mathematical proofs in arXiv papers for errors and counterexamples. Analyzed 31,000+ papers across math.DS and math.GT categories. 70% of flagged papers contain explicit counterexamples to their main claims.

31K+
papers analyzed
~380
errors flagged
Explore Database →
Live · Stress Testing

Mathematics Torture Chamber

Adversarial stress-testing for LLMs and research papers. Generates hard edge-case problems, finds counterexamples to claimed theorems, and produces verified problem–solution pairs for LLM training pipelines. Includes a referee mode for paper-level auditing.

LLMs
stress-tested
Papers
refereed
Verified
training data
Try It Live →
Live · Formalization

Lean 4 Formalization Pipeline

Paste any mathematical statement, get a semantically verified Lean 4 formalization with a compiler-checked proof. Our agentic pipeline catches the semantic drift that makes raw theorem provers prove the wrong theorem — achieving 97.4% semantic accuracy where the best single LLM reaches 85%.

97.4%
semantic accuracy
~20s
per formalization
Try It Live →
Live · Bibliometrics

Citation Analyzer

Given a researcher's name, pulls citation metrics from OpenAlex, infers sub-fields, and contextualizes everything against empirical benchmarks — accounting for field differences, research breadth, and citation distribution shape. Per-topic h-index decomposition, live sub-field percentile ranks, and distribution diagnostics (Gini, spikiness, top-k concentration).

200M+
works in OpenAlex
4,500
research topics
Live
empirical CDFs
Try It Live → GitHub →
Research · Benchmarks

Pólya-Szegő Formalization Benchmark

823 problems from Pólya-Szegő's Problems and Theorems in Analysis, evaluated across 8 formalization systems. Key finding: Aristotle achieves 97.6% sorry-free proofs but only 67.3% semantic correctness — it proves the wrong theorem a third of the time. Our pipeline closes that gap.

823
benchmark problems
8
models evaluated
97.4%
our pipeline accuracy
Read Write-up → View Benchmark →

How robustification
works

Mathematical content goes in. Verified, structured, machine-ready knowledge comes out.

01

Ingest

Mathematical content from any source — arXiv papers, LLM outputs, training corpora, textbooks. We parse LaTeX, natural language, and code.

02

Structure

We represent mathematical objects as mathematical objects — theorems, definitions, proof dependencies — not bags of words. This is where our RAG gains its edge.

03

Verify

Automated refereeing: counterexample search, proof auditing, adversarial stress-testing. We find the errors that human reviewers miss and that LLMs confidently generate.

04

Deliver

Clean training data with verified solutions. Audit reports with specific error citations. Searchable knowledge bases with mathematical structure preserved.

Published & reproducible

All Research →
arXiv · 2024

Ideal Polyhedra Volume Toolkit

Algorithms for ideal convex polyhedra in hyperbolic 3-space using Rivin's variational characterization. Volume distributions follow a Beta distribution; maximal configurations exhibit rational dihedral angles.

Benchmark · 2026

From 67% to 97%: Semantic Verification of Theorem Provers

823 Pólya-Szegő problems reveal that Aristotle proves 97.6% sorry-free — but only 67.3% are the right theorem. Our agentic pipeline recovers 97.4% semantic accuracy across 8 models.

Benchmark · 2025

Math OCR: LLMs vs. Mathpix

Comprehensive benchmark of LLM-based vs. specialist OCR for mathematical content. Key finding: Gemini Flash is 6× cheaper than Mathpix and more accurate.

Build on
verified mathematics

We work with AI labs, publishers, and research groups who need mathematical knowledge they can trust. If you are training models, curating datasets, or building on mathematical content — let's talk.

Start the Conversation igor@dimensionreducers.com