Dimension Reducers — AI-Powered Mathematical Solutions

The platform

Three pillars of
mathematics robustification

Refereeing at scale

Automated auditing of mathematical claims in papers and LLM outputs. We find errors, flag counterexamples, and verify proofs — at a throughput no human review process can match.

Torture testing

Adversarial stress-testing of LLM mathematical reasoning and research papers. We generate the hardest edge cases, verify solutions, and produce clean training data from the wreckage.

Structured RAG

Semantic search that exploits mathematical structure — not just text similarity. Our retrieval understands theorems, definitions, and proof dependencies, enabling AI systems to reason over mathematics, not just pattern-match.

Our Tools

The platform

Live tools for mathematics robustification — refereeing, torture testing, structured retrieval, and dimensionality reduction.

Live · Core product

DiRe-JAX — Dimension Reduction Suite

Advanced dimensionality reduction library that improves upon UMAP in both performance and accuracy. Built on JAX for GPU-accelerated computation, with theoretical improvements grounded in differential geometry.

GPU

JAX-accelerated

>UMAP

accuracy gains

View Project →

Live · Search

arXiv Math Semantic Search

AI-powered search and Q&A over 700,000+ arXiv mathematics papers. Ask natural-language questions — get AI-synthesized answers with citations. Hybrid BGE embeddings + full-text search. Multi-LLM support.

729K

papers indexed

290M

chunks embedded

<1s

query latency

Try It Live →

Live · Verification

arXiv Proof Audit Database

AI system that automatically audits mathematical proofs in arXiv papers for errors and counterexamples. Analyzed 31,000+ papers across math.DS and math.GT categories. 70% of flagged papers contain explicit counterexamples to their main claims.

31K+

papers analyzed

~380

errors flagged

Explore Database →

Live · Stress Testing

Mathematics Torture Chamber

Adversarial stress-testing for LLMs and research papers. Generates hard edge-case problems, finds counterexamples to claimed theorems, and produces verified problem–solution pairs for LLM training pipelines. Includes a referee mode for paper-level auditing.

LLMs

stress-tested

Papers

refereed

Verified

training data

Try It Live →

Live · Formalization

Lean 4 Formalization Pipeline

Paste any mathematical statement, get a semantically verified Lean 4 formalization with a compiler-checked proof. Our agentic pipeline catches the semantic drift that makes raw theorem provers prove the wrong theorem — achieving 97.4% semantic accuracy where the best single LLM reaches 85%.

97.4%

semantic accuracy

~20s

per formalization

Try It Live →

Live · Bibliometrics

Citation Analyzer

Given a researcher's name, pulls citation metrics from OpenAlex, infers sub-fields, and contextualizes everything against empirical benchmarks — accounting for field differences, research breadth, and citation distribution shape. Per-topic h-index decomposition, live sub-field percentile ranks, and distribution diagnostics (Gini, spikiness, top-k concentration).

200M+

works in OpenAlex

4,500

research topics

Live

empirical CDFs

Try It Live → GitHub →

Research · Benchmarks

Pólya-Szegő Formalization Benchmark

823 problems from Pólya-Szegő's Problems and Theorems in Analysis, evaluated across 8 formalization systems. Key finding: Aristotle achieves 97.6% sorry-free proofs but only 67.3% semantic correctness — it proves the wrong theorem a third of the time. Our pipeline closes that gap.

823

benchmark problems

models evaluated

97.4%

our pipeline accuracy

Read Write-up → View Benchmark →

The pipeline

How robustification
works

Mathematical content goes in. Verified, structured, machine-ready knowledge comes out.

Ingest

Mathematical content from any source — arXiv papers, LLM outputs, training corpora, textbooks. We parse LaTeX, natural language, and code.

Structure

We represent mathematical objects as mathematical objects — theorems, definitions, proof dependencies — not bags of words. This is where our RAG gains its edge.

Verify

Automated refereeing: counterexample search, proof auditing, adversarial stress-testing. We find the errors that human reviewers miss and that LLMs confidently generate.

Deliver

Clean training data with verified solutions. Audit reports with specific error citations. Searchable knowledge bases with mathematical structure preserved.

Research

Published & reproducible

All Research →

arXiv · 2024

Ideal Polyhedra Volume Toolkit

Algorithms for ideal convex polyhedra in hyperbolic 3-space using Rivin's variational characterization. Volume distributions follow a Beta distribution; maximal configurations exhibit rational dihedral angles.

Benchmark · 2026

From 67% to 97%: Semantic Verification of Theorem Provers

823 Pólya-Szegő problems reveal that Aristotle proves 97.6% sorry-free — but only 67.3% are the right theorem. Our agentic pipeline recovers 97.4% semantic accuracy across 8 models.

Benchmark · 2025

Math OCR: LLMs vs. Mathpix

Comprehensive benchmark of LLM-based vs. specialist OCR for mathematical content. Key finding: Gemini Flash is 6× cheaper than Mathpix and more accurate.

Making mathematics
robust

Three pillars of
mathematics robustification

Refereeing at scale

Torture testing

Structured RAG

The platform

DiRe-JAX — Dimension Reduction Suite

arXiv Math Semantic Search

arXiv Proof Audit Database

Mathematics Torture Chamber

Lean 4 Formalization Pipeline

Citation Analyzer

Pólya-Szegő Formalization Benchmark

How robustification
works

Ingest

Structure

Verify

Deliver

Published & reproducible

Ideal Polyhedra Volume Toolkit

From 67% to 97%: Semantic Verification of Theorem Provers

Math OCR: LLMs vs. Mathpix

Build on
verified mathematics

Making mathematicsrobust

Three pillars ofmathematics robustification

Refereeing at scale

Torture testing

Structured RAG

The platform

DiRe-JAX — Dimension Reduction Suite

arXiv Math Semantic Search

arXiv Proof Audit Database

Mathematics Torture Chamber

Lean 4 Formalization Pipeline

Citation Analyzer

Pólya-Szegő Formalization Benchmark

How robustificationworks

Ingest

Structure

Verify

Deliver

Published & reproducible

Ideal Polyhedra Volume Toolkit

From 67% to 97%: Semantic Verification of Theorem Provers

Math OCR: LLMs vs. Mathpix

Build onverified mathematics

Making mathematics
robust

Three pillars of
mathematics robustification

How robustification
works

Build on
verified mathematics