Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jupyter notebook markdown generator

Posts

What to expect from this blog

less than 1 minute read

Published: August 11, 2022

This is the first post of my blog. The posts here, though still unsure how frequently they will come, will be about two topics:

Discussion of projects I am working on
Things I found while metal detecting

publications

$k$-NIM Trees: Characterization and Enumeration

Submitted, 2022

Among those real symmetric matrices whose graph is a given tree T, the maximum multiplicity $M(T)$ that can be attained by an eigenvalue is known to be the path cover number of $T$. We say that a tree is $k$-NIM if, whenever an eigenvalue attains a multiplicity of $k − 1$ less than the maximum multiplicity, all other multiplicities are 1. $1$-NIM trees are known as NIM trees, and a characterization for NIM trees is already known. Here we provide a graph-theoretic characterization for $k$-NIM trees for each k ≥ 1, as well as count them. It follows from the characterization that k-NIM trees exist on n vertices only when $k = 1, 2, 3$. In case k = 3, the only 3-NIM trees are simple stars.

Download here

An In-Context Learning Agent for Formal Theorem-Proving

Conference on Language Modeling, 2024, 2024

We present an in-context learning agent for formal theorem-proving in environments like Lean and Coq. Current state-of-the-art models for the problem are finetuned on environment-specific proof data. By contrast, our approach, called COPRA, repeatedly asks a high-capacity, general-purpose large language model (GPT-4) to propose tactic applications from within a stateful backtracking search. Proposed tactics are executed in the underlying proof environment. Feedback from the execution is used to build the prompt for the next model query, along with selected information from the search history and lemmas retrieved from an external database. We evaluate our implementation of COPRA on the miniF2F benchmark for Lean and a set of Coq tasks from the CompCert project. On these benchmarks, COPRA significantly outperforms few-shot invocations of GPT-4. It also compares favorably against finetuning-based approaches, outperforming ReProver, a state-of-the-art finetuned approach for Lean, in terms of the pass@1 metric.

Download here

PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition

NeurIPS 2024, ICML AI for Math Workshop **Best Paper Award**, 2024

We present PutnamBench, a new multilingual benchmark for evaluating the ability of neural theorem-provers to solve competition mathematics problems. PutnamBench consists of 1697 hand-constructed formalizations of 640 theorems sourced from the William Lowell Putnam Mathematical Competition, the premier undergraduate-level mathematics competition in North America. All the theorems have formalizations in Lean 4 and Isabelle; a substantial subset also has Coq formalizations. Proving the theorems requires significant problem-solving ability and proficiency in a broad range of topics taught in undergraduate mathematics courses. We use PutnamBench to evaluate several established neural and symbolic theorem-provers. These approaches can only solve a handful of the PutnamBench problems, establishing the benchmark as a difficult open challenge for research on neural theorem-proving.

Download here

George Tsoukalas

Sitemap

Pages

Page Not Found

About me!

Archive Layout with Content

Posts by Category

Posts by Collection

CV

Page not in menu

Page Archive

Publications

Sitemap

Posts by Tags

Talk map

Talks and presentations

Teaching

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

What to expect from this blog

publications

$k$-NIM Trees: Characterization and Enumeration

An In-Context Learning Agent for Formal Theorem-Proving

PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition