Math starts with counting. Ten-fingered humans have 10 symbols to write numbers: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. But fingerless computers prefer binary, which is just 2 symbols: 0 and 1. We both know how to go from one number to the next. In the 19-th century, Giuseppe Peano wrote down the rules of arithmetic based on successors. The successor of 0 is 1. The succesor of 1 is 2 (= 10 in binary). Et cetera. So to put humans and computers on the same page, Lean uses Peano's postulates to count, add, subtract, and multiply the natural numbers. Check out the natural number game from the Lean community.
In the early 20th century, math had a crisis. Bertrand Russell pointed out a logical contradiction in the basis of math: sets. A set is a collection of abstract objects, like numbers. We can use curly brackets for a set, like the natural numbers
$$\htmlData{tooltip=Some prefer to exclude 0 from the naturals but no biggie.}{\mathbb{N} = \{0, 1, 2, ...\},} $$
and the symbol $\in$ for elements of the set, like
$$\htmlData{tooltip=50 is in the naturals but 3.14159... is not in.}{50 \in \mathbb{N}, \text{ but } \pi \notin \mathbb{N}.}$$
We want to define sets by properties, like
$$\{ x \in \mathbb{N}, x > 50\},$$
the set of all natural numbers bigger than 50.
Russell's paradox occurs when you let sets be elements of themselves.
The set of all sets that do not contain themselves,
$$X = \{ x \in \text{Sets}, x \notin x \}$$
creates a non-question: is $X \in X$? If true, then false, and vice versa, just like
This sentence is false.
cannot be true nor false.
Religion encounters the same phenomenon in defining an all-powerful god, when you ask
Can God create a rock too heavy for God to lift?
If all-powerful, both creating and lifting
should work, a contradiction.
The crisis in math was resolved by enhancing the concept of sets. One such enhancement is
Type Theory. Types are just like sets, but there is a hierarchy that prevents Russell's paradox.
$\mathbb{N}$ is a Type in Lean. And regular Types of Type(0) are elements of a higher type, Type(1).
The type level in parentheses is called a universe, which organizes mathematical objects to maintain
logical consistency. Other systems to avoid Russell's paradox are used in other formalization
languages. Coq gets it name from calculus of constructions
, and HOL from higher-order logic
.
Lean uses Type Theory, which may be preferred by computer scientists who have a similar notion in data types, like Bool and Int. Computers crash when programs encounter contradictions, so formal verification, a way of checking programs by strict type-checking, is akin to formal mathematics. Basically computer scientists and mathematicians have the same problem with the same solution: the results are not perfectly trustworthy unless formalized.
Formalization is slow. Human language has developed for thousands if not millions of years, and human mathematicians have an arsenal of shortcuts still to teach the computers. The ratio of formal language to informal language in programming math is known as the de Bruijn factor. Professional mathematicians operate on the premise that they could explain everything formally, with painstaking detail, but there is a trade-off between progress and perfection. The de Bruijn factor tends to be between 1 and 10,000. In the Liquid Tensor Experiment, mathematician Peter Scholze stated a result in roughly 5 lines of human language that took 80,000 lines of code to formalize. The Lean language uses tactics to prune the gap, putting meticulous code into single words that help computers catch up to human shortcuts. The #print theorem command unravels tactics into fully formal computer language in case checking is desired. Artificial intelligence may discover a mathematical proof which is too long for humans to check, even in human language, so the trade-off betweeen trust and testing will always persist.
The mathematical proofs that AI finds will only be valuable if they are understandable to humans, which can happen to various degrees. A classic calculator can display about 10 digits of approximation to $\pi$, which would take a human infant quite a long time to learn individually and be a waste of time to do by hand. Trust is collective, and the people who make calculators can be trusted to implement circuits that do as advertised, including negative consequences for competitors who get it wrong. One common practical value of higher math is cryptographic, allowing you to buy things online. Perfect code cannot be hacked, but hackability is an imperfect test contingent on the skill and resources of the hacker. At Mathgod we endeavor to reduce the gap between computer and human understanding, championing transparency and explainability. We propose a scale that rates the gap. $\pi$ is a number known to humans for centuries if not millenia, so calculators that spit out digits of $\pi$ are a 1 out of 10 on our scale. Fermat's Last Theorem, actively being formalized, was proved without computers, so once formalized it also is 1 out of 10. If some result in fact turns out impossible to formalize, perhaps requiring a new language or philosophical choice, we rank it 0 out of 10 on the Mathgod scale (not intended as an insult). Some modern results, such as the Four Color Theorem, were, like estimating many digits of pi, too tedious to achieve by hand, but their acceptance is not as universal as digits of $\pi$ because the algorithms are more complicated and not understood by as many people. We rate those 2 out of 10. We intend the scale to have plenty of room at the top and be human-centric, but have scientific basis. Using the de Bruijn factor, recency of proof, and breadth of acceptance and practical value, we propose to rate computer proofs of mathematical results, as much a social exercise as a numerical one, inviting debate.