} % <body of poem with lines separated by '\\'> % \end{poem} \newenvironment{question}[1]% {\begin{proof}[#1]}% {\renewcommand{\qed}{}\end{proof}} % Correct usage: \begin{question}{<name of question>} % <text of question> % \end{question} % Uncomment "\makeindex" if you need to recompile the index. % (See The LaTeX Manual for information on how to do so.) % The given index was modified by hand to get nice formatting... %\makeindex \begin{document} \frontmatter \title{ A Problem Course \\ in \\ Mathematical Logic \\ {\em Version 1.6\/} } \author{Stefan Bilaniuk} \date{????} \address{Department of Mathematics\newline \indent Trent University\newline \indent Peterborough, Ontario\newline \indent Canada K9J 7B8} \email{sbilaniuk@trentu.ca} \thanks{} \keywords{logic, computability, incompleteness} \subjclass{03} \begin{abstract} This is a text for a problem-oriented course on mathematical logic and computability.\\ \\ \\ \\ Copyright \copyright\ 1994-2003 Stefan Bilaniuk.\\ Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled ``GNU Free Documentation License''.\\ \\ \\ \\ This work was typeset with \LaTeX, using the \AmS-\LaTeX\ and \AmS Fonts packages of the American Mathematical Society. \end{abstract} \maketitle \tableofcontents % % Preface to "A Problem Course in Mathematical Logic" % \chapter*{Preface} This book is a free text intended to be the basis for a problem-oriented course(s) in mathematical logic and computability for students with some degree of mathematical sophistication. Parts I and II cover the basics of propositional and first-order logic respectively, Part III covers the basics of computability using Turing machines and recursive functions, and Part IV covers G\"odel's Incompleteness Theorems. They can be used in various ways for courses of various lengths and mixes of material. The author typically uses Parts I and II for a one-term course on mathematical logic, Part III for a one-term course on computability, and/or much of Part III together with Part IV for a one-term course on computability and incompleteness. In keeping with the modified Moore-method, this book supplies definitions, problems, and statements of results, along with some explanations, examples, and hints. The intent is for the students, individually or in groups, to learn the material by solving the problems and proving the results for themselves. Besides constructive criticism, it will probably be necessary for the instructor to supply further hints or direct the students to other sources from time to time. Just how this text is used will, of course, depend on the instructor and students in question. However, it is probably {\em not\/} appropriate for a conventional lecture-based course nor for a really large class. The material presented in this text is somewhat stripped-down. Various concepts and topics that are often covered in introductory mathematical logic and computability courses are given very short shrift or omitted entirely.\footnote{Future versions of both volumes may include more -- or less! -- material. Feel free to send suggestions, corrections, criticisms, and the like --- I'll feel free to ignore them or use them.} Instructors might consider having students do projects on additional material if they wish to to cover it. \subsection*{Prerequisites} The material in this text is largely self-contained, though some knowledge of (very basic) set theory and elementary number theory is assumed at several points. A few problems and examples draw on concepts from other parts of mathematics; students who are not already familiar with these should consult texts in the appropriate subjects for the necessary definitions. What is really needed to get anywhere with all of the material developed here is competence in handling abstraction and proofs, including proofs by induction. The experience provided by a rigorous introductory course in abstract algebra, analysis, or discrete mathematics ought to be sufficient. \subsection*{Chapter Dependencies} The following diagram indicates how the parts and chapters depend on one another, with the exception of a few isolated problems or subsections. \vspace{5mm} \begin{picture}(283,283) \put(47,257){\framebox(21,21){1}} \put(173,257){\framebox(21,21){10}} \put(5,215){\framebox(21,21){2}} \put(89,215){\framebox(21,21){3}} \put(131,215){\framebox(21,21){11}} \put(215,215){\framebox(21,21){12}} \put(47,173){\framebox(21,21){4}} \put(173,173){\framebox(21,21){13}} \put(47,131){\framebox(21,21){5}} \put(173,131){\framebox(21,21){14}} \put(5,89){\framebox(21,21){6}} \put(89,89){\framebox(21,21){7}} \put(173,89){\framebox(21,21){15}} \put(47,47){\framebox(21,21){8}} \put(131,47){\framebox(21,21){16}} \put(215,47){\framebox(21,21){17}} \put(47,5){\framebox(21,21){9}} \put(173,5){\framebox(21,21){18}} \put(0,168){\dashbox(115,115)[lb]{I}} \put(0,0){\dashbox(115,157)[lb]{II}} \put(126,126){\dashbox(115,157)[lb]{III}} \put(126,0){\dashbox(115,115)[lb]{IV}} \put(58,257){\vector(-2,-1){42}} % 1 -> 2 \put(58,257){\vector(2,-1){42}} % 1 -> 3 \put(184,257){\vector(-2,-1){42}} % 10 -> 11 \put(184,257){\vector(2,-1){42}} % 10 -> 12 \put(16,215){\vector(2,-1){42}} % 2 -> 4 \put(100,215){\vector(-2,-1){42}} % 3 -> 4 \put(100,215){\vector(0,-1){105}} % 3 ->7 \put(226,215){\vector(-2,-1){42}} % 12 -> 13 \put(184,173){\vector(0,-1){21}} % 13 -> 14 \put(58,131){\vector(-2,-1){42}} % 5 -> 6 \put(58,131){\vector(2,-1){42}} % 5 -> 7 \put(184,131){\vector(0,-1){21}} % 14 -> 15 \put(110,100){\vector(1,0){63}} % 7 -> 15 \put(16,89){\vector(2,-1){42}} % 6 -> 8 \put(100,89){\vector(-2,-1){42}} % 7 -> 8 \put(184,89){\vector(-2,-1){42}} % 15 -> 16 \put(184,89){\vector(2,-1){42}} % 15 -> 17 \put(58,47){\vector(0,-1){21}} % 8 -> 9 \put(142,47){\vector(2,-1){42}} % 16 -> 18 \put(226,47){\vector(-2,-1){42}} % 16 -> 18 \end{picture} \subsection*{Acknowledgements} Various people and institutions deserve some credit for this text. Foremost are all the people who developed the subject, even though almost no attempt has been made to give due credit to those who developed and refined the ideas, results, and proofs mentioned in this work. In mitigation, it would often be difficult to assign credit fairly because many people were involved, frequently having interacted in complicated ways. Those interested in who did what should start by consulting other texts or reference works covering similar material. In particular, a number of the key papers in the development of modern mathematical logic can be found in \cite{JvH:FFG} and \cite{DA:U}. Others who should be acknowledged include my teachers and colleagues; my students at Trent University who suffered, suffer, and will suffer through assorted versions of this text; Trent University and the taxpayers of Ontario, who paid my salary; Ohio University, where I spent my sabbatical in 1995--96; all the people and organizations who developed the software and hardware with which this book was prepared. Gregory H. Moore, whose mathematical logic course convinced me that I wanted to do the stuff, deserves particular mention. Any blame properly accrues to the author. \subsection*{Availability} The URL of the home page for {\em A Problem Course In Mathematical Logic\/}, with links to \LaTeX, PostScript, and Portable Document Format (pdf) files of the latest available release is: \begin{itemize} \item[] {\tt http://euclid.trentu.ca/math/sb/pcml/} \end{itemize} Please note that to typeset the \LaTeX\ source files, you will need the \AmS-\LaTeX\ and \AmS Fonts packages in addition to \LaTeX. If you have any problems, feel free to contact the author for assistance, preferably by e-mail: \begin{verse} Stefan Bilaniuk\\ Department of Mathematics\\ Trent University\\ Peterborough, Ontario\\ K9J 7B8\\ {\em e-mail\/}: {\tt sbilaniuk@trentu.ca}\\ \end{verse} \subsection*{Conditions} See the {\em GNU Free Documentation License\/} in Appendix~\ref{app:gnufdl} for what you can do with this text. The gist is that you are free to copy, distribute, and use it unchanged, but there are some restrictions on what you can do if you wish to make changes. If you wish to use this text in a manner not covered by the {\em GNU Free Documentation License\/}, please contact the author. \subsection*{Author's Opinion} It's not great, but the price is right! % % Introduction to "A Problem Course in Mathematical Logic" % \chapter*{Introduction} What sets mathematics aside from other disciplines is its reliance on proof as the principal technique for determining truth, where science, for example, relies on (carefully analyzed) experience. So what is a proof? Practically speaking, a proof is any reasoned argument accepted as such by other mathematicians.\footnote{If you are not a mathematician, gentle reader, you are hereby temporarily promoted.} A more precise definition is needed, however, if one wishes to discover what mathematical reasoning can -- or cannot -- accomplish in principle. This is one of the reasons for studying mathematical logic, which is also pursued for its own sake and in order to find new tools to use in the rest of mathematics and in related fields. In any case, mathematical logic \index{mathematical logic} \index{logic mathematical} is concerned with formalizing and analyzing the kinds of reasoning used in the rest of mathematics. The point of mathematical logic is not to try to do mathematics {\em per se\/} completely formally --- the practical problems involved in doing so are usually such as to make this an exercise in frustration --- but to study formal logical systems as mathematical objects in their own right in order to (informally!) prove things about them. For this reason, the formal systems developed in this part and the next are optimized to be easy to prove things about, rather than to be easy to use. Natural deductive \index{logic natural deductive} \index{natural deductive logic} systems such as those developed by philosophers to formalize logical reasoning are equally capable in principle and much easier to actually use, but harder to prove things about. Part of the problem with formalizing mathematical reasoning is the necessity of precisely specifying the language(s) in which it is to be done. The natural languages\index{language natural} spoken by humans won't do: they are so complex and continually changing as to be impossible to pin down completely. By contrast,\index{language formal} the languages which underly formal logical systems are, like programming languages, rigidly defined but much simpler and less flexible than natural languages. A formal logical system also requires the careful specification of the allowable rules of reasoning, plus some notion of how to interpret statements in the underlying language and determine their truth. The real fun lies in the relationship between interpretation of statements, truth, and reasoning. The {\em de facto\/} standard for formalizing mathematical systems is first-order logic, \index{logic first-order} \index{first-order logic} and the main thrust of this text is studying it with a view to understanding some of its basic features and limitations. More specifically, Part I of this text is concerned with propositional logic\index{logic propositional}\index{propositional logic}, developed here as a warm-up for the development of first-order logic proper in Part II. Propositional logic \index{logic propositional} \index{propositional logic} attempts to make precise the relationships that certain connectives like {\em not\/},\index{not} {\em and\/},\index{and} {\em or\/},\index{or} and {\em if \dots then\/}\index{if \dots then} are used to express in English. While it has uses, propositional logic is not powerful enough to formalize most mathematical discourse. For one thing, it cannot handle the concepts expressed by the quantifiers {\em all\/}\index{all} and {\em there is\/}\index{there is}. First-order logic \index{logic first-order} \index{first-order logic} adds these notions to those propositional logic handles, and suffices, in principle, to formalize most mathematical reasoning. The greater flexibility and power of first-order logic makes it a good deal more complicated to work with, both in syntax and semantics. However, a number of results about propositional logic carry over to first-order logic with little change. Given that first-order logic can be used to formalize most mathematical reasoning it provides a natural context in which to ask whether such reasoning can be automated. This question is the {\em Entscheidungsproblem\/}\footnote{ {\em Entscheidungsproblem\/} $\equiv$ decision problem. \index{Entscheidungsproblem} \index{decision problem} }: \begin{question}{Entscheidungsproblem} \index{Entscheidungsproblem} Given a set $\Sigma$ of hypotheses and some statement $\varphi$, is there an effective method for determining whether or not the hypotheses in $\Sigma$ suffice to prove $\varphi$? \end{question} Historically, this question arose out of David Hilbert's scheme to secure the foundations of mathematics by axiomatizing mathematics in first-order logic, showing that the axioms in question do not give rise to any contradictions, and that they suffice to prove or disprove every statement (which is where the Entscheidungsproblem comes in). If the answer to the Entscheidungsproblem were ``yes'' in general, the effective method(s) in question might put mathematicians out of business\dots Of course, the statement of the problem begs the question of what ``effective method'' is supposed to mean. In the course of trying to find a suitable formalization of the notion of ``effective method'', mathematicians developed several different abstract models of computation in the 1930's, including recursive functions, $\lambda$-calculus, Turing machines, and grammars\footnote{The development of the theory of computation thus actually began before the development of electronic digital computers. In fact, the computers and programming languages we use today owe much to the abstract models of computation which preceded them. For example, the standard von Neumann architecture for digital computers was inspired by Turing machines and the programming language LISP borrows much of its structure from $\lambda$-calculus.}. Although these models are very different from each other in spirit and formal definition, it turned out that they were all essentially equivalent in what they could do. This suggested the (empirical, not mathematical!) principle: \begin{question}{Church's Thesis} \index{Church's Thesis} A function is effectively computable in principle in the real world if and only if it is computable by (any) one of the abstract models mentioned above. \end{question} Part III explores two of the standard formalizations of the notion of ``effective method'', namely Turing machines \index{Turing machine}\index{machine Turing} and recursive functions\index{recursive functions}\index{functions recursive}, showing, among other things, that these two formalizations are actually equivalent. Part IV then uses the tools developed in Parts II ands III to answer the Entscheidungsproblem for first-order logic. The answer to the general problem is negative, by the way, though decision procedures do exist for propositional logic, and for some particular first-order languages and sets of hypotheses in these languages. \subsection*{Prerequisites} In principle, not much is needed by way of prior mathematical knowledge to define and prove the basic facts about propositional logic and computability. Some knowledge of the natural numbers and a little set theory suffices; the former will be assumed and the latter is very briefly summarized in Appendix~\ref{ap:sets}. (\cite{JH:OST} is a good introduction to basic set theory in a style not unlike this book's; \cite{PH:NST} is a good one in a more conventional mode.) Competence in handling abstraction and proofs, especially proofs by induction, will be needed, however. In principle, the experience provided by a rigorous introductory course in algebra, analysis, or discrete mathematics ought to be sufficient. \subsection*{Other Sources and Further Reading} \cite{BE:LPL}, \cite{DA:CU}, \cite{HE:MIL}, \cite{JM:IML}, and \cite{YM:CML} are texts which go over large parts of the material covered here (and often much more besides), while \cite{JB:HML} and \cite{CK:MT} are good references for more advanced material. A number of the key papers in the development of modern mathematical logic and related topics can be found in \cite{JvH:FFG} and \cite{DA:U}. Entertaining accounts of some related topics may be found in \cite{DH:GEB}, \cite{RP:ENM} and\cite{RP:SOTM}. Those interested in natural deductive systems might try \cite{MB:LB}, which has a very clean presentation. \mainmatter % % Part I of "A Problem Course in Mathematical Logic" % \part{Propositional Logic} % % First chapter of "A Problem Course in Mathematical Logic" % \chapter{Language} \label{ch:one} Propositional logic\index{propositional logic} \index{logic propositional} (sometimes called sentential\index{sentential logic}\index{logic sentential} or predicate logic\index{predicate logic}\index{logic predicate}) attempts to formalize the reasoning that can be done with connectives like {\em not\/}, {\em and\/}, {\em or\/}, and {\em if \dots then\/}. We will define the formal language of propositional logic\index{language propositional}, $\mathcal{L}_P$\index{$\mathcal{L}_P$}, by specifying its symbols and rules for assembling these symbols into the formulas of the language. \begin{defn} \label{d:symb} \index{symbols} The {\em symbols\/} of $\mathcal{L}_P$ are: \begin{enumerate} \item Parentheses: ( and ). \index{parentheses} \index{$($} \index{$)$} \item Connectives: $\lnot$ and $\to$. \index{connectives} \index{$\lnot$} \index{$\to$} \item Atomic formulas: $A_0$, $A_1$, $A_2$, \dots, $A_n$, \dots \index{formulas atomic} \index{atomic formulas} \index{$A_n$} \end{enumerate} \end{defn} We still need to specify the ways in which the symbols of $\mathcal{L}_P$ can be put together. \begin{defn} \label{d:form} \index{formula} The {\em formulas\/} of $\mathcal{L}_P$ are those finite sequences or strings of the symbols given in Definition~\ref{d:symb} which satisfy the following rules: \begin{enumerate} \item Every atomic formula is a formula. \item If $\alpha$ is a formula, then $(\lnot \alpha)$ is a formula. \item If $\alpha$ and $\beta$ are formulas, then $(\alpha \to \beta)$ is a formula. \item No other sequence of symbols is a formula. \end{enumerate} \end{defn} We will often use lower-case Greek characters\index{Greek characters} to represent formulas, as we did in the definition above, and upper-case Greek characters to represent sets of formulas.\footnote{The Greek alphabet is given in Appendix~\ref{ap:greek}.} All formulas in Chapters~\ref{ch:one}--\ref{ch:four} will be assumed to be formulas of $\mathcal{L}_P$ unless stated otherwise. What do these definitions mean? The parentheses are just punctuation:\index{punctuation} their only purpose is to group other symbols together. (One could get by without them; see Problem~\ref{p:pn}.) $\lnot$\index{$\lnot$} and $\to$\index{$\to$} are supposed to represent the connectives\index{connectives} {\em not\/}\index{not} and {\em if \dots then\/}\index{if \dots then} respectively. The atomic formulas\index{atomic formulas}\index{formulas atomic}, $A_0$, $A_1$, \dots,\index{$A_n$} are meant to represent statements that cannot be broken down any further using our connectives, such as ``The moon is made of cheese.'' Thus, one might translate the the English sentence ``If the moon is red, it is not made of cheese'' into the formula $(A_0 \to (\lnot A_1))$ of $\mathcal{L}_P$ by using $A_0$ to represent ``The moon is red'' and $A_1$ to represent ``The moon is made of cheese.'' Note that the truth of the formula depends on the interpretation of the atomic sentences which appear in it. Using the interpretations just given of $A_0$ and $A_1$, the formula $(A_0 \to (\lnot A_1))$ is true, but if we instead use $A_0$ and $A_1$ to interpret ``My telephone is ringing'' and ``Someone is calling me'', respectively, $(A_0 \to (\lnot A_1))$ is false. Definition~\ref{d:form} says that that every atomic formula is a formula and every other formula is built from shorter formulas using the connectives and parentheses in particular ways. For example, $A_{1123}$, $(A_2 \to (\lnot A_0))$, and $(((\lnot A_1) \to (A_1 \to A_7) ) \to A_7)$ are all formulas, but $X_3$, $(A_5)$, $()\lnot A_{41}$, $A_5 \to A_7$, and $(A_2 \to (\lnot A_0)$ are not. \begin{prob} \label{p:one1} Why are the following {\em not\/} formulas of $\mathcal{L}_P$? There might be more than one reason\dots \begin{enumerate} \item $A_{-56}$ \item $(Y \to A)$ \item $(A_7 \leftarrow A_4)$ \item $A_7 \to (\lnot A_5))$ \item $(A_8 A_9 \to A_{1043998}$ \item $(((\lnot A_1) \to (A_\ell \to A_7) \to A_7)$ \end{enumerate} \end{prob} \begin{prob} \label{p:lrp} Show that every formula of $\mathcal{L}_P$ has the same number of left parentheses as it has of right parentheses. \end{prob} \begin{prob} \label{p:one3} Suppose $\alpha$ is any formula of $\mathcal{L}_P$. Let $\ell(\alpha)$ be the length of $\alpha$ as a sequence of symbols and let $p(\alpha)$ be the number of parentheses (counting both left and right parentheses) in $\alpha$. What are the minimum and maximum values of $p(\alpha) / \ell(\alpha)$? \end{prob} \begin{prob} \label{p:one4} Suppose $\alpha$ is any formula of $\mathcal{L}_P$. Let $s(\alpha)$ be the number of atomic formulas in $\alpha$ (counting repetitions) and let $c(\alpha)$ be the number of occurrences of $\to$ in $\alpha$. Show that $s(\alpha) = c(\alpha) + 1$. \end{prob} \begin{prob} \label{p:lof} What are the possible lengths of formulas of $\mathcal{L}_P$? Prove it. \end{prob} \begin{prob} \label{p:pn} \index{parentheses doing without} Find a way for doing without parentheses or other punctuation symbols in defining a formal language for propositional logic. \end{prob} \begin{prop} \label{p:foc} Show that the set of formulas of $\mathcal{L}_P$ is countable. \end{prop} \subsection*{Informal Conventions} At first glance, $\mathcal{L}_P$ may not seem capable of breaking down English sentences with connectives other than {\em not\/} and {\em if \dots then\/}. However, the sense of many other connectives\index{connectives} can be captured by these two by using suitable circumlocutions. We will use the symbols $\land$\index{$\land$}, $\lor$\index{$\lor$}, and $\fromto$\index{$\fromto$} to represent {\em and\/}\index{and}, {\em or\/}\index{or},\footnote{We will use {\em or\/} inclusively, so that ``$A$ or $B$'' is still true if both of $A$ and $B$ are true.} and {\em if and only if\/}\index{if and only if} respectively. Since they are not among the symbols of $\mathcal{L}_P$, we will use them as abbreviations\index{abbreviations} for certain constructions involving only $\lnot$ and $\to$. Namely, \begin{itemize} \item $(\alpha \land \beta)$ is short for $(\lnot (\alpha \to (\lnot \beta)))$, \item $(\alpha \lor \beta)$ is short for $( (\lnot \alpha) \to \beta)$, and \item $(\alpha \fromto \beta)$ is short for $((\alpha \to \beta) \land (\beta \to \alpha))$. \end{itemize} Interpreting $A_0$ and $A_1$ as before, for example, one could translate the English sentence ``The moon is red and made of cheese'' as $(A_0 \land A_1)$. (Of course this is really $(\lnot (A_0 \to (\lnot A_1)))$, {\em i.e.\/} ``It is not the case that if the moon is green, it is not made of cheese.'') $\land$, $\lor$, and $\fromto$ were not included among the official symbols of $\mathcal{L}_P$ partly because we can get by without them and partly because leaving them out makes it easier to prove things about $\mathcal{L}_P$. \begin{prob} \label{p:one8} Take a couple of English sentences with several connectives and translate them into formulas of $\mathcal{L}_P$. You may use $\land$, $\lor$, and $\fromto$ if appropriate. \end{prob} \begin{prob} \label{p:one9} Write out $((\alpha \lor \beta) \land (\beta \to \alpha))$ using only $\lnot$ and $\to$. \end{prob} For the sake of readability, we will occasionally use some informal conventions that let us get away with writing fewer parentheses:\index{parentheses conventions}\index{conventions, parentheses} \begin{itemize} \item We will usually drop the outermost parentheses in a formula, writing $\alpha \to \beta$ instead of $(\alpha \to \beta)$ and $\lnot \alpha$ instead of $(\lnot \alpha)$. \item We will let $\lnot$ take precedence over $\to$ when parentheses are missing, so $\lnot \alpha \to \beta$ is short for $((\lnot\alpha) \to \beta)$, and fit the informal connectives into this scheme by letting the order of precedence be $\lnot$, $\land$, $\lor$, $\to$, and $\fromto$. \item Finally, we will group repetitions of $\to$, $\lor$, $\land$, or $\fromto$ to the right when parentheses are missing, so $\alpha \to \beta \to \gamma$ is short for $(\alpha \to (\beta \to \gamma))$. \end{itemize} Just like formulas using $\lor$, $\land$, or $\lnot$, formulas in which parentheses have been omitted as above are not official formulas of $\mathcal{L}_P$, they are convenient abbreviations for official formulas of $\mathcal{L}_P$. Note that a precedent for the precedence convention can be found in the way that $\cdot$ commonly takes precedence over $+$ in writing arithmetic formulas. \begin{prob} \label{p:one10} Write out $\lnot (\alpha \fromto \lnot \delta ) \land \beta \to \lnot \alpha \to \gamma$ first with the missing parentheses included and then as an official formula of $\mathcal{L}_P$. \end{prob} The following notion will be needed later on. \begin{defn} \label{d:subf} \index{subformula} \index{$\mathcal{S}$} Suppose $\varphi$ is a formula of $\mathcal{L}_P$. The set of {\em subformulas\/} of $\varphi$, $\mathcal{S}(\varphi)$, is defined as follows. \begin{enumerate} \item If $\varphi$ is an atomic formula, then $\mathcal{S}(\varphi) = \{ \varphi \}$. \item If $\varphi$ is $(\lnot \alpha)$, then $\mathcal{S}(\varphi) = \mathcal{S}(\alpha) \cup \{ (\lnot\alpha) \}$. \item If $\varphi$ is $(\alpha \to \beta)$, then $\mathcal{S}(\varphi) = \mathcal{S}(\alpha) \cup \mathcal{S}(\beta) \cup \{ (\alpha \to \beta) \}$. \end{enumerate} \end{defn} For example, if $\varphi$ is $(((\lnot A_1) \to A_7) \to (A_8 \to A_1))$, then $\mathcal{S}(\varphi)$ includes $A_1$, $A_7$, $A_8$, $(\lnot A_1)$, $(A_8 \to A_1)$, $((\lnot A_1) \to A_7)$, and $(((\lnot A_1) \to A_7) \to (A_8 \to A_1))$ itself. Note that if you write out a formula with all the official parentheses, then the subformulas are just the parts of the formula enclosed by matching parentheses, plus the atomic formulas. In particular, every formula is a subformula of itself. Note that some subformulas of formulas involving our informal abbreviations $\lor$, $\land$, or $\fromto$ will be most conveniently written using these abbreviations. For example, if $\psi$ is $A_4 \to A_1 \lor A_4$, then \[ \mathcal{S}(\psi) = \{\, A_1,\, A_4,\, (\lnot A_1),\, (A_1 \lor A_4),\, (A_4 \to (A_1 \lor A_4)) \,\}\, . \] (As an exercise, where did $(\lnot A_1)$ come from?) \begin{prob} \label{p:one11} Find all the subformulas of each of the following formulas. \begin{enumerate} \item $(\lnot ((\lnot A_{56}) \to A_{56}))$ \item $A_9 \to A_8 \to \lnot (A_{78} \to \lnot \lnot A_0)$ \item $\lnot A_0 \land \lnot A_1 \fromto \lnot (A_0 \lor A_1)$ \end{enumerate} \end{prob} \subsection*{Unique Readability} The slightly paranoid --- er, truly rigorous --- might ask whether Definitions \ref{d:symb} and \ref{d:form} actually ensure that the formulas of $\mathcal{L}_P$ are unambiguous, {\em i.e.\/} can be read in only one way according to the rules given in Definition \ref{d:form}. To actually prove this one must add to Definition \ref{d:symb} the requirement that all the symbols of $\mathcal{L}_P$ are distinct and that no symbol is a subsequence of any other symbol. With this addition, one can prove the following: \begin{thm}[Unique Readability Theorem] \label{t:ur} \index{Unique Readability Theorem} \index{formula unique readability} \index{unique readability of formulas} A formula of $\mathcal{L}_P$ must satisfy exactly one of conditions 1--3 in Definition \ref{d:form}. \end{thm} % % Second chapter of "A Problem Course in Mathematical Logic" % \chapter{Truth Assignments} \label{ch:two} Whether a given formula $\varphi$ of $\mathcal{L}_P$ is true or false usually depends on how we interpret the atomic formulas which appear in $\varphi$. For example, if $\varphi$ is the atomic formula $A_2$ and we interpret it as ``$2 + 2 = 4$'', it is true, but if we interpret it as ``The moon is made of cheese'', it is false. Since we don't want to commit ourselves to a single interpretation --- after all, we're really interested in general logical relationships --- we will define how any assignment of {\em truth values\/}\index{truth values} $T$\index{$T$} (``true'') and $F$\index{$F$} (``false'') to atomic formulas of $\mathcal{L}_P$ can be extended to all other formulas. We will also get a reasonable definition of what it means for a formula of $\mathcal{L}_P$ to follow logically from other formulas. \begin{defn} \label{d:tras} \index{truth assignment} \index{assignment truth} A {\em truth assignment\/} is a function $v$ whose domain is the set of all formulas of $\mathcal{L}_P$ and whose range is the set $\{ T, F \}$ of truth values, such that: \begin{enumerate} \item $v(A_n)$ is defined for every atomic formula $A_n$. \item For any formula $\alpha$, \begin{displaymath} v(\, (\lnot\alpha)\, ) = \begin{cases} T & \text{if $v(\alpha) = F$} \\ F & \text{if $v(\alpha) = T$.} \end{cases} \end{displaymath} \item For any formulas $\alpha$ and $\beta$, \begin{displaymath} v(\, (\alpha \to \beta)\, ) = \begin{cases} F & \text{if $v(\alpha)=T$ and $v(\beta)=F$} \\ T & \text{otherwise.} \end{cases} \end{displaymath} \end{enumerate} \end{defn} Given interpretations of all the atomic formulas of $\mathcal{L}_P$, the corresponding truth assignment would give each atomic formula representing a true statement the value $T$ and every atomic formula representing a false statement the value $F$. Note that we have not defined how to handle any truth values besides $T$ and $F$ in $\mathcal{L}_P$. Logics with other truth values have uses, but are not relevant in most of mathematics. For an example of how non-atomic formulas are given truth values on the basis of the truth values given to their components, suppose $v$ is a truth assignment such that $v(A_0) = T$ and $v(A_1) = F$. Then $v(\, ((\lnot A_1) \to (A_0 \to A_1))\, )$ is determined from $v(\, (\lnot A_1)\, )$ and $v(\, (A_0 \to A_1)\, )$ according to clause 3 of Definition~\ref{d:tras}. In turn, $v(\, (\lnot A_1)\, )$ is determined from of $v(A_1)$ according to clause 2 and $v(\, (A_0 \to A_1)\, )$ is determined from $v(A_1)$ and $v(A_0)$ according to clause 3. Finally, by clause 1, our truth assignment must be defined for all atomic formulas to begin with; in this case, $v(A_0) = T$ and $v(A_1) = F$. Thus $v(\, (\lnot A_1)\, ) = T$ and $v(\, (A_0 \to A_1)\, ) = F$, so $v(\, ((\lnot A_1) \to (A_0 \to A_1))\, ) = F$. A convenient way to write out the determination of the truth value of a formula on a given truth assignment is to use a {\em truth table\/}\index{truth table}: list all the subformulas of the given formula across the top in order of length and then fill in their truth values on the bottom from left to right. Except for the atomic formulas at the extreme left, the truth value of each subformula will depend on the truth values of the subformulas to its left. For the example above, one gets something like: \[\begin{array}{c|c|c|c|c} A_0 & A_1 & (\lnot A_1) & (A_0 \to A_1) & (\lnot A_1) \to (A_0 \to A_1)) \\ \hline T & F & T & F & F \end{array}\] \begin{prob} \label{p:two1} Suppose $v$ is a truth assignment such that $v(A_0) = v(A_2) = T$ and $v(A_1) = v(A_3) = F$. Find $v(\alpha)$ if $\alpha$ is: \begin{enumerate} \item $\lnot A_2 \to \lnot A_3$ \item $\lnot A_2 \to A_3$ \item $\lnot ( \lnot A_0 \to A_1)$ \item $A_0 \lor A_1$ \item $A_0 \land A_1$ \end{enumerate} \end{prob} The use of finite truth tables to determine what truth value a particular truth assignment gives a particular formula is justified by the following proposition, which asserts that only the truth values of the atomic sentences in the formula matter. \begin{prop} \label{p:tav} Suppose $\delta$ is any formula and $u$ and $v$ are truth assignments such that $u(A_n) = v(A_n)$ for all atomic formulas $A_n$ which occur in $\delta$. Then $u(\delta) = v(\delta)$. \end{prop} \begin{cor} \label{c:tav} Suppose $u$ and $v$ are truth assignments such that $u(A_n) = v(A_n)$ for every atomic formula $A_n$. Then $u = v$, {\em i.e.\/} $u(\varphi) = v(\varphi)$ for every formula $\varphi$. \end{cor} \begin{prop} \label{p:tif} If $\alpha$ and $\beta$ are formulas and $v$ is a truth assignment, then: \begin{enumerate} \item $v(\lnot \alpha) = T$ if and only if $v(\alpha) = F$. \item $v(\alpha \to \beta) = T$ if and only if $v(\beta) = T$ whenever $v(\alpha) = T$; \item $v(\alpha \land \beta) = T$ if and only if $v(\alpha) = T$ and $v(\beta) = T$; \item $v(\alpha \lor \beta) = T$ if and only if $v(\alpha) = T$ or $v(\beta) = T$; and \item $v(\alpha \fromto \beta) = T$ if and only if $v(\alpha) = v(\beta)$. \end{enumerate} \end{prop} Truth tables\index{truth table} are often used even when the formula in question is not broken down all the way into atomic formulas. For example, if $\alpha$ and $\beta$ are any formulas and we know that $\alpha$ is true but $\beta$ is false, then the truth of $(\alpha \to (\lnot \beta))$ can be determined by means of the following table: \[ \begin{array}{c|c|c|c} \alpha & \beta & (\lnot \beta) & (\alpha \to (\lnot \beta)) \\ \hline T & F & T & T \end{array} \] \begin{defn} If $v$ is a truth assignment and $\varphi$ is a formula, we will often say that $v$ {\em satisfies\/}\index{satisfies} $\varphi$ if $v(\varphi) = T$. Similarly, if $\Sigma$ is a set of formulas, we will often say that $v$ satisfies $\Sigma$ if $v(\sigma) = T$ for every $\sigma \in \Sigma$. We will say that $\varphi$ (respectively, $\Sigma$) is {\em satisfiable\/}\index{satisfiable} if there is some truth assignment which satisfies it. \end{defn} \begin{defn} \label{d:taco} \index{tautology} \index{contradiction} A formula $\varphi$ is a {\em tautology\/} if it is satisfied by every truth assignment. A formula $\psi$ is a {\em contradiction\/} if there is no truth assignment which satisfies it. \end{defn} For example, $(A_4 \to A_4)$ is a tautology while $(\lnot (A_4 \to A_4))$ is a contradiction, and $A_4$ is a formula which is neither. One can check whether a given formula is a tautology, contradiction, or neither, by grinding out a complete truth table\index{truth table} for it, with a separate line for each possible assignment of truth values to the atomic subformulas of the formula. For $A_3 \to (A_4 \to A_3)$ this gives \index{truth table} \[\begin{array}{c|c|c|c} A_3 & A_4 & A_4 \to A_3 & A_3 \to (A_4 \to A_3) \\ \hline T & T & T & T \\ T & F & T & T \\ F & T & F & T \\ F & F & T & T \end{array}\] so $A_3 \to (A_4 \to A_3)$ is a tautology. Note that, by Proposition \ref{p:tav}, we need only consider the possible truth values of the atomic sentences which actually occur in a given formula. One can often use truth tables\index{truth table} to determine whether a given formula is a tautology or a contradiction even when it is not broken down all the way into atomic formulas. For example, if $\alpha$ is any formula, then the table \[\begin{array}{c|c|c} \alpha & (\alpha \to \alpha) & (\lnot (\alpha \to \alpha)) \\ \hline T & T & F \\ F & T & F \end{array}\] demonstrates that $(\lnot (\alpha \to \alpha))$ is a contradiction, no matter which formula of $\mathcal{L}_P$ $\alpha$ actually is. \begin{prop} \label{p:two5} If $\alpha$ is any formula, then $((\lnot \alpha) \lor \alpha)$ is a tautology and $((\lnot \alpha) \land \alpha)$ is a contradiction. \end{prop} \begin{prop} \label{p:two6} A formula $\beta$ is a tautology if and only if $\lnot \beta$ is a contradiction. \end{prop} After all this warmup, we are finally in a position to define what it means for one formula to follow logically from other formulas. \begin{defn} \label{d:imp} \index{implies} \index{$\models$} \index{$\nmodels$} A set of formulas $\Sigma$ {\em implies\/} a formula $\varphi$, written as $\Sigma \models \varphi$, if every truth assignment $v$ which satisfies $\Sigma$ also satisfies $\varphi$. We will often write $\Sigma \nmodels \varphi$ if it is not the case that $\Sigma \models \varphi$. In the case where $\Sigma$ is empty, we will usually write $\models \varphi$ instead of $\emptyset \models \varphi$. Similarly, if $\Delta$ and $\Gamma$ are sets of formulas, then $\Delta$ {\em implies\/} $\Gamma$, written as $\Delta \models \Gamma$, if every truth assignment $v$ which satisfies $\Delta$ also satisfies $\Gamma$. \end{defn} For example, $\{\, A_3 ,\, (A_3 \to \lnot A_7) \,\} \models \lnot A_7$, but $\{\, A_8 ,\, (A_5 \to A_8) \,\} \nmodels A_5$. (There is a truth assignment which makes $A_8$ and $A_5 \to A_8$ true, but $A_5$ false.) Note that a formula $\varphi$ is a tautology if and only if $\models \varphi$, and a contradiction if and only if $\models (\lnot \varphi)$. \begin{prop} \label{p:two6a} If $\Gamma$ and $\Sigma$ are sets of formulas such that $\Gamma \subseteq \Sigma$, then $\Sigma \models \Gamma$. \end{prop} \begin{prob} \label{p:two7} How can one check whether or not $\Sigma \models \varphi$ for a formula $\varphi$ and a finite set of formulas $\Sigma$? \end{prob} \begin{prop} \label{p:moto} Suppose $\Sigma$ is a set of formulas and $\psi$ and $\rho$ are formulas. Then $\Sigma \cup \{\psi\} \models \rho$ if and only if $\Sigma \models \psi \to \rho$. \end{prop} \begin{prop} \label{p:sanc} A set of formulas $\Sigma$ is satisfiable if and only if there is no contradiction $\chi$ such that $\Sigma \models \chi$. \end{prop} % % Third chapter of "A Problem Course in Mathematical Logic" % \chapter{Deductions} \label{ch:three} In this chapter we develop a way of defining logical implication that does not rely on any notion of truth, but only on manipulating sequences of formulas, namely formal proofs or deductions. (Of course, any way of defining logical implication had better be compatible with that given in Chapter \ref{ch:two}.) To define these, we first specify a suitable set of formulas which we can use freely as premisses in deductions. \begin{defn} \index{axiom schema} \index{axiom} \index{A1} \index{A2} \index{A3} The three {\em axiom schema\/} of $\mathcal{L}_P$ are: \begin{description} \item[A1] $(\alpha \to (\beta \to \alpha))$ \item[A2] $((\alpha \to (\beta \to \gamma)) \to ((\alpha \to \beta) \to (\alpha \to \gamma)))$ \item[A3] $(((\lnot\beta)\to (\lnot\alpha)) \to ( ((\lnot\beta) \to \alpha) \to \beta ) )$. \end{description} Replacing $\alpha$, $\beta$, and $\gamma$ by particular formulas of $\mathcal{L}_P$ in any one of the schemas A1, A2, or A3 gives an {\em axiom\/} of $\mathcal{L}_P$. \end{defn} For example, $(A_1 \to (A_4 \to A_1))$ is an axiom, being an instance of axiom schema A1, but $(A_9 \to (\lnot A_0))$ is not an axiom as it is not the instance of any of the schema. As had better be the case, every axiom is always true: \begin{prop} \label{p:axta} Every axiom of $\mathcal{L}_P$ is a tautology. \end{prop} Second, we specify our one (and only!) rule of inference\index{rule of inference}\index{inference rule}.\footnote{Natural deductive systems, which are usually more convenient to actually execute deductions in than the system being developed here, compensate for having few or no axioms by having many rules of inference.} \begin{defn}[Modus Ponens] \index{Modus Ponens} Given the formulas $\varphi$ and $(\varphi \to \psi)$, one may infer $\psi$. \end{defn} We will usually refer to Modus Ponens by its initials, MP.\index{MP} Like any rule of inference worth its salt, MP preserves truth. \begin{prop} \label{p:snd} Suppose $\varphi$ and $\psi$ are formulas. Then $\{\, \varphi ,\, (\varphi \to \psi) \,\} \models \psi$. \end{prop} With axioms and a rule of inference in hand, we can execute formal proofs in $\mathcal{L}_P$. \begin{defn} \label{d:ded} \index{deduction} \index{proof} Let $\Sigma$ be a set of formulas. A {\em deduction\/} or {\em proof\/} from $\Sigma$ in $\mathcal{L}_P$ is a finite sequence $\varphi_1 \varphi_2 \dots \varphi_n$ of formulas such that for each $k \le n$, \begin{enumerate} \item $\varphi_k$ is an axiom, or \item $\varphi_k \in \Sigma$, or \item there are $i,j < k$ such that $\varphi_k$ follows from $\varphi_i$ and $\varphi_j$ by MP. \end{enumerate} A formula of $\Sigma$ appearing in the deduction is called a {\em premiss\/}\index{premiss}. $\Sigma$ {\em proves\/}\index{proves} a formula $\alpha$, written as $\Sigma \proves \alpha$,\index{$\proves$} if $\alpha$ is the last formula of a deduction from $\Sigma$. We'll usually write $\proves \alpha$ for $\emptyset \proves \alpha$, and take $\Sigma \proves \Delta$ to mean that $\Sigma \proves \delta$ for every formula $\delta \in \Delta$. \end{defn} In order to make it easier to verify that an alleged deduction really is one, we will number the formulas in a deduction, write them out in order on separate lines, and give a justification for each formula. Like the additional connectives and conventions for dropping parentheses in Chapter \ref{ch:one}, this is not officially a part of the definition of a deduction. \begin{exmp} \label{e:one} Let us show that $\proves \varphi \to \varphi$. \begin{enumerate} \item $(\varphi \to ((\varphi \to \varphi) \to \varphi)) \to ((\varphi \to (\varphi \to \varphi)) \to (\varphi \to \varphi))$ \hfill A2 \item $\varphi \to ((\varphi \to \varphi) \to \varphi)$ \hfill A1 \item $(\varphi \to (\varphi \to \varphi)) \to (\varphi \to \varphi)$ \hfill 1,2 MP \item $\varphi \to (\varphi \to \varphi)$ \hfill A1 \item $\varphi \to \varphi$ \hfill 3,4 MP \end{enumerate} Hence $\proves \varphi \to \varphi$, as desired. Note that indication of the formulas from which formulas 3 and 5 beside the mentions of MP. \end{exmp} \begin{exmp} \label{e:two} Let us show that $\{\, \alpha \to \beta,\, \beta \to \gamma \,\} \proves \alpha \to \gamma$. \begin{enumerate} \item $(\beta \to \gamma) \to (\alpha \to (\beta \to \gamma))$ \hfill A1 \item $\beta \to \gamma$ \hfill Premiss \item $\alpha \to (\beta \to \gamma)$ \hfill 1,2 MP \item $(\alpha \to (\beta \to \gamma)) \to ((\alpha \to \beta) \to (\alpha \to \gamma))$ \hfill A2 \item $(\alpha \to \beta) \to (\alpha \to \gamma)$ \hfill 4,3 MP \item $\alpha \to \beta$ \hfill Premiss \item $\alpha \to \gamma$ \hfill 5,6 MP \end{enumerate} Hence $\{\, \alpha \to \beta,\, \beta \to \gamma \,\} \proves \alpha \to \gamma$, as desired. \end{exmp} It is frequently convenient to save time and effort by simply referring to a deduction one has already done instead of writing it again as part of another deduction. If you do so, please make sure you appeal only to deductions that have already been carried out. \begin{exmp} \label{e:three} Let us show that $\proves (\lnot \alpha \to \alpha) \to \alpha$. \begin{enumerate} \item $( \lnot \alpha \to \lnot \alpha) \to (( \lnot \alpha \to \alpha ) \to \alpha )$ \hfill A3 \item $\lnot\alpha \to \lnot\alpha$ \hfill Example~\ref{e:one} \item $(\lnot \alpha \to \alpha) \to \alpha$ \hfill 1,2 MP \end{enumerate} Hence $\proves (\lnot \alpha \to \alpha) \to \alpha$, as desired. To be completely formal, one would have to insert the deduction given in Example \ref{e:one} (with $\varphi$ replaced by $\lnot \alpha$ throughout) in place of line 2 above and renumber the old line 3. \end{exmp} \begin{prob} \label{p:ded} Show that if $\alpha$, $\beta$, and $\gamma$ are formulas, then \begin{enumerate} \item $\{\, \alpha \to (\beta \to \gamma),\, \beta\,\} \proves \alpha \to \gamma$ \item $\proves \alpha \lor \lnot \alpha$ \end{enumerate} \end{prob} \begin{exmp} \label{e:four} Let us show that $\proves \lnot\lnot \beta \to \beta$. \begin{enumerate} \item $(\lnot\beta \to \lnot\lnot\beta) \to ((\lnot\beta \to \lnot\beta) \to \beta)$ \hfill A3 \item $\lnot\lnot\beta \to (\lnot\beta \to \lnot\lnot\beta)$ \hfill A1 \item $\lnot\lnot\beta \to ((\lnot\beta \to \lnot\beta) \to \beta)$ \hfill 1,2 Example~\ref{e:two} \item $\lnot\beta \to \lnot\beta$ \hfill Example~\ref{e:one} \item $\lnot\lnot \beta \to \beta$ \hfill 3,4 Problem~\ref{p:ded}.1 \end{enumerate} Hence $\proves \lnot\lnot \beta \to \beta$, as desired. \end{exmp} Certain general facts are sometimes handy: \begin{prop} \label{p:three3a} If $\varphi_1 \varphi_2 \dots \varphi_n$ is a deduction of $\mathcal{L}_P$, then $\varphi_1 \dots \varphi_\ell$ is also a deduction of $\mathcal{L}_P$ for any $\ell$ such that $1 \le \ell \le n$. \end{prop} \begin{prop} \label{p:dmp} If $\Gamma \proves \delta$ and $\Gamma \proves \delta \to \beta$, then $\Gamma \proves \beta$. \end{prop} \begin{prop} \label{p:three5} If $\Gamma \subseteq \Delta$ and $\Gamma \proves \alpha$, then $\Delta \proves \alpha$. \end{prop} \begin{prop} \label{p:three6} If $\Gamma \proves \Delta$ and $\Delta \proves \sigma$, then $\Gamma \proves \sigma$. \end{prop} The following theorem often lets one take substantial shortcuts when trying to show that certain deductions exist in $\mathcal{L}_P$, even though it doesn't give us the deductions explicitly. \begin{thm}[Deduction Theorem] \label{t:ded} \index{Deduction Theorem} If $\Sigma$ is any set of formulas and $\alpha$ and $\beta$ are any formulas, then $\Sigma \proves \alpha \to \beta$ if and only if $\Sigma \cup \{ \alpha \} \proves \beta$. \end{thm} \begin{exmp} \label{e:five} Let us show that $\proves \varphi \to \varphi$. By the Deduction Theorem it is enough to show that $\{ \varphi \} \proves \varphi$, which is trivial: \begin{enumerate} \item $\varphi$ \hfill Premiss \end{enumerate} Compare this to the deduction in Example~\ref{e:one}. \end{exmp} \begin{prob} \label{p:prov} Appealing to previous deductions and the Deduction Theorem if you wish, show that: \begin{enumerate} \item $\{ \delta, \lnot\delta \} \proves \gamma$ \item $\proves \varphi \to \lnot\lnot \varphi$ \item $\proves (\lnot \beta \to \lnot \alpha) \to (\alpha \to \beta)$ \item $\proves (\alpha \to \beta) \to (\lnot \beta \to \lnot \alpha)$ \item $\proves (\beta \to \lnot \alpha) \to (\alpha \to \lnot \beta)$ \item $\proves (\lnot \beta \to \alpha) \to (\lnot \alpha \to \beta)$ \item $\proves \sigma \to (\sigma \lor \tau)$ \item $\{ \alpha \land \beta \} \proves \beta$ \item $\{ \alpha \land \beta \} \proves \alpha$ \end{enumerate} \end{prob} % % Chapter 4 of "A Problem Course in Mathematical Logic" % \chapter{Soundness and Completeness} \label{ch:four} How are deduction and implication related, given that they were defined in completely different ways? We have some evidence that they behave alike; compare, for example, Proposition~\ref{p:moto} and the Deduction Theorem. It had better be the case that if there is a deduction of a formula $\varphi$ from a set of premisses $\Sigma$, then $\varphi$ is implied by $\Sigma$. (Otherwise, what's the point of defining deductions?) It would also be nice for the converse to hold: whenever $\varphi$ is implied by $\Sigma$, there is a deduction of $\varphi$ from $\Sigma$. (So anything which is true can be proved.) The Soundness and Completeness Theorems say that both ways do hold, so $\Sigma \proves \varphi$ if and only if $\Sigma \models \varphi$, {\em i.e.\/} $\proves$ and $\models$ are equivalent for propositional logic. One direction is relatively straightforward to prove\dots \begin{thm}[Soundness Theorem] \label{t:psnd} \index{Soundness Theorem} If $\Delta$ is a set of formulas and $\alpha$ is a formula such that $\Delta \proves \alpha$, then $\Delta \models \alpha$. \end{thm} \dots but for the other direction we need some additional concepts. \begin{defn} \label{d:cons} A set of formulas $\Gamma$ is {\em inconsistent\/}\index{inconsistent} if $\Gamma \proves \lnot(\alpha \to \alpha)$ for some formula $\alpha$, and {\em consistent\/}\index{consistent} if it is not inconsistent. \end{defn} For example, $\{ A_{41} \}$ is consistent by Proposition~\ref{p:stoc}, but it follows from Problem~\ref{p:prov} that $\{ A_{13}, \lnot A_{13} \}$ is inconsistent. \begin{prop} \label{p:stoc} If a set of formulas is satisfiable, then it is consistent. \end{prop} \begin{prop} \label{p:inca} Suppose $\Delta$ is an inconsistent set of formulas. Then $\Delta \proves \psi$ for any formula $\psi$. \end{prop} \begin{prop} \label{p:cmp} Suppose $\Sigma$ is an inconsistent set of formulas. Then there is a finite subset $\Delta$ of $\Sigma$ such that $\Delta$ is inconsistent. \end{prop} \begin{cor} \label{c:cmp} A set of formulas $\Gamma$ is consistent if and only if every finite subset of $\Gamma$ is consistent. \end{cor} To obtain the Completeness Theorem requires one more definition. \begin{defn} \label{d:mxc} \index{maximally consistent} \index{consistent maximally} A set of formulas $\Sigma$ is {\em maximally consistent} if $\Sigma$ is consistent but $\Sigma \cup \{\varphi\}$ is inconsistent for any $\varphi \notin \Sigma$. \end{defn} That is, a set of formulas is maximally consistent if it is consistent, but there is no way to add any other formula to it and keep it consistent. \begin{prob} \label{p:emc} Suppose $v$ is a truth assignment. Show that $\Sigma = \{\, \varphi \mid v(\varphi) = T \,\}$ is maximally consistent. \end{prob} We will need some facts concerning maximally consistent theories. \begin{prop} \label{p:inmc} If $\Sigma$ is a maximally consistent set of formulas, $\varphi$ is a formula, and $\Sigma \proves \varphi$, then $\varphi \in \Sigma$. \end{prop} \begin{prop} \label{p:nimc} Suppose $\Sigma$ is a maximally consistent set of formulas and $\varphi$ is a formula. Then $\lnot\varphi \in \Sigma$ if and only if $\varphi \notin \Sigma$. \end{prop} \begin{prop} \label{p:iimc} Suppose $\Sigma$ is a maximally consistent set of formulas and $\varphi$ and $\psi$ are formulas. Then $\varphi \to \psi \in \Sigma$ if and only if $\varphi \notin \Sigma$ or $\psi \in \Sigma$. \end{prop} It is important to know that any consistent set of formulas can be expanded to a maximally consistent set. \begin{thm} \label{t:exmc} Suppose $\Gamma$ is a consistent set of formulas. Then there is a maximally consistent set of formulas $\Sigma$ such that $\Gamma \subseteq \Sigma$. \end{thm} Now for the main event! \begin{thm} \label{t:saco} A set of formulas is consistent if and only if it is satisfiable. \end{thm} Theorem~\ref{t:saco} gives the equivalence between $\proves$ and $\models$ in slightly disguised form. \begin{thm}[Completeness Theorem] \label{t:pcmpl} \index{Completeness Theorem} If $\Delta$ is a set of formulas and $\alpha$ is a formula such that $\Delta \models \alpha$, then $\Delta \proves \alpha$. \end{thm} It follows that anything provable from a given set of premisses must be true if the premisses are, and {\em vice versa\/}. The fact that $\proves$ and $\models$ are actually equivalent can be very convenient in situations where one is easier to use than the other. For example, most parts of Problems~\ref{p:ded} and \ref{p:prov} are much easier to do with truth tables instead of deductions, even if one makes use of the Deduction Theorem. Finally, one more consequence of Theorem~\ref{t:saco}. \begin{thm}[Compactness Theorem] \label{t:pcpct} \index{Compactness Theorem} A set of formulas $\Gamma$ is satisfiable if and only if every finite subset of $\Gamma$ is satisfiable. \end{thm} We will not look at any uses of the Compactness Theorem now, but we will consider a few applications of its counterpart for first-order logic in Chapter~\ref{ch:nine}. \chapter*{Hints for Chapters 1--4} % % Hints for Chapter 1 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:one}} \begin{clue}{p:one1} Symbols not in the language, unbalanced parentheses, lack of connectives\dots \end{clue} \begin{clue}{p:lrp} The key idea is to exploit the recursive structure of Definition~\ref{d:form} and proceed by induction on the length of the formula or on the number of connectives in the formula. As this is an idea that will be needed repeatedly in Parts I, II, and IV, here is a skeleton of the argument in this case: \begin{proof} By induction on $n$, the number of connectives ({\em i.e.\/} occurrences of $\lnot$ and/or $\to$) in a formula $\varphi$ of $\mathcal{L}_P$, we will show that any formula $\varphi$ must have just as many left parentheses as right parentheses. {\em Base step:\/} ($n=0$) If $\varphi$ is a formula with no connectives, then it must be atomic. (Why?) Since an atomic formula has no parentheses at all, it has just as many left parentheses as right parentheses. {\em Induction hypothesis:\/} ($n \le k$) Assume that any formula with $n \le k$ connectives has just as many left parentheses as right parentheses. {\em Induction step:\/} ($n=k+1$) Suppose $\varphi$ is a formula with $n=k+1$ connectives. It follows from Definition~\ref{d:form} that $\varphi$ must be either \begin{enumerate} \item $(\lnot \alpha)$ for some formula $\alpha$ with $k$ connectives or \item $(\beta \to \gamma)$ for some formulas $\beta$ and $\gamma$ which have $\le k$ connectives each. \end{enumerate} (Why?) We handle the two cases separately: \begin{enumerate} \item By the induction hypothesis, $\alpha$ has just as many left parentheses as right parentheses. Since $\varphi$, {\em i.e.\/} $(\lnot \alpha)$, has one more left parenthesis and one more right parentheses than $\alpha$, it must have just as many left parentheses as right parentheses as well. \item By the induction hypothesis, $\beta$ and $\gamma$ each have the same number of left parentheses as right parentheses. Since $\varphi$, {\em i.e.\/} $(\beta \to \alpha)$, has one more left parenthesis and one more right parnthesis than $\beta$ and $\gamma$ together have, it must have just as many left parntheses as right parentheses as well. \end{enumerate} It follows by induction that every formula $\varphi$ of $\mathcal{L}_P$ has just as many left parentheses as right parentheses. \end{proof} \end{clue} \begin{clue}{p:one3} Compute $p(\alpha) / \ell(\alpha)$ for a number of examples and look for patterns. Getting a minimum value should be pretty easy. \end{clue} \begin{clue}{p:one4} Proceed by induction on the length of or on the number of connectives in the formula. \end{clue} \begin{clue}{p:lof} Construct examples of formulas of all the short lengths that you can, and then see how you can make longer formulas out of short ones. \end{clue} \begin{clue}{p:pn} Hewlett-Packard sells calculators that use such a trick. A similar one is used in Definition \ref{d:ter}. \end{clue} \begin{clue}{p:foc} Observe that $\mathcal{L}_P$ has countably many symbols and that every formula is a finite sequence of symbols. The relevant facts from set theory are given in Appendix \ref{ap:sets}. \end{clue} \begin{clue}{p:one8} Stick several simple statements together with suitable connectives. \end{clue} \begin{clue}{p:one9} This should be straightforward. \end{clue} \begin{clue}{p:one10} Ditto. \end{clue} \begin{clue}{p:one11} To make sure you get all the subformulas, write out the formula in official form with all the parentheses. \end{clue} \begin{clue}{t:ur} Proceed by induction on the length or number of connectives of the formula. \end{clue} % % Hints for Chapter 2 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:two}} \begin{clue}{p:two1} Use truth tables. \end{clue} \begin{clue}{p:tav} Proceed by induction on the length of $\delta$ or on the number of connectives in $\delta$. \end{clue} \begin{clue}{c:tav} Use Proposition \ref{p:tav}. \end{clue} \begin{clue}{p:tif} In each case, unwind Definition \ref{d:tras} and the definitions of the abbreviations. \end{clue} \begin{clue}{p:two5} Use truth tables. \end{clue} \begin{clue}{p:two6} Use Definition \ref{d:taco} and Proposition \ref{p:tif}. \end{clue} \begin{clue}{p:two6a} If a truth assignment satisfies every formula in $\Sigma$ and every formula in $\Gamma$ is also in $\Sigma$, then\dots \end{clue} \begin{clue}{p:two7} Grinding out an appropriate truth table will do the job. Why is it important that $\Sigma$ be finite here? \end{clue} \begin{clue}{p:moto} Use Definition \ref{d:imp} and Proposition \ref{p:tif}. \end{clue} \begin{clue}{p:sanc} Use Definitions \ref{d:taco} and \ref{d:imp}. If you have trouble trying to prove one of the two directions directly, try proving its contrapositive instead. \end{clue} % % Hints for Chapter 3 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:three}} \begin{clue}{p:axta} Truth tables are probably the best way to do this. \end{clue} \begin{clue}{p:snd} Look up Proposition \ref{p:tif}. \end{clue} \begin{clue}{p:ded} There are usually many different deductions with a given conclusion, so you shouldn't take the following hints as gospel. \begin{enumerate} \item Use A2 and A1. \item Recall what $\lor$ abbreviates. \end{enumerate} \end{clue} \begin{clue}{p:three3a} You need to check that $\varphi_1 \dots \varphi_\ell$ satisfies the three conditions of Definition~\ref{d:ded}; you know $\varphi_1 \dots \varphi_n$ does. \end{clue} \begin{clue}{p:dmp} Put together a deduction of $\beta$ from $\Gamma$ from the deductions of $\delta$ and $\delta \to \beta$ from $\Gamma$. \end{clue} \begin{clue}{p:three5} Examine Definition \ref{d:ded} carefully. \end{clue} \begin{clue}{p:three6} The key idea is similar to that for proving Proposition \ref{p:dmp}. \end{clue} \begin{clue}{t:ded} One direction follows from Proposition \ref{p:dmp}. For the other direction, proceed by induction on the length of the shortest proof of $\beta$ from $\Sigma \cup \{ \alpha \}$. \end{clue} \begin{clue}{p:prov} Again, don't take these hints as gospel. Try using the Deduction Theorem in each case, plus \begin{enumerate} \item A3. \item A3 and Problem \ref{p:ded}. \item A3. \item A3, Problem \ref{p:ded}, and Example \ref{e:two}. \item Some of the above parts and Problem \ref{p:ded}. \item Ditto. \item Use the definition of $\lor$ and one of the above parts. \item Use the definition of $\land$ and one of the above parts. \item Aim for $\lnot\alpha \to (\alpha \to \lnot\beta)$ as an intermediate step. \end{enumerate} \end{clue} % % Hints for Chapter 4 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:four}} \begin{clue}{t:psnd} Use induction on the length of the deduction and Proposition \ref{p:snd}. \end{clue} \begin{clue}{p:stoc} Assume, by way of contradiction, that the given set of formulas is inconsistent. Use the Soundness Theorem to show that it can't be satisfiable. \end{clue} \begin{clue}{p:inca} First show that $\{ \lnot(\alpha \to \alpha) \} \proves \psi$. \end{clue} \begin{clue}{p:cmp} Note that deductions are finite sequences of formulas. \end{clue} \begin{clue}{c:cmp} Use Proposition \ref{p:cmp}. \end{clue} \begin{clue}{p:emc} Use Proposition \ref{p:stoc}, the definition of $\Sigma$, and Proposition \ref{p:tif}. \end{clue} \begin{clue}{p:inmc} Assume, by way of contradiction, that $\varphi \notin \Sigma$. Use Definition \ref{d:mxc} and the Deduction Theorem to show that $\Sigma$ must be inconsistent. \end{clue} \begin{clue}{p:nimc} Use Definition \ref{d:mxc} and Problem \ref{p:prov}. \end{clue} \begin{clue}{p:iimc} Use Definition \ref{d:mxc} and Proposition \ref{p:nimc}. \end{clue} \begin{clue}{t:exmc} Use Proposition \ref{p:foc} and induction on a list of all the formulas of $\mathcal{L}_P$. \end{clue} \begin{clue}{t:saco} One direction is just Proposition \ref{p:stoc}. For the other, expand the set of formulas in question to a maximally consistent set of formulas $\Sigma$ using Theorem \ref{t:exmc}, and define a truth assignment $v$ by setting $v(A_n) = T$ if and only if $A_n \in \Sigma$. Now use induction on the length of $\varphi$ to show that $\varphi \in \Sigma$ if and only if $v$ satisfies $\varphi$. \end{clue} \begin{clue}{t:pcmpl} Prove the contrapositive using Theorem \ref{t:saco}. \end{clue} \begin{clue}{t:pcpct} Put Corollary \ref{c:cmp} together with Theorem \ref{t:saco}. \end{clue} % % Part II of "A Problem Course in Mathematical Logic" % \part{First-Order Logic} % % Chapter 5 of "A Problem Course in Mathematical Logic" % \chapter{Languages} \label{ch:five} As noted in the Introduction, propositional logic has obvious deficiencies as a tool for mathematical reasoning. First-order logic\index{first-order logic}\index{logic first-order} remedies enough of these to be adequate for formalizing most ordinary mathematics. It does have enough in common with propositional logic to let us recycle some of the material in Chapters 1--4. A few informal words about how first-order languages\index{first-order languages}\index{language first-order} work are in order. In mathematics one often deals with structures consisting of a set of elements plus various operations on them or relations among them. To cite three common examples, a group is a set of elements plus a binary operation on these elements satisfying certain conditions, a field is a set of elements plus two binary operations on these elements satisfying certain conditions, and a graph is a set of elements plus a binary relation with certain properties. In most such cases, one frequently uses symbols naming the operations or relations in question, symbols for variables which range over the set of elements, symbols for logical connectives such as {\em not\/} and {\em for all\/}, plus auxiliary symbols such as parentheses, to write formulas which express some fact about the structure in question. For example, if $(G,\cdot)$ is a group, one might express the associative law by writing something like \[ \forall x\, \forall y\, \forall z\; x\cdot (y\cdot z) = (x\cdot y)\cdot z\, , \] it being understood that the variables range over the set $G$ of group elements. A formal language to do as much will require some or all of these: symbols for various logical notions and for variables, some for functions or relations, plus auxiliary symbols. It will also be necessary to specify rules for putting the symbols together to make formulas, for interpreting the meaning and determining the truth of these formulas, and for making inferences in deductions. For a concrete example, consider elementary number theory. The set of elements under discussion is the set of natural numbers $\mathbb N = \{\, 0,1,2,3,4, \dots \}$. One might need symbols or names for certain interesting numbers, say $0$ and $1$; for variables over $\mathbb{N}$ such as $n$ and $x$; for functions on $\mathbb{N}$, say $\cdot$ and $+$; and for relations, say $=$, $<$, and $|$. In addition, one is likely to need symbols for punctuation, such as $($ and $)$; for logical connectives, such as $\lnot$ and $\to$; and for quantifiers, such as $\forall$ (``for all'') and $\exists$ (``there exists''). A statement of mathematical English such as ``For all $n$ and $m$, if $n$ divides $m$, then $n$ is less than or equal to $m$'' can then be written as a cool formula like \[ \forall n \forall m \, (n \mid m \to (n < m \land n = m)) \, . \] The extra power of first-order logic comes at a price: greater complexity. First, there are many first-order languages one might wish to use, practically one for each subject, or even problem, in mathematics.\footnote{It is possible to formalize almost all of mathematics in a single first-order language, like that of set theory or category theory. However, trying to actually do most mathematics in such a language is so hard as to be pointless.} We will set up our definitions and general results, however, to apply to a wide range of them.\footnote{Specifically, to countable one-sorted first-order languages with equality.} As with $\mathcal{L}_P$, our formal language for propositional logic, first-order languages are defined by specifying their symbols and how these may be assembled into formulas. \begin{defn} \label{d:sym} \index{symbols} \index{symbols logical} \index{symbols non-logical} The {\em symbols\/} of a first-order language $\mathcal{L}$\index{$\mathcal{L}$} include: \begin{enumerate} \item Parentheses: $($ and $)$. \index{parentheses} \index{$($} \index{$)$} \item Connectives: $\lnot$ and $\to$. \index{connectives} \index{$\lnot$} \index{$\to$} \item Quantifier: $\forall$. \index{quantifier universal} \index{$\forall$} \item Variables: $v_0$, $v_1$, $v_2$, \dots, $v_n$, \dots \index{variable} \index{$v_n$} \item Equality: $=$. \index{equality} \index{$=$} \item A (possibly empty) set of {\em constant\/} symbols. \index{constant} \item For each $k \ge 1$, a (possibly empty) set of {\em $k$-place function\/} symbols. \index{function $k$-place} \index{function} \item For each $k \ge 1$, a (possibly empty) set of {\em $k$-place relation\/} (or {\em predicate\/}) symbols. \index{relation $k$-place} \index{relation} \index{predicate} \end{enumerate} The symbols described in parts 1--5 are the {\em logical\/} symbols of $\mathcal{L}$, shared by every first-order language, and the rest are the {\em non-logical\/} symbols of $\mathcal{L}$, which usually depend on what the language's intended use. \end{defn} \begin{note} It is possible to define first-order languages without $=$, so $=$ is considered a non-logical symbol by many authors. While such languages have some uses, they are uncommon in ordinary mathematics. Observe that any first-order language $\mathcal{L}$ has countably many logical symbols. It may have uncountably many symbols if it has uncountably many non-logical symbols. {\em Unless explicitly stated otherwise, we will assume that every first-order language we encounter has only countably many non-logical symbols.\/} Most of the results we will prove actually hold for countable and uncountable first-order languages alike, but some require heavier machinery to prove for uncountable languages. \end{note} Just as in $\mathcal{L}_P$, the parentheses are just punctuation\index{punctuation} while the connectives, $\lnot$\index{$\lnot$} and $\to$\index{$\to$}, are intended to express {\em not\/}\index{not} and {\em if \dots then\/}\index{if \dots then}. However, the rest of the symbols are new and are intended to express ideas that cannot be handled by $\mathcal{L}_P$. The quantifier\index{quantifier universal} symbol, $\forall$\index{$\forall$}, is meant to represent {\em for all\/}\index{for all}, and is intended to be used with the variable symbols, {\em e.g.\/} $\forall v_4$. The constant\index{constant} symbols are meant to be names for particular elements of the structure under discussion. $k$-place function\index{function $k$-place} symbols are meant to name particular functions which map $k$-tuples of elements of the structure to elements of the structure. $k$-place relation\index{relation $k$-place} symbols are intended to name particular $k$-place relations among elements of the structure.\footnote{Intuitively, a relation or predicate\index{predicate} expresses some (possibly arbitrary) relationship among one or more objects. For example, ``$n$ is prime'' is a 1-place relation on the natural numbers, $<$ is a 2-place or binary relation\index{relation binary} on the rationals, and $\vec{a} \times (\vec{b} \times \vec{c}) = \vec{0}$ is a 3-place relation on $\mathbb{R}^3$. Formally, a $k$-place relation\index{relation $k$-place} on a set $X$ is just a subset of $X^k$, {\em i.e.\/} the collection of sequences of length $k$ of elements of $X$ for which the relation is true.} Finally, $=$\index{$=$} is a special binary relation\index{relation binary} symbol intended to represent equality\index{equality}. \begin{exmp} \label{e:lannt} Since the logical symbols are always the same, first-order languages are usually defined by specifying the non-logical symbols. A formal language for elementary number theory like that unofficially described above, call it $\mathcal{L}_{NT}$\index{$\mathcal{L}_{NT}$}, can be defined as follows. \begin{itemize} \item Constant symbols: $0$ and $1$ \item Two $2$-place function symbols: $+$ and $\cdot$ \item Two binary relation symbols: $<$ and $|$ \end{itemize} Each of these symbols is intended to represent the same thing it does in informal mathematical usage: $0$ and $1$ are intended to be names for the numbers zero and one, $+$ and $\cdot$ names for the operations of addition and multiplications, and $<$ and $|$ names for the relations ``less than'' and ``divides''. (Note that we could, in principle, interpret things completely differently -- let $0$ represent the number forty-one, $+$ the operation of exponentiation, and so on -- or even use the language to talk about a different structure -- say the real numbers, $\mathbb{R}$, with $0$, $1$, $+$, $\cdot$, and $<$ representing what they usually do and, just for fun, $|$ interpreted as ``is not equal to''. More on this in Chapter \ref{ch:six}.) We will usually use the same symbols in our formal languages that we use informally for various common mathematical objects. This convention\index{convention for common symbols} can occasionally cause confusion if it is not clear whether an expression involving these symbols is supposed to be an expression in a formal language or not. \end{exmp} \begin{exmp} \label{e:lan} Here are some other first-order languages. Recall that we need only specify the non-logical symbols in each case and note that some parts of Definitions \ref{d:ter} and \ref{d:for} may be irrelevant for a given language\index{language} if it is missing the appropriate sorts of non-logical symbols. \begin{enumerate} \item The language of pure equality, $\mathcal{L}_=$:\index{$\mathcal{L}_=$} \begin{itemize} \item No non-logical symbols at all. \end{itemize} \item A language for fields, $\mathcal{L}_F$:\index{$\mathcal{L}_F$} \begin{itemize} \item Constant symbols: $0$, $1$ \item $2$-place function symbols: $+$, $\cdot$ \end{itemize} \item A language for set theory, $\mathcal{L}_S$:\index{$\mathcal{L}_S$} \begin{itemize} \item $2$-place relation symbol: $\in$ \end{itemize} \item A language for linear orders, $\mathcal{L}_O$:\index{$\mathcal{L}_O$} \begin{itemize} \item $2$-place relation symbol: $<$ \end{itemize} \item Another language for elementary number theory, $\mathcal{L}_N$:\index{$\mathcal{L}_N$} \begin{itemize} \item Constant symbol: $0$ \item $1$-place function symbol: $S$ \item $2$-place function symbols: $+$, $\cdot$, $E$ \end{itemize} Here $0$ is intended to represent zero, $S$ the successor function, {\em i.e.\/} $S(n) = n + 1$, and $E$ the exponential function, {\em i.e.\/} $E(n,m) = n^m$. \item A ``worst-case'' countable language, $\mathcal{L}_1$:\index{$\mathcal{L}_1$} \begin{itemize} \item Constant symbols: $c_1$, $c_2$, $c_3$, \dots \item For each $k \ge 1$, $k$-place function symbols: $f^k_1$, $f^k_2$, $f^k_3$, \dots \item For each $k \ge 1$, $k$-place relation symbols: $P^k_1$, $P^k_2$, $P^k_3$, \dots \end{itemize} This language has no use except as an abstract example. \end{enumerate} \end{exmp} It remains to specify how to form valid formulas from the symbols of a first-order language $\mathcal{L}$. This will be more complicated than it was for $\mathcal{L}_P$. In fact, we first need to define a type of expression in $\mathcal{L}$ which has no counterpart in propositional logic. \begin{defn} \label{d:ter} \index{term} The {\em terms\/} of a first-order language $\mathcal{L}$ are those finite sequences of symbols of $\mathcal{L}$ which satisfy the following rules: \begin{enumerate} \item Every variable symbol $v_n$ is a term. \item Every constant symbol $c$ is a term. \item If $f$ is a $k$-place function symbol and $t_1$, \dots, $t_k$ are terms, then $f t_1 \dots t_k$ is also a term. \item Nothing else is a term. \end{enumerate} \end{defn} That is, a term is an expression which represents some (possibly indeterminate) element of the structure under discussion. For example, in $\mathcal{L}_{NT}$ or $\mathcal{L}_N$, $+ v_0 v_1$ (informally, $v_0 + v_1$ ) is a term, though precisely which natural number it represents depends on what values are assigned to the variables $v_0$ and $v_1$. \begin{prob} \label{p:five1} Which of the following are terms of one of the languages defined in Examples \ref{e:lannt} and \ref{e:lan}? If so, which of these language(s) are they terms of; if not, why not? \begin{enumerate} \item $\cdot v_2$ \item $+ 0 \cdot + v_6 1 1$ \item $|1+v_30$ \item $(<E101 \to +11)$ \item $++\cdot +00000$ \item $f^3_4f^2_7 c_4 v_9 c_1 v_4$ \item $\cdot v_5 (+1v_8)$ \item $< v_6 v_2$ \item $1 + 0$ \end{enumerate} \end{prob} Note that in languages with no function symbols all terms have length one. \begin{prob} \label{p:five2} Choose one of the languages defined in Examples \ref{e:lannt} and \ref{e:lan} which has terms of length greater than one and determine the possible lengths of terms of this language. \end{prob} \begin{prop} \label{p:fcmt} The set of terms of a countable first-order language $\mathcal{L}$ is countable. \end{prop} Having defined terms, we can finally define first-order formulas. \begin{defn} \label{d:for} \index{formula} The {\em formulas\/} of a first-order language $\mathcal{L}$ are the finite sequences of the symbols of $\mathcal{L}$ satisfying the following rules: \begin{enumerate} \item If $P$ is a $k$-place relation symbol and $t_1$, \dots, $t_k$ are terms, then $P t_1 \dots t_k$ is a formula. \item If $t_1$ and $t_2$ are terms, then $= t_1 t_2$ is a formula. \item If $\alpha$ is a formula, then $(\lnot \alpha)$ is a formula. \item If $\alpha$ and $\beta$ are formulas, then $(\alpha \to \beta)$ is a formula. \item If $\varphi$ is a formula and $v_n$ is a variable, then $\forall v_n \varphi$ is a formula. \item Nothing else is a formula. \end{enumerate} Formulas of form 1 or 2 will often be referred to as the {\em atomic formulas\/}\index{atomic formulas} \index{formulas atomic} of $\mathcal{L}$. \end{defn} Note that three of the conditions in Definition \ref{d:for} are borrowed directy from propositional logic. As before, we will exploit the way formulas are built up in making definitions and in proving results by induction on the length of a formula. We will also recycle the use of lower-case Greek characters\index{Greek characters} to refer to formulas and of upper-case Greek characters to refer to sets of formulas. \begin{prob} \label{p:five4} Which of the following are formulas of one of the languages defined in Examples \ref{e:lannt} and \ref{e:lan}? If so, which of these language(s) are they formulas of; if not, why not? \begin{enumerate} \item $= 0 + v_7 \cdot 1 v_3$ \item $(\lnot = v_1 v_1)$ \item $(| v_2 0 \to \cdot 0 1)$ \item $(\lnot \forall v_5 (= v_5 v_5))$ \item $< +01 |v_1v_3$ \item $(v_3 = v_3 \to \forall v_5 \, v_3 = v_5)$ \item $\forall v_6 (= v_6 0 \to \forall v_9 (\lnot | v_9 v_6))$ \item $\forall v_8 < +11 v_4$ \end{enumerate} \end{prob} \begin{prob} \label{p:five5} Show that every formula of a first-order language has the same number of left parentheses as of right parentheses. \end{prob} \begin{prob} \label{p:five6} Choose one of the languages defined in Examples \ref{e:lannt} and \ref{e:lan} and determine the possible lengths of formulas of this language. \end{prob} \begin{prop} \label{p:five7} A countable first-order language $\mathcal{L}$ has countably many formulas. \end{prop} In practice, devising a formal language intended to deal with a particular (kind of) structure isn't the end of the job: one must also specify axioms\index{axiom} in the language that the structure(s) one wishes to study should satisfy. Defining satisfaction is officially done in the next chapter, but it is usually straightforward to unofficially figure out what a formula in the language is supposed to mean. \begin{prob} \label{p:for} In each case, write down a formula of the given language expressing the given informal statement. \begin{enumerate} \item ``Addition is associative'' in $\mathcal{L}_F$. \item ``There is an empty set'' in $\mathcal{L}_S$. \item ``Between any two distinct elements there is a third element'' in $\mathcal{L}_O$. \item ``$n^0 = 1$ for every $n$ different from $0$'' in $\mathcal{L}_N$. \item ``There is only one thing'' in $\mathcal{L}_=$. \end{enumerate} \end{prob} \begin{prob} \label{p:fole} Define first-order languages to deal with the following structures and, in each case, an appropriate set of axioms in your language: \begin{enumerate} \item Groups. \item Graphs. \item Vector spaces. \end{enumerate} \end{prob} We will need a few additional concepts and facts about formulas of first-order logic later on. First, what are the subformulas of a formula? \begin{prob} \label{p:five10} \index{subformula} Define the set of subformulas of a formula $\varphi$ of a first-order language $\mathcal{L}$. \end{prob} For example, if $\varphi$ is \[ (((\lnot \forall v_1\, (\lnot =v_1c_7) ) \to P^2_3 v_5 v_8) \to \forall v_8 ( = v_8 f^3_5 c_0 v_1 v_5 \to P^1_2 v_8 )) \] in the language $\mathcal{L}_1$, then the set of subformulas of $\varphi$, $\mathcal{S}(\varphi)$, ought to include \begin{itemize} \item $=v_1c_7$, $P^2_3 v_5 v_8$, $= v_8 f^3_5 c_0 v_1 v_5$, $P^1_2 v_8$, \item $(\lnot =v_1c_7)$, $(= v_8 f^3_5 c_0 v_1 v_5 \to P^1_2 v_8)$, \item $\forall v_1\, (\lnot =v_1c_7)$, $\forall v_8 (= v_8 f^3_5 c_0 v_1 v_5 \to P^1_2 v_8)$, \item $(\lnot \forall v_1\, (\lnot =v_1c_7))$, \item $(\lnot \forall v_1\, (\lnot =v_1c_7) ) \to P^2_3 v_5 v_8)$, and \item $(((\lnot \forall v_1\, (\lnot =v_1c_7) ) \to P^2_3 v_5 v_8) \to \forall v_8 (= v_8 f^3_5 c_0 v_1 v_5 \to P^1_2 v_8 ))$ itself. \end{itemize} Second, we will need a concept that has no counterpart in propositional logic. \begin{defn} \label{d:frv} \index{variable free} \index{free variable} \index{variable bound} \index{bound variable} Suppose $x$ is a variable of a first-order language $\mathcal{L}$. Then $x$ {\em occurs free\/} in a formula $\varphi$ of $\mathcal{L}$ is defined as follows: \begin{enumerate} \item If $\varphi$ is atomic, then $x$ occurs free in $\varphi$ if and only if $x$ occurs in $\varphi$. \item If $\varphi$ is $(\lnot \alpha)$, then $x$ occurs free in $\varphi$ if and only if $x$ occurs free in $\alpha$. \item If $\varphi$ is $(\beta \to \delta)$, then $x$ occurs free in $\varphi$ if and only if $x$ occurs free in $\beta$ or in $\delta$. \item If $\varphi$ is $\forall v_k \, \psi$, then $x$ occurs free in $\varphi$ if and only if $x$ is different from $v_k$ and $x$ occurs free in $\psi$. \end{enumerate} An occurrence of $x$ in $\varphi$ which is not free is said to be {\em bound\/}. A formula $\sigma$ of $\mathcal{L}$ in which no variable occurs free is said to be a {\em sentence\/}.\index{sentence} \end{defn} Part 4 is the key: it asserts that an occurrence of a variable $x$ is bound instead of free if it is in the ``scope'' of an occurrence of $\forall x$. For example, $v_7$ is free in $\forall v_5 \, = v_5 v_7$, but $v_5$ is not. Different occurences of a given variable in a formula may be free or bound, depending on where they are; {\em e.g.\/} $v_6$ occurs both free and bound in $\forall v_0 \, (= v_0 f^1_3 v_6 \to (\lnot \forall v_6 \, P^1_9 v_6))$. \begin{prob} \label{p:five11} \index{scope of a quantifier} \index{quantifier, scope of} Give a precise definition of the scope of a quantifier. \end{prob} Note the distinction between sentences and ordinary formulas introduced in the last part of Definition 5.4. As we shall see, sentences are often more tractable and useful theoretically than ordinary formulas. \begin{prob} \label{p:five12} Which of the formulas you gave in solving Problem~\ref{p:for} are sentences? \end{prob} Finally, we will eventually need to consider a relationship between first-order languages. \begin{defn} \label{d:exlan} \index{extension of a language} \index{language extension of} A first-order language $\mathcal{L}'$ is an {\em extension\/} of a first-order language $\mathcal{L}$, sometimes written as $\mathcal{L} \subseteq \mathcal{L}'$, if every non-logical symbol of $\mathcal{L}$ is a non-logical symbol of the same kind of $\mathcal{L}'$. \end{defn} For example, every first-order language is an extension of $\mathcal{L}_=$. \begin{prob} \label{p:five13} Which of the languages given in Example \ref{e:lan} are extensions of other languages given in Example \ref{e:lan}? \end{prob} \begin{prop} \label{p:exlan} Suppose $\mathcal{L}$ is a first-order language and $\mathcal{L}'$ is an extension of $\mathcal{L}$. Then every formula $\varphi$ of $\mathcal{L}$ is a formula of $\mathcal{L}'$. \end{prop} \subsection*{Common Conventions} As with propositional logic, we will often use abbreviations\index{abbreviations} and informal conventions to simplify the writing of formulas in first-order languages. In particular, we will use the same additional connectives we used in propositional logic, plus an additional quantifier, $\exists$ (``there exists''): \begin{itemize} \item $(\alpha \land \beta)$ is short for $(\lnot (\alpha \to (\lnot \beta)))$\index{$\land$}. \item $(\alpha \lor \beta)$ is short for $( (\lnot \alpha) \to \beta)$\index{$\lor$}. \item $(\alpha \fromto \beta)$ is short for $((\alpha \to \beta) \land (\beta \to \alpha))$\index{$\fromto$}. \item $\exists v_k \varphi$ is short for $(\lnot \forall v_k (\lnot \varphi))$\index{$\exists$}. \end{itemize} ($\forall$\index{$\forall$} is often called the universal quantifier \index{universal quantifier} \index{quantifier universal} and $\exists$ is often called the existential quantifier.) \index{existential quantifier} \index{quantifier existential} Parentheses \index{parentheses conventions} \index{conventions, parentheses} will often be omitted in formulas according to the same conventions we used in propositional logic, with the modification that $\forall$ and $\exists$ take precedence over all the logical connectives: \begin{itemize} \item We will usually drop the outermost parentheses in a formula, writing $\alpha \to \beta$ instead of $(\alpha \to \beta)$ and $\lnot \alpha$ instead of $(\lnot \alpha)$. \item We will let $\forall$ take precedence over $\lnot$, and $\lnot$ take precedence over $\to$ when parentheses are missing, and fit the informal abbreviations into this scheme by letting the order of precedence be $\forall$, $\exists$, $\lnot$, $\land$, $\lor$, $\to$, and $\fromto$. \item Finally, we will group repetitions of $\to$, $\lor$, $\land$, or $\fromto$ to the right when parentheses are missing, so $\alpha \to \beta \to \gamma$ is short for $(\alpha \to (\beta \to \gamma))$. \end{itemize} For example, $\exists v_k \lnot \alpha \to \forall v_n \beta$ is short for $((\lnot \forall v_k (\lnot (\lnot\alpha))) \to \forall v_n \beta)$. On the other hand, we will sometimes add parentheses and arrange things in unofficial ways to make terms and formulas easier to read. In particular we will often write \begin{enumerate} \item $f(t_1,\dots,t_k)$ for $ft_1\dots t_k$ if $f$ is a $k$-place function symbol and $t_1$, \dots, $t_k$ are terms, \item $s \circ t$ for $\circ st$ if $\circ$ is a $2$-place function symbol and $s$ and $t$ are terms, \item $P(t_1, \dots, t_k)$ for $Pt_1 \dots t_k$ if $P$ is a $k$-place relation symbol and $t_1$, \dots, $t_k$ are terms, \item $s \bullet t$ for $\bullet st$ if $\bullet$ is a $2$-place relation symbol and $s$ and $t$ are terms, and \item $s=t$ for $=st$ if $s$ and $t$ are terms, and \item enclose terms in parentheses to group them. \end{enumerate} Thus, we could write the formula $= +1 \cdot 0 v_6 \cdot 11$ of $\mathcal{L}_{NT}$ as $1 + (0 \cdot v_6) = 1 \cdot 1$. As was observed in Example \ref{e:lannt}, it is customary in devising a formal language to recycle the same symbols used informally for the given objects. In situations where we want to talk about symbols without committing ourselves to a particular one, such as when talking about first-order languages in general, we will often use ``generic'' choices: \begin{itemize} \item $a$, $b$, $c$, \dots for constant\index{constant} symbols; \item $x$, $y$, $z$, \dots for variable\index{variable} symbols; \item $f$, $g$, $h$, \dots for function\index{function} symbols; \item $P$, $Q$, $R$, \dots for relation\index{relation} symbols; and \item $r$, $s$, $t$, \dots for generic terms\index{term}. \end{itemize} These can be thought of as variables in the metalanguage\footnote{The metalanguage is the language\index{language}, mathematical English in this case, in which we talk {\em about\/} a language. The theorems we prove about formal logic are, strictly speaking, metatheorems\index{metatheorem}, as opposed to the theorems\index{theorem} proved within a formal logical system. For more of this kind of stuff, read some philosophy\dots}\index{metalanguage} ranging over different kinds objects of first-order logic, much as we're already using lower-case Greek characters as variables which range over formulas. (In fact, we have already used some of these conventions in this chapter\dots) \subsection*{Unique Readability} The slightly paranoid might ask whether Definitions \ref{d:sym}, \ref{d:ter} and \ref{d:for} actually ensure that the terms and formulas of a first-order language $\mathcal{L}$ are unambiguous, {\em i.e.\/} cannot be read in more than one way. As with $\mathcal{L}_P$, to actually prove this one must assume that all the symbols of $\mathcal{L}$ are distinct and that no symbol is a subsequence of any other symbol. It then follows that: \begin{thm} \label{t:urt} \index{unique readibility of terms} Any term of a first-order language $\mathcal{L}$ satisfies exactly one of conditions 1--3 in Definition \ref{d:ter}. \end{thm} \begin{thm}[Unique Readability Theorem] \label{t:urf} \index{unique readability of formulas} \index{formula unique readability} \index{Unique Readability Theorem} Any formula of a first-order language satisfies exactly one of conditions 1--5 in Definition \ref{d:for}. \end{thm} % % Chapter 6 of "A Problem Course in Mathematical Logic" % \chapter{Structures and Models} \label{ch:six} Defining truth and implication in first-order logic is a lot harder than it was in propositional logic. First-order languages are intended to deal with mathematical objects like groups or linear orders, so it makes little sense to speak of the truth of a formula without specifying a context. For example, one can write down a formula expressing the commutative law in a language for group theory, $\forall x\, \forall y\, x \cdot y = y \cdot x$, but whether it is true or not depends on which group we're dealing with. It follows that we need to make precise which mathematical objects or structures a given first-order language can be used to discuss and how, given a suitable structure, formulas in the language are to be interpreted. Such a structure for a given language should supply most of the ingredients needed to interpret formulas of the language. Throughout this chapter, let $\mathcal{L}$ be an arbitrary fixed countable first-order language. All formulas will be assumed to be formulas of $\mathcal{L}$ unless stated otherwise. \begin{defn} \label{d:str} \index{structure} A {\em structure\/} $\mathfrak{M}$ for $\mathcal{L}$ consists of the following: \begin{enumerate} \item A non-empty set $M$, often written as $|\mathfrak{M}|$, called the {\em universe\/} of $\mathfrak{M}$.\index{universe} \item For each constant symbol $c$ of $\mathcal{L}$, an element $c^{\mathfrak{M}}$ of $M$.\index{constant} \item For each $k$-place function symbol $f$ of $\mathcal{L}$, a function $f^{\mathfrak{M}} : M^k \to M$, {\em i.e.\/} a $k$-place function on $M$.\index{function} \item For each $k$-place relation symbol $P$ of $\mathcal{L}$, a relation $P^{\mathfrak{M}} \subseteq M^k$, {\em i.e.\/} a $k$-place relation on $M$.\index{relation} \end{enumerate} \end{defn} That is, a structure supplies an underlying set of elements plus interpretations for the various non-logical symbols of the language: constant symbols are interpreted by particular elements of the underlying set, function symbols by functions on this set, and relation symbols by relations among elements of this set. It is customary to use upper-case ``gothic'' characters\index{gothic characters} such as $\mathfrak{M}$\index{$\mathfrak{M}$} and $\mathfrak{N}$\index{$\mathfrak{N}$} for structures. For example, consider $\mathfrak{Q} = (\mathbb{Q}, <)$, where $<$ is the usual ``less than'' relation on the rationals. This is a structure for $\mathcal{L}_O$, the language for linear orders defined in Example~\ref{e:lan}; it supplies a $2$-place relation to interpret the language's $2$-place relation symbol. $\mathfrak{Q}$ is {\em not\/} the only possible structure for $\mathcal{L}_O$: $(\mathbb{R}, < )$, $(\{0\}, \emptyset)$, and $(\mathbb{N}, \mathbb{N}^2)$ are three others among infinitely many more. (Note that in these cases the relation symbol $<$ is interpreted by relations on the universe which are not linear orders. One can ensure that a structure satisfy various conditions beyond what Definition~\ref{d:str} guarantees by requiring appropriate formulas to be true when interpreted in the structure.) On the other hand, $(\mathbb{R})$ is not a structure for $\mathcal{L}_O$ because it lacks a binary relation to interpret the symbol $<$ by, while $(\mathbb{N}, 0, 1, +, \cdot, |, <)$ is not a structure for $\mathcal{L}_O$ because it has two binary relations where $\mathcal{L}_O$ has a symbol only for one, plus constants and functions for which $\mathcal{L}_O$ lacks symbols. \begin{prob} \label{p:six1} The first-order languages referred to below were all defined in Example~\ref{e:lan}. \begin{enumerate} \item Is $(\emptyset)$ a structure for $\mathcal{L}_=$? \item Determine whether $\mathfrak Q = (\mathbb{Q}, <)$ is a structure for each of $\mathcal{L}_=$, $\mathcal{L}_F$, and $\mathcal{L}_S$. \item Give three different structures for $\mathcal{L}_F$ which are not fields. \end{enumerate} \end{prob} To determine what it means for a given formula to be true in a structure for the corresponding language, we will also need to specify how to interpret the variables when they occur free. (Bound variables have the associated quantifier to tell us what to do.) \begin{defn} \label{d:ass} \index{assignment} Let $V = \{\, v_0, v_1, v_2, \dots \,\}$ be the set of all variable\index{variable} symbols of $\mathcal{L}$ and suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$. A function $s : V \to |\mathfrak{M}|$ is said to be an {\em assignment\/} for $\mathfrak{M}$. \end{defn} Note that these are {\em not\/} truth assignments like those for $\mathcal{L}_P$. An assignment just interprets each variable in the language by an element of the universe of the structure. Also, as long as the universe of the structure has more than one element, any variable can be interpreted in more than one way. Hence there are usually many different possible assignments for a given structure. \begin{exmp} \label{e:as} Consider the structure $\mathfrak{R} = (\mathbb{R},0,1,+,\cdot)$ for $\mathcal{L}_F$. Each of the following functions $V \to \mathbb{R}$ is an assignment for $\mathfrak{R}$: \begin{enumerate} \item $p(v_n) = \pi$ for each $n$, \item $r(v_n) = e^n$ for each $n$, and \item $s(v_n) = n + 1$ for each $n$. \end{enumerate} In fact, {\em every\/} function $V \to \mathbb{R}$ is an assignment for $\mathfrak{R}$. \end{exmp} In order to use assignments to determine whether formulas are true in a structure, we need to know how to use an assignment to interpret each term of the language as an element of the universe. \begin{defn} \label{d:exas} \index{assignment extended} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$ and $s \colon V \to |\mathfrak{M}|$ is an assignment for $\mathfrak{M}$. Let $T$ be the set of all terms\index{term} of $\mathcal{L}$. Then the {\em extended assignment\/} $\mathbf{s} \colon T \to |\mathfrak{M}|$ is defined inductively as follows: \begin{enumerate} \item For each variable $x$, $\mathbf{s}(x) = s(x)$.\index{variable} \item For each constant symbol $c$, $\mathbf{s}(c) = c^{\mathfrak{M}}$.\index{constant} \item For every $k$-place function symbol $f$ and terms $t_1$, \dots, $t_k$, \[ \mathbf{s}(f t_1 \dots t_k) = f^{\mathfrak{M}} (\mathbf{s}(t_1), \dots, \mathbf{s}(t_k) ). \]\index{function} \end{enumerate} \end{defn} \begin{exmp} \label{e:exas} Let $\mathfrak{R}$ be the structure for $\mathcal{L}_F$ given in Example \ref{e:as}, and let $\mathbf{p}$, $\mathbf{r}$, and $\mathbf{s}$ be the extended assignments corresponding to the assignments $p$, $r$, and $s$ defined in Example \ref{e:as}. Consider the term $+ \cdot v_6 v_0 + 0 v_3$ of $\mathcal{L}_F$. Then: \begin{enumerate} \item $\mathbf{p}(+ \cdot v_6 v_0 + 0 v_3) = \pi^2 + \pi$, \item $\mathbf{r}(+ \cdot v_6 v_0 + 0 v_3) = e^6 + e^3$, and \item $\mathbf{s}(+ \cdot v_6 v_0 + 0 v_3) = 11$. \end{enumerate} Here's why for the last one: since $s(v_6) = 7$, $s(v_0) = 1$, $s(v_3) = 4$, and $\mathbf{s}(0) = 0$ (by part 2 of Definition \ref{d:exas}), it follows from part 3 of Definition \ref{d:exas} that $\mathbf{s}(+ \cdot v_6 v_0 + 0 v_3) = (7 \cdot 1) + (0 + 4) = 7 + 4 = 11$. \end{exmp} \begin{prob} \label{pb:exas} $\mathfrak{N} = (\mathbb{N}, 0, S, +, \cdot, E)$ is a structure for $\mathcal{L}_N$. Let $s \colon V \to \mathbb{N}$ be the assignment defined by $s(v_k) = k + 1$. What are $\mathbf{s}( E + v_{19} v_1 \cdot 0 v_{45})$ and $\mathbf{s}(SSS + E 0 v_6 v_7 )$? \end{prob} \begin{prop} \label{p:eau} $\mathbf s$ is unique, {\em i.e.\/} given an assignment $s$, no other function $T \to |\mathfrak{M}|$ satisfies conditions 1--3 in Definition~\ref{d:exas}. \end{prop} With Definitions \ref{d:ass} and \ref{d:exas} in hand, we can take our first cut at defining what it means for a first-order formula to be true. \begin{defn} \label{d:sat} \index{assignment} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$, $s$ is an assignment for $\mathfrak{M}$, and $\varphi$ is a formula of $\mathcal{L}$. Then $\mathfrak{M} \models \varphi [s]$ is defined as follows:\index{$\models$} \begin{enumerate} \item If $\varphi$ is $t_1 = t_2$ for some terms $t_1$ and $t_2$, then $\mathfrak{M} \models \varphi [s]$ if and only if $\mathbf{s}(t_1) = \mathbf{s}(t_2)$. \item If $\varphi$ is $P t_1 \dots t_k$ for some $k$-place relation symbol $P$ and terms $t_1$, \dots, $t_k$, then $\mathfrak{M} \models \varphi [s]$ if and only if $(\mathbf{s}(t_1), \dots, \mathbf{s}(t_k)) \in P^{\mathfrak{M}}$, {\em i.e.\/} $P^{\mathfrak{M}}$ is true of $(\mathbf{s}(t_1), \dots, \mathbf{s}(t_k))$. \item If $\varphi$ is $(\lnot \psi)$ for some formula $\psi$, then $\mathfrak{M} \models \varphi [s]$ if and only if it is not the case that $\mathfrak{M} \models \psi [s]$. \item If $\varphi$ is $(\alpha \to \beta)$, then $\mathfrak{M} \models \varphi [s]$ if and only if $\mathfrak{M} \models \beta [s]$ whenever $\mathfrak{M} \models \alpha [s]$, {\em i.e.\/} unless $\mathfrak{M} \models \alpha [s]$ but not $\mathfrak{M} \models \beta [s]$. \item If $\varphi$ is $\forall x \, \delta$ for some variable $x$, then $\mathfrak{M} \models \varphi [s]$ if and only if for all $m \in |\mathfrak{M}|$, $\mathfrak{M} \models \delta [s(x|m)]$, where $s(x|m)$ is the assignment given by \begin{displaymath} s(x|m)(v_k) = \begin{cases} s(v_k) & \text{if $v_k$ is different from $x$} \\ m & \text{if $v_k$ is $x$.} \end{cases} \end{displaymath} \end{enumerate} If $\mathfrak{M} \models \varphi [s]$, we shall say that $\mathfrak{M}$ {\em satisfies $\varphi$ on assignment\/}\index{satisfies} $s$ or that $\varphi$ {\em is true in $\mathfrak{M}$ on assignment\/}\index{truth in a structure} $s$. We will often write $\mathfrak{M} \nmodels \varphi [s]$ if it is not the case that $\mathfrak{M} \models \varphi [s]$.\index{$\nmodels$} Also, if $\Gamma$ is a set of formulas of $\mathcal{L}$, we shall take $\mathfrak{M} \models \Gamma [s]$ to mean that $\mathfrak{M} \models \gamma [s]$ for every formula $\gamma$ in $\Gamma$ and say that $\mathfrak{M}$ {\em satisfies $\Gamma$ on assignment\/} $s$. Similarly, we shall take $\mathfrak{M} \nmodels \Gamma [s]$ to mean that $\mathfrak{M} \nmodels \gamma [s]$ for {\em some\/} formula $\gamma$ in $\Gamma$. \end{defn} Clauses 1 and 2 are pretty straightforward and clauses 3 and 4 are essentially identical to the corresponding parts of Definition~\ref{d:tras}. The key clause is 5, which says that $\forall$ should be interpreted as ``for all elements of the universe''. \begin{exmp} Let $\mathfrak{R}$ be the structure for $\mathcal{L}_F$ and $s$ the assignment for $\mathfrak{R}$ given in Example \ref{e:as}, and consider the formula $\forall v_1\, (= v_3 \cdot 0 v_1 \to = v_3 0)$ of $\mathcal{L}_F$. We can verify that $\mathfrak{R} \models \forall v_1\, (= v_3 \cdot 0 v_1 \to = v_3 0) \, [s]$ as follows: \[ \begin{aligned} \mbox{} &\mathfrak{R} \models \forall v_1\, (= v_3 \cdot 0 v_1 \to = v_3 0) \, [s] \\ \iff &\text{for all $a \in |\mathfrak{R}|$,\ } \mathfrak{R} \models (= v_3 \cdot 0 v_1 \to = v_3 0) \, [s(v_1|a)] \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $\mathfrak{R} \models = v_3 \cdot 0 v_1 \, [s(v_1|a)]$,} \\ & \;\;\; \text{then $\mathfrak{R} \models = v_3 0 \, [s(v_1|a)]$} \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $\mathbf{s}(v_1|a)(v_3) = \mathbf{s}(v_1|a)(\cdot 0 v_1)$,} \\ & \;\;\; \text{then $\mathbf{s}(v_1|a)(v_3) = \mathbf{s}(v_1|a)(0)$} \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $\mathbf{s}(v_3) = \mathbf{s}(v_1|a)(0) \cdot \mathbf{s}(v_1|a)(v_1)$, then $\mathbf{s}(v_3) = 0$} \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $s(v_3) = 0 \cdot a$, then $s(v_3) = 0$} \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $4 = 0 \cdot a$, then $4 = 0$} \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $4 = 0$, then $4 = 0$} \\ \end{aligned} \] \dots which last is true whether or not $4 = 0$ is true or false. \end{exmp} \begin{prob} \label{p:six4} Let $\mathfrak{N}$ be the structure for $\mathcal{L}_N$ in Problem \ref{pb:exas}. Let $p : V \to \mathbb{N}$ be defined by $p(v_{2k}) = k$ and $p(v_{2k+1}) = k$. Verify that \begin{enumerate} \item $\mathfrak{N} \models \forall w \, (\lnot Sw = 0) \, [p]$ and \item $\mathfrak{N} \nmodels \forall x \exists y \, x + y = 0 \, [p]$. \end{enumerate} \end{prob} \begin{prop} \label{p:six5} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$, $s$ is an assignment for $\mathfrak{M}$, $x$ is a variable, and $\varphi$ is a formula of a first-order language $\mathcal{L}$. Then $\mathfrak{M} \models \exists x\, \varphi [s]$ if and only if $\mathfrak{M} \models \varphi [s(x|m)]$ for some $m \in |\mathfrak{M}|$. \end{prop} Working with particular assignments is difficult but, while sometimes unavoidable, not always necessary. \begin{defn} \label{d:mod} \index{model} \index{true in a structure} \index{$\models$} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$, and $\varphi$ a formula of $\mathcal{L}$. Then $\mathfrak{M} \models \varphi$ if and only if $\mathfrak{M} \models \varphi [s]$ for every assignment $s : V \to |\mathfrak{M}|$ for $\mathfrak{M}$. $\mathfrak{M}$ is a {\em model\/} of $\varphi$ or that $\varphi$ is {\em true\/} in $\mathfrak{M}$ if $\mathfrak{M} \models \varphi$. We will often write $\mathfrak{M} \nmodels \psi$ if it is not the case that $\mathfrak{M} \models \psi$. Similarly, if $\Gamma$ is a set of formulas, we will write $\mathfrak{M} \models \Gamma$ if $\mathfrak{M} \models \gamma$ for every formula $\gamma \in \Gamma$, and say that $\mathfrak{M}$ is a {\em model\/}\index{model} of $\Gamma$ or that $\mathfrak{M}$ {\em satisfies\/}\index{satisfies} $\Gamma$. A formula or set of formulas is {\em satisfiable\/}\index{satisfiable} if there is some structure $\mathfrak{M}$ which satisfies it. We will often write $\mathfrak{M} \nmodels \Gamma$ if it is not the case that $\mathfrak{M} \models \Gamma$.\index{$\nmodels$} \end{defn} \begin{note} $\mathfrak{M} \nmodels \varphi$ does {\em not\/} mean that for every assignment $s : V \to |\mathfrak{M}|$, it is not the case that $\mathfrak{M} \models \varphi [s]$. It only means that that there is {\em some\/} assignment $r : V \to |\mathfrak{M}|$ for which $\mathfrak{M} \models \varphi [r]$ is not true. \end{note} \begin{prob} \label{p:ord} $\mathfrak{Q} = (\mathbb{Q},<)$ is a structure for $\mathcal{L}_O$. For each of the following formulas $\varphi$ of $\mathcal{L}_O$, determine whether or not $\mathfrak{Q} \models \varphi$. \begin{enumerate} \item $\forall v_0\, \exists v_2\, v_0 < v_2$ \item $\exists v_1\, \forall v_3\, (v_1 < v_3 \to v_1 = v_3)$ \item $\forall v_4\, \forall v_5\, \forall v_6 (v_4 < v_5 \to (v_5 < v_6 \to v_4 < v_6))$ \end{enumerate} \end{prob} The following facts are counterparts of sorts for Proposition~\ref{p:tav}. Their point is that what a given assignment does with a given term or formula depends only on the assignment's values on the (free) variables of the term or formula. \begin{lem} \label{l:six7} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$, $t$ is a term of $\mathcal{L}$, and $r$ and $s$ are assignments for $\mathfrak{M}$ such that $r(x) = s(x)$ for every variable $x$ which occurs in $t$. Then $\mathbf{r}(t) = \mathbf{s}(t)$. \end{lem} \begin{prop} \label{p:six8} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$, $\varphi$ is a formula of $\mathcal{L}$, and $r$ and $s$ are assignments for $\mathfrak{M}$ such that $r(x) = s(x)$ for every variable $x$ which occurs free in $\varphi$. Then $\mathfrak{M} \models \varphi [r]$ if and only if $\mathfrak{M} \models \varphi [s]$. \end{prop} \begin{cor} \label{c:six9} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$ and $\sigma$ is a sentence of $\mathcal{L}$. Then $\mathfrak{M} \models \sigma$ if and only if there is some assignment $s : V \to |\mathfrak{M}|$ for $\mathfrak{M}$ such that $\mathfrak{M} \models \sigma [s]$. \end{cor} Thus sentences are true or false in a structure independently of any particular assignment. This does not necessarily make life easier when trying to verify whether a sentence is true in a structure -- try doing Problem~\ref{p:ord} again with the above results in hand -- but it does let us simplify things on occasion when proving things about sentences rather than formulas. We recycle a sense in which we used $\models$\index{$\models$} in propositional logic. \begin{defn} Suppose $\Gamma$ is a set of formulas of $\mathcal{L}$ and $\psi$ is a formula of $\mathcal{L}$. Then $\Gamma$ {\em implies\/}\index{implies} $\psi$, written as $\Gamma \models \psi$, if $\mathfrak{M} \models \psi$ whenever $\mathfrak{M} \models \Gamma$ for every structure $\mathfrak{M}$ for $\mathcal{L}$. Similarly, if $\Gamma$ and $\Delta$ are sets of formulas of $\mathcal{L}$, then $\Gamma$ {\em implies\/} $\Delta$, written as $\Gamma \models \Delta$, if $\mathfrak{M} \models \Delta$ whenever $\mathfrak{M} \models \Gamma$ for every structure $\mathfrak{M}$ for $\mathcal{L}$. We will usually write $\models \dots$ for $\emptyset \models \dots$. \end{defn} \begin{prop} \label{p:inf} Suppose $\alpha$ and $\beta$ are formulas of some first-order language. Then $\{\, (\alpha \to \beta),\, \alpha\, \} \models \beta$. \end{prop} \begin{prop} \label{p:six12} Suppose $\Sigma$ is a set of formulas and $\psi$ and $\rho$ are formulas of some first-order language. Then $\Sigma \cup \{\psi\} \models \rho$ if and only if $\Sigma \models (\psi \to \rho)$. \end{prop} \begin{defn} A formula $\psi$ of $\mathcal{L}$ is a {\em tautology\/}\index{tautology} if it is true in every structure, {\em i.e.\/} if $\models \psi$. $\psi$ is a {\em contradiction\/}\index{contradiction} if $\lnot \psi$ is a tautology, {\em i.e.\/} if $\models \lnot \psi$. \end{defn} For some trivial examples, let $\varphi$ be a formula of $\mathcal{L}$ and $\mathfrak{M}$ a structure for $\mathcal{L}$. Then $\mathfrak{M} \models \{ \varphi \}$ if and only if $\mathfrak{M} \models \varphi$, so it must be the case that $\{ \varphi \} \models \varphi$. It is also easy to check that $\varphi \to \varphi$ is a tautology and $\lnot (\varphi \to \varphi)$ is a contradiction. \begin{prob} \label{p:taut} Show that $\forall y\, y = y$ is a tautology and that $\exists y\, \lnot y = y$ is a contradiction. \end{prob} \begin{prob} \label{p:cont} Suppose $\varphi$ is a contradiction. Show that $\mathfrak{M} \models \varphi [s]$ is false for every structure $\mathfrak{M}$ and assignment $s : V \to |\mathfrak{M}|$ for $\mathfrak{M}$. \end{prob} \begin{prob} \label{p:six13} Show that a set of formulas $\Sigma$ is satisfiable if and only if there is no contradiction $\chi$ such that $\Sigma \models \chi$. \end{prob} The following fact is a counterpart of Proposition \ref{p:tif}. \begin{prop} \label{p:mif} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$ and $\alpha$ and $\beta$ are sentences of $\mathcal{L}$. Then: \begin{enumerate} \item $\mathfrak{M} \models \lnot\alpha$ if and only if $\mathfrak{M} \nmodels \alpha$. \item $\mathfrak{M} \models \alpha \to \beta$ if and only if $\mathfrak{M} \models \beta$ whenever $\mathfrak{M} \models \alpha$. \item $\mathfrak{M} \models \alpha \lor \beta$ if and only if $\mathfrak{M} \models \alpha$ or $\mathfrak{M} \models \beta$. \item $\mathfrak{M} \models \alpha \land \beta$ if and only if $\mathfrak{M} \models \alpha$ and $\mathfrak{M} \models \beta$. \item $\mathfrak{M} \models \alpha \fromto \beta$ if and only if $\mathfrak{M} \models \alpha$ exactly when $\mathfrak{M} \models \beta$. \item $\mathfrak{M} \models \forall x\, \alpha$ if and only if $\mathfrak{M} \models \alpha$. \item $\mathfrak{M} \models \exists x\, \alpha$ if and only if there is some $m \in |\mathfrak{M}|$ so that $\mathfrak{M} \models \alpha\, [s(x|m)]$ for every assignment $s$ for $\mathfrak{M}$. \end{enumerate} \end{prop} \begin{prob} \label{p:mif2} How much of Proposition \ref{p:mif} must remain true if $\alpha$ and $\beta$ are not sentences? \end{prob} Recall that by Proposition \ref{p:exlan} a formula of a first-order language is also a formula of any extension of the language. The following relationship between extension languages and satisfiability will be needed later on. \begin{prop} \label{p:exsat} Suppose $\mathcal{L}$ is a first-order language, $\mathcal{L}'$ is an extension of $\mathcal{L}$, and $\Gamma$ is a set of formulas of $\mathcal{L}$. Then $\Gamma$ is satisfiable in a structure for $\mathcal{L}$ if and only if $\Gamma$ is satisfiable in a structure for $\mathcal{L}'$. \end{prop} One last bit of terminology\dots \begin{defn} \label{d:ax} \index{axiom} \index{theory} \index{$\text{Th}$} If $\mathfrak{M}$ is a structure for $\mathcal{L}$, then the {\em theory\/} of $\mathfrak{M}$ is just the set of all sentences of $\mathcal{L}$ true in $\mathfrak{M}$, {\em i.e.\/} \[ \text{Th}(\mathfrak{M}) = \{\, \tau \mid \tau \text{\ is a sentence and\ } \mathfrak{M} \models \tau \,\}. \] If $\Delta$ is a set of sentences and $\mathcal{S}$ is a collection of structures, then $\Delta$ is a set of (non-logical) {\it axioms\/} for $\mathcal{S}$ if for every structure $\mathfrak{M}$, $\mathfrak{M} \in \mathcal{S}$ if and only if $\mathfrak{M} \models \Delta$. \end{defn} \begin{exmp} Consider the sentence $\exists x\, \exists y\, ( (\lnot x = y) \land \forall z\, (z = x \lor z = y))$ of $\mathcal{L}_=$. Every structure of $\mathcal{L}_=$ satisfying this sentence must have exactly two elements in its universe, so $\{\, \exists x\, \exists y\, ( (\lnot x = y) \land \forall z\, (z = x \lor z = y)) \,\}$ is a set of non-logical axioms for the collection of sets of cardinality $2$: \[ \{\, \mathfrak{M} \mid \mathfrak{M} \text{\ is a structure for\ } \mathcal{L}_= \text{\ with exactly $2$ elements} \,\} \, . \] \end{exmp} \begin{prob} \label{p:six16} In each case, find a suitable language and a set of axioms in it for the given collection of structures. \begin{enumerate} \item Sets of size 3. \item Bipartite graphs. \item Commutative groups. \item Fields of characteristic 5. \end{enumerate} \end{prob} % % Chapter 7 of "A Problem Course in Mathematical Logic" % \chapter{Deductions} \label{ch:seven} Deductions in first-order logic are not unlike deductions in propositional logic. Of course, some changes are necessary to handle the various additional features of propositional logic, especially quantifiers. In particular, one of the new axioms requires a tricky preliminary definition. Roughly, the problem is that we need to know when we can replace occurrences of a variable in a formula by a term without letting any variable in the term get captured by a quantifier. Throughout this chapter, let $\mathcal{L}$ be a fixed arbitrary first-order language. Unless stated otherwise, all formulas will be assumed to be formulas of $\mathcal{L}$. \begin{defn} \label{d:subs} \index{substitutable} Suppose $x$ is a variable, $t$ is a term, and $\varphi$ is a formula. Then {\em $t$ is substitutable for $x$ in $\varphi$\/} is defined as follows: \begin{enumerate} \item If $\varphi$ is atomic, then $t$ is substitutable for $x$ in $\varphi$. \item If $\varphi$ is $(\lnot \psi)$, then $t$ is substitutable for $x$ in $\varphi$ if and only if $t$ is substitutable for $x$ in $\psi$. \item If $\varphi$ is $(\alpha \to \beta)$, then $t$ is substitutable for $x$ in $\varphi$ if and only if $t$ is substitutable for $x$ in $\alpha$ and $t$ is substitutable for $x$ in $\beta$. \item If $\varphi$ is $\forall y \, \delta$, then $t$ is substitutable for $x$ in $\varphi$ if and only if either \begin{enumerate} \item $x$ does not occur free in $\varphi$, or \item if $y$ does not occur in $t$ and $t$ is substitutable for $x$ in $\delta$. \end{enumerate} \end{enumerate} \end{defn} For example, $x$ is always substitutable for itself in any formula $\varphi$ and $\varphi^x_x$ is just $\varphi$ (see Problem~\ref{p:subs}). On the other hand, $y$ is not substitutable for $x$ in $\forall y\, x = y$ because if $x$ were to be replaced by $y$, the new instance of $y$ would be ``captured'' by the quantifier $\forall y$. This makes a difference to the truth of the formula. The truth of $\forall y\, x = y$ depends on the structure in which it is interpreted --- it's true if the universe has only one element and false otherwise --- but $\forall y\, y = y$ is a tautology by Problem~\ref{p:taut} so it is true in any structure whatsoever. This sort of difficulty makes it necessary to be careful when substituting for variables. \begin{defn} \label{d:subst} \index{substitution} Suppose $x$ is a variable, $t$ is a term, and $\varphi$ is a formula. If $t$ is substitutable for $x$ in $\varphi$, then $\varphi^x_t$ ({\em i.e.\/} $\varphi$ with $t$ substituted for $x$) is defined as follows:\index{$\varphi^x_t$} \begin{enumerate} \item If $\varphi$ is atomic, then $\varphi^x_t$ is the formula obtained by replacing each occurrence of $x$ in $\varphi$ by $t$. \item If $\varphi$ is $(\lnot \psi)$, then $\varphi^x_t$ is the formula $(\lnot \psi^x_t)$. \item If $\varphi$ is $(\alpha \to \beta)$, then $\varphi^x_t$ is the formula $(\alpha^x_t \to \beta^x_t)$. \item If $\varphi$ is $\forall y \, \delta$, then $\varphi^x_t$ is the formula \begin{enumerate} \item $\forall y \, \delta$ if $x$ is $y$, and \item $\forall y \, \delta^x_t$ if $x$ isn't $y$. \end{enumerate} \end{enumerate} \end{defn} \begin{prob} \label{p:subs} \begin{enumerate} \item Is $x$ substitutable for $z$ in $\psi$ if $\psi$ is $z = x \to \forall z\, z = x$? If so, what is $\psi^z_x$? \item Show that if $t$ is any term and $\sigma$ is a sentence, then $t$ is substitutable in $\sigma$ for any variable $x$. What is $\sigma^x_t$? \item Show that if $t$ is a term in which no variable occurs that occurs in the formula $\varphi$, then $t$ is substitutable in $\varphi$ for any variable $x$. \item Show that $x$ is substitutable for $x$ in $\varphi$ for any variable $x$ and any formula $\varphi$, and that $\varphi^x_x$ is just $\varphi$. \end{enumerate} \end{prob} Along with the notion of substitutability, we need an additional notion in order to define the logical axioms of $\mathcal{L}$. \begin{defn} \index{generalization} If $\varphi$ is any formula and $x_1$, \dots, $x_n$ are any variables, then $\forall x_1 \dots \forall x_n \, \varphi$ is said to be a {\em generalization\/} of $\varphi$. \end{defn} For example, $\forall y\, \forall x\, (x = y \to fx = fy)$ and $\forall z\, (x = y \to fx = fy)$ are (different) generalizations of $x = y \to fx = fy$, but $\forall x\, \exists y\, (x = y \to fx = fy)$ is not. Note that the variables being quantified don't have to occur in the formula being generalized. \begin{lem} \label{l:gen} Any generalization of a tautology is a tautology. \end{lem} \begin{defn} \label{d:axs} \index{axiom schema} Every first-order language $\mathcal{L}$ has eight {\em logical axiom schema\/}: \begin{description} \item[A1] $(\alpha \to (\beta \to \alpha))$ \index{A1} \item[A2] $((\alpha \to (\beta \to \gamma)) \to ((\alpha \to \beta) \to (\alpha \to \gamma)))$ \index{A2} \item[A3] $(((\lnot \beta)\to (\lnot \alpha)) \to (((\lnot \beta) \to \alpha) \to \beta))$ \index{A3} \item[A4] $(\forall x \, \alpha \to \alpha^x_t)$, if $t$ is substitutable for $x$ in $\alpha$. \index{A4} \item[A5] $(\forall x \, (\alpha \to \beta) \to (\forall x \, \alpha \to \forall x \, \beta))$ \index{A5} \item[A6] $(\alpha \to \forall x \, \alpha)$, if $x$ does not occur free in $\alpha$. \index{A6} \item[A7] $x = x$ \index{A7} \item[A8] $(x = y \to (\alpha \to \beta))$, if $\alpha$ is atomic and $\beta$ is obtained from $\alpha$ by replacing some occurrences (possibly all or none) of $x$ in $\alpha$ by $y$. \index{A8} \end{description} Plugging in any particular formulas of $\mathcal{L}$ for $\alpha$, $\beta$, and $\gamma$, and any particular variables for $x$ and $y$, in any of A1--A8 gives a {\em logical axiom\/}\index{logical axiom}\index{axiom logical} of $\mathcal{L}$. In addition, any generalization of a logical axiom of $\mathcal{L}$ is also a logical axiom of $\mathcal{L}$. \end{defn} The reason for calling the instances of A1--A8 the logical axioms, instead of just axioms, is to avoid conflict with Definition~\ref{d:ax}. \begin{prob} \label{p:seven3} Determine whether or not each of the following formulas is a logical axiom. \begin{enumerate} \item $\forall x\, \forall z\, (x = y \to (x = c \to x = y))$ \item $x = y \to (y = z \to z = x)$ \item $\forall z\, (x = y \to (x = c \to y = c))$ \item $\forall w\, \exists x\, (Pwx \to Pww) \to \exists x\, (Pxx \to Pxx)$ \item $\forall x\, (\forall x\, c = fxc \to \forall x\, \forall x\, c = fxc)$ \item $(\exists x\, Px \to \exists y\, \forall z\, Rzfy) \to ( (\exists x\, Px \to \forall y\, \lnot \forall z\, Rzfy) \to \forall x\, \lnot Px)$ \end{enumerate} \end{prob} \begin{prop} \label{p:seven4} Every logical axiom is a tautology. \end{prop} Note that we have recycled our axiom schemas A1---A3 from propositional logic. We will also recycle MP as the sole rule of inference\index{rule of inference} for first-order logic. \begin{defn}[Modus Ponens] \index{Modus Ponens} Given the formulas $\varphi$ and $(\varphi \to \psi)$, one may infer $\psi$. \end{defn} As in propositional logic, we will usually refer to Modus Ponens by its initials, MP\index{MP}. That MP preserves truth in the sense of Chapter \ref{ch:six} follows from Problem \ref{p:inf}. Using the logical axioms and MP, we can execute deductions in first-order logic just as we did in propositional logic. \begin{defn} \index{deduction} \index{proof} Let $\Delta$ be a set of formulas of the first-order language $\mathcal{L}$. A {\em deduction\/} or {\em proof\/} from $\Delta$ in $\mathcal{L}$ is a finite sequence $\varphi_1 \varphi_2 \dots \varphi_n$ of formulas of $\mathcal{L}$ such that for each $k \le n$, \begin{enumerate} \item $\varphi_k$ is a logical axiom, or \item $\varphi_k \in \Delta$, or \item there are $i,j < k$ such that $\varphi_k$ follows from $\varphi_i$ and $\varphi_j$ by MP. \end{enumerate} A formula of $\Delta$ appearing in the deduction is usually referred to as a {\em premiss\/}\index{premiss} of the deduction. $\Delta$ {\em proves\/}\index{proves} a formula $\alpha$, written as $\Delta \proves \alpha$,\index{$\proves$} if $\alpha$ is the last formula of a deduction from $\Delta$. We'll usually write $\proves \alpha$ instead of $\emptyset \proves \alpha$. Finally, if $\Gamma$ and $\Delta$ are sets of formulas, we'll take $\Gamma \proves \Delta$ to mean that $\Gamma \proves \delta$ for every formula $\delta \in \Delta$. \end{defn} \begin{note} We have reused the axiom schema, the rule of inference, and the definition of deduction from propositional logic. It follows that any deduction of propositional logic can be converted into a deduction of first-order logic simply by replacing the formulas of $\mathcal{L}_P$ occurring in the deduction by first-order formulas. Feel free to appeal to the deductions in the exercises and problems of Chapter~\ref{ch:three}. {\em You should probably review the Examples and Problems of Chapter~\ref{ch:three} before going on, since most of the rest of this Chapter concentrates on what is {\em different\/} about deductions in first-order logic.\/} \end{note} \begin{exmp} \label{e:apf} We'll show that $\{ \alpha \} \proves \exists x\, \alpha$ for any first-order formula $\alpha$ and any variable $x$. \begin{enumerate} \item $(\forall x\, \lnot\alpha \to \lnot\alpha) \to (\alpha \to \lnot \forall x\, \lnot \alpha)$ \hfill Problem~\ref{p:prov}.5 \item $\forall x\, \lnot\alpha \to \lnot\alpha$ \hfill A4 \item $\alpha \to \lnot \forall x\, \lnot \alpha$ \hfill 1,2 MP \item $\alpha$ \hfill Premiss \item $\lnot \forall x\, \lnot \alpha$ \hfill 3,4 MP \item $\exists x\, \alpha$ \hfill Definition of $\exists$ \end{enumerate} Strictly speaking, the last line is just for our convenience, like $\exists$ itself. \end{exmp} \begin{prob} \label{p:deds} Show that: \begin{enumerate} \item $\proves \forall x\, \varphi \to \forall y\, \varphi^x_y$, if $y$ does not occur at all in $\varphi$. \item $\proves \alpha \lor \lnot \alpha$. \item $\{ c = d \} \proves \forall z\, Qazc \to Qazd$. \item $\proves x = y \to y = x$. \item $\{ \exists x\, \alpha \} \proves \alpha$ if $x$ does not occur free in $\alpha$. \end{enumerate} \end{prob} Many general facts about deductions can be recycled from propositional logic, including the Deduction Theorem. \begin{prop} \label{p:seven5a} If $\varphi_1 \varphi_2 \dots \varphi_n$ is a deduction of $\mathcal{L}$, then $\varphi_1 \dots \varphi_\ell$ is also a deduction of $\mathcal{L}$ for any $\ell$ such that $1 \le \ell \le n$. \end{prop} \begin{prop} \label{p:seven6} If $\Gamma \proves \delta$ and $\Gamma \proves \delta \to \beta$, then $\Gamma \proves \beta$. \end{prop} \begin{prop} \label{p:bim} If $\Gamma \subseteq \Delta$ and $\Gamma \proves \alpha$, then $\Delta \proves \alpha$. \end{prop} \begin{prop} \label{p:seven8} Then if $\Gamma \proves \Delta$ and $\Delta \proves \sigma$, then $\Gamma \proves \sigma$. \end{prop} \begin{thm}[Deduction Theorem] \label{t:fded} \index{Deduction Theorem} If $\Sigma$ is any set of formulas and $\alpha$ and $\beta$ are any formulas, then $\Sigma \proves \alpha \to \beta$ if and only if $\Sigma \cup \{ \alpha \} \proves \beta$. \end{thm} Just as in propositional logic, the Deduction Theorem is useful because it often lets us take shortcuts when trying to show that deductions exist. There is also another result about first-order deductions which often supplies useful shortcuts. \begin{thm}[Generalization Theorem] \label{t:gen} \index{Generalization Theorem} Suppose $x$ is a variable, $\Gamma$ is a set of formulas in which $x$ does not occur free, and $\varphi$ is a formula such that $\Gamma \proves \varphi$. Then $\Gamma \proves \forall x \, \varphi$. \end{thm} \begin{thm}[Generalization On Constants] \label{t:genc} \index{Generalization On Constants} Suppose that $c$ is a constant symbol, $\Gamma$ is a set of formulas in which $c$ does not occur, and $\varphi$ is a formula such that $\Gamma \proves \varphi$. Then there is a variable $x$ which does not occur in $\varphi$ such that $\Gamma \proves \forall x \, \varphi^c_x$.\footnote{$\varphi^c_x$ is $\varphi$ with every occurence of the constant $c$ replaced by $x$.} Moreover, there is a deduction of $\forall x \, \varphi^c_x$ from $\Gamma$ in which $c$ does not occur. \end{thm} \begin{exmp} We'll show that if $\varphi$ and $\psi$ are any formulas, $x$ is any variable, and $\proves \varphi \to \psi$, then $\proves \forall x\, \varphi \to \forall x\, \psi$. Since $x$ does not occur free in any formula of $\emptyset$, it follows from $\proves \varphi \to \psi$ by the Generalization Theorem that $\proves \forall x\, (\varphi \to \psi)$. But then \begin{enumerate} \item $\forall x\, (\varphi \to \psi)$ \hfill above \item $\forall x\, (\varphi \to \psi) \to (\forall x\, \varphi \to \forall x\, \psi)$ \hfill A5 \item $\forall x\, \varphi \to \forall x\, \psi$ \hfill 1,2 MP \end{enumerate} is the tail end of a deduction of $\forall x\, \varphi \to \forall x\, \psi$ from $\emptyset$. \end{exmp} \begin{prob} \label{p:seven12} Show that: \begin{enumerate} \item $\proves \forall x\, \forall y\, \forall z\, ( x = y \to (y = z \to x = z) )$. \item $\proves \forall x\, \alpha \to \exists x\, \alpha$. \item $\proves \exists x \, \gamma \to \forall x\, \gamma$ if $x$ does not occur free in $\gamma$. \end{enumerate} \end{prob} We conclude with a bit of terminology. \begin{defn} \index{theory} \index{$\text{Th}$} If $\Sigma$ is a set of sentences, then the {\em theory\/} of $\Sigma$ is \[ \mathrm{Th}(\Sigma) = \{\, \tau \mid \tau \text{\ is a sentence and\ } \Sigma \proves \tau \,\}. \] \end{defn} That is, the theory of $\Sigma$ is just the collection of all sentences which can be proved from $\Sigma$. % % Chapter 8 of "A Problem Course in Mathematical Logic" % \chapter{Soundness and Completeness} \label{ch:eight} As with propositional logic, first-order logic had better satisfy the Soundness Theorem and it is desirable that it satisfy the Completeness Theorem. These theorems do hold for first-order logic. The Soundness Theorem is proved in a way similar to its counterpart for propositional logic, but the Completeness Theorem will require a fair bit of additional work.\footnote{This is not too surprising because of the greater complexity of first-order logic. Also, it turns out that first-order logic is about as powerful as a logic can get and still have the Completeness Theorem hold.} It is in this extra work that the distinction between formulas and sentences becomes useful. Let $\mathcal{L}$ be a fixed countable first-order language throughout this chapter. All formulas will be assumed to be formulas of $\mathcal{L}$ unless stated otherwise. First, we rehash many of the definitions and facts we proved for propositional logic in Chapter~\ref{ch:four} for first-order logic. \begin{thm}[Soundness Theorem] \label{t:fsnd} \index{Soundness Theorem} If $\alpha$ is a sentence and $\Delta$ is a set of sentences such that $\Delta \proves \alpha$, then $\Delta \models \alpha$. \end{thm} \begin{defn} \index{consistent} \index{inconsistent} A set of sentences $\Gamma$ is {\em inconsistent\/} if $\Gamma \proves \lnot (\psi \to \psi)$ for some formula $\psi$, and is {\em consistent\/} if it is not inconsistent. \end{defn} Recall that a set of sentences $\Gamma$ is satisfiable if $\mathfrak{M} \models \Gamma$ for some structure $\mathfrak{M}$. \begin{prop} \label{p:sacon} If a set of sentences $\Gamma$ is satisfiable, then it is consistent. \end{prop} \begin{prop} \label{p:eight4} Suppose $\Delta$ is an inconsistent set of sentences. Then $\Delta \proves \psi$ for any formula $\psi$. \end{prop} \begin{prop} \label{p:eight5} Suppose $\Sigma$ is an inconsistent set of sentences. Then there is a finite subset $\Delta$ of $\Sigma$ such that $\Delta$ is inconsistent. \end{prop} \begin{cor} \label{c:eight6} A set of sentences $\Gamma$ is consistent if and only if every finite subset of $\Gamma$ is consistent. \end{cor} \begin{defn} \index{maximally consistent} \index{consistent maximally} A set of sentences $\Sigma$ is {\em maximally consistent} if $\Sigma$ is consistent but $\Sigma \cup \{\tau\}$ is inconsistent whenever $\tau$ is a sentence such that $\tau \notin \Sigma$. \end{defn} One quick way of finding examples of maximally consistent sets is given by the following proposition. \begin{prop} \label{p:smac} If $\mathfrak{M}$ is a structure, then $\text{Th}(\mathfrak{M})$ is a maximally consistent set of sentences. \end{prop} \begin{exmp} \label{e:maxcon} $\mathfrak{M} = \left( \{ 5 \} \right)$ is a structure for $\mathcal{L}_=$, so $\text{Th}(\mathfrak{M})$ is a maximally consistent set of sentences. Since it turns out that $\text{Th}(\mathfrak{M}) = \text{Th}\left( \{\, \forall x\, \forall y\, x=y \,\} \right)$, this also gives us an example of a set of sentences $\Sigma = \{\, \forall x\, \forall y\, x=y \,\}$ such that $\text{Th}(\Sigma)$ is maximally consistent. \end{exmp} \begin{prop} \label{p:eight8} If $\Sigma$ is a maximally consistent set of sentences, $\tau$ is a sentence, and $\Sigma \proves \tau$, then $\tau \in \Sigma$. \end{prop} \begin{prop} \label{p:eight9} Suppose $\Sigma$ is a maximally consistent set of sentences and $\tau$ is a sentence. Then $\lnot\tau \in \Sigma$ if and only if $\tau \notin \Sigma$. \end{prop} \begin{prop} \label{p:eight10} Suppose $\Sigma$ is a maximally consistent set of sentences and $\varphi$ and $\psi$ are any sentences. Then $\varphi \to \psi \in \Sigma$ if and only if $\varphi \notin \Sigma$ or $\psi \in \Sigma$. \end{prop} \begin{thm} \label{t:etmc} Suppose $\Gamma$ is a consistent set of sentences. Then there is a maximally consistent set of sentences $\Sigma$ with $\Gamma \subseteq \Sigma$. \end{thm} The counterparts of these notions and facts for propositional logic sufficed to prove the Completeness Theorem, but here we will need some additional tools. The basic problem is that instead of defining a suitable truth assignment from a maximally consistent set of formulas, we need to construct a suitable structure from a maximally consistent set of sentences. Unfortunately, structures for first-order languages are usually more complex than truth assignments for propositional logic. The following definition supplies the key new idea we will use to prove the Completeness Theorem. \begin{defn} \index{witnesses} Suppose $\Sigma$ is a set of sentences and $C$ is a set of (some of the) constant symbols of $\mathcal{L}$. Then $C$ is a {\em set of witnesses\/} for $\Sigma$ in $\mathcal{L}$ if for every formula $\varphi$ of $\mathcal{L}$ with at most one free variable $x$, there is a constant symbol $c \in C$ such that $\Sigma \proves \exists x\, \varphi \to \varphi^x_c$. \end{defn} The idea is that every element of the universe which $\Sigma$ proves must exist is named, or ``witnessed'', by a constant symbol in $C$. Note that if $\Sigma \proves \lnot \exists x\, \varphi$, then $\Sigma \proves \exists x\, \varphi \to \varphi^x_c$ for any constant symbol $c$. \begin{prop} \label{p:eight14} Suppose $\Gamma$ and $\Sigma$ are sets of sentences of $\mathcal{L}$, $\Gamma \subseteq \Sigma$, and $C$ is a set of witnesses for $\Gamma$ in $\mathcal{L}$. Then $C$ is a set of witnesses for $\Sigma$ in $\mathcal{L}$. \end{prop} \begin{exmp} \label{e:rw} Let $\mathcal{L}'_O$ be the first-order language with a single 2-place relation symbol, $<$, and countably many constant symbols, $c_q$ for each $q \in \mathbb{Q}$. Let $\Sigma$ include all the sentences \begin{enumerate} \item $c_p < c_q$, for every $p,q \in \mathbb{Q}$ such that $p < q$, \item $\forall x\, (\lnot x < x)$, \item $\forall x\, \forall y\, (x < y \lor x = y \lor y < x)$, \item $\forall x\, \forall y\, \forall z\, (x < y \to (y < z \to x < z))$, \item $\forall x\, \forall y\, (x < y \to \exists z\, (x < z \land z < y))$, \item $\forall x\, \exists y\, (x < y)$, and \item $\forall x\, \exists y\, (y < x)$. \end{enumerate} In effect, $\Sigma$ asserts that $<$ is a linear order on the universe (2--4) which is dense (5) and has no endpoints (6--7), and which has a suborder isomorphic to $\mathbb{Q}$ (1). Then $C = \{\, c_q \mid q \in \mathbb{Q} \,\}$ is a set of witnesses for $\Sigma$ in $\mathcal{L}'_O$. \end{exmp} In the example above, one can ``reverse-engineer'' a model for the set of sentences in question from the set of witnesses simply by letting the universe of the structure {\em be\/} the set of witnesses. One can also define the necessary relation interpreting $<$ in a pretty obvious way from $\Sigma$.\footnote{Note, however, that an isomorphic copy of $\mathbb{Q}$ is not the only structure for $\mathcal{L}'_O$ satisfying $\Sigma$. For example, $\mathfrak{R} = (\mathbb{R},<, q + \pi \colon q \in \mathbb{Q})$ will also satisfy $\Sigma$ if we intepret $c_q$ by $q + \pi$.} This example is obviously contrived: there are no constant symbols around which are not witnesses, $\Sigma$ proves that distinct constant symbols aren't equal to to each other, there is little by way of non-logical symbols needing interpretation, and $\Sigma$ explicitly includes everything we need to know about $<$. In general, trying to build a model for a set of sentences $\Sigma$ in this way runs into a number of problems. First, how do we know whether $\Sigma$ has a set of witnesses at all? Many first-order languages have few or no constant symbols, after all. Second, if $\Sigma$ has a set of witnesses $C$, it's unlikely that we'll be able to get away with just letting the universe of the model be $C$. What if $\Sigma \proves c = d$ for some distinct witnesses $c$ and $d$? Third, how do we handle interpreting constant symbols which are not in $C$? Fourth, what if $\Sigma$ doesn't prove enough about whatever relation and function symbols exist to let us define interpretations of them in the structure under construction? (Imagine, if you like, that someone hands you a copy of Joyce's {\em Ulysses\/} and asks you to produce a complete road map of Dublin on the basis of the book. Even if it has no geographic contradictions, you are unlikely to find all the information in the novel needed to do the job.) Finally, even if $\Sigma$ does prove all we need to define functions and relations on the universe to interpret the function and relation symbols, just how do we do it? Getting around all these difficulties requires a fair bit of work. One can get around many by sticking to maximally consistent sets of sentences in suitable languages. \begin{lem} \label{l:eight12} Suppose $\Sigma$ is a set of sentences, $\varphi$ is any formula, and $x$ is any variable. Then $\Sigma \proves \varphi$ if and only if $\Sigma \proves \forall x\, \varphi$. \end{lem} \begin{thm} \label{t:exmcw} Suppose $\Gamma$ is a consistent set of sentences of $\mathcal{L}$. Let $C$ be an infinite countable set of constant symbols which are {\em not\/} symbols of $\mathcal{L}$, and let $\mathcal{L}' = \mathcal{L} \cup C$ be the language obtained by adding the constant symbols in $C$ to the symbols of $\mathcal{L}$. Then there is a maximally consistent set $\Sigma$ of sentences of $\mathcal{L}'$ such that $\Gamma \subseteq \Sigma$ and $C$ is a set of witnesses for $\Sigma$. \end{thm} This theorem allows one to use a certain measure of brute force: No set of witnesses? Just add one! The set of sentences doesn't decide enough? Decide {\em everything\/} one way or the other! \begin{thm} \label{t:mfc} Suppose $\Sigma$ is a maximally consistent set of sentences and $C$ is a set of witnesses for $\Sigma$. Then there is a structure $\mathfrak{M}$ such that $\mathfrak{M} \models \Sigma$. \end{thm} The important part here is to define $\mathfrak{M}$ --- proving that $\mathfrak{M} \models \Sigma$ is tedious but fairly straightforward if you have the right definition. Proposition \ref{p:exsat} now lets us deduce the fact we really need. \begin{cor} \label{p:eight17} Suppose $\Gamma$ is a consistent set of sentences of a first-order language $\mathcal{L}$. Then there is a structure $\mathfrak{M}$ for $\mathcal{L}$ satisfying $\Gamma$. \end{cor} With the above facts in hand, we can rejoin our proof of Soundness and Completeness, already in progress: \begin{thm} \label{t:sacof} A set of sentences $\Sigma$ in $\mathcal{L}$ is consistent if and only if it is satisfiable. \end{thm} The rest works just like it did for propositional logic. \begin{thm}[Completeness Theorem] \label{t:fcmpl} \index{Completeness Theorem} If $\alpha$ is a sentence and $\Delta$ is a set of sentences such that $\Delta \models \alpha$, then $\Delta \proves \alpha$. \end{thm} It follows that in a first-order logic, as in propositional logic, a sentence is implied by some set of premisses if and only if it has a proof from those premisses. \begin{thm}[Compactness Theorem] \label{p:fcmpct} \index{Compactness Theorem} A set of sentences $\Delta$ is satisfiable if and only if every finite subset of $\Delta$ is satisfiable. \end{thm} % % Chapter 9 of "A Problem Course in Mathematical Logic" % \chapter{Applications of Compactness} \label{ch:nine} \index{Compactness Theorem, applications of} After wading through the preceding chapters, it should be obvious that first-order logic is, in principle, adequate for the job it was originally developed for: the essentially philosophical exercise of formalizing most of mathematics. As something of a bonus, first-order logic can supply useful tools for doing ``real'' mathematics. The Compactness Theorem is the simplest of these tools and glimpses of two ways of using it are provided below. \subsection*{From the finite to the infinite} Perhaps the simplest use of the Compactness Theorem is to show that if there exist arbitrarily large finite objects of some type, then there must also be an infinite object of this type. \begin{exmp} \label{e:com} We will use the Compactness Theorem to show that there is an infinite commutative group in which every element is of order $2$, {\em i.e.\/} such that $g \cdot g = e$ for every element $g$. Let $\mathcal{L}_G$\index{$\mathcal{L}_G$} be the first-order language with just two non-logical symbols: \begin{itemize} \item Constant symbol: $e$ \item 2-place function symbol: $\cdot$ \end{itemize} Here $e$ is intended to name the group's identity element and $\cdot$ the group operation. Let $\Sigma$ be the set of sentences of $\mathcal{L}_G$ including: \begin{enumerate} \item The axioms for a commutative group: \begin{itemize} \item $\forall x\, x\cdot e = x$ \item $\forall x\, \exists y\, x \cdot y = e$ \item $\forall x\, \forall y\, \forall z\, x \cdot (y \cdot z) = (x \cdot y) \cdot z$ \item $\forall x\, \forall y\, y \cdot x = x \cdot y$ \end{itemize} \item A sentence which asserts that every element of the universe is of order $2$: \begin{itemize} \item $\forall x\, x \cdot x = e$ \end{itemize} \item For each $n \ge 2$, a sentence, $\sigma_n$, which asserts that there are at least $n$ different elements in the universe: \begin{itemize} \item $\exists x_1\, \dots \exists x_n\, ( (\lnot x_1 = x_2) \land (\lnot x_1 = x_3) \land \dots \land (\lnot x_{n-1} = x_n))$ \end{itemize} \end{enumerate} We claim that every finite subset of $\Sigma$ is satisfiable. The most direct way to verify this is to show how, given a finite subset $\Delta$ of $\Sigma$, to produce a model $\mathfrak{M}$ of $\Delta$. Let $n$ be the largest integer such that $\sigma_n \in \Delta \cup \{ \sigma_2 \}$ (Why is there such an $n$?) and choose an integer $k$ such that $2^k \ge n$. Define a structure $(G,\circ)$ for $\mathcal{L}_G$ as follows: \begin{itemize} \item $G = \{\, \langle a_\ell \mid 1 \le \ell \le k \rangle \mid a_\ell = 0 \text{\ or\ } 1 \,\}$ \item $\langle a_\ell \mid 1 \le \ell \le k \rangle \circ \langle b_\ell \mid 1 \le \ell \le k \rangle = \langle a_\ell + b_\ell \pmod 2 \mid 1 \le \ell \le k \rangle$ \end{itemize} That is, $G$ is the set of binary sequences of length $k$ and $\circ$ is coordinatewise addition modulo $2$ of these sequences. It is easy to check that $(G,\circ)$ is a commutative group with $2^k$ elements in which every element has order $2$. Hence $(G,\circ) \models \Delta$, so $\Delta$ is satisfiable. Since every finite subset of $\Sigma$ is satisfiable, it follows by the Compactness Theorem that $\Sigma$ is satisfiable. A model of $\Sigma$, however, must be an infinite commutative group in which every element is of order $2$. (To be sure, it is quite easy to build such a group directly; for example, by using coordinatewise addition modulo $2$ of infinite binary sequences.) \end{exmp} \begin{prob} \label{p:nine1} Use the Compactness Theorem to show that there is an infinite \begin{enumerate} \item bipartite graph, \item non-commutative group, and \item field of characteristic 3, \end{enumerate} and also give concrete examples of such objects. \end{prob} Most applications of this method, including the ones above, are not really interesting: it is usually more valuable, and often easier, to directly construct examples of the infinite objects in question rather than just show such must exist. Sometimes, though, the technique can be used to obtain a non-trivial result more easily than by direct methods. We'll use it to prove an important result from graph theory, Ramsey's Theorem. Some definitions first: \begin{defn} If $X$ is a set, let the set of unordered pairs of elements of $X$ be $[X]^2 = \{\, \{a,b\} \mid a,b \in X \text{\ and\ } a \ne b \,\}$. (See Definition~\ref{d:sed}.) \begin{enumerate} \item A {\em graph\/} \index{graph} is a pair $(V,E)$ such that $V$ is a non-empty set and $E \subseteq [V]^2$. Elements of $V$ are called {\em vertices\/}\index{vertex} of the graph and elements of $E$ are called {\em edges\/}\index{edge}. \item A {\em subgraph\/}\index{subgraph} of $(V,E)$ is a pair $(U,F)$, where $U \subset V$ and $F = E \cap [U]^2$. \item A subgraph $(U,F)$ of $(V,E)$ is a {\em clique\/}\index{clique} if $F = [U]^2$. \item A subgraph $(U,F)$ of $(V,E)$ is an {\em independent set\/}\index{independent set} if $F = \emptyset$. \end{enumerate} \end{defn} That is, a graph is some collection of vertices, some of which are joined to one another. A subgraph is just a subset of the vertices, together with all edges joining vertices of this subset in the whole graph. It is a clique if it happens that the original graph joined every vertex in the subgraph to all other vertices in the subgraph, and an independent set if it happens that the original graph joined none of the vertices in the subgraph to each other. The question of when a graph must have a clique or independent set of a given size is of some interest in many applications, especially in dealing with colouring problems. \begin{thm}[Ramsey's Theorem] \label{t:ram} \index{Ramsey's Theorem} For every $n \ge 1$ there is an integer $R_n$ such that any graph with at least $R_n$ vertices has a clique with $n$ vertices or an independent set with $n$ vertices. \end{thm} $R_n$\index{$R_n$} is the {\em $n$th Ramsey number\/}.\index{Ramsey number} It is easy to see that $R_1 = 1$ and $R_2 = 2$, but $R_3$ is already $6$, and $R_n$ grows very quickly as a function of $n$ thereafter. Ramsey's Theorem is fairly hard to prove directly, but the corresponding result for infinite graphs is comparatively straightforward. \begin{lem} \label{l:irt} \index{Infinite Ramsey's Theorem} \index{Ramsey's Theorem Infinite} If $(V,E)$ is a graph with infinitely many vertices, then it has an infinite clique or an infinite independent set. \end{lem} A relatively quick way to prove Ramsey's Theorem is to first prove its infinite counterpart, Lemma \ref{l:irt}, and then get Ramsey's Theorem out of it by way of the Compactness Theorem. (If you're an ambitious minimalist, you can try to do this using the Compactness Theorem for propositional logic instead!) \subsection*{Elementary equivalence and non-standard models} One of the common uses for the Compactness Theorem is to construct ``non-standard'' models\index{non-standard model} of the theories satisfied by various standard mathematical structures. Such a model satisfies all the same first-order sentences as the standard model, but differs from it in some way not expressible in the first-order language in question. This brings home one of the intrinsic limitations of first-order logic: it can't always tell essentially different structures apart. Of course, we need to define just what constitutes essential difference. \begin{defn} Suppose $\mathcal{L}$ is a first-order language and $\mathfrak{N}$ and $\mathfrak{M}$ are two structures for $\mathcal{L}$. Then $\mathfrak{N}$ and $\mathfrak{M}$ are: \begin{enumerate} \item {\em isomorphic\/},\index{isomorphism of structures} written as $\mathfrak{N} \cong \mathfrak{M}$, if there is a function $F \colon |\mathfrak{N}| \to |\mathfrak{M}|$ such that \begin{enumerate} \item $F$ is $1-1$ and onto, \item $F(c^{\mathfrak{N}}) = c^{\mathfrak{M}}$ for every constant symbol $c$ of $\mathcal{L}$, \item $F(f^{\mathfrak{N}}(a_1, \dots, a_k) = f^{\mathfrak{M}}(F(a_1), \dots, F(a_k))$ for every $k$-place function symbol $f$ of $\mathcal{L}$ and elements $a_1, \dots, a_k \in |\mathfrak{N}|$, and \item $P^{\mathfrak{N}}(a_1, \dots, a_k)$ holds if and only if $P^{\mathfrak{N}}(F(a_1), \dots, F(a_k))$ for every $k$-place relation symbol of $\mathcal{L}$ and elements $a_1$, \dots, $a_k$ of $|\mathfrak{N}|$; \end{enumerate} and \item {\em elementarily equivalent\/},\index{elementary equivalence} \index{equivalence, elementary} written as $\mathfrak{N} \equiv \mathfrak{M}$, if $\text{\rm Th}(\mathfrak{N}) = \text{\rm Th}(\mathfrak{M})$, {\em i.e.\/} if $\mathfrak{N} \models \sigma$ if and only if $\mathfrak{M} \models \sigma$ for every sentence $\sigma$ of $\mathcal{L}$. \end{enumerate} \end{defn} That is, two structures for a given language are isomorphic if they are structurally identical and elementarily equivalent if no statement in the language can distinguish between them. Isomorphic structures are elementarily equivalent: \begin{prop} \label{p:nine4} Suppose $\mathcal{L}$ is a first-order language and $\mathfrak{N}$ and $\mathfrak{M}$ are structures for $\mathcal{L}$ such that $\mathfrak{N} \cong \mathfrak{M}$. Then $\mathfrak{N} \equiv \mathfrak{M}$. \end{prop} However, as the following application of the Compactness Theorem shows, elementarily equivalent structures need not be isomorphic: \begin{exmp} Note that $\mathfrak{C} = (\mathbb{N})$ is an infinite structure for $\mathcal{L}_=$. Expand $\mathcal{L}_=$ to $\mathcal{L}_R$ by adding a constant symbol $c_r$ for every real number $r$, and let $\Sigma$ be the set of sentences of $\mathcal{L}_=$ including \begin{itemize} \item every sentence $\tau$ of $\text{\rm Th}(\mathfrak{C})$, {\em i.e.\/} such that $\mathfrak{C} \models \tau$, and \item $\lnot c_r = c_s$ for every pair of real numbers $r$ and $s$ such that $r \ne s$. \end{itemize} Every finite subset of $\Sigma$ is satisfiable. (Why?) Thus, by the Compactness Theorem, there is a structure $\mathfrak{U}'$ for $\mathcal{L}_R$ satisfying $\Sigma$, and hence $\text{\rm Th}(\mathfrak{C})$. The structure $\mathfrak{U}$ obtained by dropping the interpretations of all the constant symbols $c_r$ from $\mathfrak{U}'$ is then a structure for $\mathcal{L}_=$ which satisfies $\text{\rm Th}(\mathfrak{C})$. Note that $|\mathfrak{U}| = |\mathfrak{U}'|$ is at least large as the set of all real numbers $\mathbb{R}$, since $\mathfrak{U}'$ requires a distinct element of the universe to interpret each constant symbol $c_r$ of $\mathcal{L}_R$. Since $\text{\rm Th}(\mathfrak{C})$ is a maximally consistent set of sentences of $\mathcal{L}_=$ by Problem \ref{p:smac}, it follows from the above that $\mathfrak{C} \equiv \mathfrak{U}$. On the other hand, $\mathfrak{C}$ cannot be isomorphic to $\mathfrak{U}$ because there cannot be an onto map between a countable set, such as $\mathbb{N} = |\mathfrak{C}|$, and a set which is at least as large as $\mathbb{R}$, such as $|\mathfrak{U}|$. \end{exmp} In general, the method used above can be used to show that if a set of sentences in a first-order language has an infinite model, it has many different ones. In $\mathcal{L}_=$ that is essentially all that can happen: \begin{prop} \label{p:nine5} Two structures for $\mathcal{L}_=$ are elementarily equivalent if and only if they are isomorphic or infinite. \end{prop} \begin{prob} \label{p:nine6} Let $\mathfrak{N} = (\mathbb{N}, 0, 1, S, +, \cdot, E)$ be the standard structure for $\mathcal{L}_N$. Use the Compactness Theorem to show there is a structure $\mathfrak{M}$ for $\mathcal{L}_N$ such that $\mathfrak{N} \equiv \mathfrak{N}$ but not $\mathfrak{N} \cong \mathfrak{M}$. \end{prob} Note that because $\mathfrak{N}$ and $\mathfrak{M}$ both satisfy $\text{\rm Th}(\mathfrak{N})$, which is maximally consistent by Problem \ref{p:smac}, there is absolutely no way of telling them apart in $\mathcal{L}_N$. \begin{prop} \label{p:nine7} Every model of $\text{\rm Th}(\mathfrak{N})$ which is {\em not\/} isomorphic to $\mathfrak{N}$ has \begin{enumerate} \item an isomorphic copy of $\mathfrak{N}$ embedded in it, \item an infinite number, {\em i.e.\/} one larger than all of those in the copy of $\mathfrak{N}$, and \item an infinite decreasing sequence. \end{enumerate} \end{prop} The apparent limitation of first-order logic that non-isomorphic structures may be elementarily equivalent can actually be useful. A non-standard model\index{non-standard model} may have features that make it easier to work with than the standard model one is really interested in. Since both structures satisfy exactly the same sentences, if one uses these features to prove that some statement expressible in the given first-order language is true about the non-standard structure, one gets for free that it must be true of the standard structure as well. A prime example of this idea is the use of non-standard models of the real numbers\index{non-standard models of the reals} containing infinitesimals (numbers which are infinitely small but different from zero) in some areas of analysis. \begin{thm} \label{t:nsr} Let $\mathfrak{R} = (\mathbb{R}, 0, 1, +, \cdot)$ be the field of real numbers, considered as a structure for $\mathcal{L}_F$. Then there is a model of $\text{Th}(\mathfrak{R})$ which contains a copy of $\mathbb{R}$ and in which there is an infinitesimal\index{infinitesimal}. \end{thm} The non-standard models of the real numbers\index{non-standard models of the reals} actually used in analysis are usually obtained in more sophisticated ways in order to have more information about their internal structure. It is interesting to note that infinitesimals were the intuition behind calculus for Leibniz when it was first invented, but no one was able to put their use on a rigourous footing until Abraham Robinson did so in 1950. \chapter*{Hints for Chapters 5--9} % % Hints for Chapter 5 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:five}} \begin{clue}{p:five1} Try to disassemble each string using Definition \ref{d:ter}. Note that some might be valid terms of more than one of the given languages. \end{clue} \begin{clue}{p:five2} This is similar to Problem \ref{p:lof}. \end{clue} \begin{clue}{p:fcmt} This is similar to Proposition \ref{p:foc}. \end{clue} \begin{clue}{p:five4} Try to disassemble each string using Definitions \ref{d:ter} and \ref{d:for}. Note that some might be valid formulas of more than one of the given languages. \end{clue} \begin{clue}{p:five5} This is just like Problem \ref{p:lrp}. \end{clue} \begin{clue}{p:five6} This is similar to Problem \ref{p:lof}. You may wish to use your solution to Problem \ref{p:five2}. \end{clue} \begin{clue}{p:five7} This is similar to Proposition \ref{p:foc}. \end{clue} \begin{clue}{p:for} You might want to rephrase some of the given statements to make them easier to formalize. \begin{enumerate} \item Look up associativity if you need to. \item``There is an object such that every object is not in it.'' \item This should be easy. \item Ditto. \item ``Any two things must be the same thing.'' \end{enumerate} \end{clue} \begin{clue}{p:fole} If necessary, don't hesitate to look up the definitions of the given structures. \begin{enumerate} \item Read the discussion at the beginning of the chapter. \item You really need only one non-logical symbol. \item There are two sorts of objects in a vector space, the vectors themselves and the scalars of the field, which you need to be able to tell apart. \end{enumerate} \end{clue} \begin{clue}{p:five10} Use Definition \ref{d:for} in the same way that Definition \ref{d:form} was used in Definition \ref{d:subf}. \end{clue} \begin{clue}{p:five11} The scope of a quantifier ought to be a certain subformula of the formula in which the quantifier occurs. \end{clue} \begin{clue}{p:five12} Check to see whether they satisfy Definition \ref{d:frv}. \end{clue} \begin{clue}{p:five13} Check to see which pairs satisfy Definition \ref{d:exlan}. \end{clue} \begin{clue}{p:exlan} Proceed by induction on the length of $\varphi$ using Definition \ref{d:for}. \end{clue} \begin{clue}{t:urt} This is similar to Theorem \ref{t:ur}. \end{clue} \begin{clue}{t:urf} This is similar to Theorem \ref{t:ur} and uses Theorem~\ref{t:urt}. \end{clue} % % Hints for Chapter 6 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:six}} \begin{clue}{p:six1} In each case, apply Definition \ref{d:str}. \begin{enumerate} \item This should be easy. \item Ditto. \item Invent objects which are completely different except that they happen to have the right number of the right kind of components. \end{enumerate} \end{clue} \begin{clue}{pb:exas} Figure out the relevant values of $s(v_n)$ and apply Definition \ref{d:exas}. \end{clue} \begin{clue}{p:eau} Suppose $\mathbf{s}$ and $\mathbf{r}$ both extend the assignment $s$. Show that $\mathbf{s}(t) = \mathbf{r}(t)$ by induction on the length of the term $t$. \end{clue} \begin{clue}{p:six4} Unwind the formulas using Definition \ref{d:sat} to get informal statements whose truth you can determine. \end{clue} \begin{clue}{p:six5} Unwind the abbreviation $\exists$ and use Definition~\ref{d:sat}. \end{clue} \begin{clue}{p:ord} Unwind each of the formulas using Definitions~\ref{d:sat} and \ref{d:mod} to get informal statements whose truth you can determine. \end{clue} \begin{clue}{l:six7} This is much like Proposition \ref{p:eau}. \end{clue} \begin{clue}{p:six8} Proceed by induction on the length of the formula using Definition \ref{d:sat} and Lemma \ref{l:six7}. \end{clue} \begin{clue}{c:six9} How many free variables does a sentence have? \end{clue} \begin{clue}{p:inf} Use Definition \ref{d:sat}. \end{clue} \begin{clue}{p:taut} Unwind the sentences in question using Definition \ref{d:sat}. \end{clue} \begin{clue}{p:six12} Use Definitions \ref{d:sat} and \ref{d:mod}; the proof is similar in form to the proof of Proposition \ref{p:moto}. \end{clue} \begin{clue}{p:six13} Use Definitions \ref{d:sat} and \ref{d:mod}; the proof is similar in form to the proof for Problem \ref{p:sanc}. \end{clue} \begin{clue}{p:mif} Use Definitions \ref{d:sat} and \ref{d:mod} in each case, plus the meanings of our abbreviations. \end{clue} \begin{clue}{p:exsat} In one direction, you need to add appropriate objects to a structure; in the other, delete them. In both cases, you still have to verify that $\Gamma$ is still satisfied. \end{clue} \begin{clue}{p:six16} Here are some appropriate languages. \begin{enumerate} \item $\mathcal{L}_=$ \item Modify your language for graph theory from Problem~\ref{p:fole} by adding a 1-place relation symbol. \item Use your language for group theory from Problem~\ref{p:fole}. \item $\mathcal{L}_F$ \end{enumerate} \end{clue} % % Chapter 7 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:seven}} \begin{clue}{p:subs} \begin{enumerate} \item Use Definition \ref{d:subs}. \item Ditto. \item Ditto. \item Proceed by induction on the length of the formula $\varphi$. \end{enumerate} \end{clue} \begin{clue}{l:gen} Use the definitions and facts about $\models$ from Chapter \ref{ch:six}. \end{clue} \begin{clue}{p:seven3} Check each case against the schema in Definition \ref{d:axs}. Don't forget that any generalization of a logical axiom is also a logical axiom. \end{clue} \begin{clue}{p:seven4} You need to show that any instance of the schemas A1--A8 is a tautology and then apply Lemma \ref{l:gen}. That each instance of schemas A1--A3 is a tautology follows from Proposition \ref{p:mif}. For A4--A8 you'll have to use the definitions and facts about $\models$ from Chapter 6. \end{clue} \begin{clue}{p:deds} You may wish to appeal to the deductions that you made or were given in Chapter \ref{ch:three}. \begin{enumerate} \item Try using A4 and A6. \item You don't need A4--A8 here. \item Try using A4 and A8. \item A8 is the key; you may need it more than once. \item This is just A6 in disguise. \end{enumerate} \end{clue} \begin{clue}{p:seven5a} This is just like its counterpart for propositional logic. \end{clue} \begin{clue}{p:seven6} Ditto. \end{clue} \begin{clue}{p:bim} Ditto. \end{clue} \begin{clue}{p:seven8} Ditto. \end{clue} \begin{clue}{t:fded} Ditto. \end{clue} \begin{clue}{t:gen} Proceed by induction on the length of the shortest proof of $\varphi$ from $\Gamma$. \end{clue} \begin{clue}{t:genc} Ditto. \end{clue} \begin{clue}{p:seven12} As usual, don't take the following suggestions as gospel. \begin{enumerate} \item Try using A8. \item Start with Example \ref{e:apf}. \item Start with part of Problem \ref{p:deds}. \end{enumerate} \end{clue} % % Hints for Chapter 8 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:eight}} \begin{clue}{t:fsnd} This is similar to the proof of the Soundness Theorem for propositional logic, using Proposition \ref{p:inf} in place of Proposition \ref{p:snd}. \end{clue} \begin{clue}{p:sacon} This is similar to its counterpart for prpositional logic, Proposition \ref{p:stoc}. Use Proposition \ref{p:inf} instead of Proposition \ref{p:snd}. \end{clue} \begin{clue}{p:eight4} This is just like its counterpart for propositional logic. \end{clue} \begin{clue}{p:eight5} Ditto. \end{clue} \begin{clue}{c:eight6} Ditto. \end{clue} \begin{clue}{p:smac} This is a counterpart to Problem \ref{p:emc}; use Proposition \ref{p:sacon} instead of Proposition \ref{p:stoc} and Proposition \ref{p:mif} instead of Proposition \ref{p:tif}. \end{clue} \begin{clue}{p:eight8} This is just like its counterpart for propositional logic. \end{clue} \begin{clue}{p:eight9} Ditto \end{clue} \begin{clue}{p:eight10} Ditto. \end{clue} \begin{clue}{t:etmc} This is much like its counterpart for propositional logic, Theorem \ref{t:exmc}. \end{clue} \begin{clue}{p:eight14} Use Proposition \ref{p:bim}. \end{clue} \begin{clue}{l:eight12} Use the Generalization Theorem for the hard direction. \end{clue} \begin{clue}{t:exmcw} This is essentially a souped-up version of Theorem~\ref{t:etmc}. To ensure that $C$ is a set of witnesses of the maximally consistent set of sentences, enumerate all the formulas $\varphi$ of $\mathcal{L}'$ with one free variable and take care of one at each step in the inductive construction. \end{clue} \begin{clue}{t:mfc} To construct the required structure, $\mathfrak{M}$, proceed as follows. Define an equivalence relation $\sim$ on $C$ by setting $c \sim d$ if and only if $c = d \in \Sigma$, and let $[c] = \{\, a \in C \mid a \sim c \,\}$ be the equivalence class of $c \in C$. The universe of $\mathfrak{M}$ will be $M = \{\, [c] \mid c \in C \,\}$. For each $k$-place function symbol $f$ define $f^{\mathfrak{M}}$ by setting $f^{\mathfrak{M}}([a_1], \dots, [a_k]) =[b]$ if and only if $fa_1\dots a_k = b$ is in $\Sigma$. Define the interpretations of constant symbols and relation symbols in a similar way. You need to show that all these things are well-defined, and then show that $\mathfrak{M} \models \Sigma$. \end{clue} \begin{clue}{p:eight17} Expand $\Gamma$ to a maximally consistent set of sentences with a set of witnesses in a suitable extension of $\mathcal{L}$, apply Theorem \ref{t:mfc}, and then cut down the resulting structure to one for $\mathcal{L}$. \end{clue} \begin{clue}{t:sacof} One direction is just Proposition \ref{p:sacon}. For the other, use Corollary \ref{p:eight17}. \end{clue} \begin{clue}{t:fcmpl} This follows from Theorem \ref{t:sacof} in the same way that the Completeness Theorem for propositional logic followed from Theorem \ref{t:saco}. \end{clue} \begin{clue}{p:fcmpct} This follows from Theorem \ref{t:sacof} in the same way that the Compactness Theorem for propositional logic followed from Theorem \ref{t:saco}. \end{clue} % % Hints for Chapter 9 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:nine}} \begin{clue}{p:nine1} In each case, apply the trick used in Example~\ref{e:com}. For definitions and the concrete examples, consult texts on combinatorics and abstract algebra. \end{clue} \begin{clue}{t:ram} Suppose Ramsey's Theorem fails for some $n$. Use the Compactness Theorem to get a contradiction to Lemma \ref{l:irt} by showing there must be an infnite graph with no clique or independent set of size $n$. \end{clue} \begin{clue}{l:irt} Inductively define a sequence $a_0$, $a_1$, \dots, of vertices so that for every $n$, either it is the case that for all $k \ge n$ there is an edge joining $a_n$ to $a_k$ or it is the case that for all $k \ge n$ there is no edge joining $a_n$ to $a_k$. There will then be a subsequence of the sequence which is an infinite clique or a subsequence which is an infinite independent set. \end{clue} \begin{clue}{p:nine4} The key is to figure out how, given an assignment for one structure, one should define the corresponding assignment in the other structure. After that, proceed by induction using the definition of satisfaction. \end{clue} \begin{clue}{p:nine5} When are two finite structures for $\mathcal{L}_=$ elementarily equivalent? \end{clue} \begin{clue}{p:nine6} In a suitable expanded language, consider $\text{\rm Th}(\mathfrak{N})$ together with the sentences $\exists x\, 0 + x = c$, $\exists x\, S0 + x = c$, $\exists x\, SS0 + x = c$, \dots \end{clue} \begin{clue}{p:nine7} Suppose $\mathfrak{M} \models \text{\rm Th}(\mathfrak{N})$ but is not isomorphic to $\mathfrak{N}$. \begin{enumerate} \item Consider the subset of $|\mathfrak{M}|$ given by $0^{\mathfrak{M}}$, $S^{\mathfrak{M}}(0^{\mathfrak{M}})$, $S^{\mathfrak{M}}(S^{\mathfrak{M}}(0^{\mathfrak{M}}))$, \dots \item If it didn't have one, it would be a copy of $\mathfrak{N}$. \item Start with a infinite number and work down. \end{enumerate} \end{clue} \begin{clue}{t:nsr} Expand $\mathcal{L}_F$ by throwing in a constant symbol for every real number, plus an extra one, and take it from there. \end{clue} % % Part III of "A Problem Course in Mathematical Logic" % \part{Computability} % % Tenth chapter of "A Problem Course in Mathematical Logic" % \chapter{Turing Machines} \label{ch:ten} Of the various ways to formalize the notion an ``effective method'', the most commonly used are the simple abstract computers called Turing machines,\index{machine, Turing}\index{Turing machine} which were introduced more or less simultaneously by Alan Turing and Emil Post in 1936.\footnote{Both papers are reprinted in \cite{DA:U}. Post's brief paper gives a particularly lucid informal description.} Like most real-life digital computers, Turing machines have two main parts, a processing unit and a memory (which doubles as the input/output device), which we will consider separately before seeing how they interact. The memory can be thought of as an infinite tape which is divided up into cells like the frames of a movie. The Turing machine proper is the processing unit. It has a scanner\index{scanner} or head\index{head} which can read from or write to a single cell of the tape, and which can be moved to the left or right one cell at a time. \subsection*{Tapes} To keep things simple, in this chapter we will only allow Turing machines to read and write the symbols $0$ and $1$. (One symbol per cell!) Moreover, we will allow the tape to be infinite in only one direction. That these restrictions do not affect what a Turing machine can, in principle, compute follows from the results in the next chapter. \begin{defn} \label{d:tape} \index{tape} A {\em tape\/} is an infinite sequence $$ \mathbf{a} = a_0\, a_1\, a_2\, a_3 \dots $$ such that for each integer $i$ the {\em cell\/} $a_i \in \{0,1\}$. The $i$th cell is said to be {\em blank\/}\index{blank cell}\index{cell, blank} if $a_i$ is $0$, and {\em marked\/}\index{marked cell}\index{cell, marked} if $a_i$ is $1$. \end{defn} A blank tape \index{tape, blank}\index{blank tape} is one in which every cell is $0$. \begin{exmp}\label{TM:tape} A blank tape looks like: $$ 000000000000000000000000 \cdots $$ The $0$th cell is the leftmost one, cell $1$ is the one immediately to the right, cell $2$ is the one immediately to the right of cell $1$, and so on. The following is a slightly more exciting tape: $$ 0101101110001000000000000000 \cdots $$ In this case, cell $1$ is marked ({\it i.e.\/} contains a $1$), as do cells $3$, $4$, $5$, $7$, $8$, and $12$; all the rest are blank ({\it i.e.\/} contain a $0$). \end{exmp} \begin{prob} \label{ten:tape} Write down tapes satisfying the following. \begin{enumerate} \item Entirely blank except for cells $3$, $12$, and $20$. \item Entirely marked except for cells $0$, $2$, and $3$. \item Entirely blank except that 1025 is written out in binary just to the right of cell $2$. \end{enumerate} \end{prob} To keep track of which cell the Turing machine's scanner is at, plus which instruction the Turing machine is to execute next, we will usually attach additional information to our description of the tape. \begin{defn} \index{tape position} \index{position, tape} A {\em tape position\/} is a triple $(s,i,\mathbf{a})$, where $s$ and $i$ are natural numbers with $s > 0$, and $\mathbf{a}$ is a tape. Given a tape position $(s,i,\mathbf{a})$, we will refer to cell $i$ as the {\em scanned cell\/}\index{scanned cell}\index{cell, scanned} and to $s$ as the {\em state\/}\index{state}. \end{defn} Note that if $(s,i,\mathbf{a})$ is a tape position, then the corresponding Turing machine's scanner is presently reading $a_i$ (which is one of $0$ or $1$). \subsection*{Conventions for tapes} Unless stated otherwise, we will assume that all but finitely many cells of any given tape are blank, and that any cells not explicitly described or displayed are blank. We will usually depict as little of a tape as possible and omit the $\cdots$s we used above. Thus $$ 0101101110001 $$ represents the tape given in the Example \ref{TM:tape}. In many cases we will also use $z^n$ to abbreviate $n$ consecutive copies of $z$, so the same tape could be represented by $$ 0101^201^30^31\, . $$ Similarly, if $\sigma$ is a finite sequence of elements of $\{0,1\}$, we may write $\sigma^n$ for the sequence consisting of $n$ copies of $\sigma$ stuck together end-to-end. For example, $(010)^3$ is short for $010010010$. In displaying tape positions we will usually underline the scanned cell and write $s$ to the left of the tape. For example, we would display the tape position using the tape from Example \ref{TM:tape} with cell $3$ being scanned and state $2$ as follows: $$ 2 \colon 010\underline{1}101110001 $$ Note that in this example, the scanner is reading a $1$. \begin{prob} \label{ten:tppos} Using the tapes you gave in the corresponding part of Problem \ref{ten:tape}, write down tape positions satisfying the following conditions. \begin{enumerate} \item Cell $7$ being scanned and state $4$. \item Cell $4$ being scanned and state $3$. \item Cell $3$ being scanned and state $413$. \end{enumerate} \end{prob} \subsection*{Turing machines} The ``processing unit'' of a Turing machine is just a finite list of specifications describing what the machine will do in various situations. (Remember, this is an {\em abstract\/} computer\dots) The formal definition may not seem to amount to this at first glance. \begin{defn} \label{d:TM} \index{machine, Turing} \index{Turing machine} A {\em Turing machine\/} is a function $M$ such that for some natural number $n$, \begin{eqnarray*} \mathrm{dom}(M) & \subseteq & \{1,\dots,n\} \times \{0,1\} \\ & = & \{\, (s,b) \mid 1 \le s \le n \text{\ and\ } b \in \{0,1\} \,\} \end{eqnarray*} and \begin{eqnarray*} \mathrm{ran}(M) & \subseteq & \{0,1\} \times \{-1,1\} \times \{1,\dots,n\} \\ & = & \{\, (c,d,t) \mid c \in \{0,1\} \text{\ and\ } d \in \{-1,1\} \text{\ and\ }1 \le t \le n \,\}\, . \end{eqnarray*} Note that $M$ does not have to be defined for all possible pairs $$ (s,b) \in \{1,\dots,n\} \times \{0,1\} \, . $$ We will sometimes refer to a Turing machine simply as a {\em machine\/} \index{machine} or {\rm TM\/} \index{TM}. If $n \ge 1$ is least such that $M$ satisfies the definition above, we shall say that $M$ is an {\em $n$-state Turing machine\/}\index{$n$-state Turing machine}\index{Turing machine $n$-state} and that $\{1,\dots,n\}$ is the set of {\em states\/}\index{state} of $M$. \end{defn} Intuitively, we have a processing unit which has a finite list of basic instructions, the states, which it can execute. Given a combination of current state and the symbol marked in the currently scanned cell of the tape, the list specifies \begin{itemize} \item a symbol to be written in the currently scanned cell, overwriting the symbol being read, then \item a move of the scanner one cell to the left or right, and then \item the next instruction to be executed. \end{itemize} That is, $M(s,c) = (b,d,t)$ means that if our machine is in state $s$ ({\em i.e.\/} executing instruction number $s$) and the scanner is presently reading a $c$ in cell $i$, then the machine $M$ should \begin{itemize} \item set $a_i = b$ ({\em i.e.\/} write $b$ instead of $c$ in the scanned cell), then \item move the scanner to $a_{i+d}$ ({\em i.e.\/} move one cell left if $d = -1$ and one cell right if $d = 1$), and then \item enter state $t$ ({\em i.e.\/} go to instruction $t$). \end{itemize} If our processor isn't equipped to handle input $c$ for instruction $s$ ({\em i.e.\/} $M(s,c)$ is undefined), then the computation in progress will simply stop dead or {\it halt\/}.\index{halt, Turing machine}\index{Turing machine, halt} \begin{exmp}\label{TM:M} We will usually present Turing machines in the form of a table\index{Turing machine, table}\index{table, Turing machine}, with a row for each state and a column for each possible entry in the scanned cell. Instead of $-1$ and $1$, we will usually use $L$ and $R$ when writing such tables in order to make them more readable. Thus the table \begin{center} \mbox{ \begin{tabular}{c|c|c} $M$ & $0$ & $1$ \\ \hline $1$ & $1R2$ & $0R1$ \\ $2$ & $0L2$ & \end{tabular} } \end{center} \noindent defines a Turing machine $M$ with two states such that $M(1,0) = (1,1,2)$, $M(1,1) = (0,1,1)$, and $M(2,0) = (0,-1,2)$, but $M(2,1)$ is undefined. In this case $M$ has domain $\{\, (1,0),\, (1,1),\, (2,0) \,\}$ and range $\{\, (1,1,2),\, (0,1,1),\, (0,-1,2) \,\}$. If the machine $M$ were faced with the tape position $$ 1 \colon 010\underline{0}1111 \, , $$ it would, since it was in state $1$ while scanning a cell containing $0$, \begin{itemize} \item write a $1$ in the scanned cell, \item move the scanner one cell to the right, and \item go to state $2$. \end{itemize} This would give the new tape position $$ 2 \colon 0101\underline{1}111 \, . $$ Since $M$ doesn't know what to do on input $1$ in state $2$, it would then halt, ending the computation. \end{exmp} \begin{prob} \label{ten:TMs} In each case, give the table of a Turing machine $M$ meeting the given requirement. \begin{enumerate} \item $M$ has three states. \item $M$ changes $0$ to $1$ and {\em vice versa\/} in any cell it scans. \item $M$ is as simple as possible. How many possibilities are there here? \end{enumerate} \end{prob} \subsection*{Computations} Informally, a computation is a sequence of actions of a machine $M$ on a tape according to the rules above, starting with instruction $1$ and the scanner at cell $0$ on the given tape. A computation ends (or {\em halts\/}\index{halt, Turing machine}\index{Turing machine, halt}) when and if the machine encounters a tape position which it does not know what to do in If it never halts, and doesn't {\it crash\/}\index{crash, Turing machine}\index{Turing machine, crash} by running the scanner off the left end of the tape\footnote{Be warned that most authors prefer to treat running the scanner off the left end of the tape as being just another way of halting. Halting with the scanner on the tape is more convenient, however, when putting together different Turing machines to make more complex ones.} either, the computation will never end. The formal definition makes all this seem much more formidable. \begin{defn} \label{TM:pc} Suppose $M$ is a Turing machine. Then: \begin{itemize} \item If $p = (s,i,\mathbf{a})$ is a tape position and $M(s,a_i) = (b,d,t)$ is defined, then $\mathbf{M}(p) = (t,i+d,\mathbf{a}')$ is the {\em successor tape position\/}, \index{successor tape position} \index{tape position, successor} where $a'_i = b$ and $a'_j = a_j$ whenever $j \ne i$. \item A {\em partial computation\/} \index{partial computation} \index{computation, partial} with respect to $M$ is a sequence $p_1 p_2 \dots$ of tape positions such that $p_{\ell+1} = \mathbf M(p_\ell)$ for each $\ell < k$. \item A partial computation $p_1 p_2 \dots p_k$ with respect to $M$ is a {\em computation\/}\index{computation} (with respect to $M$) with {\em input tape\/} \index{input tape} \index{tape, input} $\mathbf{a}$ if $p_1 = (1,0,\mathbf{a})$ and $\mathbf M(p_k)$ is undefined (and {\it not\/} because the scanner would run off the end of the tape). The {\em output tape\/} \index{output tape} \index{tape, output} of the computation is the tape of the final tape position $p_k$. \end{itemize} \end{defn} Note that a partial computation is a computation only if the Turing machine halts but doesn't crash in the final tape position. The requirement that it halt means that any computation can have only finitely many steps. Unless stated otherwise, we will assume that every partial computation on a given input begins in state $1$. We will often omit the ``partial'' when speaking of computations that might not strictly satisfy the definition of computation. \begin{exmp} Let's see the machine $M$ of Example \ref{TM:M} perform a computation. Our input tape will be $\mathbf{a} = 1 1 0 0$, that is, the tape which is entirely blank except that $a_0 = a_1 = 1$. The initial tape position of the computation of $M$ with input tape $\mathbf{a}$ is: $$ 1 \colon \underline{1} 1 0 0 $$ The subsequent steps in the computation are: \begin{align*} & 1 \colon 0 \underline{1} 0 0 \\ & 1 \colon 0 0 \underline{0} 0 \\ & 2 \colon 0 0 1 \underline{0} \\ & 2 \colon 0 0 \underline{1} \end{align*} We leave it to the reader to check that this is indeed a partial computation with respect to $M$. Since $M(2,1)$ is undefined the process terminates at this point and this partial computation is therefore a computation. \end{exmp} \begin{prob} \label{ten:comp} Give the (partial) computation of the Turing machine $M$ of Example \ref{TM:M} starting in state $1$ with the input tape: \begin{enumerate} \item $\underline{0} 0$ \item $1 \underline{1} 0$ \item The tape with all cells marked and cell $5$ being scanned. \end{enumerate} \end{prob} \begin{prob} \label{ten:inps} For which possible input tapes does the partial computation of the Turing machine $M$ of Example \ref{TM:M} eventually terminate? Explain why. \end{prob} \begin{prob} \label{ten:mrks} Find a Turing machine that (eventually!) fills a blank input tape with the pattern $010110001011000101100\dots$. \end{prob} \begin{prob} \label{ten:runs} Find a Turing machine that never halts (or crashes), no matter what is on the tape. \end{prob} \subsection*{Building Turing Machines} It will be useful later on to have a library of Turing machines that manipulate blocks of $1$s in various ways, and very useful to be able to combine machines peforming simpler tasks to perform more complex ones. \begin{exmp} \label{ex:STU} The Turing machine $S$ given below is intended to halt with output $01^k\underline{0}$ on input $\underline{0}1^k$, if $k>0$; that is, it just moves past a single block of $1$s without disturbing it. \begin{center} \mbox{ \begin{tabular}{c|c|c} $S$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & \\ $2$ & & $1R2$ \\ \end{tabular} } \end{center} Trace this machine's computation on, say, input $\underline{0}1^3$ to see how it works. The following machine, which is itself a variation on $S$, does the reverse of what $S$ does: on input $01^k\underline{0}$ it halts with output $\underline{0}1^k$. \begin{center} \mbox{ \begin{tabular}{c|c|c} $T$ & $0$ & $1$ \\ \hline $1$ & $0L2$ & \\ $2$ & & $1L2$ \\ \end{tabular} } \end{center} We can combine $S$ and $T$ into a machine $U$ which does nothing to a block of $1$s: given input $\underline{0}1^k$ it halts with output $\underline{0}1^k$. (Of course, a better way to do nothing is to really do nothing!) \begin{center} \mbox{ \begin{tabular}{c|c|c} $T$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & \\ $2$ & $0L3$ & $1R2$ \\ $3$ & & $1L3$ \end{tabular} } \end{center} Note how the states of $T$ had to be renumbered to make the combination work. \end{exmp} \begin{exmp} \label{ex:P22} The Turing machine $P$ given below is intended to move a block of $1$s: on input $\underline{0}0^n1^k$, where $n \ge 0$ and $k > 0$, it halts with output $\underline{0}1^k$. \begin{center} \mbox{ \begin{tabular}{c|c|c} $P$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & \\ $2$ & $1R3$ & $1L8$ \\ $3$ & $0R3$ & $0R4$ \\ $4$ & $0R7$ & $1L5$ \\ $5$ & $0L5$ & $1R6$ \\ $6$ & $1R3$ & \\ $7$ & $0L7$ & $1L8$ \\ $8$ & & $1L8$ \\ \end{tabular} } \end{center} Trace $P$'s computation on, say, input $\underline{0} 0^3 1^3$ to see how it works. Trace it on inputs $\underline{0} 1^2$ and $\underline{0}0^21$ as well to see how it handles certain special cases. \end{exmp} \begin{note} In both Examples \ref{ex:STU} and \ref{ex:P22} we do not really care what the given machines do on other inputs, so long as they perform as intended on the particular inputs we are concerned with. \end{note} \begin{prob} \label{ten:combTMs} We can combine the machine $P$ of Example~\ref{ex:P22} with the machines $S$ and $T$ of Example~\ref{ex:STU} to get the following machine. \begin{center} \mbox{ \begin{tabular}{c|c|c} $R$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & \\ $2$ & $0R3$ & $1R2$ \\ $3$ & $1R4$ & $1L9$ \\ $4$ & $0R4$ & $0R5$ \\ $5$ & $0R8$ & $1L6$ \\ $6$ & $0L6$ & $1R7$ \\ $7$ & $1R4$ & \\ $8$ & $0L8$ & $1L9$ \\ $9$ & $0L10$ & $1L9$ \\ $10$ & & $1L10$ \end{tabular} } \end{center} What task involving blocks of $1$s is this machine intended to perform? \end{prob} \begin{prob} \label{ten:mTMs} In each case, devise a Turing machine that: \begin{enumerate} \item Halts with output $\underline{0}1^4$ on input $\underline{0}$. \item Halts with output $01^n\underline{0}$ on input $\underline{0}0^n1$. \item Halts with output $\underline{0}1^{2n}$ on input $\underline{0}1^n$. \item Halts with output $\underline{0}(10)^n$ on input $\underline{0}1^n$. \item Halts with output $\underline{0}1^m$ on input $\underline{0}1^n01^m$ whenever $n,m > 0$. \item Halts with output $\underline{0}1^m01^n01^k$ on input $\underline{0}1^n01^k01^m$, if $n,m,k > 0$. \item Halts with output $\underline{0}1^m01^n01^k01^m01^n01^k$ on input $\underline{0}1^m01^n01^k$, if $n,m,k > 0$. \item On input $\underline{0}1^m01^n$, where $m,n >0$, halts with output $\underline{0}1$ if $m \ne n$ and output $\underline{0}11$ if $m = n$. \end{enumerate} It doesn't matter what the machine you define in each case may do on other inputs, so long as it does the right thing on the given one(s). \end{prob} % % Eleventh chapter of "A Problem Course in Mathematical Logic" % \chapter{Variations and Simulations} \label{ch:eleven} The definition of a Turing machine given in Chapter \ref{ch:ten} is arbitrary in a number of ways, among them the use of the symbols $0$ and $1$, a single read-write scanner, and a single one-way infinite tape. One could further restrict the definition we gave by allowing \begin{itemize} \item the machine to move the scanner\index{scanner} only to one of left or right in each state, \end{itemize} or expand it by allowing the use of \begin{itemize} \item any finite alphabet\index{alphabet} of at least two symbols, \item separate read and write heads\index{head separate}, \item multiple heads\index{head multiple}, \item two-way infinite tapes, \index{two-way infinite tape} \index{tape two-way infinite} \item multiple tapes, \index{tape multiple} \item two- and higher-dimensional tapes, \index{tape higher-dimensional} \end{itemize} or various combinations of these, among many other possibilities. We will construct a number of Turing machines that simulate others with additional features; this will show that various of the modifications mentioned above really change what the machines can compute. (In fact, none of them turn out to do so.) \begin{exmp} \label{sim:LR} Consider the following Turing machine: \begin{center} \mbox{ \begin{tabular}{c|c|c} $M$ & $0$ & $1$ \\ \hline $1$ & $1R2$ & $0L1$ \\ $2$ & $0L2$ & $1L1$ \end{tabular} } \end{center} Note that in state $1$, this machine may move the scanner to either the left or the right, depending on the contents of the cell being scanned. We will construct a Turing machine using the same alphabet that emulates the action of $M$ on any input, but which moves the scanner to only one of left or right in each state. There is no problem with state $2$ of $M$, by the way, because in state $2$ $M$ always moves the scanner to the left. The basic idea is to add some states to $M$ which replace part of the description of state $1$. \begin{center} \mbox{ \begin{tabular}{c|c|c} $M'$ & $0$ & $1$ \\ \hline $1$ & $1R2$ & $0R3$ \\ $2$ & $0L2$ & $1L1$ \\ $3$ & $0L4$ & $1L4$ \\ $4$ & $0L1$ & \end{tabular} } \end{center} This machine is just like $M$ except that in state $1$ with input $1$, instead of moving the scanner to the left and going to state $1$, the machine moves the scanner to the right and goes to the new state $3$. States $3$ and $4$ do nothing between them except move the scanner two cells to the left without changing the tape, thus putting it where $M$ would have put it, and then entering state $1$, as $M$ would have. \end{exmp} \begin{prob} \label{p:eleven1} Compare the computations of the machines $M$ and $M'$ of Example \ref{sim:LR} on the input tapes \begin{enumerate} \item $0$ \item $011$ \end{enumerate} and explain why is it not necessary to define $M'$ for state $4$ on input $1$. \end{prob} \begin{prob} \label{p:eleven2} Explain in detail how, given an arbitrary Turing machine $M$, one can construct a machine $M'$ that simulates what $M$ does on any input, but which moves the scanner only to one of left or right in each state. \end{prob} It should be obvious that the converse, simulating a Turing machine that moves the scanner only to one of left or right in each state by an ordinary Turing machine, is easy to the point of being trivial. It is often very convenient to add additional symbols to the alphabet that Turing machines are permitted to use. For example, one might want to have special symbols to use as place markers in the course of a computation. (For a more spectacular application, see Example~\ref{sim:2way} below.) It is conventional to include $0$, the ``blank'' symbol, in an alphabet used by a Turing machine, but otherwise any finite set of symbols goes. \begin{prob} \label{p:eleven1a} How do you need to change Definitions \ref{d:tape} and \ref{d:TM} to define Turing machines using a finite alphabet $\Sigma$? \end{prob} While allowing arbitary alphabets is often convenient when designing a machine to perform some task, it doesn't actually change what can, in principle, be computed. \begin{exmp} \label{sim:alph} Consider the machine $W$ below which uses the alphabet $\{0,x,y,z\}$. \begin{center} \mbox{ \begin{tabular}{c|c|c|c|c} $W$ & $0$ & $x$ & $y$ & $z$\\ \hline $1$ & $0R1$ & $xR1$ & $0L2$ & $zR1$\\ \end{tabular} } \end{center} For example, on input $\underline{0}xzyxy$, $W$ will eventually halt with output $0x\underline{z}0xy$. Note that state $2$ of $W$ is used only to halt, so we don't bother to make a row for it on the table. To simulate $W$ with a machine $Z$ using the alphabet $\{0,1\}$, we first have to decide how to represent $W$'s tape. We will use the following scheme, arbitrarily chosen among a number of alternatives. Every cell of $W$'s tape will be represented by two consecutive cells of $Z$'s tape, with a $0$ on $W$'s tape being stored as $00$ on $Z$'s, an $x$ as $01$, a $y$ as $10$, and a $z$ as $11$. Thus, if $W$ had input tape $\underline{0}xzyxy$, the corresponding input tape for $Z$ would be $\underline{0}0 01 11 10 01 10$. Designing the machine $Z$ that simulates the action of $W$ on the representation of $W$'s tape is a little tricky. In the example below, each state of $W$ corresponds to a ``subroutine'' of states of $Z$ which between them read the information in each representation of a cell of $W$'s tape and take appropriate action. \begin{center} \mbox{ \begin{tabular}{c|c|c} $Z$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & $1R3$ \\ $2$ & $0L4$ & $1L6$ \\ $3$ & $0L8$ & $1L13$ \\ $4$ & $0R5$ & \\ $5$ & $0R1$ & \\ $6$ & $0R7$ & \\ $7$ & & $1R1$ \\ $8$ & & $0R9$ \\ $9$ & $0L10$ & \\ $10$ & $0L11$ & \\ $11$ & $0L12$ & $1L12$ \\ $12$ & $0L15$ & $1L15$ \\ $13$ & & $1R14$ \\ $14$ & & $1R1$ \end{tabular} } \end{center} States $1$--$3$ of $Z$ read the input for state $1$ of $W$ and then pass on control to subroutines handling each entry for state $1$ in $W$'s table. Thus states $4$--$5$ of $Z$ take action for state $1$ of $W$ on input $0$, states $6$--$7$ of $Z$ take action for state $1$ of $W$ on input $x$, states $8$--$12$ of $Z$ take action for state $1$ of $W$ on input $y$, and states $13$--$14$ take action for state $1$ of $W$ on input $z$. State $15$ of $Z$ does what state $2$ of $W$ does: nothing but halt. \end{exmp} \begin{prob} \label{p:eleven2a} Trace the (partial) computations of $W$, and their counterparts for $Z$, for the input $\underline{0}xzyxy$ for $W$. Why is the subroutine for state $1$ of $W$ on input $y$ so much longer than the others? How much can you simplify it? \end{prob} \begin{prob} \label{p:eleven3} Given a Turing machine $M$ with an arbitrary alphabet $\Sigma$, explain in detail how to construct a machine $N$ with alphabet $\{0,1\}$ that simulates $M$. \end{prob} Doing the converse of this problem, simulating a Turing machine with alphabet $\{0,1\}$ by one using an arbitrary alphabet, is pretty easy. To define Turing machines with two-way infinite tapes \index{two-way infinite tape} \index{tape, two-way infinite} we need only change Definition~\ref{d:tape}: instead of having tapes $\mathbf{a} = a_0 a_1a_2 \ldots$ indexed by $\mathbb{N}$, we let them be $\mathbf{b} = \ldots b_{-2} b_{-1} b_0 b_1 b_2 \ldots$ indexed by $\mathbb{Z}$. In defining computations for machines with two-way infinite tapes, we adopt the same conventions that we did for machines with one-way infinite tapes, such as having the scanner start off scanning cell $0$ on the input tape. The only real difference is that a machine with a two-way infinite tape cannot crash\index{crash} by running off the left end of the tape; it can only stop by halting.\index{halt} \begin{exmp} \label{sim:2way} Consider the following two-way infinite tape Turing machine with alphabet $\{0,1\}$: \begin{center} \mbox{ \begin{tabular}{c|c|c} $T$ & $0$ & $1$ \\ \hline $1$ & $1L1$ & $0R2$ \\ $2$ & $0R2$ & $1L1$ \\ \end{tabular} } \end{center} To emulate $T$ with a Turing machine $O$ that has a one-way infinite tape, we need to decide how to represent a two-way infinite tape on a one-way infinite tape. This is easier to do if we allow ourselves to use an alphabet for $O$ other than $\{0,1\}$, chosen with malice aforethought: \[ \{\, \tover{0}{\mathrm{S}},\, \tover{1}{\mathrm{S}},\, \tover{0}{0},\, \tover{0}{1},\, \tover{1}{0},\, \tover{1}{1} \,\} \] We can now represent the tape $\mathbf{a} = \ldots a_{-2} a_{-1} a_0 a_1 a_2 \ldots$ for $T$ by the tape $\mathbf{a}' = \tover{a_0}{\mathrm{S}}\, \tover{a_1}{a_{-1}}\, \tover{a_2}{a_{-2}}\, \ldots$ for $O$. In effect, this trick allows us to split $O$'s tape into two tracks, each of which accomodates half of the tape of $T$. To define $O$, we split each state of $T$ into a pair of states for $O$, one for the lower track and one for the upper track. One must take care to keep various details straight: when $O$ changes a ``cell'' on one track, it should not change the corresponding ``cell'' on the other track; directions are reversed on the lower track; one has to ``turn a corner'' moving past cell $0$; and so on. \begin{center} \mbox{ \begin{tabular}{c|c|c|c|c|c|c|c} $O$ & $0$ & $\tover{0}{\mathrm{S}}$ & $\tover{0}{0}$ & $\tover{0}{1}$ & $\tover{1}{\mathrm{S}}$ & $\tover{1}{0}$ & $\tover{1}{1}$ \\ \hline $1$ & $\tover{1}{0}L1$ & $\tover{1}{\mathrm{S}}R3$ & $\tover{1}{0}L1$ & $\tover{1}{1}L1$ & $\tover{0}{\mathrm{S}}R2$ & $\tover{0}{0}R2$ & $\tover{0}{1}R2$ \\ $2$ & $\tover{0}{0}R2$ & $\tover{0}{\mathrm{S}}R2$ & $\tover{0}{0}R2$ & $\tover{0}{1}R2$ & $\tover{1}{\mathrm{S}}R3$ & $\tover{1}{0}L1$ & $\tover{1}{1}L1$ \\ $3$ & $\tover{0}{1}R3$ & $\tover{1}{\mathrm{S}}R3$ & $\tover{0}{1}R3$ & $\tover{0}{0}L4$ & $\tover{0}{\mathrm{S}}R2$ & $\tover{1}{1}R3$ & $\tover{1}{0}L4$ \\ $4$ & $\tover{0}{0}L4$ & $\tover{0}{\mathrm{S}}R2$ & $\tover{0}{0}L4$ & $\tover{0}{1}R3$ & $\tover{1}{\mathrm{S}}R3$ & $\tover{1}{0}L4$ & $\tover{1}{1}R3$ \\ \end{tabular} } \end{center} States $1$ and $3$ are the upper- and lower-track versions, respectively, of $T$'s state $1$; states $2$ and $4$ are the upper- and lower-track versions, respectively, of $T$'s state $2$. We leave it to the reader to check that $O$ actually does simulate $T$\dots \end{exmp} \begin{prob} \label{p:eleven4} Trace the (partial) computations of $T$, and their counterparts for $O$, for each of the following input tapes for $T$: \begin{enumerate} \item $\underline{0}$ ({\em i.e.\/} a blank tape) \item $1\underline{0}$ \item $\dots 111\underline{1}111 \dots$ ({\em i.e.\/} every cell marked with $1$) \end{enumerate} \end{prob} \begin{prob} \label{p:eleven5} Explain in detail how, given a Turing machine $N$ with alphabet $\Sigma$ and a two-way infinite tape, one can construct a Turing machine $P$ with an one-way infinite tape that simulates $N$. \end{prob} \begin{prob} \label{p:eleven5a} Explain in detail how, given a Turing machine $P$ with alphabet $\Sigma$ and an one-way infinite tape, one can construct a Turing machine $N$ with a two-way infinite tape that simulates $P$. \end{prob} Combining the techniques we've used so far, we could simulate any Turing machine with a two-way infinite tape and arbitrary alphabet by a Turing machine with a one-way infinite tape and alphabet $\{0,1\}$. \begin{prob} \label{p:eleven6} Give a precise definition for Turing machines with two tapes. Explain how, given any such machine, one could construct a single-tape machine to simulate it. \end{prob} \begin{prob} \label{p:eleven7} Give a precise definition for Turing machines with two-dimensional tapes. Explain how, given any such machine, one could construct a single-tape machine to simulate it. \end{prob} These results, and others like them, imply that none of the variant types of Turing machines mentioned at the start of this chapter differ essentially in what they can, in principle, compute. In Chapter~\ref{ch:fourteen} we will construct a Turing machine that can simulate {\it any\/} (standard) Turing machine. % % Twelfth chapter of "A Problem Course in Mathematical Logic" % \chapter{Computable and Non-Computable Functions} \label{ch:twelve} A lot of computational problems in the real world have to do with doing arithmetic, and any notion of computation that can't deal with arithmetic is unlikely to be of great use. \subsection*{Notation and conventions} To keep things as simple as possible, we will stick to computations involving the {\em natural numbers\/}\index{natural numbers}, {\em i.e.\/} the non-negative integers, the set of which is usually denoted by $\mathbb{N} = \{\, 0, 1, 2, \dots \,\}$.\index{$\mathbb{N}$}. The set of all $k$-tuples $(n_1,\dots,n_k)$ of natural numbers is denoted by $\mathbb{N}^k$.\index{$\mathbb{N}^k$} For all practical purposes, we may take $\mathbb{N}^1$ to be $\mathbb{N}$ by identifying the $1$-tuple $(n)$ with the natural number $n$. For $k \ge 1$, $f$ is a {\em $k$-place function\/} \index{$k$-place function} \index{function $k$-place} (from the natural numbers to the natural numbers), often written as $f \colon \mathbb{N}^k \to \mathbb{N}$,\index{$f \colon \mathbb{N}^k \to \mathbb{N}$} if it associates a value, $f(n_1,\dots,n_k)$, to each $k$-tuple $(n_1,n_2,\dots,n_k) \in \mathbb{N}^k$. Strictly speaking, though we will frequently forget to be explicit about it, we will often be working with $k$-place {\em partial functions\/} \index{partial function} \index{function partial} which might not be defined for all the $k$-tuples in $\mathbb{N}^k$. If $f$ is a $k$-place partial function, the {\em domain\/} \index{domain of a function} \index{function domain of} of $f$ is the set $$ \mathrm{dom}(f) = \{\, (n_1, \dots, n_k) \in \mathbb{N}^k \mid \text{$f(n_1, \dots, n_k)$ is defined} \,\} \, . $$ Similarly, the {\em range\/} \index{range of a function} of $f$ is the set $$ \mathrm{ran}(f) = \{\, f(n_1, \dots, n_k) \in \mathbb{N} \mid (n_1, \dots, n_k) \in \mathrm{dom}(f) \,\} \, . $$ In subsequent chapters we will also work with relations on the natural numbers. Recall that a {\em $k$-place relation\/} \index{$k$-place relation} \index{relation $k$-place} on $\mathbb{N}$ is formally a subset $P$ of $\mathbb{N}^k$; $P(n_1,\dots,n_k)$ is {\em true\/} if $(n_1,\dots,n_k) \in P$ and {\em false\/} otherwise. In particular, a $1$-place relation is really just a subset of $\mathbb{N}$. Relations and functions are closely related. All one needs to know about a $k$-place function $f$ can be obtained from the $(k+1)$-place relation $P_f$ given by $$ P_f(n_1,\dots,n_k,n_{k+1}) \iff f(n_1,\dots,n_k) = n_{k+1} \, . $$ Similarly, all one needs to know about the $k$-place relation $P$ can be obtained from its {\em characteristic function\/} \index{characteristic function} \index{relation characteristic function}: $$ \chi_P (n_1,\dots,n_k) = \begin{cases} 1 & \text{if $P(n_1,\dots,n_k)$ is true;} \\ 0 & \text{if $P(n_1,\dots,n_k)$ is false.} \end{cases} $$ The basic convention for representing natural numbers on the tape of a standard Turing machine is a slight variation of {\em unary notation\/} \index{unary notation}: $n$ is represented by $1^{n+1}$. (Why would using $1^n$ be a bad idea?) A $k$-tuple $(n_1, n_2, \ldots, n_k) \in \mathbb{N}$ will be represented by $1^{n_1 + 1} 0 1^{n_2 + 1} 0 \ldots 0 1^{n_k + 1}$, {\em i.e.\/} with the representations of the individual numbers separated by $0$s. This scheme is inefficient in its use of space --- compared to binary notation, for example --- but it is simple and can be implemented on Turing machines restricted to the alphabet $\{1\}$. \subsection*{Turing computable functions} With suitable conventions for representing the input and output of a function on the natural numbers on the tape of a Turing machine in hand, we can define what it means for a function to be computable by a Turing machine. \begin{defn} \label{D:TC} A $k$-place function $f$ is {\em Turing computable\/}, \index{Turing computable function} \index{function Turing computable} or just {\em computable\/}, \index{computable function} \index{function computable} if there is a Turing machine $M$ such that for any $k$-tuple $(n_1, \ldots, n_k) \in \mathrm{dom}(f)$ the computation of $M$ with input tape $\underline{0} 1^{n_1 + 1} 0 1^{n_2 + 1} \ldots 0 1^{n_k + 1}$ eventually halts with output tape $\underline{0} 1^{f(n_1,\dots,n_k) + 1}$. Such a machine $M$ is said to {\em compute\/} $f$. \end{defn} Note that for a Turing machine $M$ to compute a function $f$, $M$ need only do the right thing on the right kind of input: what $M$ does in other situations does not matter. In particular, it does not matter what $M$ might do with $k$-tuple which is not in the domain of $f$. \begin{exmp} \index{identity function} \index{function identity} \index{$i_{\mathbb{N}}$} The identity function $i_{\mathbb{N}} \colon \mathbb{N} \to \mathbb{N}$, {\em i.e.\/} $i_{\mathbb{N}}(n) = n$, is computable. It is computed by $M = \emptyset$, the Turing machine with an empty table that does absolutely nothing on any input. \end{exmp} \begin{exmp} \label{ex:projmach} The projection function $\pi^2_1 : \mathbb{N}^2 \to \mathbb{N}$ given by $\pi^2_1(n,m) = n$ is computed by the Turing machine: \begin{center} \mbox{ \begin{tabular}{c|c|c} $P^2_1$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & \\ $2$ & $0R3$ & $1R2$ \\ $3$ & $0L4$ & $0R3$ \\ $4$ & $0L4$ & $1L5$ \\ $5$ & & $1L5$ \end{tabular} } \end{center} \noindent $P^2_1$ acts as follows: it moves to the right past the first block of $1$s without disturbing it, erases the second block of $1$s, and then returns to the left of first block and halts. The projection function $\pi^2_2 : \mathbb{N}^2 \to \mathbb{N}$ given by $\pi^2_2(n,m) = m$ is also computable: the Turing machine $P$ of Example \ref{ex:P22} does the job. \end{exmp} \begin{prob} \label{C:c} Find Turing machines that compute the following functions and explain how they work. \begin{enumerate} \item $\mathsc{O}(n) = 0$. \index{$\mathsc{O}$} \item $\mathsc{S}(n) = n + 1$. \index{$\mathsc{S}$} \item $\mathsc{Sum}(n,m) = n + m$. \index{$\mathsc{Sum}$} \item $\mathsc{Pred}(n) = \begin{cases} n - 1 & n \ge 1 \\ 0 & n = 0 \end{cases}$. \index{$\mathsc{Pred}$} \item $\mathsc{Diff}(n,m) = \begin{cases} n - m & n \ge m \\ 0 & n < m \end{cases}$. \index{$\mathsc{Diff}$} \item $\pi^3_2 (p,q,r) = q$. \item $\pi^k_i (a_1, \dots, a_i, \dots, a_k) = a_i$ \end{enumerate} \end{prob} We will consider methods for building functions computable by Turing machines out of simpler ones later on. \subsection*{A non-computable function} In the meantime, it is worth asking whether or not every function on the natural numbers is computable. No such luck! \begin{prob} \label{C:inc} Show that there is some $1$-place function $f : \mathbb{N} \to \mathbb{N}$ which is not computable by comparing the number of such functions to the number of Turing machines. \end{prob} The argument hinted at above is unsatisfying in that it tells us there is a non-computable function without actually producing an explicit example. We can have some fun on the way to one. \begin{defn}[Busy Beaver Competition] A machine $M$ is an {\em $n$-state entry in the busy beaver competition\/} \index{busy beaver competition} \index{busy beaver competition $n$-state entry} \index{$n$-state entry in busy beaver competition} if: \begin{itemize} \item $M$ has a two-way infinite tape and alphabet $\{1\}$ (see Chapter~\ref{ch:eleven}; \item $M$ has $n+1$ states, but state $n+1$ is used only for halting (so both $M(n+1,0)$ and $M(n+1,1)$ are undefined); \item $M$ eventually halts when given a blank input tape. \end{itemize} $M$'s {\em score\/} \index{score in busy beaver competition} \index{busy beaver competition score in} in the competition is the number of $1$'s on the output tape of its computation from a blank input tape. The greatest possible score of an $n$-state entry in the competition is denoted by $\Sigma(n)$. \end{defn} Note that there are only finitely many possible $n$-state entries in the busy beaver competition because there are only finitely many $(n+1)$-state Turing machines with alphabet $\{1\}$. Since there is at least one $n$-state entry in the busy beaver competition for every $n \ge 0$ , it follows that $\Sigma(n)$ is well-defined for each $n \in \mathbb{N}$. \begin{exmp} $M = \emptyset$ is the {\em only\/} $0$-state entry in the busy beaver competition, so $\Sigma(0) = 0$. \end{exmp} \begin{exmp} \label{bb:exs} The machine $P$ given by \begin{center} \mbox{ \begin{tabular}{c|c|c} $P$ & $0$ & $1$ \\ \hline $1$ & $1R2$ & $1L2$ \\ $2$ & $1L1$ & $1L3$ \end{tabular} } \end{center} \noindent is a $2$-state entry in the busy beaver competition with a score of $4$, so $\Sigma(2) \ge 4$. \end{exmp} The function $\Sigma$ grows extremely quickly. It is known that $\Sigma(0) = 0$, $\Sigma(1) = 1$, $\Sigma(2) = 4$, $\Sigma(3) = 6$, and $\Sigma(4) = 13$. The value of $\Sigma(5)$ is still unknown, but must be quite large.\footnote{The best score known to the author by a $5$-state entry in the busy beaver competition is $4098$. One of the two machines achieving this score does so in a computation that takes over $40$ million steps! The other requires only $11$ million or so\dots} \begin{prob} \label{prob:BB} Show that: \begin{enumerate} \item The $2$-state entry given in Example \ref{bb:exs} actually scores $4$. \item $\Sigma(1) = 1$. \item $\Sigma(3) \ge 6$. \item $\Sigma(n) < \Sigma(n+1)$ for every $n \in \mathbb{N}$. \end{enumerate} \end{prob} \begin{prob} \label{prob:mBB} Devise as high-scoring $4$- and $5$-state entries in the busy beaver competition as you can. \end{prob} The serious point of the busy beaver competition is that the function $\Sigma$ is {\em not\/} a Turing computable function. \begin{prop} \label{p:twelve5} $\Sigma$ is not computable by any Turing machine. \end{prop} Anyone interested in learning more about the busy beaver competition should start by reading the paper \cite{TR:ONCF} in which it was first introduced. \subsection*{Building more computable functions} One of the most common methods for assembling functions from simpler ones in many parts of mathematics is composition. It turns out that compositions of computable functions are computable. \begin{defn} \index{composition} \index{function composition of} Suppose that $m,k \ge 1$, $g$ is an $m$-place function, and $h_1$, \dots, $h_m$ are $k$-place functions. Then the $k$-place function $f$ is said to be obtained from $g$, $h_1$, \dots, $h_m$ by {\em composition\/}, written as $$ f = g \circ (h_1, \dots, h_m) \, , $$ if for all $(n_1, \ldots, n_k) \in \mathbb N^k$, $$ f(n_1, \ldots, n_k) = g(h_1(n_1, \ldots, n_k), \ldots, h_m(n_1, \ldots, n_k)). $$ \end{defn} \begin{exmp} \label{e:con} The constant function $c^1_1$, where $c^1_1(n) = 1$ for all $n$, can be obtained by composition from the functions $\mathsc{S}$ and $\mathsc{O}$. For any $n \in \mathbb N$, $$ c^1_1(n) = (\mathsc{S} \circ \mathsc{O})(n) = \mathsc{S}(\mathsc{O}(n)) = \mathsc{S}(0) = 0 + 1 = 1 \, . $$ \end{exmp} \begin{prob} \label{p:cnstfns} \index{constant function}\index{function constant} Suppose $k \ge 1$ and $a \in \mathbb{N}$. Use composition to define the constant function $c^k_a$, where $c^k_a (n_1, \ldots, n_k) = a$ for all $(n_1, \ldots, n_k) \in \mathbb N^k$, from functions already known to be computable. \end{prob} \begin{prop} \label{p:compcomp} Suppose that $1 \le k$, $1 \le m$, $g$ is a Turing computable $m$-place function, and $h_1$, \dots, $h_m$ are Turing computable $k$-place functions. Then $g \circ (h_1, \dots, h_m)$ is also Turing computable. \end{prop} Starting with a small set of computable functions, and applying computable ways (such as composition) of building functions from simpler ones, we will build up a useful collection of computable functions. This will also provide a characterization of computable functions which does not mention any type of computing device. The ``small set of computable functions'' that will be the fundamental building blocks is infinite only because it includes all the projection functions. \begin{defn} \label{d:initfns} The following are the {\em initial functions\/}: \index{initial function} \index{function initial} \begin{itemize} \item $\mathsc{O}$, the $1$-place function such that $\mathsc{O}(n) = 0$ for all $n \in \mathbb{N}$; \index{$\mathsc{O}$} \item $\mathsc{S}$, the $1$-place function such that $\mathsc{S}(n) = n + 1$ for all $n \in \mathbb{N}$; \index{$\mathsc{S}$} and, \item for each $k \ge 1$ and $1 \le i \le k$, $\pi^k_i$, the $k$-place function such that $\pi^k_i (n_1, \ldots, n_k) = n_i$ for all $(n_1, \ldots, n_k) \in \mathbb{N}^k$. \index{$\pi^k_i$} \end{itemize} $\mathsc{O}$ is often referred to as the {\em zero function\/}, \index{zero function} \index{function zero} $\mathsc{S}$ is the {\em successor function\/}, \index{successor function} \index{function successor} and the functions $\pi^k_i$ are called the {\em projection functions\/}. \index{projection function} \index{function projection} \end{defn} Note that $\pi^1_1$ is just the identity function on $\mathbb{N}$. We have already shown, in Problem~\ref{C:c}, that all the initial functions are computable. It follows from Proposition~\ref{p:compcomp} that every function defined from the initial functions using composition (any number of times) is computable too. Since one can build relatively few functions from the initial functions using only composition\dots \begin{prop} \label{p:complin} Suppose $f$ is a $1$-place function obtained from the initial functions by finitely many applications of composition. Then there is a constant $c \in \mathbb{N}$ such that $f(n) \le n + c$ for all $n \in \mathbb{N}$. \end{prop} \dots in the next chapter we will add other methods of building functions to our repertoire that will allow us to build all computable functions from the initial functions. % % Thirteenth chapter of "A Problem Course in Mathematical Logic" % \chapter{Recursive Functions} \label{ch:thirteen} We will add two other methods of building computable functions from computable functions to composition, and show that one can use the three methods to construct all computable functions on $\mathbb{N}$ from the initial functions. \subsection*{Primitive recursion} The second of our methods is simply called recursion in most parts of mathematics and computer science. Historically, the term ``primitive recursion'' has been used to distinguish it from the other recursive method of defining functions that we will consider, namely unbounded minimalization. ... Primitive recursion boils down to defining a function inductively, using different functions to tell us what to do at the base and inductive steps. Together with composition, it suffices to build up just about all familiar arithmetic functions from the initial functions. \begin{defn} \index{primitive recursion}\index{recursion primitive}\index{function primitive recursion} Suppose that $k \ge 1$, $g$ is a $k$-place function, and $h$ is a $k+2$-place function. Let $f$ be the $(k+1)$-place function such that \begin{enumerate} \item $f(n_1, \ldots, n_k, 0) = g(n_1, \ldots, n_k)$ and \item $f(n_1, \ldots, n_k, m + 1) = h\left(n_1, \ldots, n_k, m, f(n_1, \ldots, n_k, m)\right)$ \end{enumerate} for every $(n_1, \ldots, n_k) \in \mathbb{N}^k$ and $m \in \mathbb{N}$. Then $f$ is said to be obtained from $g$ and $h$ by {\em primitive recursion\/}. \end{defn} That is, the initial values of $f$ are given by $g$, and the rest are given by $h$ operating on the given input and the preceding value of $f$. For a start, primitive recursion and composition let us define addition and multiplication from the initial functions. \begin{exmp} $\mathsc{Sum}(n,m) = n + m$ is obtained by primitive recursion from the initial function $\pi^1_1$ and the composition $\mathsc{S} \circ \pi^3_3$ of initial functions as follows: \index{$\mathsc{Sum}$} \begin{itemize} \item $\mathsc{Sum}(n,0) = \pi^1_1(n)$; \item $\mathsc{Sum}(n,m+1) = (\mathsc{S} \circ \pi^3_3)(n,m,\mathsc{Sum}(n,m))$. \end{itemize} To see that this works, one can proceed by induction on $m$: At the base step, $m = 0$, we have $$ \mathsc{Sum}(n,0) = \pi^1_1(n) = n = n + 0 \, . $$ Assume that $m \ge 0$ and $\mathsc{Sum}(n,m) = n + m$. Then \begin{align*} \mathsc{Sum}(n,m+1) &= (\mathsc{S} \circ \pi^3_3)(n,m,\mathsc{Sum}(n,m)) \\ &= \mathsc{S}(\pi^3_3(n,m,\mathsc{Sum}(n,m))) \\ &= \mathsc{S}(\mathsc{Sum}(n,m)) \\ &= \mathsc{Sum}(n,m) + 1 \\ &= n + m + 1 \, , \end{align*} as desired. \end{exmp} As addition is to the successor function, so multiplication is to addition. \begin{exmp} $\mathsc{Mult}(n,m) = nm$ is obtained by primitive recursion from $\mathsc{O}$ and $\mathsc{Sum} \circ (\pi^3_3, \pi^3_1)$: \begin{itemize} \item $\mathsc{Mult}(n,0) = \mathsc{O}(n)$; \item $\mathsc{Mult}(n,m+1) = (\mathsc{Sum} \circ (\pi^3_3, \pi^3_1))(n,m,\mathsc{Mult}(n,m))$. \end{itemize} \index{$\mathsc{Mult}$} We leave it to the reader to check that this works. \end{exmp} \begin{prob} \label{pr:fns} Use composition and primitive recursion to obtain each of the following functions from the initial functions or other functions already obtained from the initial functions. \begin{enumerate} \item $\mathsc{Exp}(n,m) = n^m$ \index{$\mathsc{Exp}$} \item $\mathsc{Pred}(n)$ (defined in Problem \ref{C:c}) \index{$\mathsc{Pred}$} \item $\mathsc{Diff}(n,m)$ (defined in Problem \ref{C:c}) \index{$\mathsc{Diff}$} \item $\mathsc{Fact}(n) = n!$ \index{$\mathsc{Fact}$} \end{enumerate} \end{prob} \begin{prop} \label{p:thirteen5} Suppose $k \ge 1$, $g$ is a Turing computable $k$-place function, and $h$ is a Turing computable $(k+2)$-place function. If $f$ is obtained from $g$ and $h$ by primitive recursion, then $f$ is also Turing computable. \end{prop} \subsection*{Primitive recursive functions and relations} The collection of functions which can be obtained from the initial functions by (possibly repeatedly) using composition and primitive recursion is useful enough to have a name. \begin{defn} \index{primitive recursive function}\index{function primitive recursive} A function $f$ is {\em primitive recursive\/} if it can be defined from the initial functions by finitely many applications of the operations of composition and primitive recursion. \end{defn} So we already know that all the initial functions, addition, and multiplication, among others, are primitive recursive. \begin{prob} \label{p:thirteen6} Show that each of the following functions is primitive recursive. \begin{enumerate} \item For any $k \ge 0$ and primitive recursive $(k+1)$-place function $g$, the $(k+1)$-place function $f$ given by \begin{align*} f(n_1, \ldots, n_k,m) &= \Pi_{i=0}^m g(n_1, \ldots, n_k, i) \\ &= g(n_1, \ldots, n_k, 0) \cdot \ldots \cdot g(n_1, \ldots, n_k, m) \, . \end{align*} \item For any constant $a \in \mathbb{N}$, $\chi_{\{a\}}(n) = \begin{cases} 0 & n \ne a \\ 1 & n = a \, . \end{cases}$ \item $h(n_1, \ldots, n_k) = \begin{cases} f(n_1, \ldots, n_k) & (n_1, \ldots, n_k) \ne (c_1, \ldots, c_k) \\ a & (n_1, \ldots, n_k) = (c_1, \ldots, c_k) \end{cases}$, if $f$ is a primitive recursive $k$-place function and $a, c_1, \dots, c_k \in \mathbb{N}$ are constants. \end{enumerate} \end{prob} \begin{thm} \label{t:thirteen7} Every primitive recursive function is Turing computable. \end{thm} Be warned, however, that there are computable functions which are not primitive recursive. We can extend the idea of ``primitive recursive'' to relations by using their characteristic functions. \begin{defn} \index{relation primitive recursive} \index{primitive recursive relation} Suppose $k \ge 1$. A $k$-place relation $P \subseteq \mathbb N^k$ is {\em primitive recursive\/} if its characteristic function $$ \chi_P(n_1,\dots,n_k) = \begin{cases} 1 & (n_1,\dots,n_k) \in P \\ 0 & (n_1,\dots,n_k) \notin P \end{cases} $$ is primitive recursive. \end{defn} \begin{exmp} $P = \{2\} \subset \mathbb N$ is primitive recursive since $\chi_{\{2\}}$ is recursive by Problem \ref{p:thirteen6}. \end{exmp} \begin{prob} \label{r:rfs} Show that the following relations and functions are primitive recursive. \begin{enumerate} \item $\lnot P$, {\em i.e.\/} $\mathbb{N}^k \setminus P$, if $P$ is a primitive recursive $k$-place relation. \index{$\lnot P$} \index{$\mathbb{N}^k \setminus P$} \item $P \lor Q$, {\em i.e.\/} $P \cup Q$, if $P$ and $Q$ are primitive recursive $k$-place relations. \index{$P \lor Q$} \index{$P \cup Q$} \item $P \land Q$, {\em i.e.\/} $P \cap Q$, if $P$ and $Q$ are primitive recursive $k$-place relations. \index{$P \land Q$} \index{$P \cap Q$} \item $\mathsc{Equal}$, where $\mathsc{Equal}(n,m) \iff n = m$. \index{$\mathsc{Equal}$} \item $h(n_1,\dots,n_k,m) = \sum_{i=0}^m g(n_1,\dots,n_k,i)$, for any $k \ge 0$ and primitive recursive $(k+1)$-place function $g$. \item $\mathsc{Div}$, where $\mathsc{Div}(n,m) \iff n \mid m$. \index{$\mathsc{Div}$} \item $\mathsc{IsPrime}$, where $\mathsc{IsPrime}(n) \iff n \text{\ is prime}$. \index{$\mathsc{IsPrime}$} \item $\mathsc{Prime}(k) = p_k$, where $p_0 = 1$ and $p_k$ is the $k$th prime if $k \ge 1$. \index{$\mathsc{Prime}$} \item $\mathsc{Power}(n,m) = k$, where $k \ge 0$ is maximal such that $n^k \mid m$. \index{$\mathsc{Power}$} \item $\mathsc{Length}(n) = \ell$, where $\ell$ is maximal such that $p_\ell \mid n$. \index{$\mathsc{Length}$} \item $\mathsc{Element}(n,i) = n_i$, if $n = p_1^{n_1}\dots p_k^{n_k}$ (and $n_i = 0$ if $i > k$). \index{$\mathsc{Element}$} \item $\mathsc{Subseq}(n,i,j) = \begin{cases} p_i^{n_i} p_{i+1}^{n_{i+1}} \dots p_j^{n_j} & \text{if\ } 1 \le i \le j \le k \\ 0 & \text{otherwise} \end{cases}$, \index{$\mathsc{Subseq}$} whenever $n = p_1^{n_1}\dots p_k^{n_k}$. \item $\mathsc{Concat}(n,m) = p_1^{n_1} \dots p_k^{n_k} p_{k+1}^{m_1} \dots p_{k+\ell}^{m_l}$, if $n = p_1^{n_1}\dots p_k^{n_k}$ and $m = p_1^{m_1} \dots p_\ell^{m_\ell}$. \end{enumerate} \end{prob} Parts of Problem \ref{r:rfs} give us tools for representing finite sequences of integers by single integers, as well as some tools for manipulating these representations. This lets us reduce, in principle, all problems involving primitive recursive functions and relations to problems involving only $1$-place primitive recursive functions and relations. \begin{thm} \label{r:cd} A $k$-place $g$ is primitive recursive if and only if the $1$-place function $h$ given by $h(n) = g(n_1, \dots, n_k)$ if $n = p_1^{n_1}\dots p_k^{n_k}$ is primitive recursive. \end{thm} \begin{note} It doesn't matter what the function $h$ may do on an $n$ which does not represent a sequence of length $k$. \end{note} \begin{cor} \label{c:thirteen10} A $k$-place relation $P$ is primitive recursive if and only if the $1$-place relation $P'$ is primitive recursive, where $$ (n_1, \dots, n_k) \in P \iff p_1^{n_1}\dots p_k^{n_k} \in P' \, . $$ \end{cor} \subsection*{A computable but not primitive recursive function} While primitive recursion and composition do not quite suffice to build all Turing computable functions from the initial functions, they are powerful enough that specific counterexamples are not all that easy to find. \begin{exmp}[Ackerman's Function] \index{Ackerman's Function} \index{$\mathsc{A}$} \index{$\alpha$} \label{pr:ack} Define the $2$-place function $\mathsc{A}$ from as follows: \begin{itemize} \item $\mathsc{A}(0,\ell) = \mathsc{S}(\ell)$ \item $\mathsc{A}(\mathsc{S}(k),0) = \mathsc{A}(k,1)$ \item $\mathsc{A}(\mathsc{S}(k),\mathsc{S}(\ell)) = \mathsc{A}(k,\mathsc{A}(\mathsc{S}(k),\ell))$ \end{itemize} Given $\mathsc{A}$, define the $1$-place function $\alpha$ by $\alpha(n) = \mathsc{A}(n,n)$. It isn't too hard to show that $\mathsc{A}$, and hence also $\alpha$, are Turing computable. However, though it takes considerable effort to prove it, $\alpha$ grows faster with $n$ than any primitive recursive function. (Try working out the first few values of $\alpha$\dots) \end{exmp} \begin{prob} \label{p:thirteen11} Show that the functions $\mathsc{A}$ and $\alpha$ defined in Example \ref{pr:ack} are Turing computable. \end{prob} If you are very ambitious, you can try to prove the following theorem. \begin{thm} \label{t:thirteen12} Suppose $\alpha$ is the function defined in Example \ref{pr:ack} and $f$ is any primitive recursive function. Then there is an $n \in \mathbb{N}$ such that for all $k > n$, $\alpha(k) > f(k)$. \end{thm} \begin{cor} \label{c:thirteen13} The function $\alpha$ defined in Example \ref{pr:ack} is not primitive recursive. \end{cor} \noindent\dots but if you aren't, you can still try the following exercise. \begin{prob} \label{p:thirteen14} Informally, define a computable function which must be different from every primitive recursive function. \end{prob} \subsection*{Unbounded minimalization} The last of our three method of building computable functions from computable functions is unbounded minimalization. The functions which can be defined from the initial functions using unbounded minimalization, as well as composition and primitive recursion, turn out to be precisely the Turing computable functions. Unbounded minimalization is the counterpart for functions of ``brute force'' algorithms that try every possibility until they succeed. (Which, of course, they might not\dots) \begin{defn} \index{unbounded minimalization} \index{minimalization unbounded} \index{function unbounded minimalization of} Suppose $k \ge 1$ and $g$ is a $(k+1)$-place function. Then the {\em unbounded minimalization\/} of $g$ is the $k$-place function $f$ defined by $$ f(n_1, \ldots, n_k) = m \text{ where $m$ is least so that $g(n_1, \ldots, n_k,m) = 0$.} $$ This is often written as $f(n_1, \ldots, n_k) = \mu m [g(n_1, \ldots, n_k, m) = 0]$. \end{defn} \begin{note} If there is no $m$ such that $g(n_1, \ldots, n_k,m) = 0$, then the unbounded minimalization of $g$ is not defined on $(n_1, \ldots, n_k)$. This is one reason we will occasionally need to deal with partial functions. \end{note} If the unbounded minimalization of a computable function is to be computable, we have a problem even if we ask for some default output ($0$, say) to ensure that it is defined for all $k$-tuples. The obvious procedure which tests successive values of $g$ to find the needed $m$ will run forever if there is no such $m$, and the incomputability of the Halting Problem suggests that other procedure's won't necessarily succeed either. It follows that it is desirable to be careful, so far as possible, which functions unbounded minimalization is applied to. \begin{defn} \index{regular function} \index{function regular} A $(k+1)$-place function $g$ is said to be {\em regular\/} if for every $(n_1, \ldots, n_k) \in \mathbb N^k$, there is at least one $m \in \mathbb N$ so that $g(n_1, \ldots, n_k, m) = 0$. \end{defn} That is, $g$ is regular precisely if the obvious strategy of computing $g(n_1, \dots, n_k, m)$ for $m = 0$, $1$, \dots in succession until an $m$ is found with $g(n_1, \dots, n_k, m) = 0$ always succeeds. \begin{prop} \label{p:thirteen15} If $g$ is a Turing computable regular $(k+1)$-place function, then the unbounded minimalization of $g$ is also Turing computable. \end{prop} While unbounded minimalization adds something essentially new to our repertoire, it is worth noticing that {\em bounded minimalization\/} \index{bounded minimalization} \index{minimalization bounded} \index{function bounded minimalization of} does not. \begin{prob} \label{p:thirteen16} Suppose $g$ is a $(k+1)$-place primitive recursive regular function such that for some primitive recursive $k$-place function $h$, $$ \mu m [g(n_1, \ldots, n_k, m) = 0] \le h(n_1, \ldots, n_k) $$ for all $(n_1, \ldots, n_k) \in \mathbb{N}$. Show that $\mu m [g(n_1, \ldots, n_k, m) = 0]$ is also primitive recursive. \end{prob} \subsection*{Recursive functions and relations} We can finally define an equivalent notion of computability for functions on the natural numbers which makes no mention of any computational device. \begin{defn} \index{recursive function}\index{function recursive} A $k$-place function $f$ is {\em recursive\/} if it can be defined from the initial functions by finitely many applications of composition, primitive recursion, and the unbounded minimalization of regular functions. Similarly, $k$-place partial function is {\em recursive\/} if it can be defined from the initial functions by finitely many applications of composition, primitive recursion, and the unbounded minimalization of (possibly non-regular) functions. \end{defn} In particular, every primitive recursive function is a recursive function. \begin{thm} \label{RF:TC} Every recursive function is Turing computable. \end{thm} We shall show that every Turing computable function is recursive later on. Similarly to primitive recursive relations we have the following. \begin{defn} \label{df:rr} \index{recursive relation} \index{relation recursive} \index{Turing computable relation} \index{relation Turing computable} A $k$-place relation $P$ is said to be {\em recursive\/} ({\em Turing computable\/}) if its characteristic function $\chi_P$ is recursive (Turing computable). \end{defn} Since every recursive function is Turing computable, and {\em vice versa\/}, ``recursive'' is just a synonym of ``Turing computable'', for functions and relations alike. Also, similarly to Theorem \ref{r:cd} and Corollary \ref{c:thirteen10} we have the following. \begin{thm} \label{t:thirteen17} A $k$-place function $g$ is recursive if and only if the $1$-place function $h$ given by $h(n) = g(n_1, \dots, n_k)$ if $n = p_1^{n_1}\dots p_k^{n_k}$ is recursive. \end{thm} As before, it doesn't really matter what the function $h$ does on an $n$ which does not represent a sequence of length $k$. \begin{cor} \label{c:thirteen18} A $k$-place relation $P$ is recursive if and only if the $1$-place relation $P'$ is recursive, where $$ (n_1, \dots, n_k) \in P \iff p_1^{n_1}\dots p_k^{n_k} \in P' \, . $$ \end{cor} % % Fourteenth chapter of "A Problem Course in Mathematical Logic" % \chapter{Characterizing Computability} \label{ch:fourteen} By putting together some of the ideas in Chapters \ref{ch:twelve} and \ref{ch:thirteen}, we can use recursive functions to simulate Turing machines. This will let us show that Turing computable functions are recursive, completing the argument that Turing machines and recursive functions are essentially equivalent models of computation. We will also use these techniques to construct an {\em universal Turing machine\/}\index{universal Turing machine}\index{Turing machine universal} (or {\em UTM\/}\index{UTM}): a machine $U$ that, when given as input (a suitable description of) some Turing machine $M$ and an input tape $\mathbf a$ for $M$, simulates the computation of $M$ on input $\mathbf a$. In effect, an universal Turing machine is a single piece of hardware that lets us treat other Turing machines as software. \subsection*{Turing computable functions are recursive} Our basic strategy is to show that any Turing machine can be simulated by some recursive function. Since recursive functions operate on integers, we will need to encode the tape positions of Turing machines, as well as Turing machines themselves, by integers. For simplicity, we shall stick to Turing machines with alphabet $\{1\}$; we already know from Chapter~\ref{ch:eleven} that such machines can simulate Turing machines with bigger alphabets. \begin{defn} \label{TP:gcode} \index{code tape position} \index{tape position code} Suppose $(s,i,\mathbf{a})$ is a tape position such that all but finitely many cells of $\mathbf{a}$ are blank. Let $n$ be any positive integer such that $a_k = 0$ for all $k > n$. Then the {\em code\/} of $(s,i,\mathbf{a})$ is $$ \ulcorner (s,i,\mathbf{a}) \urcorner = 2^s 3^i 5^{a_0} 7^{a_1} 11^{a_2} \dots p_{n+3}^{a_n} \, . $$ \end{defn} \begin{exmp} \label{TP:gcd} Consider the tape position $(2,1,1001)$. Then $$ \ulcorner (2,1,1001) \urcorner = 2^2 3^1 5^1 7^0 11^0 13^1 = 780 \, . $$ \end{exmp} \begin{prob} \label{p:fourteen6} Find the codes of the following tape positions. \begin{enumerate} \item $(1,0,\mathbf{a})$, where $\mathbf a$ is entirely blank. \item $(4,3,\mathbf{a})$, where $\mathbf a$ is $1011100101$. \end{enumerate} \end{prob} \begin{prob} \label{p:fourteen7} What is the tape position whose code is $10314720$? \end{prob} When dealing with computations, we will also need to encode sequences of tape positions by integers. \begin{defn} \label{TPS:gc} \index{code sequence of tape positions} \index{tape positions code of a sequence} \index{sequence of tape positions code} Suppose $t_1 t_2 \dots t_n$ is a sequence of tape positions. Then the {\em code\/} of this sequence is $$ \ulcorner t_1 t_2 \dots t_n \urcorner = 2^{\ulcorner t_1 \urcorner} 3^{\ulcorner t_2 \urcorner} \ldots p_n^{\ulcorner t_n \urcorner} \, . $$ \end{defn} \begin{note} Both tape positions and sequences of tape positions have unique codes. \end{note} \begin{prob} \label{p:fourteen8} Pick some (short!) sequence of tape positions and find its code. \end{prob} Having defined how to represent tape positions as integers, we now need to manipulate these representations using recursive functions. The recursive functions and relations in Problems \ref{p:thirteen6} and \ref{r:rfs} provide most of the necessary tools. \begin{prob} \label{p:fourteen8a} Show that both of the following relations are primitive recursive. \begin{enumerate} \item $\mathsc{TapePos}$, where $\mathsc{TapePos}(n)$ $\iff$ $n$ is the code of a tape position. \index{$\mathsc{TapePos}$} \item $\mathsc{TapePosSeq}$, where $\mathsc{TapePosSeq}(n)$ $\iff$ $n$ is the code of a sequence of tape positions. \index{$\mathsc{TapePosSeq}$} \end{enumerate} \end{prob} \begin{prob} \label{p:fourteen9} Show that each of the following is primitive recursive. \begin{enumerate} \item The $4$-place function $\mathsc{Entry}$\index{$\mathsc{Entry}$} such that \begin{align*} &\mathsc{Entry}(j,w,t,n) \\ &= \begin{cases} \ulcorner (t,i+w-1,\mathbf{a}') \urcorner & \text{if $n = \ulcorner (s,i,\mathbf{a}) \urcorner$, $j \in \{0,1\}$,} \\ & \text{$w \in \{0,2\}$, $i+w-1 \ge 0$, and $t \ge 1$,} \\ & \text{where $a_k' = a_k$ for $k \ne i$ and $a_i' = j$;} \\ 0 & \text{otherwise.} \end{cases} \end{align*} \item For any Turing machine $M$ with alphabet $\{1\}$, the $1$-place function $\mathsc{Step}_M$\index{$\mathsc{Step}_M$} such that $$ \mathsc{Step}_M(n) = \begin{cases} \ulcorner \mathbf{M}(s,i,\mathbf{a}) \urcorner & \text{if $n = \ulcorner (s,i,\mathbf{a}) \urcorner$ and} \\ & \text{$\mathbf{M}(s,i,\mathbf{a})$ is defined;} \\ 0 & \text{otherwise.} \end{cases} $$ \item For any Turing machine $M$ with alphabet $\{1\}$, the $1$-place relation $\mathsc{Comp}_M$\index{$\mathsc{Comp}_M$}, where $$ \mathsc{Comp}_M(n) \iff \text{$n$ is the code of a computation of $M$.} $$ \end{enumerate} \end{prob} The functions and relations above may be primitive recursive, but the last big step in showing that Turing computable functions are recursive requires unbounded minimalization. \begin{prop} \label{p:fourteen9a} For any Turing machine $M$ with alphabet $\{1\}$, the $1$-place (partial) function $\mathsc{Sim}_M$\index{$\mathsc{Sim}_M$} is recursive, where $$ \mathsc{Sim}_M(n) = \ulcorner (t,j,\mathbf{b}) \urcorner $$ if $n = \ulcorner (1,0,\mathbf{a}) \urcorner$ for some input tape $\mathbf{a}$ and $M$ eventually halts in position $(t,j,\mathbf{b})$ on input $\mathbf{a}$. (Note that $\mathsc{Sim}_M(n)$ may be undefined if $n \ne \ulcorner (1,0,\mathbf{a}) \urcorner$ for an input tape $\mathbf{a}$, or if $M$ does not eventually halt on input $\mathbf{a}$.) \end{prop} \begin{lem} \label{l:fourteen9b} Show that the following functions are primitive recursive: \begin{enumerate} \item For any fixed $k \ge 1$, $\mathsc{Code}_k (n_1,\ldots,n_k) = \ulcorner (1,0,01^{n_1}0\ldots 01^{n_k}) \urcorner$. \index{$\mathsc{Code}_k$} \item $\mathsc{Decode}(t) = n$ if $t = \ulcorner (s,i,01^{n+1}) \urcorner$ (and anything you like otherwise). \index{$\mathsc{Decode}$} \end{enumerate} \end{lem} \begin{thm} \label{t:fourteen10} Any $k$-place Turing computable function is recursive. \end{thm} \begin{cor} \label{c:TCiffRec} A function $f : \mathbb{N}^k \to \mathbb{N}$ is Turing computable if and only if it is recursive. \end{cor} Thus Turing machines and recursive functions are essentially equivalent models of computation. \subsection*{An universal Turing machine} One can push the techniques used above little farther to get a recursive function that can simulate {\em any\/} Turing machine. Since every recursive function can be computed by some Turing machine, this effectively gives us an universal Turing machine. \index{universal Turing machine} \index{Turing machine universal} \begin{prob} \label{p:fourteen11} \index{code Turing machine} \index{Turing machine code} Devise a suitable definition for the code $\ulcorner M \urcorner$ of a Turing machine $M$ with alphabet $\{1\}$. \end{prob} \begin{prob} \label{p:fourteen12} Show, using your definition of $\ulcorner M \urcorner$ from Problem \ref{p:fourteen11}, that the following are primitive recursive. \begin{enumerate} \item The $2$-place function $\mathsc{Step}$, where \index{$\mathsc{Step}$} $$ \mathsc{Step}(m,n) = \begin{cases} \ulcorner \mathbf{M}(s,i,\mathbf{a}) \urcorner & \text{if $m = \ulcorner M \urcorner$ for some machine $M$,} \\ & \text{$n = \ulcorner (s,i,\mathbf{a}) \urcorner$, \& $\mathbf{M}(s,i,\mathbf{a})$ is defined;} \\ 0 & \text{otherwise.} \end{cases} $$ \item The $2$-place relation $\mathsc{Comp}$, where \index{$\mathsc{Comp}$} $$ \mathsc{Comp}(m,n) \iff m = \ulcorner M \urcorner $$ for some Turing machine $M$ and $n$ is the code of a computation of $M$. \end{enumerate} \end{prob} \begin{prop} \label{p:fourteen12a} The $2$-place (partial) function $\mathsc{Sim}$ is recursive, where, for any Turing machine $M$ with alphabet $\{1\}$ and input tape $\mathbf{a}$ for $M$, \index{$\mathsc{Sim}$} $$ \mathsc{Sim}(\ulcorner M \urcorner,\ulcorner (1,0,\mathbf{a}) \urcorner) = \ulcorner (t,j,\mathbf{b}) \urcorner $$ if $M$ eventually halts in position $(t,j,\mathbf{b})$ on input $\mathbf{a}$. (Note that $\mathsc{Sim}(m,n)$ may be undefined if $m$ is not the code of some Turing machine $M$, or if $n \ne \ulcorner (1,0,\mathbf{a}) \urcorner$ for an input tape $\mathbf{a}$, or if $M$ does not eventually halt on input $\mathbf{a}$.) \end{prop} \begin{cor} \label{c:UTM} There is a Turing machine $U$ which can simulate any Turing machine $M$. \end{cor} \begin{cor} \label{c:URM} There is a recursive function $f$ which can compute any other recursive function. \end{cor} \subsection*{The Halting Problem} An effective method to determine whether or not a given machine will eventually halt on a given input --- short of waiting forever! --- would be nice to have. For example, assuming Church's Thesis is true, such a method would let us identify computer programs which have infinite loops before we attempt to execute them. \begin{question}{The Halting Problem} \index{Halting Problem} Given a Turing machine $M$ and an input tape $\mathbf{a}$, is there an effective method to determine whether or not $M$ eventually halts on input $\mathbf{a}$? \end{question} Given that we are using Turing machines to formalize the notion of an effective method, one of the difficulties with solving the Halting Problem is representing a given Turing machine and its input tape as input for another machine. As this is one of the things that was accomplished in the course of constructing an universal Turing machine, we can now formulate a precise version of the Halting Problem and solve it. \begin{question}{The Halting Problem} \index{Halting Problem} Is there a Turing machine $T$ which, for any Turing machine $M$ with alphabet $\{1\}$ and tape $\mathbf{a}$ for $M$, halts on input $$ \underline{0} 1^{\ulcorner M \urcorner + 1} 0 1^{\ulcorner (1,0,\mathbf{a}) \urcorner + 1} $$ with output $\underline{0} 11$ if $M$ halts on input $\mathbf{a}$, and with output $\underline{0} 1$ if $M$ does not halt on input $\mathbf{a}$? \end{question} Note that this precise version of the Halting Problem is equivalent to the informal one above only if Church's Thesis is true. \begin{prob} \label{HP:coder} Show that there is a Turing machine $C$ which, for any Turing machine $M$ with alphabet $\{1\}$, on input $$ \underline{0} 1^{\ulcorner M \urcorner + 1} $$ eventually halts with output $$ \underline{0} 1^{\ulcorner M \urcorner + 1} 0 1^{\ulcorner (0,1,0 1^{\ulcorner M \urcorner + 1}) \urcorner + 1} $$ \end{prob} \begin{thm} \label{HP:no} The answer to (the precise version of) the Halting Problem is ``No.'' \end{thm} \subsection*{Recursively enumerable sets} The following notion is of particular interest in the advanced study of computability. \begin{defn} \label{df:re} \index{recursively enumerable} \index{r.e.} A subset ({\em i.e.\/} a $1$-place relation) $P$ of $\mathbb{N}$ is {\em recursively enumerable\/}, often abbreviated as {\em r.e.\/}, if there is a $1$-place recursive function $f$ such that $P = \mathrm{im}(f) = \{\, f(n) \mid n \in \mathbb{N} \,\}$. \end{defn} Since the image of any recursive $1$-place function is recursively enumerable by definition, we do not lack for examples. For one, the set $E$ of even natural numbers is recursively enumerable, since it is the image of $f(n) = \mathsc{Mult}(\mathsc{S}(\mathsc{S}(\mathsc{O}(n))),n)$. \begin{prop} \label{p:fourteen13} If $P$ is a $1$-place recursive relation, then $P$ is recursively enumerable. \end{prop} This proposition is not reversible, but it does come close. \begin{prop} \label{p:fourteen14} $P \subseteq \mathbb{N}$ is recursive if and only if both $P$ and $\mathbb{N} \setminus P$ are recursively enumerable. \end{prop} \begin{prob} \label{p:fourteen15} Find an example of a recursively enumerable set which is not recursive. \end{prob} \begin{prob} \label{p:fourteen16} Is $P \subseteq \mathbb N$ primitive recursive if and only if both $P$ and $\mathbb N \setminus P$ are enumerable by primitive recursive functions? \end{prob} \begin{prob} \label{p:fourteen17} $P \subseteq \mathbb N$ recursively enumerable if and only if there is a $1$-place recursive partial function $g$ such that $P = \text{dom}(g) = \{\, n \mid g(n) \text{\ is defined} \,\}$ \end{prob} \chapter*{Hints for Chapters 10--14} % % Hints for Chapter 10 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:ten}} \begin{clue}{ten:tape} This should be easy\dots \end{clue} \begin{clue}{ten:tppos} Ditto. \end{clue} \begin{clue}{ten:TMs} \begin{enumerate} \item Any machine with the given alphabet and a table with three non-empty rows will do. \item Every entry in the table in the $0$ column must write a $1$ in the scanned cell; similarly, every entry in the $1$ column must write a $0$ in the scanned cell. \item What's the simplest possible table for a given alphabet? \end{enumerate} \end{clue} \begin{clue}{ten:comp} Unwind the definitions step by step in each case. Not all of these are computations\dots \end{clue} \begin{clue}{ten:inps} Examine your solutions to the previous problem and, if necessary, take the computations a little farther. \end{clue} \begin{clue}{ten:mrks} Have the machine run on forever to the right, writing down the desired pattern as it goes no matter what may be on the tape already. \end{clue} \begin{clue}{ten:runs} Consider your solution to Problem \ref{ten:mrks} for one possible approach. It should be easy to find simpler solutions, though. \end{clue} \begin{clue}{ten:combTMs} Consider the tasks $S$ and $T$ are intended to perform. \end{clue} \begin{clue}{ten:mTMs} \begin{enumerate} \item Use four states to write the $1$s, one for each. \item The input has a convenient marker. \item Run back and forth to move one marker $n$ cells {\em from\/} the block of $1$'s while moving another {\em through\/} the block, and then fill in. \item Modify the previous machine by having it delete every other $1$ after writing out $1^{2n}$. \item Run back and forth to move the right block of $1$s cell by cell to the desired position. \item Run back and forth to move the left block of $1$s cell by cell past the other two, and then apply a minor modification of the machine in part 5. \item Variations on the ideas used in part 6 should do the job. \item Run back and forth between the blocks, moving a marker through each. After the race between the markers to the ends of their respective blocks has been decided, erase everything and write down the desired output. \end{enumerate} \end{clue} % % Hints for Chapter 11 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:eleven}} \begin{clue}{p:eleven1} This ought to be easy. \end{clue} \begin{clue}{p:eleven2} Generalize the technique of Example \ref{sim:LR}, adding two new states to help with each old state that may cause a move in different directions. You do have to be a bit careful not to make a machine that would run off the end of the tape when the original would not. \end{clue} \begin{clue}{p:eleven1a} You only need to change the parts of the definitions involving the symbols $0$ and $1$. \end{clue} \begin{clue}{p:eleven2a} If you have trouble figuring out whether the subroutine of $Z$ simulating state $1$ of $W$ on input $y$, try tracing the partial computations of $W$ and $Z$ on other tapes involving $y$. \end{clue} \begin{clue}{p:eleven3} Generalize the concepts used in Example~\ref{sim:alph}. Note that the simulation must operate with coded versions of $M$s tape, unless $\Sigma = \{ 1 \}$. The key idea is to use the tape of the simulator in blocks of some fixed size, with the patterns of $0$s and $1$s in each block corresponding to elements of $\Sigma$. \end{clue} \begin{clue}{p:eleven4} This should be straightforward, if somewhat tedious. You do need to be careful in coming up with the appropriate input tapes for $O$. \end{clue} \begin{clue}{p:eleven5} Generalize the technique of Example \ref{sim:2way}, splitting up the tape of the simulator into upper and lower tracks and splitting each state of $N$ into two states in $P$. You will need to be quite careful in describing just how the latter is to be done. \end{clue} \begin{clue}{p:eleven5a} This is mostly pretty easy. The only problem is to devise $N$ so that one can tell from its output whether $P$ halted or crashed, and this is easy to indicate using some extra symbol in $N$s alphabet. \end{clue} \begin{clue}{p:eleven6} If you're in doubt, go with one read/write scanner for each tape, and have each entry in the table of a two-tape machine take both scanners into account. Simulating such a machine is really just a variation on the techniques used in Example \ref{sim:2way}. \end{clue} \begin{clue}{p:eleven7} Such a machine should be able to move its scanner to cells up and down from the current one, as well to the side. (Diagonally too, if you want to!) Simulating such a machine on a single tape machine is a challenge. You might find it easier to first describe how to simulate it on a suitable multiple-tape machine. \end{clue} % % Hints for Chapter 12 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:twelve}} \begin{clue}{C:c} \begin{enumerate} \item Delete most of the input. \item Add a one to the far end of the input. \item Add a little to the input, and delete a little more elsewhere. \item Delete a little from the input {\em most\/} of the time. \item Run back and forth between the two blocks in the input, deleting until one side disappears. Clean up appropriately! (This is a relative of Problem~\ref{ten:mTMs}.8.) \item Delete two of blocks and move the remaining one. \item This is just a souped-up version of the machine immediately preceding\dots \end{enumerate} \end{clue} \begin{clue}{C:inc} There are just as many functions $\mathbb{N} \to \mathbb{N}$ as there are real numbers, but only as many Turing machines as there are natural numbers. \end{clue} \begin{clue}{prob:BB} \begin{enumerate} \item Trace the computation through step-by-step. \item Consider the scores of each of the $1$-state entries in the busy beaver competition. \item Find a $3$-state entry in the busy beaver competition which scores six. \item Show how to turn an $n$-state entry in the busy beaver competition into an $(n+1)$-state entry that scores just one better. \end{enumerate} \end{clue} \begin{clue}{prob:mBB} You could start by looking at modifications of the $3$-state entry you devised in Problem~\ref{prob:BB}.3, but you will probably want to do some serious fiddling to do better than what Problem~\ref{prob:BB}.4 do from there. \end{clue} \begin{clue}{p:twelve5} Suppose $\Sigma$ was computable by a Turing machine $M$. Modify $M$ to get an $n$-state entry in the busy beaver competition for some $n$ which achieves a score greater than $\Sigma(n)$. The key idea is to add a ``pre-processor'' to $M$ which writes a block with more $1$s than the number odf states that $M$ and the pre-processor have between them. \end{clue} \begin{clue}{p:cnstfns} Generalize Example \ref{e:con}. \end{clue} \begin{clue}{p:compcomp} Use machines computing $g$, $h_1$, \dots, $h_m$ as sub-machines of the machine computing the composition. You might also find sub-machines that copy the original input and various stages of the output useful. It is important that each sub-machine get all the data it needs and does not damage the data needed by other sub-machines. \end{clue} \begin{clue}{p:complin} Proceed by induction on the number of applications of composition used to define $f$ from the initial functions. \end{clue} % % Hints for Chapter 13 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:thirteen}} \begin{clue}{pr:fns} \begin{enumerate} \item Exponentiation is to multiplication as multiplication is to addition. \item This is straightforward except for taking care of $\mathsc{Pred}(0) = \mathsc{Pred}(1) = 0$. \item $\mathsc{Diff}$ is to $\mathsc{Pred}$ as $\mathsc{S}$ is to $\mathsc{Sum}$. \item This is straightforward if you let $0! = 1$. \end{enumerate} \end{clue} \begin{clue}{p:thirteen5} Machines used to compute $g$ and $h$ are the principal parts of the machine computing $f$, along with parts to copy, move, and/or delete data on the tape between stages in the recursive process. \end{clue} \begin{clue}{p:thirteen6} \begin{enumerate} \item $f$ is to $g$ as $\mathsc{Fact}$ is to the identity function. \item Use $\mathsc{Diff}$ and a suitable constant function as the basic building blocks. \item This is a slight generalization of the preceding part. \end{enumerate} \end{clue} \begin{clue}{t:thirteen7} Proceed by induction on the number of applications of primitive recursion and composition. \end{clue} \begin{clue}{r:rfs} \begin{enumerate} \item Use a composition including $\mathsc{Diff}$, $\chi_{P}$, and a suitable constant function. \item A suitable composition will do the job; it's just a little harder than it looks. \item A suitable composition will do the job; it's rather more straightforward than the previous part. \item Note that $n = m$ exactly when $n - m = 0 = m - n$. \item Adapt your solution from the first part of Problem~\ref{p:thirteen6}. \item First devise a characteristic function for the relation $$ \mathsc{Product}(n,k,m) \iff nk = m \, , $$ and then sum up. \item Use $\chi_{\mathsc{Div}}$ and sum up. \item Use $\mathsc{IsPrime}$ and some ingenuity. \item Use $\mathsc{Exp}$ and $\mathsc{Div}$ and some more ingenuity. \item A suitable combination of $\mathsc{Prime}$ with other things will do. \item A suitable combination of $\mathsc{Prime}$ and $\mathsc{Power}$ will do. \item Throw the kitchen sink at this one\dots \item Ditto. \end{enumerate} \end{clue} \begin{clue}{r:cd} In each direction, use a composition of functions already known to be primitive recursive to modify the input as necessary. \end{clue} \begin{clue}{c:thirteen10} A straightforward application of Theorem \ref{r:cd}. \end{clue} \begin{clue}{p:thirteen11} This is not unlike, though a little more complicated than, showing that primitive recursion preserves computability. \end{clue} \begin{clue}{t:thirteen12} It's {\em not\/} easy! Look it up\dots \end{clue} \begin{clue}{c:thirteen13} This is a very easy consequence of Theorem \ref{t:thirteen12}. \end{clue} \begin{clue}{p:thirteen14} Listing the definitions of all possible primitive recursive functions is a computable task. Now borrow a trick from Cantor's proof that the real numbers are uncountable. (A formal argument to this effect could be made using techniques similar to those used to show that all Turing computable functions are recursive in the next chapter.) \end{clue} \begin{clue}{p:thirteen15} The strategy should be easy. Make sure that at each stage you preserve a copy of the original input for use at later stages. \end{clue} \begin{clue}{p:thirteen16} The primitive recursive function you define only needs to check values of $g(n_1,\dots,n_k,m)$ for $m$ such that $0 \le m \le h(n_1,\dots,n_k)$, but it still needs to pick the least $m$ such that $g(n_1,\dots,n_k,m) = 0$. \end{clue} \begin{clue}{RF:TC} This is very similar to Theorem \ref{t:thirteen7}. \end{clue} \begin{clue}{t:thirteen17} This is virtually identical to Theorem \ref{r:cd}. \end{clue} \begin{clue}{c:thirteen18} This is virtually identical to Corollary \ref{c:thirteen10}. \end{clue} % % Hints for Chapter 14 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:fourteen}} \begin{clue}{p:fourteen6} Emulate Example \ref{TP:gcd} in both parts. \end{clue} \begin{clue}{p:fourteen7} Write out the prime power expansion of the given number and unwind Definition \ref{TP:gcode}. \end{clue} \begin{clue}{p:fourteen8} Find the codes of each of the positions in the sequence you chose and then apply Definition \ref{TPS:gc}. \end{clue} \begin{clue}{p:fourteen8a} \begin{enumerate} \item $\chi_{\mathsc{TapePos}}(n) = 1$ \index{$\mathsc{TapePos}$} exactly when the power of $2$ in the prime power expansion of $n$ is at least $1$ and every other prime appears in the expansion with a power of $0$ or $1$. This can be achieved with a composition of recursive functions from Problems \ref{p:thirteen6} and \ref{r:rfs}. \item $\chi_{\mathsc{TapePosSeq}}(n)=1$ exactly when $n$ is the code of a sequence of tape positions, {\em i.e.\/} every power in the prime power expansion of $n$ is the code of a tape position. \index{$\mathsc{TapePosSeq}$} \end{enumerate} \end{clue} \begin{clue}{p:fourteen9} \begin{enumerate} \item If the input is of the correct form, make the necessary changes to the prime power expansion of $n$ using the tools in Problem \ref{r:rfs}. \item Piece $\mathsc{Step}_M$ together by cases using the function $\mathsc{Entry}$ in each case. The piecing-together works a lot like redefining a function at a particular point in Problem \ref{p:thirteen6}. \item If the input is of the correct form, use the function $\mathsc{Step}_M$ to check that the successive elements of the sequence of tape positions are correct. \end{enumerate} \end{clue} \begin{clue}{p:fourteen9a} The key idea is to use unbounded minimalization on $\chi_{\mathsc{Comp}}$, with some additions to make sure the computation found (if any) starts with the given input, and then to extract the output from the code of the computation. \end{clue} \begin{clue}{l:fourteen9b} \begin{enumerate} \item To define $\mathsc{Code}_k$\index{$\mathsc{Code}_k$}, consider what $\ulcorner (1,0,01^{n_1}0\ldots 01^{n_k}) \urcorner$ is as a prime power expansion, and arrange a suitable composition to obrtain it from $(n_1,\dots,n_k)$. \item To define $\mathsc{Decode}$\index{$\mathsc{Decode}$} you only need to count how many powers of primes other than $3$ in the prime-power expansion of $\ulcorner (s,i,01^{n+1}) \urcorner$ are equal to $1$. \end{enumerate} \end{clue} \begin{clue}{t:fourteen10} Use Proposition \ref{p:fourteen9a} and Lemma \ref{l:fourteen9b}. \end{clue} \begin{clue}{c:TCiffRec} This follows directly from Theorems \ref{RF:TC} and \ref{t:fourteen10}. \end{clue} \begin{clue}{p:fourteen11} Take some creative inspiration from Definitions \ref{TP:gcode} and \ref{TPS:gc}. For example, if $(s,i) \in \mathrm{dom}(M)$ and $M(s,i) = (j,d,t)$, you could let the code of $M(s,i)$ be $$ \ulcorner M(s,i) \urcorner = 2^s 3^i 5^j 7^{d+1} 11^t \, . $$ \end{clue} \begin{clue}{p:fourteen12} Much of what you need for both parts is just what was needed for Problem \ref{p:fourteen9}, except that $\mathsc{Step}$\index{$\mathsc{Step}$} is probably easier to define than $\mathsc{Step}_M$\index{$\mathsc{Step}_M$} was. (Define it as a composition\dots) The additional ingredients mainly have to do with using $m = \ulcorner M \urcorner$ properly. \end{clue} \begin{clue}{p:fourteen12a} Essentially, this is to Problem \ref{p:fourteen12} as proving Proposition \ref{p:fourteen9a} is to Problem \ref{p:fourteen9}. \end{clue} \begin{clue}{c:UTM} The machine that computes $\mathsc{SIM}$\index{$\mathsc{SIM}$} does the job. \end{clue} \begin{clue}{c:URM} A modification of $\mathsc{SIM}$\index{$\mathsc{SIM}$} does the job. The modifications are needed to handle appropriate input and output. Check Theorem \ref{t:thirteen17} for some ideas on what may be appropriate. \end{clue} \begin{clue}{HP:coder} This can be done directly, but may be easier to think of in terms of recursive functions. \end{clue} \begin{clue}{HP:no} Suppose the answer was yes and such a machine $T$ did exist. Create a machine $U$ as follows. Give $T$ the machine $C$ from Problem \ref{HP:coder} as a pre-processor and alter its behaviour by having it run forever if $M$ halts and halt if $M$ runs forever. What will $T$ do when it gets itself as input? \end{clue} \begin{clue}{p:fourteen13} Use $\chi_P$ to help define a function $f$ such that $\textrm{im}(f) = P$. \end{clue} \begin{clue}{p:fourteen14} One direction is an easy application of Proposition \ref{p:fourteen13}. For the other, given an $n \in \mathbb{N}$, run the functions enumerating $P$ and $\mathbb{N} \setminus P$ concurrently until one or the other outputs $n$. \end{clue} \begin{clue}{p:fourteen15} Consider the set of natural numbers coding (according to some scheme you must devise) Turing machines together with input tapes on which they halt. \end{clue} \begin{clue}{p:fourteen16} See how far you can adapt your argument for Proposition \ref{p:fourteen14}. \end{clue} \begin{clue}{p:fourteen17} This may well be easier to think of in terms of Turing machines. Run a Turing machine that computes $g$ for a few steps on the first possible input, a few on the second, a few more on the first, a few more on the second, a few on the third, a few more on the first, \dots \end{clue} % % Part IV of "A Problem Course in Mathematical Logic" % \part{Incompleteness} % % Fifteenth chapter of "A Problem Course in Mathematical Logic" % \chapter{Preliminaries} \label{ch:fifteen} It was mentioned in the Introduction that one of the motivations for the development of notions of computability was the following question. \begin{proof}[Entscheidungsproblem] \index{Entscheidungsproblem} Given a reasonable set $\Sigma$ of formulas of a first-order language $\mathcal{L}$ and a formula $\varphi$ of $\mathcal{L}$, is there an effective method for determining whether or not $\Sigma \proves \varphi$? \end{proof} Armed with knowledge of first-order logic on the one hand and of computability on the other, we are in a position to formulate this question precisely and then solve it. To cut to the chase, the answer is usually ``no''. G\"odel's Incompleteness Theorem \index{G\"odel Incompleteness Theorem} \index{Incompleteness Theorem} asserts, roughly, that given any set of axioms in a first-order language which are computable and also powerful enough to prove certain facts about arithmetic, it is possible to formulate statements in the language whose truth is not decided by the axioms. In particular, it turns out that no consistent set of axioms can hope to prove its own consistency. We will tackle the Incompleteness Theorem in three stages. First, we will code the formulas and proofs of a first-order language as numbers and show that the functions and relations involved are recursive. This will, in particular, make it possible for us to define a ``computable set of axioms'' precisely. Second, we will show that all recursive functions and relations can be defined by first-order formulas in the presence of a fairly minimal set of axioms about elementary number theory. Finally, by putting recursive functions talking about first-order formulas together with first-order formulas defining recursive functions, we will manufacture a self-referential sentence which asserts its own unprovability. \begin{note} It will be assumed in what follows that you are familiar with the basics of the syntax and semantics of first-order languages, as laid out in Chapters 5--8 of this text. Even if you are already familiar with the material, you may wish to look over Chapters 5--8 to familiarize yourself with the notation, definitions, and conventions used here, or at least keep them handy in case you need to check some such point. \end{note} \subsection*{A language for first-order number theory} To keep things as concrete as possible we will work with and in the following language for first-order number theory, mentioned in Example~\ref{e:lan}. \begin{defn} \label{d:LN} \index{$\mathcal{L}_N$} \index{language for first-order number theory} \index{first-order language for number theory} \index{number theory first-order language for} $\mathcal{L}_N$ is the first-order language with the following symbols: \begin{enumerate} \item Parentheses: $($ and $)$ \item Connectives: $\lnot$ and $\to$ \item Quantifier: $\forall$ \item Equality: $=$ \item Variable symbols: $v_0$, $v_2$, $v_3$, \dots \item Constant symbol: $0$ \item $1$-place function symbol: $S$ \item $2$-place function symbols: $+$, $\cdot$, and $E$. \end{enumerate} \end{defn} The non-logical symbols of $\mathcal{L}_N$, $0$, $S$, $+$, $\cdot$, and $E$, are intended to name, respectively, the number zero, and the successor, addition, multiplication, and exponentiation functions on the natural numbers. That is, the (standard!) structure this language is intended to discuss is $\mathfrak{N} = (\mathbb{N},0,\mathsc{S},+,\cdot,\mathsc{E})$. \index{$\mathfrak{N}$} \subsection*{Completeness} The notion of completeness used in the Incompleteness Theorem is different from the one used in the Completeness Theorem.\footnote{Which, to confuse the issue, was also first proved by Kurt G\"odel.} ``Completeness'' in the latter sense is a property of a logic: it asserts that whenever $\Gamma \models \sigma$ ({\em i.e.\/} the truth of the sentence $\sigma$ follows from that of the set of sentences $\Gamma$), $\Gamma \proves \sigma$ ({\em i.e.\/} there is a deduction of $\sigma$ from $\Gamma$). The sense of ``completeness'' in the Incompleteness Theorem, defined below, is a property of a set of sentences. \begin{defn} \label{d:cplt} \index{completeness} \index{complete set of sentences} A set of sentences $\Sigma$ of a first-order language $\mathcal{L}$ is said to be {\em complete\/} if for every sentence $\tau$ either $\Sigma \proves \tau$ or $\Sigma \proves \lnot \tau$. \end{defn} That is, a set of sentences, or non-logical axioms, is complete if it suffices to prove or disprove every sentence of the langage in in question. \begin{prop} \label{p:fifteen1} A consistent set $\Sigma$ of sentences of a first-order language $\mathcal{L}$ is complete if and only if the theory of $\Sigma$, $$ \mathrm{Th}(\Sigma) = \{\, \tau \mid \text{$\tau$ is a sentence of $\mathcal{L}$ and $\Sigma \proves \tau$} \,\} \, , $$ is maximally consistent. \index{theory of a set of sentences} \index{$\mathrm{Th}(\Sigma)$} \end{prop} % % Sixteenth chapter of "A Problem Course in Mathematical Logic" % \chapter{Coding First-Order Logic} \label{ch:sixteen} We will encode the symbols, formulas, and deductions of $\mathcal{L}_N$ as natural numbers in such a way that the operations necessary to manipulate these codes are recursive. Although we will do so just for $\mathcal{L}_N$, any countable first-order language can be coded in a similar way. \subsection*{G\"odel coding} The basic approach of the coding scheme we will use was devised by G\"odel in the course of his proof of the Incompleteness Theorem. \begin{defn} \label{df:gcsym} \index{G\"odel code of symbols of $\mathcal{L}_N$} To each symbol $s$ of $\mathcal L_N$ we assign an unique positive integer $\ulcorner s \urcorner$, the {\em G\"odel code\/} of $s$, as follows: \begin{enumerate} \item $\ulcorner ( \urcorner = 1$ and $\ulcorner ) \urcorner = 2$ \item $\ulcorner \lnot \urcorner = 3$ and $\ulcorner \to \urcorner = 4$ \item $\ulcorner \forall \urcorner = 5$ \item $\ulcorner = \urcorner = 6$. \item $\ulcorner v_k \urcorner = k + 12$ \item $\ulcorner 0 \urcorner = 7$ \item $\ulcorner S \urcorner = 8$ \item $\ulcorner + \urcorner = 9$, $\ulcorner \cdot \urcorner = 10$, and $\ulcorner E \urcorner = 11$ \end{enumerate} \end{defn} Note that each positive integer is the G\"odel code of one and only one symbol of $\mathcal{L}_N$. We will also need to code sequences of the symbols of $\mathcal{L}_N$, such as terms and formulas, as numbers, not to mention sequences of sequences of symbols of $\mathcal{L}_N$, such as deductions. \begin{defn} \label{df:gcfor} \index{G\"odel code of sequences} Suppose $s_1 s_2 \dots s_k$ is a sequence of symbols of $\mathcal{L}_N$. Then the {\em G\"odel code\/} of this sequence is $$ \ulcorner s_1 \dots s_k \urcorner = p_1^{\ulcorner s_1 \urcorner} \dots p_k^{\ulcorner s_k \urcorner} \, , $$ where $p_n$ is the $n$th prime number. Similarly, if $\sigma_1 \sigma_2 \dots \sigma_\ell$ is a sequence of sequences of symbols of $\mathcal{L}_N$, then the {\em G\"odel code\/} of this sequence is $$ \ulcorner \sigma_1 \dots \sigma_\ell \urcorner = p_1^{\ulcorner \sigma_1 \urcorner} \dots p_k^{\ulcorner \sigma_\ell \urcorner} \, . $$ \end{defn} \begin{exmp} The code of the formula $\forall v_1\, = \cdot v_1 S 0 v_1$ (the official form of $\forall v_1\, v_1 \cdot S 0 = v_1$), $\ulcorner \forall v_1\, = \cdot v_1 S 0 v_1 \urcorner$, works out to \begin{align*} & 2^{\ulcorner \forall \urcorner} 3^{\ulcorner v_1 \urcorner} 5^{\ulcorner = \urcorner} 7^{\ulcorner \cdot \urcorner} 11^{\ulcorner v_1 \urcorner} 13^{\ulcorner S \urcorner} 17^{\ulcorner 0 \urcorner} 19^{\ulcorner v_1 \urcorner} = 2^5 3^{13} 5^6 7^{10} 11^{13} 13^8 17^7 19^{13} \\ & = {\scriptstyle 109425289274918632559342112641443058962750733001979829025245569500000} \, . \end{align*} This is {\em not\/} the most efficient conceivable coding scheme! \end{exmp} \begin{exmp} \label{ex:formseqcode} The code of the sequence of formulas \begin{center} \begin{tabular}{ll} $=00$ & {\em i.e.\/} $0 = 0$ \\ $(=00 \to =S0S0)$ & {\em i.e.\/} $0 = 0 \to S0 = S0$ \\ $=S0S0$ & {\em i.e.\/} $S0 = S0$ \end{tabular} \end{center} works out to \begin{align*} 2^{\ulcorner =00 \urcorner} & 3^{\ulcorner (=00 \to =S0S0) \urcorner} 5^{\ulcorner =S0S0 \urcorner} \\ & = 2^{2^{\ulcorner = \urcorner} 3^{\ulcorner 0 \urcorner} 5^{\ulcorner 0 \urcorner}} \\ & \;\;\;\;\;\;\; \cdot 3^{2^{\ulcorner ( \urcorner} 3^{\ulcorner = \urcorner} 5^{\ulcorner 0 \urcorner} 7^{\ulcorner 0 \urcorner} 11^{\ulcorner \to \urcorner} 13^{\ulcorner = \urcorner} 17^{\ulcorner S \urcorner} 19^{\ulcorner 0 \urcorner} 23^{\ulcorner S \urcorner} 29^{\ulcorner 0 \urcorner} 31^{\ulcorner ) \urcorner}} \\ & \;\;\;\;\;\;\; \cdot 5^{2^{\ulcorner = \urcorner} 3^{\ulcorner S \urcorner} 5^{\ulcorner 0 \urcorner} 7^{\ulcorner S \urcorner} 11^{\ulcorner 0 \urcorner}} \\ & = 2^{2^6 3^7 5^7} 3^{2^1 3^6 5^7 7^7 11^4 13^6 17^8 19^7 23^8 29^7 31^2} 5^{2^6 3^8 5^7 7^8 11^7} \, , \end{align*} which is large enough not to be worth the bother of working it out explicitly. \end{exmp} \begin{prob} \label{p:formcodes} Pick a short sequence of short formulas of $\mathcal{L}_N$ and find the code of the sequence. \end{prob} A particular integer $n$ may simultaneously be the G\"odel code of a symbol, a sequence of symbols, and a sequence of sequences of symbols of $\mathcal{L}_N$. We shall rely on context to avoid confusion, but, with some more work, one could set things up so that no integer was the code of more than one kind of thing. In any case, we will be most interested in the cases where sequences of symbols are (official) terms or formulas and where sequences of sequences of symbols are sequences of (official) formulas. In these cases things are a little simpler. \begin{prob} \label{p:dualcodes} Is there a natural number $n$ which is simultaneously the code of a symbol of $\mathcal{L}_N$, the code of a formula of $\mathcal{L}_N$, and the code of a sequence of formulas of $\mathcal{L}_N$? If not, how many of these three things can a natural number be? \end{prob} \subsection*{Recursive operations on G\"odel codes} We will need to know that various relations and functions which recognize and manipulate G\"odel codes are recursive, and hence computable. \begin{prob} \label{pr:loth} Show that each of the following relations is primitive recursive. \begin{enumerate} \item $\mathsc{Term}(n) \iff n = \ulcorner t \urcorner$ for some term $t$ of $\mathcal{L}_N$. \index{$\mathsc{Term}$} \item $\mathsc{Formula}(n) \iff n = \ulcorner \varphi \urcorner$ for some formula $\varphi$ of $\mathcal{L}_N$. \index{$\mathsc{Formula}$} \item $\mathsc{Sentence}(n) \iff n = \ulcorner \sigma \urcorner$ for some sentence $\sigma$ of $\mathcal{L}_N$. \index{$\mathsc{Sentence}$} \item $\mathsc{Logical}(n) \iff n = \ulcorner \gamma \urcorner$ for some logical axiom $\gamma$ of $\mathcal{L}_N$. \index{$\mathsc{Logical}$} \end{enumerate} \end{prob} Using these relations as building blocks, we will develop relations and functions to handle deductions of $\mathcal{L}_N$. First, though, we need to make ``a computable set of formulas'' precise. \begin{defn} \index{recursive set of formulas} \index{recursively enumerable set of formulas} \index{computable set of formulas} \index{$\ulcorner \Delta \urcorner$} A set $\Delta$ of formulas of $\mathcal{L}_N$ is said to be {\em recursive\/} if the set of G\"odel codes of formulas of $\Delta$, $$ \ulcorner \Delta \urcorner = \{\, \ulcorner \delta \urcorner \mid \delta \in \Delta \,\} \, , $$ is a recursive subset of $\mathbb{N}$ ({\em i.e.\/} a recursive $1$-place relation). Similarly, $\Delta$ is said to be {\em recursively enumerable\/} if $\ulcorner \Delta \urcorner$ is recursively enumerable. \end{defn} \begin{prob} \label{pr:loac} Suppose $\Delta$ is a recursive set of sentences of $\mathcal{L}_N$. Show that each of the following relations is recursive. \begin{enumerate} \item $\mathsc{Premiss}_\Delta(n) \iff n = \ulcorner \beta \urcorner$ for some formula $\beta$ of $\mathcal{L}_N$ which is either a logical axiom or in $\Delta$. \index{$\mathsc{Premiss}_\Delta$} \item $\mathsc{Formulas}(n) \iff n = \ulcorner \varphi_1 \dots \varphi_k \urcorner$ for some sequence $\varphi_1 \dots \varphi_k$ of formulas of $\mathcal{L}_N$. \index{$\mathsc{Formulas}$} \item $\mathsc{Inference}(n,i,j) \iff n = \ulcorner \varphi_1 \dots \varphi_k \urcorner$ for some sequence $\varphi_1 \dots \varphi_k$ of formulas of $\mathcal{L}_N$, $1 \le i,j \le k$, and $\varphi_k$ follows from $\varphi_i$ and $\varphi_j$ by Modus Ponens. \index{$\mathsc{Inference}$} \item $\mathsc{Deduction}_\Delta(n) \iff n = \ulcorner \varphi_1 \dots \varphi_k \urcorner$ for a deduction $\varphi_1 \dots \varphi_k$ from $\Delta$ in $\mathcal{L}_N$. \index{$\mathsc{Deduction}_\Delta$} \item $\mathsc{Conclusion}_\Delta(n,m) \iff n = \ulcorner \varphi_1 \dots \varphi_k \urcorner$ for a deduction $\varphi_1 \dots \varphi_k$ from $\Delta$ in $\mathcal{L}_N$ and $m = \ulcorner \varphi_k \urcorner$. \index{$\mathsc{Conclusion}_\Delta$} \end{enumerate} If $\ulcorner \Delta \urcorner$ is primitive recursive, which of these are primitive recursive? \end{prob} It is at this point that the connection between computability and completeness begins to appear. \begin{thm} \label{t:sixteen3} Suppose $\Delta$ is a recursive set of sentences of $\mathcal{L}_N$. Then $\ulcorner \mathrm{Th}(\Delta) \urcorner$ is \begin{enumerate} \item recursively enumerable, and \item recursive if and only if $\Delta$ is complete. \end{enumerate} \end{thm} \begin{note} It follows that if $\Delta$ is not complete, then $\ulcorner \mathrm{Th}(\Delta) \urcorner$ is an example of a recursively enumerable but not recursive set. \end{note} % % Seventeenth chapter of "A Problem Course in Mathematical Logic} % \chapter{Defining Recursive Functions In Arithmetic} \label{ch:seventeen} The definitions and results in Chapter \ref{ch:seventeen} let us use natural numbers and recursive functions to code and manipulate formulas of $\mathcal{L}_N$. We will also need complementary results that let us use terms and formulas of $\mathcal{L}_N$ to represent and manipulate natural numbers and recursive functions. \subsection*{Axioms for basic arithmetic} We will define a set of non-logical axioms in $\mathcal{L}_N$ which prove enough about the operations of successor, addition, mutliplication, and exponentiation to let us define all the recursive functions using formulas of $\mathcal{L}_N$. The non-logical axioms in question essentially guarantee that basic arithmetic works properly. \begin{defn} \label{df:lna} Let $\mathcal{A}$ be the following set of sentences of $\mathcal{L}_N$, written out in official form. \index{$\mathcal{A}$} \begin{description} \item[N1] $\forall v_0\, (\lnot = Sv_0 0)$ \index{N1} \item[N2] $\forall v_0\, ((\lnot = v_0 0) \to (\lnot \forall v_1\, (\lnot = Sv_1 v_0)))$ \index{N2} \item[N3] $\forall v_0 \forall v_1 \, (= Sv_0 Sv_1 \to = v_0 v_1)$ \index{N3} \item[N4] $\forall v_0\, = + v_0 0 v_0$ \index{N4} \item[N5] $\forall v_0 \forall v_1\, = + v_0 Sv_1 S + v_0 v_1$ \index{N5} \item[N6] $\forall v_0 \, = \cdot v_0 0 0$ \index{N6} \item[N7] $\forall v_0 \forall v_1\, = \cdot v_0 Sv_1 + \cdot v_0 v_1 v_0$ \index{N7} \item[N8] $\forall v_0\, = Ev_0 0 S0$ \index{N8} \item[N9] $\forall v_0 \forall v_1\, = E v_0 Sv_1 \cdot E v_0 v_1 v_0$ \index{N9} \end{description} \end{defn} Translated from the official forms, $\mathcal{A}$ consists of the following axioms about the natural numbers: \begin{description} \item[N1] For all $n$, $n + 1 \ne 0$. \index{N1} \item[N2] For all $n$, $n \ne 0$ there is a $k$ such that $k + 1 = n$. \index{N2} \item[N3] For all $n$ and $k$, $n + 1 = k + 1$ implies that $n = k$. \index{N3} \item[N4] For all $n$, $n + 0 = n$. \index{N4} \item[N5] For all $n$ and $k$, $n + (k + 1) = (n + k) + 1$. \index{N5} \item[N6] For all $n$, $n \cdot 0 = 0$. \index{N6} \item[N7] For all $n$ and $k$, $n \cdot (k + 1) = (n \cdot k) + n$. \index{N7} \item[N8] For all $n$, $n^0 = 1$. \index{N8} \item[N9] For all $n$ and $k$, $n^{k+1} = (n^k) \cdot n$. \index{N9} \end{description} Each of the axioms in $\mathcal{A}$ is true of the natural numbers: \begin{prop} \label{p:axioms} $\mathfrak{N} \models \mathcal{A}$, where $\mathfrak{N} = (\mathbb{N},0,\mathsc{S},+,\cdot,\mathsc{E})$ is the structure consisting of the natural numbers with the usual zero and the usual successor, addition, multiplication, and exponentiation operations. \end{prop} However, $\mathcal{A}$ is a long way from being able to prove all the sentences of first-order arithmetic true in $\mathfrak{N}$. For example, though we won't prove it, it turns out that $\mathcal{A}$ is not enough to ensure that induction works: that for every formula $\varphi$ with at most the variable $x$ free, if $\varphi^x_0$ and $\forall y\, (\varphi^x_y \to \varphi^x_{Sy})$ hold, then so does $\forall x\, \varphi$. On the other hand, neither $\mathcal{L}_N$ nor $\mathcal{A}$ are quite as minimal as they might be. For example, with some (considerable) extra effort one could do without $E$ and define it from $\cdot$ and $+$. \subsection*{Representing functions and relations} For convenience, we will adopt the following conventions. First, we will often abbreviate the term of $\mathcal{L}_N$ consisting of $m$ $S$s followed by $0$ by $S^m0$.\index{$S^m0$} For example, $S^30$ abbreviates $SSS0$. The term $S^m0$ is a convenient name for the natural number $m$ in the language $\mathcal L_N$ since the interpretation of $S^m0$ in $\mathfrak{N}$ is $m$: \begin{lem} \label{l:inter} For every $m \in \mathbb{N}$ and every assignment $s$ for $\mathfrak{N}$, $\mathbf{s}(S^m0) = m$. \end{lem} Second, if $\varphi$ is a formula of $\mathcal{L}_N$ with all of its free variables among $v_1$, \dots, $v_k$, and $m_0$, $m_1$, \dots, $m_k$ are natural numbers, we will write $\varphi(S^{m_1}0, \dots, S^{m_k}0)$ \index{$\varphi(S^{m_1}0, \dots, S^{m_k}0)$} for the sentence $\varphi^{v_1 \dots v_k}_{S^{m_1}0, \dots, S^{m_k}0}$, {\em i.e.\/} $\varphi$ with $S^{m_i}0$ substituted for every free occurrence of $v_i$. Since the term $S^{m_i}0$ involves no variables, it is substitutable for $v_i$ in $\varphi$. \begin{defn} \label{df:rcna} Suppose $\Sigma$ is a set of sentences of $\mathcal{L}_N$. A $k$-place function $f$ is said to be {\em representable\/} \index{representable, function} in $\mathrm{Th}(\Sigma) = \{\, \tau \mid \Sigma \proves \tau \,\}$ if there is a formula $\varphi$ of $\mathcal{L}_N$ with at most $v_1$, \dots, $v_k$, and $v_{k+1}$ as free variables such that $$ \begin{aligned} f(n_1,\dots,n_k) = m &\iff \varphi(S^{n_1}0,\dots,S^{n_k}0,S^m0) \in \mathrm{Th}(\Sigma) \\ &\iff \Sigma \proves \varphi(S^{n_1}0,\dots,S^{n_k}0,S^m0) \end{aligned} $$ for all $n_1$, \dots, $n_k$, and $m$ in $\mathbb{N}$. The formula $\varphi$ is said to {\em represent\/}\index{represent, function} $f$ in $\mathrm{Th}(\Sigma)$. \end{defn} We will use this definition mainly with $\Sigma = \mathcal{A}$. \begin{exmp} \label{ex:frcf} The constant function $c^1_3$ given by $c^1_3(n) = 3$ is representable in $\mathrm{Th}(\mathcal{A})$; $v_2 = S^30$ is a formula representing it. Note that that this formula has no free variable for the input of the $1$-place function, but then the input is irrelevant\dots To see that $v_2 = S^30$ really does represent $c^1_3$ in $\mathrm{Th}(\mathcal{A})$, we need to verify that \begin{align*} c^1_3(n) = m & \iff \mathcal{A} \proves v_2 = S^30 (S^n0,S^m0) \\ & \iff \mathcal{A} \proves S^m0 = S^30 \end{align*} for all $n,m \in \mathbb{N}$. In one direction, suppose that $c^1_3(n) = m$. Then, by the definition of $c^1_3$, we must have $m = 3$. Now \begin{enumerate} \item $\forall x \, x = x \,\to\, S^30 = S^30$ \hfill A4 \item $\forall x \, x = x$ \hfill A8 \item $S^30 = S^30$ \hfill 1,2 MP \end{enumerate} is a deduction of $S^30 = S^30$ from $\mathcal{A}$. Hence if $c^1_3(n) = m$, then $\mathcal{A} \proves S^m0 = S^30$. In the other direction, suppose that $\mathcal{A} \proves S^m0 = S^30$. Since $\mathfrak{N} \models \mathcal{A}$, it follows that $\mathfrak{N} \models S^m0 = S^30$. It follows from Lemma \ref{l:inter} that $m = 3$, so $c^1_3(n) = m$. Hence if $\mathcal{A} \proves S^m0 = S^30$, then $c^1_3(n) = m$. \end{exmp} \begin{prob} \label{p:reppi32} Show that the projection function $\pi^3_2$ can be represented in $\mathrm{Th}(\mathcal{A})$. \end{prob} \begin{defn} \label{df:rcnab} A $k$-place relation $P \subseteq \mathbb{N}^k$ is said to be {\em representable\/} \index{representable, relation} in $\mathrm{Th}(\Sigma)$ if there is a formula $\psi$ of $\mathcal L_N$ with at most $v_1$, \dots, $v_k$ as free variables such that $$ \begin{aligned} P(n_1,\dots,n_k) &\iff \psi(S^{n_1}0,\dots,S^{n_k}0) \in \mathrm{Th}(\Sigma) \\ &\iff \Sigma \proves \psi(S^{n_1}0,\dots,S^{n_k}0) \end{aligned} $$ for all $n_1$, \dots, $n_k$ in $\mathbb N$. The formula $\psi$ is said to {\em represent\/}\index{represent, relation} $P$ in $\mathrm{Th}(\Sigma)$. \end{defn} We will also use this definition mainly with $\Sigma = \mathcal{A}$. \begin{exmp} \label{ex:frcr} Almost the same formula, $v_1 = S^30$, serves to represent the set --- {\em i.e.\/} $1$-place relation --- $\{ 3 \}$ in $\mathrm{Th}(\mathcal{A})$. Showing that $v_1 = S^30$ really does represent $\{ 3 \}$ in $\mathrm{Th}(\mathcal{A})$ is virtually identical to the corresponding argument in Example \ref{ex:frcf}. \end{exmp} \begin{prob} \label{p:repdif} Explain why $v_2 = SSS0$ does not represent the set $\{ 3 \}$ in $\mathrm{Th}(\mathcal{A})$ and $v_1 = SSS0$ does not represent the constant function $c^1_3$ in $\mathrm{Th}(\mathcal{A})$. \end{prob} \begin{prob} Show that the set of all even numbers can representable in $\mathrm{Th}(\mathcal{A})$. \end{prob} \begin{prob} \label{p:seventeen2} Show that the initial functions are representable in $\mathrm{Th}(\mathcal{A})$: \begin{enumerate} \item The zero function $\mathsc{O}(n) = 0$. \index{$\mathsc{O}$} \item The successor function $\mathsc{S}(n) = n + 1$. \index{$\mathsc{S}$} \item For every positive $k$ and $i \le k$, the projection function $\pi^k_i$. \index{$\pi^k_i$} \end{enumerate} \end{prob} It turns out that all recursive functions and relations are representable in $\mathrm{Th}(\mathcal{A})$. \begin{prop} \label{p:seventeen3} A $k$-place function $f$ is representable in $\mathrm{Th}(\mathcal{A})$ if and only if the $k+1$-place relation $P_f$ defined by $$ P_f(n_1, \dots, n_k, n_{k+1}) \iff f(n_1, \dots, n_k) = n_{k+1} $$ is representable in $\mathrm{Th}(\mathcal{A})$. Also, a relation $P \subseteq \mathbb N^k$ is representable in $\mathrm{Th}(\mathcal{A})$ if and only if its characteristic function $\chi_P$ is representable in $\mathrm{Th}(\mathcal{A})$. \end{prop} \begin{prop} \label{p:seventeen4} Suppose $g_1$, \dots, $g_m$ are $k$-place functions and $h$ is an $m$-place function, all of them representable in $\mathrm{Th}(\mathcal{A})$. Then $f = h \circ (g_1,\dots,g_m)$ is a $k$-place function representable in $\mathrm{Th}(\mathcal{A})$. \end{prop} \begin{prop} \label{p:seventeen5} Suppose $g$ is a $k+1$-place regular function which is representable in $\mathrm{Th}(\mathcal{A})$. Then the unbounded minimalization of $g$ is a $k$-place function representable in $\mathrm{Th}(\mathcal{A})$. \end{prop} Between them, the above results supply most of what is needed to conclude that all recursive functions and relations on the natural numbers are representable. The exception is showing that functions defined by primitive recursion from representable functions are also representable, which requires some additional effort. The basic problem is that it is not obvious how a formula defining a function can get at previous values of the function. To accomplish this, we will borrow a trick from Chapter \ref{ch:thirteen}. \begin{prob} \label{p:seventeen6} Show that each of the following relations and functions (first defined in Problem \ref{r:rfs}) is representable in $\mathrm{Th}(\mathcal{A})$. \begin{enumerate} \item $\mathsc{Div}(n,m) \iff n \mid m$ \index{$\mathsc{Div}$} \item $\mathsc{IsPrime}(n) \iff n \text{ is prime}$ \index{$\mathsc{IsPrime}$} \item $\mathsc{Prime}(k) = p_k$, where $p_0 = 1$ and $p_k$ is the $k$th prime if $k \ge 1$. \index{$\mathsc{Prime}$} \item $\mathsc{Power}(n,m) = k$, where $k \ge 0$ is maximal such that $n^k \mid m$. \index{$\mathsc{Power}$} \item $\mathsc{Length}(n) = \ell$, where $\ell$ is maximal such that $p_\ell \mid n$. \index{$\mathsc{Length}$} \item $\mathsc{Element}(n,i) = n_i$, where $n = p_1^{n_1} \dots p_k^{n_k}$ (and $n_i = 0$ if $i > k$). \index{$\mathsc{Element}$} \end{enumerate} \end{prob} Using the representable functions and relations given above, we can represent a ``history function'' of any representable function\dots \begin{prob} \label{p:seventeen7} Suppose $f$ is a $k$-place function representable in $\mathrm{Th}(\mathcal{A})$. Show that \begin{align*} F(n_1,\dots,n_k,m) &= p_1^{f(n_1,\dots,n_k,0)} \dots p_{m+1}^{f(n_1,\dots,n_k,m)} \\ &= \prod_{i=0}^m p_i^{f(n_1,\dots,n_k,i)} \end{align*} is also representable in $\mathrm{Th}(\mathcal{A})$. \end{prob} \noindent\dots and use it! \begin{prop} \label{p:seventeen8} Suppose $g$ is a $k$-place function and $h$ is a $k+2$-place function, both representable in $\mathrm{Th}(\mathcal{A})$. Then the $k+1$-place function $f$ defined by primitive recursion from $g$ and $h$ is also representable in $\mathrm{Th}(\mathcal{A})$. \end{prop} \begin{thm} \label{th:rfra} Recursive functions are representable in $\mathrm{Th}(\mathcal{A})$. \end{thm} In particular, it follows that there are formulas of $\mathcal{L}_N$ representing each of the functions from Chapter \ref{ch:sixteen} for manipulating the codes of formulas. This will permit us to construct formulas which encode assertions about terms, formulas, and deductions; we will ultimately prove the Incompleteness Theorem by showing there is a formula which codes its own unprovability. \subsection*{Representability} We conclude with some more general facts about representability. \begin{prop} \label{p:seventeen1a} Suppose $\Sigma$ is a set of sentences of $\mathcal{L}_N$ and $f$ is a $k$-place function which is representable in $\mathrm{Th}(\Sigma)$. Then $\Sigma$ must be consistent. \end{prop} \begin{prob} \label{p:seventeen1b} If $\Sigma$ is a set of sentences of $\mathcal{L}_N$ and $P$ is a $k$-place relation which is representable in $\mathrm{Th}(\Sigma)$, does $\Sigma$ have to be consistent? \end{prob} \begin{prop} \label{p:seventeen1} Suppose $\Sigma$ and $\Gamma$ are consistent sets of sentences of $\mathcal{L}_N$ and $\Sigma \proves \Gamma$, {\em i.e.\/} $\Sigma \proves \gamma$ for every $\gamma \in \Gamma$. Then every function and relation which is representable in $\mathrm{Th}(\Gamma)$ is representable in $\mathrm{Th}(\Sigma)$. \end{prop} This lets us use everything we can do with representability in $\mathrm{Th}(\mathcal{A})$ with any set of axioms in $\mathcal{L}_N$ that is at least as powerful as $\mathcal{A}$. \begin{cor} \label{p:representability} Functions and relations which representable in $\mathrm{Th}(\mathcal{A})$ are also representable in $\mathrm{Th}(\Sigma)$, for any consistent set of sentences $\Sigma$ such that $\Sigma \proves \mathcal{A}$. \end{cor} % % Chapter 18 of "A Problem Course in Mathematical Logic" % \chapter{The Incompleteness Theorem} \label{ch:eighteen} The material in Chapter \ref{ch:sixteen} effectively allows us to use recursive functions to manipulate coded formulas of $\mathcal{L}_N$, while the material in Chapter \ref{ch:seventeen} allows us to represent recursive functions using formulas $\mathcal{L}_N$. Combining these techniques allows us to use formulas of $\mathcal{L}_N$ to refer to and manipulate codes of formulas of $\mathcal{L}_N$. This is the key to proving G\"odel's Incompleteness Theorem and related results. In particular, we will need to know one further trick about manipulating the codes of formulas recursively, that the operation of substituting (the code of) the term $S^k0$ into (the code of) a formula with one free variable is recursive. \begin{prob} \label{pb:rpfn} Show that the function $$ \mathsc{Sub}(n,k) = \begin{cases} \ulcorner \varphi(S^k0) \urcorner & \text{if $n = \ulcorner \varphi \urcorner$ for a formula $\varphi$ of $\mathcal{L}_N$} \\ & \text{with at most $v_1$ free} \\ 0 & \text{otherwise} \end{cases} $$ \index{$\mathsc{Sub}$} is recursive, and hence representable $\mathrm{Th}(\mathcal{A})$. \end{prob} In order to combine the the results from Chapter \ref{ch:sixteen} with those from Chapter \ref{ch:seventeen}, we will also need to know the following. \begin{lem} \label{p:eighteen1} $\mathcal{A}$ is a recursive set of sentences of $\mathcal{L}_N$. \end{lem} \subsection*{The First Incompleteness Theorem} The key result needed to prove the First Incompleteness Theorem (another will follow shortly!) is the following lemma. It asserts, in effect, that for any statement about (the code of) some sentence, there is a sentence $\sigma$ which is true or false exactly when the statement is true or flase of (the code of) $\sigma$. This fact will allow us to show that the self-referential sentence we will need to verify the Incompleteness theorem exists. \begin{lem}[Fixed-Point Lemma] \index{Fixed-Point Lemma} \label{l:fpl} Suppose $\varphi$ is a formula of $\mathcal{L}_N$ with only $v_1$ as a free variable. Then there is a sentence $\sigma$ of $\mathcal{L}_N$ such that $$ \mathcal{A} \proves \sigma \fromto \varphi(S^{\ulcorner \sigma \urcorner}0) \, . $$ \end{lem} Note that $\sigma$ must be different from the sentence $\varphi(S^{\ulcorner \sigma \urcorner}0)$: there is no way to find a formula $\varphi$ with one free variable and an integer $k$ such that $\ulcorner \varphi(S^k0) \urcorner = k$. (Think about how G\"odel codes are defined\dots) With the Fixed-Point Lemma in hand, G\"odel's First Incompleteness Theorem can be put away in fairly short order. \begin{thm}[G\"odel's First Incompleteness Theorem] \index{G\"odel's First Incompleteness Theorem} \index{Incompleteness Theorem, G\"odel's First} \label{t:GIT} Suppose $\Sigma$ is a consistent recursive set of sentences of $\mathcal{L}_N$ such that $\Sigma \proves \mathcal{A}$. Then $\Sigma$ is not complete. \end{thm} That is, any consistent set of sentences which proves at least as much about the natural numbers as $\mathcal{A}$ does can't be both complete and recursive. The First Incompleteness Theorem has many variations, corollaries, and relatives, a few of which will be mentioned below. \cite{RS:GIT} is a good place to learn about more of them. \begin{cor} \label{p:eighteen5} \begin{enumerate} \item Let $\Gamma$ be a complete set of sentences of $\mathcal{L}_N$ such that $\Gamma \cup \mathcal{A}$ is consistent. Then $\Gamma$ is not recursive. \item Let $\Delta$ be a recursive set of sentences such that $\Delta \cup \mathcal{A}$ is consistent. Then $\Delta$ is not complete. \item The theory of $\mathfrak{N}$, \index{theory of $\mathfrak{N}$} \index{$\mathrm{Th}(\mathfrak{N})$} $$ \mathrm{Th}(\mathfrak{N}) =\{\, \sigma \mid \text{$\sigma$ is a sentence of $\mathcal{L}_N$ and $\mathfrak{N} \models \sigma$} \,\} \, , $$ is not recursive. \end{enumerate} \end{cor} There is nothing really special about working in $\mathcal{L}_N$. The proof of G\"odel's Incompleteness Theorem can be executed for any first order language and recursive set of axioms which allow one to code and prove enough facts about arithmetic. In particular, it can be done whenever the language and axioms are powerful enough --- as in Zermelo-Fraenkel set theory, for example --- to define the natural numbers and prove some modest facts about them. \subsection*{The Second Incompleteness Theorem} G\"odel also proved a strengthened version of the Incompleteness Theorem which asserts that, in particular, a consistent recursive set of sentences $\Sigma$ of $\mathcal{L}_N$ cannot prove its own consistency. To get at it, we need to express the statement ``$\Sigma$ is consistent'' in $\mathcal{L}_N$. \begin{prob} \label{p:eighteen6} \index{$\mathrm{Con}(\Sigma)$} Suppose $\Sigma$ is a recursive set of sentences of $\mathcal{L}_N$. Find a sentence of $\mathcal{L}_N$, which we'll denote by $\mathrm{Con}(\Sigma)$, such that $\Sigma$ is consistent if and only if $\mathcal{A} \proves \mathrm{Con}(\Sigma)$. \end{prob} \begin{thm}[G\"odel's Second Incompleteness Theorem] \index{G\"odel's Second Incompleteness Theorem} \index{Incompleteness Theorem, G\"odel's Second} \label{t:GSIT} Let $\Sigma$ be a consistent recursive set of sentences of $\mathcal{L}_N$ such that $\Sigma \proves \mathcal{A}$. Then $\Sigma \nproves \mathrm{Con}(\Sigma)$. \end{thm} As with the First Incompleteness Theorem, the Second Incompleteness Theorem holds for any recursive set of sentences in a first-order language which allow one to code and prove enough facts about arithmetic. The perverse consequence of the Second Incompleteness Theorem is that only an inconsistent set of axioms can prove its own consistency. \subsection*{Truth and definability} A close relative of the Incompleteness Theorem is the assertion that truth in $\mathfrak{N} = (\mathbb{N},\mathsc{S},+,\cdot,\mathsc{E},0)$ is not definable in $\mathfrak{N}$. To make sense of this, of course, we need to sort out what ``truth'' and ``definable in $\mathfrak{N}$'' mean here. ``Truth'' means what it usually does in first-order logic: all we mean when we say that a sentence $\sigma$ of $\mathcal{L}_N$ is true in $\mathfrak{N}$ is that when $\sigma$ is true when interpreted as a statement about the natural numbers with the usual operations. That is, $\sigma$ is true in $\mathfrak{N}$ exactly when $\mathfrak{N}$ satisfies $\sigma$, {\em i.e.\/} exactly when $\mathfrak{N} \models \sigma$. ``Definable in $\mathfrak{N}$'' we do have to define\dots \begin{defn} \index{definable relation} \index{relation definable in $\mathfrak{N}$} A $k$-place relation is {\em definable\/} in $\mathfrak{N}$ if there is a formula $\varphi$ of $\mathcal{L}_N$ with at most $v_1$, \dots, $v_k$ as free variables such that $$ P(n_1,\dots,n_k) \iff \mathfrak{N} \models \varphi [s(v_1|n_1)\dots(v_k|n_k)] $$ for every assignment $s$ of $\mathfrak{N}$. The formula $\varphi$ is said to {\em define\/} $P$ in $\mathfrak{N}$. \end{defn} A definition of ``function definable in $\mathfrak{N}$'' \index{function definable in $\mathfrak{N}$} \index{definable function} could be made in a similar way, of course. Definability is a close relative of representability: \begin{prop} \label{p:eighteen8} Suppose $P$ is a $k$-place relation which is representable in $\mathrm{Th}(\mathcal{A})$. Then $P$ is definable in $\mathfrak{N}$. \end{prop} \begin{prob} \label{p:eighteen9} Is the converse to Proposition \ref{p:eighteen8} true? \end{prob} The question of whether truth in $\mathfrak{N}$ is definable is then the question of whether the set of G\"odel codes of sentences of $\mathcal{L}_N$ true in $\mathfrak{N}$, $$ \ulcorner \mathrm{Th}(\mathfrak{N}) \urcorner = \{\, \ulcorner \sigma \urcorner \mid \text{$\sigma$ is a sentence of $\mathcal{L}_N$ and $\mathfrak N \models \sigma$} \,\} \, , $$ is definable in $\mathfrak{N}$. It isn't: \begin{thm}[Tarski's Undefinability Theorem] \index{Tarski's Undefinability Theorem} \index{Undefinability Theorem, Tarski's} \label{t:TUT} $\ulcorner \mathrm{Th}(\mathfrak{N}) \urcorner$ is \linebreak not definable in $\mathfrak{N}$. \end{thm} \subsection*{The implications} G\"odel's Incompleteness Theorems have some serious consequences. Since almost all of mathematics can be formalized in first-order logic, the First Incompleteness Theorem implies that there is no effective procedure that will find and prove all theorems. This might be considered as job security for research mathematicians. The Second Incompleteness Theorem, on the other hand, implies that we can never be completely sure that any reasonable set of axioms is actually consistent unless we take a more powerful set of axioms on faith. It follows that one can never be completely sure --- faith aside --- that the theorems proved in mathematics are really true. This might be considered as job security for philosophers of mathematics. We leave the question of who gets job security from Tarski's Undefinability Theorem to you, gentle reader\dots \chapter*{Hints for Chapters 15--18} % % Hints for Chapter 15 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:fifteen}} \begin{clue}{p:fifteen1} Compare Definition \ref{d:cplt} with the definition of maximal consistency. \end{clue} % % Hints Chapter 16 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:sixteen}} \begin{clue}{p:formcodes} Do what is done in Example \ref{ex:formseqcode} for some other sequence of formulas. \end{clue} \begin{clue}{p:dualcodes} You need to unwind Definitions \ref{df:gcsym} and \ref{df:gcfor}, keeping in mind that you are dealing with formulas and sequences of formulas, not just arbitrary sequences of symbolsof $\mathcal{L}_N$ or sequences of sequences of symbols. \end{clue} \begin{clue}{pr:loth} In each case, use Definitions \ref{df:gcsym} and \ref{df:gcfor}, along with the appropriate definitions from first-order logic and the tools developed in Problems \ref{p:thirteen6} and \ref{r:rfs}. \begin{enumerate} \item Recall that in $\mathcal{L}_N$, a term is either a variable symbol, {\it i.e.\/} $v_k$ for some $k$, the constant symbol $0$, of the form $St$ for some (shorter) term $t$, or $+t_1t_2$ for some (shorter) terms $t_1$ and $t_2$. $\chi_{\mathsc{Term}}(n)$ needs to check the length of the sequence coded by $n$. If this is of length $1$, it will need to check if the symbol coded is $0$ or $v_k$ for some $k$; otherwise, it needs to check if the sequence coded by $n$ begins with an $S$ or $+$, and then whether the rest of the sequence consists of one or two valid terms. Primitive recursion is likely to be necessary in the latter case if you can't figure out how to do it using the tools from Problems \ref{p:thirteen6} and \ref{r:rfs}. \item This is similar to showing $\mathsc{Term}(n)$ is primitive recursive. Recall that in $\mathcal L_N$, a formula is of the form either $=t_1t_2$ for some terms $t_1$ and $t_2$, $(\not\alpha)$ for some (shorter) formula $\alpha$, $(\beta \to \gamma)$ for some (shorter) formulas $\beta$ and $\gamma$, or $\forall v_i\, \delta$ for some variable symbol $v_i$ and some (shorter) formula $\delta$. $\chi_{\mathsc{Formula}}(n)$ needs to check the first symbol of the sequence coded by $n$ to identify which case ought to apply and then take it from there. \item Recall that a sentence is justa formula with no free variable; that is, every occurrence of a variable is in the scope of a quantifier. \item Each logical axiom is an instance of one of the schema A1--A8, or is a generalization thereof. \end{enumerate} \end{clue} \begin{clue}{pr:loac} In each case, use Definitions \ref{df:gcsym} and \ref{df:gcfor}, together with the appropriate definitions from first-order logic and the tools developed in Problems \ref{r:rfs} and \ref{pr:loth}. \begin{enumerate} \item $\ulcorner \Delta \urcorner$ is recursive and $\mathsc{Logical}$ is primitive recursive, so\dots \item All $\chi_{\mathsc{Formulas}}(n)$ has to do is check that every element of the sequence coded by $n$ is the code of a formula, and $\mathsc{Formula}$ is already known to be primitive recursive. \item $\chi_{\mathsc{Inference}}(n)$ needs to check that $n$ is the code of a sequence of formulas, with the additional property that either $\varphi_i$ is $(\varphi_j \to \varphi_k)$ or $\varphi_j$ is $(\varphi_i \to \varphi_k)$. Part of what goes into $\chi_{\mathsc{Formula}}(n)$ may be handy for checking the additional property. \item Recall that a deduction from $\Delta$ is a sequence of formulas $\varphi_1 \dots \varphi_k$ where each formula is either a premiss or follows from preceding formulas by Modus Ponens. \item $\chi_{\mathsc{Conclusion}}(n,m)$ needs to check that $n$ is the code of a deduction and that $m$ is the code of the last formula in that deduction. \end{enumerate} They're all primitive recursive if $\ulcorner \Delta \urcorner$ is, by the way. \end{clue} \begin{clue}{t:sixteen3} \begin{enumerate} \item Use unbounded minimalization and the relations in Problem \ref{pr:loac} to define a function which, given $n$, returns the $n$th integer which codes an element of $\mathrm{Th}(\Delta)$. \item If $\Delta$ is complete, then for any sentence $\sigma$, either $\lceil \sigma \rceil$ or $\lceil \lnot \sigma$ must eventually turn up in an enumeration of $\ulcorner \mathrm{Th}(\Delta) \urcorner$. The other direction is really just a matter of unwinding the definitions involved. \end{enumerate} \end{clue} % % Hints for Chapter 17 of "A Problem Course in Mathematical Logic} % \subsection*{Hints for Chapter~\ref{ch:seventeen}} \begin{clue}{p:seventeen1} Every deduction from $\Gamma$ can be replaced by a deduction of $\Sigma$ with the same conclusion. \end{clue} \begin{clue}{p:seventeen1a} If $\Sigma$ were insconsistent it would prove entirely too much\dots \end{clue} \begin{clue}{p:seventeen2} \begin{enumerate} \item Adapt Example \ref{ex:frcf}. \item Use the $1$-place function symbol $S$ of $\mathcal{L}_N$. \item There is much less to this part than meets the eye\dots \end{enumerate} \end{clue} \begin{clue}{p:seventeen3} In each case, you need to use the given representing formula to define the one you need. \end{clue} \begin{clue}{p:seventeen4} String together the formulas representing $g_1$, \dots, $g_m$, and $h$ with $\land$s and put some existential quantifiers in front. \end{clue} \begin{clue}{p:seventeen5} First show that that $<$ is representable in $\mathrm{Th}(\mathcal{A})$ and then exploit this fact. \end{clue} \begin{clue}{p:seventeen6} \begin{enumerate} \item $n \mid m$ if and only if there is some $k$ such that $n \cdot k = m$. \item $n$ is prime if and only if there is no $\ell$ such that $\ell \mid n$ and $1 < \ell < n$. \item $p_k$ is the first prime with exactly $k-1$ primes less than it. \item Note that $k$ must be minimal such that $n^{k+1} \nmid m$. \item You'll need a couple of the previous parts. \item Ditto. \end{enumerate} \end{clue} \begin{clue}{p:seventeen7} Problem \ref{p:seventeen6} has most of the necessary ingredients needed here. \end{clue} \begin{clue}{p:seventeen8} Problems \ref{p:seventeen6} and \ref{p:seventeen7} have most of the necessary ingredients between them. \end{clue} \begin{clue}{th:rfra} Proceed by induction on the numbers of applications of composition, primitive recursion, and unbounded minimalization in the recursive definition $f$, using the previous results in Chapter \ref{ch:seventeen} at the basis and induction steps. \end{clue} % % Hints for Chapter 18 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:eighteen}} \begin{clue}{p:eighteen1} $\mathcal{A}$ is a {\em finite\/} set of sentences. \end{clue} \begin{clue}{pb:rpfn} First show that recognizing that a formula has at most $v_1$ as a free variable is recursive. The rest boils down to checking that substituting a term for a free variable is also recursive, which has already had to be done in the solutions to Problem \ref{pr:loth}. \end{clue} \begin{clue}{l:fpl} Let $\psi$ be the formula (with at most $v_1$, $v_2$, and $v_3$ free) which represents the function $f$ of Problem \ref{pb:rpfn} in $\mathrm{Th}(\mathcal{A})$. Then the formula $\forall v_3\, ( \psi^{v_2}{v_1} \to \varphi^{v_1}_{v_3} )$ has only one variable free, namely $v_1$, and is very close to being the sentence $\sigma$ needed. To obtain $\sigma$ you need to substitute $S^kO$ for a suitable $k$ for $v_1$. \end{clue} \begin{clue}{t:GIT} Try to prove this by contradiction. Observe first that if $\Sigma$ is recursive, then $\ulcorner \mathrm{Th}(\Sigma) \urcorner$ is representable in $\mathrm{Th}(\mathcal{A})$. \end{clue} \begin{clue}{p:eighteen5} \begin{enumerate} \item If $\Gamma$ were recursive, you could get a contradiction to the Incompleteness Theorem. \item If $\Delta$ were complete, it couldn't also be recursive. \item Note that $\mathcal{A} \subset \mathrm{Th}(\mathfrak{N})$. \end{enumerate} \end{clue} \begin{clue}{p:eighteen6} Modify the formula representing the function $\mathsc{Conclusion}_\Sigma$ (defined in Problem \ref{pr:loac}) to get $\mathrm{Con}(\Sigma)$. \end{clue} \begin{clue}{t:GSIT} Try to do a proof by contradiction in three stages. First, find a formula $\varphi$ (with just $v_1$ free) that represents ``$n$ is the code of a sentence which cannot be proven from $\Sigma$'' and use the Fixed-Point Lemma to find a sentence $\tau$ such that $\Sigma \proves \tau \fromto \varphi(S^{\ulcorner \tau \urcorner})$. Second, show that if $\Sigma$ is consistent, then $\Sigma \nproves \tau$. Third --- the {\em hard\/} part --- show that $\Sigma \proves \mathrm{Con}(\Sigma) \to \varphi(S^{\ulcorner \tau \urcorner})$. This leads directly to a contradiction. \end{clue} \begin{clue}{p:eighteen8} Note that $\mathfrak{N} \models \mathcal{A}$. \end{clue} \begin{clue}{p:eighteen9} If the converse was true, $\mathcal{A}$ would run afoul of the (First) Incompleteness Theorem. \end{clue} \begin{clue}{t:TUT} Suppose, by way of contradiction, that $\ulcorner \mathrm{Th}(\mathfrak{N}) \urcorner$ was definable in $\mathfrak{N}$. Now follow the proof of the (First) Incompleteness Theorem as closely as you can. \end{clue} % % Appendices % \part*{Appendices} \appendix % % Appendix to "A Problem Course in Mathematical Logic" % \chapter{A Little Set Theory} \label{ap:sets} \index{set theory} This apppendix is meant to provide an informal summary of the notation, definitions, and facts about sets needed in Chapters \ref{ch:one}--\ref{ch:nine}. For a proper introduction to elementary set theory, try \cite{PH:NST} or \cite{JH:OST}. \begin{defn} \label{d:sed} Suppose $X$ and $Y$ are sets. Then \begin{enumerate} \item $a \in X$ means that $a$ is an {\em element\/} of ({\em i.e.\/} a thing in) the set $X$.\index{element} \index{$\in$} \item $X$ is a subset of $Y$, written as $X \subseteq Y$, if $a \in Y$ for every $a \in X$.\index{subset} \index{$\subseteq$} \item The {\em union\/} of $X$ and $Y$ is $X \cup Y = \{\, a \mid a \in X \text{\ or\ } a \in Y \,\}$.\index{union} \index{$\cup$} \item The {\em intersection\/} of $X$ and $Y$ is $X \cap Y = \{\, a \mid a \in X \text{\ and\ } a \in Y \,\}$.\index{intersection} \index{$\cap$} \item The {\em complement of\/} $Y$ {\em relative to\/} $X$ is $X \setminus Y = \{\, a \mid a \in X \text{\ and\ } a \notin Y \,\}$. \index{complement} \index{$\setminus$} \item The {\em cross product\/} of $X$ and $Y$ is $X \times Y = \{\, (a,b) \mid a \in X \text{\ and\ } b \in Y \,\}$. \index{cross product} \index{$\times$} \item The {\em power set\/} of $X$ is $\mathcal{P}(X) = \{\, Z \mid Z \subseteq X \,\}$. \index{power set} \index{$\mathcal{P}$} \item $[X]^k = \{\, Z \mid Z \subseteq X \text{\ and\ } |Z| = k \,\}$ is the set of subsets of $X$ of size $k$. \index{$[X]^k$} \end{enumerate} \end{defn} If all the sets being dealt with are all subsets of some fixed set $Z$, the complement of $Y$, $\Bar{Y}$\index{$\Bar{Y}$}, is usually taken to mean the complement of $Y$ relative to $Z$. It may sometimes be necessary to take unions, intersections, and cross products of more than two sets. \begin{defn} Suppose $A$ is a set and $\mathbf{X} = \{\, X_a \mid a \in A \,\}$ is a family of sets indexed by $A$. Then \begin{enumerate} \item The union of $\mathbf{X}$ is the set $\bigcup \mathbf{X} = \{\, z \mid \exists a \in A \colon z \in X_a \,\}$.\index{union} \item The intersection of $\mathbf{X}$ is the set $\bigcap \mathbf{X} = \{\, z \mid \forall a \in A \colon z \in X_a \,\}$.\index{intersection} \item The cross product of $\mathbf{X}$ is the set of sequences (indexed by $A$) $\prod \mathbf{X} = \prod_{a \in A} X_a = \{\, (\, z_a \mid a \in A \,) \mid \forall a \in A \colon z_a \in X_a \,\}$.\index{cross product} \index{$\prod$} \end{enumerate} We will denote the cross product of a set $X$ with itself taken $n$ times ({\em i.e.\/} the set of all sequences of length $n$ of elements of $X$) by $X^n$. \index{$X^n$} \end{defn} \begin{defn} If $X$ is any set, a {\em $k$-place relation on $X$\/} is a subset $R \subseteq X^k$. \index{relation, $k$-place} \end{defn} For example, the set $E = \{\, 0, 2, 3, \dots \,\}$ of even natural numbers is a $1$-place relation on $\mathbb{N}$, $D = \{\, (x,y) \in \mathbb{N}^2 \mid x \text{\ divides\ } y \,\}$ is a $2$-place relation on $\mathbb{N}$, and $S = \{\, (a,b,c) \in \mathbb{N}^3 \mid a + b = c \,\}$ is a $3$-place relation on $\mathbb{N}$. $2$-place relations are usually called binary relations.\index{relation, binary} \begin{defn} A set $X$ is {\em finite\/}\index{finite} if there is some $n \in \mathbb{N}$ such that $X$ has $n$ elements, and is {\em infinite\/}\index{infinite} otherwise. $X$ is {\em countable\/}\index{countable} if it is infinite and there is a 1-1 onto function $f : \mathbb{N} \to X$, and {\em uncountable\/}\index{uncountable} if it is infinite but not countable. \end{defn} Various infinite sets occur frequently in mathematics, such as $\mathbb{N}$\index{$\mathbb{N}$} (the natural numbers), $\mathbb{Q}$\index{$\mathbb{Q}$} (the rational numbers), and $\mathbb{R}$\index{$\mathbb{R}$} (the real numbers). Many of these are uncountable, such as $\mathbb{R}$. The basic facts about countable sets needed to do the problems are the following. \begin{prop} \begin{enumerate} \item If $X$ is a countable set and $Y \subseteq X$, then $Y$ is either finite or a countable. \item Suppose $\mathbf{X} = \{\, X_n \mid n \in \mathbb{N} \,\}$ is a finite or countable family of sets such that each $X_n$ is either finite or countable. Then $\bigcup \mathbf{X}$ is also finite or countable. \item If $X$ is a non-empty finite or countable set, then $X^n$ is finite or countable for each $n \ge 1$. \item If $X$ is a non-empty finite or countable set, then the set of all finite sequences of elements of $X$, $X^{<\omega} = \bigcup_{n \in \mathbb{N}} X^n$ is countable. \end{enumerate} \end{prop} The properly sceptical reader will note that setting up propositional or first-order logic formally requires that we have some set theory in hand, but formalizing set theory itself requires one to have first-order logic.\footnote{Which came first, the chicken\index{chicken} or the egg\index{egg}? Since, biblically speaking, ``In the beginning was the Word'',\index{Word}\index{John} maybe we ought to plump for alphabetical order. Which begs the question: In which alphabet?} % % Appendix to "A Problem Course in Mathematical Logic" % \chapter{The Greek Alphabet} \label{ap:greek} \index{Greek characters} \begin{center} \mbox{ \begin{tabular}{cccl} $\text{A}$ & $\alpha$ & & alpha \\ $\text{B}$ & $\beta$ & & beta \\ $\Gamma$ & $\gamma$ & & gamma \\ $\Delta$ & $\delta$ & & delta \\ $\text{E}$ & $\epsilon$ & $\varepsilon$ & epsilon \\ $\text{Z}$ & $\zeta$ & & zeta \\ $\text{H}$ & $\eta$ & & eta \\ $\Theta$ & $\theta$ & $\vartheta$ & theta \\ $\text{I}$ & $\iota$ & & iota \\ $\text{K}$ & $\kappa$ & & kappa \\ $\Lambda$ & $\lambda$ & & lambda \\ $\text{M}$ & $\mu$ & & mu \\ $\text{N}$ & $\nu$ & & nu \\ $\text{O}$ & $o$ & & omicron \\ $\Xi$ & $\xi$ & & xi \\ $\Pi$ & $\pi$ & $\varpi$ & pi \\ $\text{P}$ & $\rho$ & $\varrho$ & rho \\ $\Sigma$ & $\sigma$ & $\varsigma$ & sigma \\ $\text{T}$ & $\tau$ & & tau \\ $\Upsilon$ & $\upsilon$ & & upsilon \\ $\Phi$ & $\phi$ & $\varphi$ & phi \\ $\text{X}$ & $\chi$ & & chi \\ $\Psi$ & $\psi$ & & psi \\ $\Omega$ & $\omega$ & & omega \end{tabular} } \end{center} % % Appendix to "A Problem Course in Mathematical Logic" % \chapter{Logic Limericks} \label{ap:lim} \index{limericks} \begin{poem}{Deduction Theorem} \index{Deduction Theorem} A Theorem fine is Deduction,\\ For it allows work-reduction:\\ To show ``A implies B'',\\ Assume A and prove B;\\ Quite often a simpler production. \end{poem} \begin{poem}{Generalization Theorem} \index{Generalization Theorem} When in premiss the variable's bound,\\ To get a ``for all'' without wound,\\ Generalization.\\ For civilization\\ Could use some help for reasoning sound. \end{poem} \begin{poem}{Soundness Theorem} \index{Soundness Theorem} It's a critical logical creed:\\ Always check that it's safe to proceed. \\ To tell us deductions \\ Are truthful productions, \\ It's the Soundness of logic we need. \end{poem} \begin{poem}{Completeness Theorem} \index{Completeness Theorem} The Completeness of logics is G\"odel's. \\ 'Tis advice for looking for m\"odels: \\ They're always existent \\ For statements consistent, \\ Most helpful for logical lab\"ors. \end{poem} % % Appendix to "A Problem Course in Mathematical Logic" % \chapter{GNU Free Documentation License} \label{app:gnufdl} \begin{center} Version 1.2, November 2002\\ \begin{quotation} {\footnotesize Copyright \copyright 2000,2001,2002 Free Software Foundation, Inc.\\ \noindent 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA\\ \noindent Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. } \end{quotation} \end{center} \setcounter{section}{-1} \section{PREAMBLE} The purpose of this License is to make a manual, textbook, or other functional and useful document ``free'' in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others. This License is a kind of ``copyleft'', which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software. We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference. \section{APPLICABILITY AND DEFINITIONS} This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The ``Document'', below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as ``you''. You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law. A ``Modified Version'' of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language. A ``Secondary Section'' is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them. The ``Invariant Sections'' are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none. The ``Cover Texts'' are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words. A ``Transparent'' copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not ``Transparent'' is called ``Opaque''. Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only. The ``Title Page'' means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, ``Title Page'' means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text. A section ``Entitled XYZ'' means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as ``Acknowledgements'', ``Dedications'', ``Endorsements'', or ``History''.) To ``Preserve the Title'' of such a section when you modify the Document means that it remains a section ``Entitled XYZ'' according to this definition. The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License. \section{VERBATIM COPYING} You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3. You may also lend copies, under the same conditions stated above, and you may publicly display copies. \section{COPYING IN QUANTITY} If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects. If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages. If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public. It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document. \section{MODIFICATIONS} You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version: \begin{itemize} \item[A.] Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission. \item[B.] List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement. \item[C.] State on the Title page the name of the publisher of the Modified Version, as the publisher. \item[D.] Preserve all the copyright notices of the Document. \item[E.] Add an appropriate copyright notice for your modifications adjacent to the other copyright notices. \item[F.] Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below. \item[G.] Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document's license notice. \item[H.] Include an unaltered copy of this License. \item[I.] Preserve the section Entitled ``History'', Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled ``History'' in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence. \item[J.] Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission. \item[K.] For any section Entitled ``Acknowledgements'' or ``Dedications'', Preserve the Title of the section, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein. \item[L.] Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles. \item[M.] Delete any section Entitled ``Endorsements''. Such a section may not be included in the Modified Version. \item[N.] Do not retitle any existing section to be Entitled ``Endorsements'' or to conflict in title with any Invariant Section. \item[O.] Preserve any Warranty Disclaimers. \end{itemize} If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles. You may add a section Entitled ``Endorsements'', provided it contains nothing but endorsements of your Modified Version by various parties--for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard. You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one. The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version. \section{COMBINING DOCUMENTS} You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers. The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work. In the combination, you must combine any sections Entitled ``History'' in the various original documents, forming one section Entitled ``History''; likewise combine any sections Entitled ``Acknowledgements'', and any sections Entitled ``Dedications''. You must delete all sections Entitled ``Endorsements.'' \section{COLLECTIONS OF DOCUMENTS} You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects. You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document. \section{AGGREGATION WITH INDEPENDENT WORKS} A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an ``aggregate'' if the copyright resulting from the compilation is not used to limit the legal rights of the compilation's users beyond what the individual works permit. When the Document is included an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document. If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document's Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate. \section{TRANSLATION} Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warrany Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail. If a section in the Document is Entitled ``Acknowledgements'', ``Dedications'', or ``History'', the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title. \section{TERMINATION} You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. \section{FUTURE REVISIONS OF THIS LICENSE} The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/. Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License ``or any later version'' applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation. \backmatter % % Bibliography for "A Problem Course in Mathematical Logic" % \begin{thebibliography}{99} \bibitem{JB:HML} Jon Barwise (ed.), {\em Handbook of Mathematical Logic\/}, North Holland, Amsterdam, 1977, ISBN 0-7204-2285-X. \bibitem{BE:LPL} J. Barwise and J. Etchemendy, {\em Language, Proof and Logic\/}, Seven Bridges Press, New York, 2000, ISBN 1-889119-08-3. \bibitem{MB:LB} Merrie Bergman, James Moor, and Jack Nelson, {\em The Logic Book\/}, Random House, NY, 1980, ISBN 0-394-32323-8. \bibitem{CK:MT} C.C. Chang and H.J. Keisler, {\em Model Theory\/}, third ed., North Holland, Amsterdam, 1990. \bibitem{DA:CU} Martin Davis, {\em Computability and Unsolvability\/}, McGraw-Hill, New York, 1958; Dover, New York, 1982, ISBN 0-486-61471-9. \bibitem{DA:U} Martin Davis (ed.), {\em The Undecidable; Basic Papers On Undecidable Propositions, Unsolvable Problems And Computable Functions\/}, Raven Press, New York, 1965. \bibitem{HE:MIL} Herbert~B. Enderton, {\em A Mathematical Introduction to Logic\/}, Academic Press, New York, 1972. \bibitem{PH:NST} Paul~R. Halmos, {\em Naive Set Theory\/}, Undergraduate Texts in Mathematics, Springer-Verlag, New York, 1974, ISBN 0-387-90092-6. \bibitem{JvH:FFG} Jean van Heijenoort, {\em From Frege to G\"odel\/}, Harvard University Press, Cambridge, 1967, ISBN 0-674-32449-8. \bibitem{JH:OST} James~M. Henle, {\em An Outline of Set Theory\/}, Problem Books in Mathematics, Springer-Verlag, New York, 1986, ISBN 0-387-96368-5. \bibitem{DH:GEB} Douglas~R. Hofstadter, {\em G\"odel, Escher, Bach\/}, Random House, New York, 1979, ISBN 0-394-74502-7. \bibitem{JM:IML} Jerome Malitz, {\em Introduction to Mathematical Logic\/}, Springer-Verlag, New York, 1979, ISBN 0-387-90346-1. \bibitem{YM:CML} Yu.I. Manin, {\em A Course in Mathematical Logic\/}, Graduate Texts in Mathematics~53, Springer-Verlag, New York, 1977, ISBN 0-387-90243-0. \bibitem{RP:ENM} Roger Penrose, {\em The Emperor's New Mind\/}, Oxford University Press, Oxford, 1989. \bibitem{RP:SOTM} Roger Penrose, {\em Shadows of the Mind\/}, Oxford University Press, Oxford, 1994, ISBN 0 09 958211 2. \bibitem{TR:ONCF} T. Rado, {\em On non-computable functions\/}, Bell System Tech. J. {\bf 41} (1962), 877--884. \bibitem{RS:GIT} Raymond~M. Smullyan, {\em G\" odel's Incompleteness Theorems\/}, Oxford University Press, Oxford, 1992, ISBN 0-19-504672-2. \end{thebibliography} % % Index for "A Problem Course in Mathematical Logic" % \begin{theindex} \item $($, 3, 24 \item $)$, 3, 24 \item $=$, 24, 25 \item $\cap$, 89, 133 \item $\cup$, 89, 133 \item $\exists$, 30 \item $\forall$, 24, 25, 30 \item $\fromto$, 5, 30 \item $\in$, 133 \item $\land$, 5, 30, 89 \item $\lnot P$, 89 \item $\lnot$, 3, 24, 25, 89 \item $\lor$, 5, 30, 89 \item $\models$, 10, 35, 37, 38 \item $\nmodels$, 10, 36, 37 \item $\prod$, 133 \item $\proves$, 12, 43 \item $\setminus$, 133 \item $\subseteq$, 133 \item $\times$, 133 \item $\to$, 3, 24, 25 \indexspace \item $\mathcal{A}$, 117 \item A1, 11, 42 \item A2, 11, 42 \item A3, 11, 42 \item A4, 42 \item A5, 42 \item A6, 42 \item A7, 42 \item A8, 43 \item $A_n$, 3 \item $\mathrm{Con}(\Sigma)$, 124 \item $\ulcorner \Delta \urcorner$, 115 \item $\mathrm{dom}(f)$, 81 \item $F$, 7 \item $f \colon \mathbb{N}^k \to \mathbb{N}$, 81 \item $\varphi(S^{m_1}0, \dots, S^{m_k}0)$, 118 \item $\varphi^x_t$, 42 \item $\mathcal{L}$, 24 \item $\mathcal{L}_1$, 26 \item $\mathcal{L}_=$, 26 \item $\mathcal{L}_F$, 26 \item $\mathcal{L}_G$, 53 \item $\mathcal{L}_N$, 26, 112 \item $\mathcal{L}_O$, 26 \item $\mathcal{L}_P$, 3 \item $\mathcal{L}_S$, 26 \item $\mathcal{L}_{NT}$, 25 \item $\mathfrak{M}$, 33 \item $\mathbb{N}$, 81, 134 \item $\mathfrak{N}$, 33, 112 \item N1, 117 \item N2, 117 \item N3, 117 \item N4, 117 \item N5, 117 \item N6, 117 \item N7, 117 \item N8, 117 \item N9, 117 \item $\mathbb{N}^k$, 81 \item $\mathbb{N}^k \setminus P$, 89 \item $\mathcal{P}$, 133 \item $P \cap Q$, 89 \item $P \cup Q$, 89 \item $P \land Q$, 89 \item $P \lor Q$, 89 \item $\pi^k_i$, 85, 120 \item $\mathbb{Q}$, 134 \item $\mathbb{R}$, 134 \item $\mathrm{ran}(f)$, 81 \item $R_n$, 55 \item $\mathcal{S}$, 6 \item $S^m0$, 118 \item $T$, 7 \item $\text{Th}$, 39, 45 \item $\mathrm{Th}(\Sigma)$, 112 \item $\mathrm{Th}(\mathfrak{N})$, 124 \item $v_n$, 24 \item $X^n$, 133 \item $[X]^k$, 133 \item $\Bar{Y}$, 133 \indexspace \item $\mathsc{A}$, 90 \item $\alpha$, 90 \item $\mathsc{Code}_k$, 97, 106 \item $\mathsc{Comp}$, 98 \item $\mathsc{Comp}_M$, 96 \item $\mathsc{Conclusion}_\Delta$, 115 \item $\mathsc{Decode}$, 97, 106 \item $\mathsc{Deduction}_\Delta$, 115 \item $\mathsc{Diff}$, 83, 88 \item $\mathsc{Div}$, 90, 120 \item $\mathsc{Element}$, 90, 120 \item $\mathsc{Entry}$, 96 \item $\mathsc{Equal}$, 89 \item $\mathsc{Exp}$, 88 \item $\mathsc{Fact}$, 88 \item $\mathsc{Formulas}$, 115 \item $\mathsc{Formula}$, 115 \item $i_{\mathbb{N}}$, 82 \item $\mathsc{Inference}$, 115 \item $\mathsc{IsPrime}$, 90, 120 \item $\mathsc{Length}$, 90, 120 \item $\mathsc{Logical}$, 115 \item $\mathsc{Mult}$, 88 \item $\mathsc{O}$, 83, 85, 120 \item $\mathsc{Power}$, 90, 120 \item $\mathsc{Pred}$, 83, 88 \item $\mathsc{Premiss}_\Delta$, 115 \item $\mathsc{Prime}$, 90, 120 \item $\mathsc{SIM}$, 106, 107 \item $\mathsc{Sentence}$, 115 \item $\mathsc{Sim}$, 98 \item $\mathsc{Sim}_M$, 97 \item $\mathsc{Step}$, 97, 106 \item $\mathsc{Step}_M$, 96, 106 \item $\mathsc{Subseq}$, 90 \item $\mathsc{Sub}$, 123 \item $\mathsc{Sum}$, 83, 87 \item $\mathsc{S}$, 83, 85, 120 \item $\mathsc{TapePosSeq}$, 96, 106 \item $\mathsc{TapePos}$, 96, 105 \item $\mathsc{Term}$, 115 \indexspace \item abbreviations, 5, 30 \item Ackerman's Function, 90 \item all, x \item alphabet, 75 \item and, x, 5 \item assignment, 7, 34, 35 \subitem extended, 35 \subitem truth, 7 \item atomic formulas, 3, 27 \item axiom, 11, 28, 39 \subitem for basic arithmetic, 117 \subsubitem N1, 117 \subsubitem N2, 117 \subsubitem N3, 117 \subsubitem N4, 117 \subsubitem N5, 117 \subsubitem N6, 117 \subsubitem N7, 117 \subsubitem N8, 117 \subsubitem N9, 117 \subitem logical, 43 \subitem schema, 11, 42 \subsubitem A1, 11, 42 \subsubitem A2, 11, 42 \subsubitem A3, 11, 42 \subsubitem A4, 42 \subsubitem A5, 42 \subsubitem A6, 42 \subsubitem A7, 42 \subsubitem A8, 43 \indexspace \item blank cell, 67 \item blank tape, 67 \item bound variable, 29 \item bounded minimalization, 92 \item busy beaver competition, 83 \subitem $n$-state entry, 83 \subitem score in, 83 \indexspace \item cell, 67 \subitem blank, 67 \subitem marked, 67 \subitem scanned, 68 \item characteristic function, 82 \item chicken, 134 \item Church's Thesis, xi \item clique, 55 \item code \subitem G\"odel, 113 \subsubitem of sequences, 113 \subsubitem of symbols of $\mathcal{L}_N$, 113 \subitem of a sequence of tape positions, 96 \subitem of a tape position, 95 \subitem of a Turing machine, 97 \item Compactness Theorem, 16, 51 \subitem applications of, 53 \item complement, 133 \item complete set of sentences, 112 \item completeness, 112 \item Completeness Theorem, 16, 50, 137 \item composition, 85 \item computable \subitem function, 82 \subitem set of formulas, 115 \item computation, 71 \subitem partial, 71 \item connectives, 3, 4, 24 \item consistent, 15, 47 \subitem maximally, 15, 48 \item constant, 24, 25, 31, 33, 35 \item constant function, 85 \item contradiction, 9, 38 \item convention \subitem common symbols, 25 \subitem parentheses, 5, 30 \item countable, 134 \item crash, 70, 78 \item cross product, 133 \indexspace \item decision problem, x \item deduction, 12, 43 \item Deduction Theorem, 13, 44, 137 \item definable \subitem function, 125 \subitem relation, 125 \item domain (of a function), 81 \indexspace \item edge, 54 \item egg, 134 \item element, 133 \item elementary equivalence, 56 \item Entscheidungsproblem, x, 111 \item equality, 24, 25 \item equivalence \subitem elementary, 56 \item existential quantifier, 30 \item extension of a language, 30 \indexspace \item finite, 134 \item first-order \subitem language for number theory, 112 \subitem languages, 23 \subitem logic, x, 23 \item Fixed-Point Lemma, 123 \item for all, 25 \item formula, 3, 27 \subitem atomic, 3, 27 \subitem unique readability, 6, 32 \item free variable, 29 \item function, 24, 31, 33, 35 \subitem $k$-place, 24, 25, 81 \subitem bounded minimalization of, 92 \subitem composition of, 85 \subitem computable, 82 \subitem constant, 85 \subitem definable in $\mathfrak{N}$, 125 \subitem domain of, 81 \subitem identity, 82 \subitem initial, 85 \subitem partial, 81 \subitem primitive recursion of, 87 \subitem primitive recursive, 88 \subitem projection, 85 \subitem recursive, x, 92 \subitem regular, 92 \subitem successor, 85 \subitem Turing computable, 82 \subitem unbounded minimalization of, 91 \subitem zero, 85 \indexspace \item G\"odel code \subitem of sequences, 113 \subitem of symbols of $\mathcal{L}_N$, 113 \item G\"odel Incompleteness Theorem, 111 \subitem First Incompleteness Theorem, 124 \subitem Second Incompleteness Theorem, 124 \item generalization, 42 \item Generalization Theorem, 45, 137 \item On Constants, 45 \item gothic characters, 33 \item graph, 54 \item Greek characters, 3, 28, 135 \indexspace \item halt, 70, 78 \item Halting Problem, 98 \item head, 67 \subitem multiple, 75 \subitem separate, 75 \indexspace \item identity function, 82 \item if \dots then, x, 3, 25 \item if and only if, 5 \item implies, 10, 38 \item Incompleteness Theorem, 111 \subitem G\"odel's First, 124 \subitem G\"odel's Second, 124 \item inconsistent, 15, 47 \item independent set, 55 \item inference rule, 11 \item infinite, 134 \item Infinite Ramsey's Theorem, 55 \item infinitesimal, 57 \item initial function, 85 \item input tape, 71 \item intersection, 133 \item isomorphism of structures, 55 \indexspace \item John, 134 \indexspace \item $k$-place function, 81 \item $k$-place relation, 81 \indexspace \item language, 26, 31 \subitem extension of, 30 \subitem first-order, 23 \subitem first-order number theory, 112 \subitem formal, ix \subitem natural, ix \subitem propositional, 3 \item limericks, 137 \item logic \subitem first-order, x, 23 \item mathematical, ix \item natural deductive, ix \item predicate, 3 \item propositional, x, 3 \item sentential, 3 \item logical axiom, 43 \indexspace \item machine, 69 \subitem Turing, xi, 67, 69 \item marked cell, 67 \item mathematical logic, ix \item maximally consistent, 15, 48 \item metalanguage, 31 \item metatheorem, 31 \item minimalization \subitem bounded, 92 \subitem unbounded, 91 \item model, 37 \item Modus Ponens, 11, 43 \item MP, 11, 43 \indexspace \item natural deductive logic, ix \item natural numbers, 81 \item non-standard model, 55, 57 \subitem of the real numbers, 57 \item not, x, 3, 25 \item $n$-state \subitem Turing machine, 69 \subitem entry in busy beaver competition, 83 \item number theory \subitem first-order language for, 112 \indexspace \item or, x, 5 \item output tape, 71 \indexspace \item parentheses, 3, 24 \subitem conventions, 5, 30 \subitem doing without, 4 \item partial \subitem computation, 71 \subitem function, 81 \item position \subitem tape, 68 \item power set, 133 \item predicate, 24, 25 \item predicate logic, 3 \item premiss, 12, 43 \item primitive recursion, 87 \item primitive recursive \subitem function, 88 \subitem recursive relation, 89 \item projection function, 85 \item proof, 12, 43 \item propositional logic, x, 3 \item proves, 12, 43 \item punctuation, 3, 25 \indexspace \item quantifier \subitem existential, 30 \subitem scope of, 30 \subitem universal, 24, 25, 30 \indexspace \item Ramsey number, 55 \item Ramsey's Theorem, 55 \subitem Infinite, 55 \item range of a function, 81 \item r.e., 99 \item recursion primitive, 87 \item recursive \subitem function, 92 \subitem functions, xi \subitem relation, 93 \subitem set of formulas, 115 \item recursively enumerable, 99 \subitem set of formulas, 115 \item regular function, 92 \item relation, 24, 31, 33 \item binary, 25, 134 \item characteristic function of, 82 \item definable in $\mathfrak{N}$, 125 \item $k$-place, 24, 25, 81, 133 \item primitive recursive, 89 \item recursive, 93 \item Turing computable, 93 \item represent (in $\mathrm{Th}(\Sigma)$) \subitem a function, 118 \subitem a relation, 119 \item representable (in $\mathrm{Th}(\Sigma)$) \subitem function, 118 \subitem relation, 119 \item rule of inference, 11, 43 \indexspace \item satisfiable, 9, 37 \item satisfies, 9, 36, 37 \item scanned cell, 68 \item scanner, 67, 75 \item scope of a quantifier, 30 \item score \subitem in busy beaver competition, 83 \item sentence, 29 \item sentential logic, 3 \item sequence of tape positions \subitem code of, 96 \item set theory, 133 \item Soundness Theorem, 15, 47, 137 \item state, 68, 69 \item structure, 33 \item subformula, 6, 29 \item subgraph, 54 \item subset, 133 \item substitutable, 41 \item substitution, 41 \item successor \subitem function, 85 \subitem tape position, 71 \item symbols, 3, 24 \subitem logical, 24 \subitem non-logical, 24 \indexspace \item table \subitem of a Turing machine, 70 \item tape, 67 \subitem blank, 67 \subitem higher-dimensional, 75 \subitem input, 71 \subitem multiple, 75 \subitem output, 71 \subitem tape position, 68 \subsubitem code of, 95 \subsubitem code of a sequence of, 96 \subsubitem successor, 71 \subitem two-way infinite, 75, 78 \item Tarski's Undefinability Theorem, 125 \item tautology, 9, 38 \item term, 26, 31, 35 \item theorem, 31 \item theory, 39, 45 \subitem of $\mathfrak{N}$, 124 \subitem of a set of sentences, 112 \item there is, x \item TM, 69 \item true in a structure, 37 \item truth \subitem assignment, 7 \subitem in a structure, 36, 37 \subitem table, 8, 9 \subitem values, 7 \item Turing computable \subitem function, 82 \subitem relation, 93 \item Turing machine, xi, 67, 69 \subitem code of, 97 \subitem crash, 70 \subitem halt, 70 \subitem $n$-state, 69 \subitem table for, 70 \subitem universal, 95, 97 \item two-way infinite tape, 75, 78 \indexspace \item unary notation, 82 \item unbounded minimalization, 91 \item uncountable, 134 \item Undefinability Theorem, Tarski's, 125 \item union, 133 \item unique readability \subitem of formulas, 6, 32 \subitem of terms, 32 \item Unique Readability Theorem, 6, 32 \item universal \subitem quantifier, 30 \subitem Turing machine, 95, 97 \item universe (of a structure), 33 \item UTM, 95 \indexspace \item variable, 24, 31, 34, 35 \subitem bound, 29 \subitem free, 29 \item vertex, 54 \indexspace \item witnesses, 48 \item Word, 134 \indexspace \item zero function, 85 \end{theindex} \end{document}

%&LaTeX %================================================================= % % This is the file pcml-16.tex, the source file for A Problem % Course in Mathematical Logic [Version 1.6] % % Copyright (c) 1994-2003 by Stefan Bilaniuk. % Permission is granted to copy, distribute and/or modify this % document under the terms of the GNU Free Documentation License, % Version 1.2 or any later version published by the Free Software % Foundation; with no Invariant Sections, no Front-Cover Texts, % and no Back-Cover Texts. A copy of the license is included in % the section entitled "GNU Free Documentation License". % %================================================================= % % Typeset using LaTeX with the AMS-LaTeX 1.2 and AMS-Fonts 2.1 % packages. If you have problems, please contact the author. % %================================================================= % % The latest version of A Problem Course in Mathematical Logic can % be found at: http://www.trentu.ca/mathematics/sb/pcml/ % %================================================================= % % Stefan Bilaniuk % Department of Mathematics office: 705 748-1011 x1474 % Trent University home: 705 755-1193 % Peterborough, Ontario % Canada K9J 7B8 e-mail: sbilaniuk@trentu.ca % % home page: http://www.trentu.ca/mathematics/sb/ % %================================================================= % If you change the document class or the size option, you may % need to recompile the index. \documentclass[12pt]{amsbook} \usepackage{amssymb} \newcommand{\proves}{\vdash} \newcommand{\nproves}{\nvdash} \newcommand{\nmodels}{\nvDash} \newcommand{\restricted}{\!\upharpoonright\!} \newcommand{\fromto}{\leftrightarrow} \newcommand{\tover}[2]{\genfrac{}{}{0pt}{2}{#1}{#2}} \renewcommand{\thepart}{\Roman{part}} \renewcommand{\thechapter}{\arabic{chapter}} \providecommand{\mathsc}[1]{\ensuremath{\text{\textsc{#1}}}} \setcounter{tocdepth}{0} \theoremstyle{plain} \newtheorem{thm}{Theorem}[chapter] \newtheorem{prop}[thm]{Proposition} \newtheorem{cor}[thm]{Corollary} \newtheorem{lem}[thm]{Lemma} \newtheorem{prob}[thm]{Problem} \theoremstyle{definition} \newtheorem{defn}{Definition}[chapter] \newtheorem{exmp}{Example}[chapter] \theoremstyle{remark} \newtheorem*{rem}{Remark} %\renewcommand{\therem}{} \newtheorem*{note}{Note} %\renewcommand{\thenote}{} \newenvironment{clue}[1]% {\begin{proof}[\ref{#1}]}% {\renewcommand{\qed}{}\end{proof}} % Correct usage: \begin{clue}{} % % \end{clue} \newenvironment{poem}[1]% {\begin{verse} \mbox{}\\ {\bf #1}\\ \mbox{}\\}% {\end{verse}} % Correct usage: \begin{poem}{} % <body of poem with lines separated by '\\'> % \end{poem} \newenvironment{question}[1]% {\begin{proof}[#1]}% {\renewcommand{\qed}{}\end{proof}} % Correct usage: \begin{question}{<name of question>} % <text of question> % \end{question} % Uncomment "\makeindex" if you need to recompile the index. % (See The LaTeX Manual for information on how to do so.) % The given index was modified by hand to get nice formatting... %\makeindex \begin{document} \frontmatter \title{ A Problem Course \\ in \\ Mathematical Logic \\ {\em Version 1.6\/} } \author{Stefan Bilaniuk} \date{????} \address{Department of Mathematics\newline \indent Trent University\newline \indent Peterborough, Ontario\newline \indent Canada K9J 7B8} \email{sbilaniuk@trentu.ca} \thanks{} \keywords{logic, computability, incompleteness} \subjclass{03} \begin{abstract} This is a text for a problem-oriented course on mathematical logic and computability.\\ \\ \\ \\ Copyright \copyright\ 1994-2003 Stefan Bilaniuk.\\ Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled ``GNU Free Documentation License''.\\ \\ \\ \\ This work was typeset with \LaTeX, using the \AmS-\LaTeX\ and \AmS Fonts packages of the American Mathematical Society. \end{abstract} \maketitle \tableofcontents % % Preface to "A Problem Course in Mathematical Logic" % \chapter*{Preface} This book is a free text intended to be the basis for a problem-oriented course(s) in mathematical logic and computability for students with some degree of mathematical sophistication. Parts I and II cover the basics of propositional and first-order logic respectively, Part III covers the basics of computability using Turing machines and recursive functions, and Part IV covers G\"odel's Incompleteness Theorems. They can be used in various ways for courses of various lengths and mixes of material. The author typically uses Parts I and II for a one-term course on mathematical logic, Part III for a one-term course on computability, and/or much of Part III together with Part IV for a one-term course on computability and incompleteness. In keeping with the modified Moore-method, this book supplies definitions, problems, and statements of results, along with some explanations, examples, and hints. The intent is for the students, individually or in groups, to learn the material by solving the problems and proving the results for themselves. Besides constructive criticism, it will probably be necessary for the instructor to supply further hints or direct the students to other sources from time to time. Just how this text is used will, of course, depend on the instructor and students in question. However, it is probably {\em not\/} appropriate for a conventional lecture-based course nor for a really large class. The material presented in this text is somewhat stripped-down. Various concepts and topics that are often covered in introductory mathematical logic and computability courses are given very short shrift or omitted entirely.\footnote{Future versions of both volumes may include more -- or less! -- material. Feel free to send suggestions, corrections, criticisms, and the like --- I'll feel free to ignore them or use them.} Instructors might consider having students do projects on additional material if they wish to to cover it. \subsection*{Prerequisites} The material in this text is largely self-contained, though some knowledge of (very basic) set theory and elementary number theory is assumed at several points. A few problems and examples draw on concepts from other parts of mathematics; students who are not already familiar with these should consult texts in the appropriate subjects for the necessary definitions. What is really needed to get anywhere with all of the material developed here is competence in handling abstraction and proofs, including proofs by induction. The experience provided by a rigorous introductory course in abstract algebra, analysis, or discrete mathematics ought to be sufficient. \subsection*{Chapter Dependencies} The following diagram indicates how the parts and chapters depend on one another, with the exception of a few isolated problems or subsections. \vspace{5mm} \begin{picture}(283,283) \put(47,257){\framebox(21,21){1}} \put(173,257){\framebox(21,21){10}} \put(5,215){\framebox(21,21){2}} \put(89,215){\framebox(21,21){3}} \put(131,215){\framebox(21,21){11}} \put(215,215){\framebox(21,21){12}} \put(47,173){\framebox(21,21){4}} \put(173,173){\framebox(21,21){13}} \put(47,131){\framebox(21,21){5}} \put(173,131){\framebox(21,21){14}} \put(5,89){\framebox(21,21){6}} \put(89,89){\framebox(21,21){7}} \put(173,89){\framebox(21,21){15}} \put(47,47){\framebox(21,21){8}} \put(131,47){\framebox(21,21){16}} \put(215,47){\framebox(21,21){17}} \put(47,5){\framebox(21,21){9}} \put(173,5){\framebox(21,21){18}} \put(0,168){\dashbox(115,115)[lb]{I}} \put(0,0){\dashbox(115,157)[lb]{II}} \put(126,126){\dashbox(115,157)[lb]{III}} \put(126,0){\dashbox(115,115)[lb]{IV}} \put(58,257){\vector(-2,-1){42}} % 1 -> 2 \put(58,257){\vector(2,-1){42}} % 1 -> 3 \put(184,257){\vector(-2,-1){42}} % 10 -> 11 \put(184,257){\vector(2,-1){42}} % 10 -> 12 \put(16,215){\vector(2,-1){42}} % 2 -> 4 \put(100,215){\vector(-2,-1){42}} % 3 -> 4 \put(100,215){\vector(0,-1){105}} % 3 ->7 \put(226,215){\vector(-2,-1){42}} % 12 -> 13 \put(184,173){\vector(0,-1){21}} % 13 -> 14 \put(58,131){\vector(-2,-1){42}} % 5 -> 6 \put(58,131){\vector(2,-1){42}} % 5 -> 7 \put(184,131){\vector(0,-1){21}} % 14 -> 15 \put(110,100){\vector(1,0){63}} % 7 -> 15 \put(16,89){\vector(2,-1){42}} % 6 -> 8 \put(100,89){\vector(-2,-1){42}} % 7 -> 8 \put(184,89){\vector(-2,-1){42}} % 15 -> 16 \put(184,89){\vector(2,-1){42}} % 15 -> 17 \put(58,47){\vector(0,-1){21}} % 8 -> 9 \put(142,47){\vector(2,-1){42}} % 16 -> 18 \put(226,47){\vector(-2,-1){42}} % 16 -> 18 \end{picture} \subsection*{Acknowledgements} Various people and institutions deserve some credit for this text. Foremost are all the people who developed the subject, even though almost no attempt has been made to give due credit to those who developed and refined the ideas, results, and proofs mentioned in this work. In mitigation, it would often be difficult to assign credit fairly because many people were involved, frequently having interacted in complicated ways. Those interested in who did what should start by consulting other texts or reference works covering similar material. In particular, a number of the key papers in the development of modern mathematical logic can be found in \cite{JvH:FFG} and \cite{DA:U}. Others who should be acknowledged include my teachers and colleagues; my students at Trent University who suffered, suffer, and will suffer through assorted versions of this text; Trent University and the taxpayers of Ontario, who paid my salary; Ohio University, where I spent my sabbatical in 1995--96; all the people and organizations who developed the software and hardware with which this book was prepared. Gregory H. Moore, whose mathematical logic course convinced me that I wanted to do the stuff, deserves particular mention. Any blame properly accrues to the author. \subsection*{Availability} The URL of the home page for {\em A Problem Course In Mathematical Logic\/}, with links to \LaTeX, PostScript, and Portable Document Format (pdf) files of the latest available release is: \begin{itemize} \item[] {\tt http://euclid.trentu.ca/math/sb/pcml/} \end{itemize} Please note that to typeset the \LaTeX\ source files, you will need the \AmS-\LaTeX\ and \AmS Fonts packages in addition to \LaTeX. If you have any problems, feel free to contact the author for assistance, preferably by e-mail: \begin{verse} Stefan Bilaniuk\\ Department of Mathematics\\ Trent University\\ Peterborough, Ontario\\ K9J 7B8\\ {\em e-mail\/}: {\tt sbilaniuk@trentu.ca}\\ \end{verse} \subsection*{Conditions} See the {\em GNU Free Documentation License\/} in Appendix~\ref{app:gnufdl} for what you can do with this text. The gist is that you are free to copy, distribute, and use it unchanged, but there are some restrictions on what you can do if you wish to make changes. If you wish to use this text in a manner not covered by the {\em GNU Free Documentation License\/}, please contact the author. \subsection*{Author's Opinion} It's not great, but the price is right! % % Introduction to "A Problem Course in Mathematical Logic" % \chapter*{Introduction} What sets mathematics aside from other disciplines is its reliance on proof as the principal technique for determining truth, where science, for example, relies on (carefully analyzed) experience. So what is a proof? Practically speaking, a proof is any reasoned argument accepted as such by other mathematicians.\footnote{If you are not a mathematician, gentle reader, you are hereby temporarily promoted.} A more precise definition is needed, however, if one wishes to discover what mathematical reasoning can -- or cannot -- accomplish in principle. This is one of the reasons for studying mathematical logic, which is also pursued for its own sake and in order to find new tools to use in the rest of mathematics and in related fields. In any case, mathematical logic \index{mathematical logic} \index{logic mathematical} is concerned with formalizing and analyzing the kinds of reasoning used in the rest of mathematics. The point of mathematical logic is not to try to do mathematics {\em per se\/} completely formally --- the practical problems involved in doing so are usually such as to make this an exercise in frustration --- but to study formal logical systems as mathematical objects in their own right in order to (informally!) prove things about them. For this reason, the formal systems developed in this part and the next are optimized to be easy to prove things about, rather than to be easy to use. Natural deductive \index{logic natural deductive} \index{natural deductive logic} systems such as those developed by philosophers to formalize logical reasoning are equally capable in principle and much easier to actually use, but harder to prove things about. Part of the problem with formalizing mathematical reasoning is the necessity of precisely specifying the language(s) in which it is to be done. The natural languages\index{language natural} spoken by humans won't do: they are so complex and continually changing as to be impossible to pin down completely. By contrast,\index{language formal} the languages which underly formal logical systems are, like programming languages, rigidly defined but much simpler and less flexible than natural languages. A formal logical system also requires the careful specification of the allowable rules of reasoning, plus some notion of how to interpret statements in the underlying language and determine their truth. The real fun lies in the relationship between interpretation of statements, truth, and reasoning. The {\em de facto\/} standard for formalizing mathematical systems is first-order logic, \index{logic first-order} \index{first-order logic} and the main thrust of this text is studying it with a view to understanding some of its basic features and limitations. More specifically, Part I of this text is concerned with propositional logic\index{logic propositional}\index{propositional logic}, developed here as a warm-up for the development of first-order logic proper in Part II. Propositional logic \index{logic propositional} \index{propositional logic} attempts to make precise the relationships that certain connectives like {\em not\/},\index{not} {\em and\/},\index{and} {\em or\/},\index{or} and {\em if \dots then\/}\index{if \dots then} are used to express in English. While it has uses, propositional logic is not powerful enough to formalize most mathematical discourse. For one thing, it cannot handle the concepts expressed by the quantifiers {\em all\/}\index{all} and {\em there is\/}\index{there is}. First-order logic \index{logic first-order} \index{first-order logic} adds these notions to those propositional logic handles, and suffices, in principle, to formalize most mathematical reasoning. The greater flexibility and power of first-order logic makes it a good deal more complicated to work with, both in syntax and semantics. However, a number of results about propositional logic carry over to first-order logic with little change. Given that first-order logic can be used to formalize most mathematical reasoning it provides a natural context in which to ask whether such reasoning can be automated. This question is the {\em Entscheidungsproblem\/}\footnote{ {\em Entscheidungsproblem\/} $\equiv$ decision problem. \index{Entscheidungsproblem} \index{decision problem} }: \begin{question}{Entscheidungsproblem} \index{Entscheidungsproblem} Given a set $\Sigma$ of hypotheses and some statement $\varphi$, is there an effective method for determining whether or not the hypotheses in $\Sigma$ suffice to prove $\varphi$? \end{question} Historically, this question arose out of David Hilbert's scheme to secure the foundations of mathematics by axiomatizing mathematics in first-order logic, showing that the axioms in question do not give rise to any contradictions, and that they suffice to prove or disprove every statement (which is where the Entscheidungsproblem comes in). If the answer to the Entscheidungsproblem were ``yes'' in general, the effective method(s) in question might put mathematicians out of business\dots Of course, the statement of the problem begs the question of what ``effective method'' is supposed to mean. In the course of trying to find a suitable formalization of the notion of ``effective method'', mathematicians developed several different abstract models of computation in the 1930's, including recursive functions, $\lambda$-calculus, Turing machines, and grammars\footnote{The development of the theory of computation thus actually began before the development of electronic digital computers. In fact, the computers and programming languages we use today owe much to the abstract models of computation which preceded them. For example, the standard von Neumann architecture for digital computers was inspired by Turing machines and the programming language LISP borrows much of its structure from $\lambda$-calculus.}. Although these models are very different from each other in spirit and formal definition, it turned out that they were all essentially equivalent in what they could do. This suggested the (empirical, not mathematical!) principle: \begin{question}{Church's Thesis} \index{Church's Thesis} A function is effectively computable in principle in the real world if and only if it is computable by (any) one of the abstract models mentioned above. \end{question} Part III explores two of the standard formalizations of the notion of ``effective method'', namely Turing machines \index{Turing machine}\index{machine Turing} and recursive functions\index{recursive functions}\index{functions recursive}, showing, among other things, that these two formalizations are actually equivalent. Part IV then uses the tools developed in Parts II ands III to answer the Entscheidungsproblem for first-order logic. The answer to the general problem is negative, by the way, though decision procedures do exist for propositional logic, and for some particular first-order languages and sets of hypotheses in these languages. \subsection*{Prerequisites} In principle, not much is needed by way of prior mathematical knowledge to define and prove the basic facts about propositional logic and computability. Some knowledge of the natural numbers and a little set theory suffices; the former will be assumed and the latter is very briefly summarized in Appendix~\ref{ap:sets}. (\cite{JH:OST} is a good introduction to basic set theory in a style not unlike this book's; \cite{PH:NST} is a good one in a more conventional mode.) Competence in handling abstraction and proofs, especially proofs by induction, will be needed, however. In principle, the experience provided by a rigorous introductory course in algebra, analysis, or discrete mathematics ought to be sufficient. \subsection*{Other Sources and Further Reading} \cite{BE:LPL}, \cite{DA:CU}, \cite{HE:MIL}, \cite{JM:IML}, and \cite{YM:CML} are texts which go over large parts of the material covered here (and often much more besides), while \cite{JB:HML} and \cite{CK:MT} are good references for more advanced material. A number of the key papers in the development of modern mathematical logic and related topics can be found in \cite{JvH:FFG} and \cite{DA:U}. Entertaining accounts of some related topics may be found in \cite{DH:GEB}, \cite{RP:ENM} and\cite{RP:SOTM}. Those interested in natural deductive systems might try \cite{MB:LB}, which has a very clean presentation. \mainmatter % % Part I of "A Problem Course in Mathematical Logic" % \part{Propositional Logic} % % First chapter of "A Problem Course in Mathematical Logic" % \chapter{Language} \label{ch:one} Propositional logic\index{propositional logic} \index{logic propositional} (sometimes called sentential\index{sentential logic}\index{logic sentential} or predicate logic\index{predicate logic}\index{logic predicate}) attempts to formalize the reasoning that can be done with connectives like {\em not\/}, {\em and\/}, {\em or\/}, and {\em if \dots then\/}. We will define the formal language of propositional logic\index{language propositional}, $\mathcal{L}_P$\index{$\mathcal{L}_P$}, by specifying its symbols and rules for assembling these symbols into the formulas of the language. \begin{defn} \label{d:symb} \index{symbols} The {\em symbols\/} of $\mathcal{L}_P$ are: \begin{enumerate} \item Parentheses: ( and ). \index{parentheses} \index{$($} \index{$)$} \item Connectives: $\lnot$ and $\to$. \index{connectives} \index{$\lnot$} \index{$\to$} \item Atomic formulas: $A_0$, $A_1$, $A_2$, \dots, $A_n$, \dots \index{formulas atomic} \index{atomic formulas} \index{$A_n$} \end{enumerate} \end{defn} We still need to specify the ways in which the symbols of $\mathcal{L}_P$ can be put together. \begin{defn} \label{d:form} \index{formula} The {\em formulas\/} of $\mathcal{L}_P$ are those finite sequences or strings of the symbols given in Definition~\ref{d:symb} which satisfy the following rules: \begin{enumerate} \item Every atomic formula is a formula. \item If $\alpha$ is a formula, then $(\lnot \alpha)$ is a formula. \item If $\alpha$ and $\beta$ are formulas, then $(\alpha \to \beta)$ is a formula. \item No other sequence of symbols is a formula. \end{enumerate} \end{defn} We will often use lower-case Greek characters\index{Greek characters} to represent formulas, as we did in the definition above, and upper-case Greek characters to represent sets of formulas.\footnote{The Greek alphabet is given in Appendix~\ref{ap:greek}.} All formulas in Chapters~\ref{ch:one}--\ref{ch:four} will be assumed to be formulas of $\mathcal{L}_P$ unless stated otherwise. What do these definitions mean? The parentheses are just punctuation:\index{punctuation} their only purpose is to group other symbols together. (One could get by without them; see Problem~\ref{p:pn}.) $\lnot$\index{$\lnot$} and $\to$\index{$\to$} are supposed to represent the connectives\index{connectives} {\em not\/}\index{not} and {\em if \dots then\/}\index{if \dots then} respectively. The atomic formulas\index{atomic formulas}\index{formulas atomic}, $A_0$, $A_1$, \dots,\index{$A_n$} are meant to represent statements that cannot be broken down any further using our connectives, such as ``The moon is made of cheese.'' Thus, one might translate the the English sentence ``If the moon is red, it is not made of cheese'' into the formula $(A_0 \to (\lnot A_1))$ of $\mathcal{L}_P$ by using $A_0$ to represent ``The moon is red'' and $A_1$ to represent ``The moon is made of cheese.'' Note that the truth of the formula depends on the interpretation of the atomic sentences which appear in it. Using the interpretations just given of $A_0$ and $A_1$, the formula $(A_0 \to (\lnot A_1))$ is true, but if we instead use $A_0$ and $A_1$ to interpret ``My telephone is ringing'' and ``Someone is calling me'', respectively, $(A_0 \to (\lnot A_1))$ is false. Definition~\ref{d:form} says that that every atomic formula is a formula and every other formula is built from shorter formulas using the connectives and parentheses in particular ways. For example, $A_{1123}$, $(A_2 \to (\lnot A_0))$, and $(((\lnot A_1) \to (A_1 \to A_7) ) \to A_7)$ are all formulas, but $X_3$, $(A_5)$, $()\lnot A_{41}$, $A_5 \to A_7$, and $(A_2 \to (\lnot A_0)$ are not. \begin{prob} \label{p:one1} Why are the following {\em not\/} formulas of $\mathcal{L}_P$? There might be more than one reason\dots \begin{enumerate} \item $A_{-56}$ \item $(Y \to A)$ \item $(A_7 \leftarrow A_4)$ \item $A_7 \to (\lnot A_5))$ \item $(A_8 A_9 \to A_{1043998}$ \item $(((\lnot A_1) \to (A_\ell \to A_7) \to A_7)$ \end{enumerate} \end{prob} \begin{prob} \label{p:lrp} Show that every formula of $\mathcal{L}_P$ has the same number of left parentheses as it has of right parentheses. \end{prob} \begin{prob} \label{p:one3} Suppose $\alpha$ is any formula of $\mathcal{L}_P$. Let $\ell(\alpha)$ be the length of $\alpha$ as a sequence of symbols and let $p(\alpha)$ be the number of parentheses (counting both left and right parentheses) in $\alpha$. What are the minimum and maximum values of $p(\alpha) / \ell(\alpha)$? \end{prob} \begin{prob} \label{p:one4} Suppose $\alpha$ is any formula of $\mathcal{L}_P$. Let $s(\alpha)$ be the number of atomic formulas in $\alpha$ (counting repetitions) and let $c(\alpha)$ be the number of occurrences of $\to$ in $\alpha$. Show that $s(\alpha) = c(\alpha) + 1$. \end{prob} \begin{prob} \label{p:lof} What are the possible lengths of formulas of $\mathcal{L}_P$? Prove it. \end{prob} \begin{prob} \label{p:pn} \index{parentheses doing without} Find a way for doing without parentheses or other punctuation symbols in defining a formal language for propositional logic. \end{prob} \begin{prop} \label{p:foc} Show that the set of formulas of $\mathcal{L}_P$ is countable. \end{prop} \subsection*{Informal Conventions} At first glance, $\mathcal{L}_P$ may not seem capable of breaking down English sentences with connectives other than {\em not\/} and {\em if \dots then\/}. However, the sense of many other connectives\index{connectives} can be captured by these two by using suitable circumlocutions. We will use the symbols $\land$\index{$\land$}, $\lor$\index{$\lor$}, and $\fromto$\index{$\fromto$} to represent {\em and\/}\index{and}, {\em or\/}\index{or},\footnote{We will use {\em or\/} inclusively, so that ``$A$ or $B$'' is still true if both of $A$ and $B$ are true.} and {\em if and only if\/}\index{if and only if} respectively. Since they are not among the symbols of $\mathcal{L}_P$, we will use them as abbreviations\index{abbreviations} for certain constructions involving only $\lnot$ and $\to$. Namely, \begin{itemize} \item $(\alpha \land \beta)$ is short for $(\lnot (\alpha \to (\lnot \beta)))$, \item $(\alpha \lor \beta)$ is short for $( (\lnot \alpha) \to \beta)$, and \item $(\alpha \fromto \beta)$ is short for $((\alpha \to \beta) \land (\beta \to \alpha))$. \end{itemize} Interpreting $A_0$ and $A_1$ as before, for example, one could translate the English sentence ``The moon is red and made of cheese'' as $(A_0 \land A_1)$. (Of course this is really $(\lnot (A_0 \to (\lnot A_1)))$, {\em i.e.\/} ``It is not the case that if the moon is green, it is not made of cheese.'') $\land$, $\lor$, and $\fromto$ were not included among the official symbols of $\mathcal{L}_P$ partly because we can get by without them and partly because leaving them out makes it easier to prove things about $\mathcal{L}_P$. \begin{prob} \label{p:one8} Take a couple of English sentences with several connectives and translate them into formulas of $\mathcal{L}_P$. You may use $\land$, $\lor$, and $\fromto$ if appropriate. \end{prob} \begin{prob} \label{p:one9} Write out $((\alpha \lor \beta) \land (\beta \to \alpha))$ using only $\lnot$ and $\to$. \end{prob} For the sake of readability, we will occasionally use some informal conventions that let us get away with writing fewer parentheses:\index{parentheses conventions}\index{conventions, parentheses} \begin{itemize} \item We will usually drop the outermost parentheses in a formula, writing $\alpha \to \beta$ instead of $(\alpha \to \beta)$ and $\lnot \alpha$ instead of $(\lnot \alpha)$. \item We will let $\lnot$ take precedence over $\to$ when parentheses are missing, so $\lnot \alpha \to \beta$ is short for $((\lnot\alpha) \to \beta)$, and fit the informal connectives into this scheme by letting the order of precedence be $\lnot$, $\land$, $\lor$, $\to$, and $\fromto$. \item Finally, we will group repetitions of $\to$, $\lor$, $\land$, or $\fromto$ to the right when parentheses are missing, so $\alpha \to \beta \to \gamma$ is short for $(\alpha \to (\beta \to \gamma))$. \end{itemize} Just like formulas using $\lor$, $\land$, or $\lnot$, formulas in which parentheses have been omitted as above are not official formulas of $\mathcal{L}_P$, they are convenient abbreviations for official formulas of $\mathcal{L}_P$. Note that a precedent for the precedence convention can be found in the way that $\cdot$ commonly takes precedence over $+$ in writing arithmetic formulas. \begin{prob} \label{p:one10} Write out $\lnot (\alpha \fromto \lnot \delta ) \land \beta \to \lnot \alpha \to \gamma$ first with the missing parentheses included and then as an official formula of $\mathcal{L}_P$. \end{prob} The following notion will be needed later on. \begin{defn} \label{d:subf} \index{subformula} \index{$\mathcal{S}$} Suppose $\varphi$ is a formula of $\mathcal{L}_P$. The set of {\em subformulas\/} of $\varphi$, $\mathcal{S}(\varphi)$, is defined as follows. \begin{enumerate} \item If $\varphi$ is an atomic formula, then $\mathcal{S}(\varphi) = \{ \varphi \}$. \item If $\varphi$ is $(\lnot \alpha)$, then $\mathcal{S}(\varphi) = \mathcal{S}(\alpha) \cup \{ (\lnot\alpha) \}$. \item If $\varphi$ is $(\alpha \to \beta)$, then $\mathcal{S}(\varphi) = \mathcal{S}(\alpha) \cup \mathcal{S}(\beta) \cup \{ (\alpha \to \beta) \}$. \end{enumerate} \end{defn} For example, if $\varphi$ is $(((\lnot A_1) \to A_7) \to (A_8 \to A_1))$, then $\mathcal{S}(\varphi)$ includes $A_1$, $A_7$, $A_8$, $(\lnot A_1)$, $(A_8 \to A_1)$, $((\lnot A_1) \to A_7)$, and $(((\lnot A_1) \to A_7) \to (A_8 \to A_1))$ itself. Note that if you write out a formula with all the official parentheses, then the subformulas are just the parts of the formula enclosed by matching parentheses, plus the atomic formulas. In particular, every formula is a subformula of itself. Note that some subformulas of formulas involving our informal abbreviations $\lor$, $\land$, or $\fromto$ will be most conveniently written using these abbreviations. For example, if $\psi$ is $A_4 \to A_1 \lor A_4$, then \[ \mathcal{S}(\psi) = \{\, A_1,\, A_4,\, (\lnot A_1),\, (A_1 \lor A_4),\, (A_4 \to (A_1 \lor A_4)) \,\}\, . \] (As an exercise, where did $(\lnot A_1)$ come from?) \begin{prob} \label{p:one11} Find all the subformulas of each of the following formulas. \begin{enumerate} \item $(\lnot ((\lnot A_{56}) \to A_{56}))$ \item $A_9 \to A_8 \to \lnot (A_{78} \to \lnot \lnot A_0)$ \item $\lnot A_0 \land \lnot A_1 \fromto \lnot (A_0 \lor A_1)$ \end{enumerate} \end{prob} \subsection*{Unique Readability} The slightly paranoid --- er, truly rigorous --- might ask whether Definitions \ref{d:symb} and \ref{d:form} actually ensure that the formulas of $\mathcal{L}_P$ are unambiguous, {\em i.e.\/} can be read in only one way according to the rules given in Definition \ref{d:form}. To actually prove this one must add to Definition \ref{d:symb} the requirement that all the symbols of $\mathcal{L}_P$ are distinct and that no symbol is a subsequence of any other symbol. With this addition, one can prove the following: \begin{thm}[Unique Readability Theorem] \label{t:ur} \index{Unique Readability Theorem} \index{formula unique readability} \index{unique readability of formulas} A formula of $\mathcal{L}_P$ must satisfy exactly one of conditions 1--3 in Definition \ref{d:form}. \end{thm} % % Second chapter of "A Problem Course in Mathematical Logic" % \chapter{Truth Assignments} \label{ch:two} Whether a given formula $\varphi$ of $\mathcal{L}_P$ is true or false usually depends on how we interpret the atomic formulas which appear in $\varphi$. For example, if $\varphi$ is the atomic formula $A_2$ and we interpret it as ``$2 + 2 = 4$'', it is true, but if we interpret it as ``The moon is made of cheese'', it is false. Since we don't want to commit ourselves to a single interpretation --- after all, we're really interested in general logical relationships --- we will define how any assignment of {\em truth values\/}\index{truth values} $T$\index{$T$} (``true'') and $F$\index{$F$} (``false'') to atomic formulas of $\mathcal{L}_P$ can be extended to all other formulas. We will also get a reasonable definition of what it means for a formula of $\mathcal{L}_P$ to follow logically from other formulas. \begin{defn} \label{d:tras} \index{truth assignment} \index{assignment truth} A {\em truth assignment\/} is a function $v$ whose domain is the set of all formulas of $\mathcal{L}_P$ and whose range is the set $\{ T, F \}$ of truth values, such that: \begin{enumerate} \item $v(A_n)$ is defined for every atomic formula $A_n$. \item For any formula $\alpha$, \begin{displaymath} v(\, (\lnot\alpha)\, ) = \begin{cases} T & \text{if $v(\alpha) = F$} \\ F & \text{if $v(\alpha) = T$.} \end{cases} \end{displaymath} \item For any formulas $\alpha$ and $\beta$, \begin{displaymath} v(\, (\alpha \to \beta)\, ) = \begin{cases} F & \text{if $v(\alpha)=T$ and $v(\beta)=F$} \\ T & \text{otherwise.} \end{cases} \end{displaymath} \end{enumerate} \end{defn} Given interpretations of all the atomic formulas of $\mathcal{L}_P$, the corresponding truth assignment would give each atomic formula representing a true statement the value $T$ and every atomic formula representing a false statement the value $F$. Note that we have not defined how to handle any truth values besides $T$ and $F$ in $\mathcal{L}_P$. Logics with other truth values have uses, but are not relevant in most of mathematics. For an example of how non-atomic formulas are given truth values on the basis of the truth values given to their components, suppose $v$ is a truth assignment such that $v(A_0) = T$ and $v(A_1) = F$. Then $v(\, ((\lnot A_1) \to (A_0 \to A_1))\, )$ is determined from $v(\, (\lnot A_1)\, )$ and $v(\, (A_0 \to A_1)\, )$ according to clause 3 of Definition~\ref{d:tras}. In turn, $v(\, (\lnot A_1)\, )$ is determined from of $v(A_1)$ according to clause 2 and $v(\, (A_0 \to A_1)\, )$ is determined from $v(A_1)$ and $v(A_0)$ according to clause 3. Finally, by clause 1, our truth assignment must be defined for all atomic formulas to begin with; in this case, $v(A_0) = T$ and $v(A_1) = F$. Thus $v(\, (\lnot A_1)\, ) = T$ and $v(\, (A_0 \to A_1)\, ) = F$, so $v(\, ((\lnot A_1) \to (A_0 \to A_1))\, ) = F$. A convenient way to write out the determination of the truth value of a formula on a given truth assignment is to use a {\em truth table\/}\index{truth table}: list all the subformulas of the given formula across the top in order of length and then fill in their truth values on the bottom from left to right. Except for the atomic formulas at the extreme left, the truth value of each subformula will depend on the truth values of the subformulas to its left. For the example above, one gets something like: \[\begin{array}{c|c|c|c|c} A_0 & A_1 & (\lnot A_1) & (A_0 \to A_1) & (\lnot A_1) \to (A_0 \to A_1)) \\ \hline T & F & T & F & F \end{array}\] \begin{prob} \label{p:two1} Suppose $v$ is a truth assignment such that $v(A_0) = v(A_2) = T$ and $v(A_1) = v(A_3) = F$. Find $v(\alpha)$ if $\alpha$ is: \begin{enumerate} \item $\lnot A_2 \to \lnot A_3$ \item $\lnot A_2 \to A_3$ \item $\lnot ( \lnot A_0 \to A_1)$ \item $A_0 \lor A_1$ \item $A_0 \land A_1$ \end{enumerate} \end{prob} The use of finite truth tables to determine what truth value a particular truth assignment gives a particular formula is justified by the following proposition, which asserts that only the truth values of the atomic sentences in the formula matter. \begin{prop} \label{p:tav} Suppose $\delta$ is any formula and $u$ and $v$ are truth assignments such that $u(A_n) = v(A_n)$ for all atomic formulas $A_n$ which occur in $\delta$. Then $u(\delta) = v(\delta)$. \end{prop} \begin{cor} \label{c:tav} Suppose $u$ and $v$ are truth assignments such that $u(A_n) = v(A_n)$ for every atomic formula $A_n$. Then $u = v$, {\em i.e.\/} $u(\varphi) = v(\varphi)$ for every formula $\varphi$. \end{cor} \begin{prop} \label{p:tif} If $\alpha$ and $\beta$ are formulas and $v$ is a truth assignment, then: \begin{enumerate} \item $v(\lnot \alpha) = T$ if and only if $v(\alpha) = F$. \item $v(\alpha \to \beta) = T$ if and only if $v(\beta) = T$ whenever $v(\alpha) = T$; \item $v(\alpha \land \beta) = T$ if and only if $v(\alpha) = T$ and $v(\beta) = T$; \item $v(\alpha \lor \beta) = T$ if and only if $v(\alpha) = T$ or $v(\beta) = T$; and \item $v(\alpha \fromto \beta) = T$ if and only if $v(\alpha) = v(\beta)$. \end{enumerate} \end{prop} Truth tables\index{truth table} are often used even when the formula in question is not broken down all the way into atomic formulas. For example, if $\alpha$ and $\beta$ are any formulas and we know that $\alpha$ is true but $\beta$ is false, then the truth of $(\alpha \to (\lnot \beta))$ can be determined by means of the following table: \[ \begin{array}{c|c|c|c} \alpha & \beta & (\lnot \beta) & (\alpha \to (\lnot \beta)) \\ \hline T & F & T & T \end{array} \] \begin{defn} If $v$ is a truth assignment and $\varphi$ is a formula, we will often say that $v$ {\em satisfies\/}\index{satisfies} $\varphi$ if $v(\varphi) = T$. Similarly, if $\Sigma$ is a set of formulas, we will often say that $v$ satisfies $\Sigma$ if $v(\sigma) = T$ for every $\sigma \in \Sigma$. We will say that $\varphi$ (respectively, $\Sigma$) is {\em satisfiable\/}\index{satisfiable} if there is some truth assignment which satisfies it. \end{defn} \begin{defn} \label{d:taco} \index{tautology} \index{contradiction} A formula $\varphi$ is a {\em tautology\/} if it is satisfied by every truth assignment. A formula $\psi$ is a {\em contradiction\/} if there is no truth assignment which satisfies it. \end{defn} For example, $(A_4 \to A_4)$ is a tautology while $(\lnot (A_4 \to A_4))$ is a contradiction, and $A_4$ is a formula which is neither. One can check whether a given formula is a tautology, contradiction, or neither, by grinding out a complete truth table\index{truth table} for it, with a separate line for each possible assignment of truth values to the atomic subformulas of the formula. For $A_3 \to (A_4 \to A_3)$ this gives \index{truth table} \[\begin{array}{c|c|c|c} A_3 & A_4 & A_4 \to A_3 & A_3 \to (A_4 \to A_3) \\ \hline T & T & T & T \\ T & F & T & T \\ F & T & F & T \\ F & F & T & T \end{array}\] so $A_3 \to (A_4 \to A_3)$ is a tautology. Note that, by Proposition \ref{p:tav}, we need only consider the possible truth values of the atomic sentences which actually occur in a given formula. One can often use truth tables\index{truth table} to determine whether a given formula is a tautology or a contradiction even when it is not broken down all the way into atomic formulas. For example, if $\alpha$ is any formula, then the table \[\begin{array}{c|c|c} \alpha & (\alpha \to \alpha) & (\lnot (\alpha \to \alpha)) \\ \hline T & T & F \\ F & T & F \end{array}\] demonstrates that $(\lnot (\alpha \to \alpha))$ is a contradiction, no matter which formula of $\mathcal{L}_P$ $\alpha$ actually is. \begin{prop} \label{p:two5} If $\alpha$ is any formula, then $((\lnot \alpha) \lor \alpha)$ is a tautology and $((\lnot \alpha) \land \alpha)$ is a contradiction. \end{prop} \begin{prop} \label{p:two6} A formula $\beta$ is a tautology if and only if $\lnot \beta$ is a contradiction. \end{prop} After all this warmup, we are finally in a position to define what it means for one formula to follow logically from other formulas. \begin{defn} \label{d:imp} \index{implies} \index{$\models$} \index{$\nmodels$} A set of formulas $\Sigma$ {\em implies\/} a formula $\varphi$, written as $\Sigma \models \varphi$, if every truth assignment $v$ which satisfies $\Sigma$ also satisfies $\varphi$. We will often write $\Sigma \nmodels \varphi$ if it is not the case that $\Sigma \models \varphi$. In the case where $\Sigma$ is empty, we will usually write $\models \varphi$ instead of $\emptyset \models \varphi$. Similarly, if $\Delta$ and $\Gamma$ are sets of formulas, then $\Delta$ {\em implies\/} $\Gamma$, written as $\Delta \models \Gamma$, if every truth assignment $v$ which satisfies $\Delta$ also satisfies $\Gamma$. \end{defn} For example, $\{\, A_3 ,\, (A_3 \to \lnot A_7) \,\} \models \lnot A_7$, but $\{\, A_8 ,\, (A_5 \to A_8) \,\} \nmodels A_5$. (There is a truth assignment which makes $A_8$ and $A_5 \to A_8$ true, but $A_5$ false.) Note that a formula $\varphi$ is a tautology if and only if $\models \varphi$, and a contradiction if and only if $\models (\lnot \varphi)$. \begin{prop} \label{p:two6a} If $\Gamma$ and $\Sigma$ are sets of formulas such that $\Gamma \subseteq \Sigma$, then $\Sigma \models \Gamma$. \end{prop} \begin{prob} \label{p:two7} How can one check whether or not $\Sigma \models \varphi$ for a formula $\varphi$ and a finite set of formulas $\Sigma$? \end{prob} \begin{prop} \label{p:moto} Suppose $\Sigma$ is a set of formulas and $\psi$ and $\rho$ are formulas. Then $\Sigma \cup \{\psi\} \models \rho$ if and only if $\Sigma \models \psi \to \rho$. \end{prop} \begin{prop} \label{p:sanc} A set of formulas $\Sigma$ is satisfiable if and only if there is no contradiction $\chi$ such that $\Sigma \models \chi$. \end{prop} % % Third chapter of "A Problem Course in Mathematical Logic" % \chapter{Deductions} \label{ch:three} In this chapter we develop a way of defining logical implication that does not rely on any notion of truth, but only on manipulating sequences of formulas, namely formal proofs or deductions. (Of course, any way of defining logical implication had better be compatible with that given in Chapter \ref{ch:two}.) To define these, we first specify a suitable set of formulas which we can use freely as premisses in deductions. \begin{defn} \index{axiom schema} \index{axiom} \index{A1} \index{A2} \index{A3} The three {\em axiom schema\/} of $\mathcal{L}_P$ are: \begin{description} \item[A1] $(\alpha \to (\beta \to \alpha))$ \item[A2] $((\alpha \to (\beta \to \gamma)) \to ((\alpha \to \beta) \to (\alpha \to \gamma)))$ \item[A3] $(((\lnot\beta)\to (\lnot\alpha)) \to ( ((\lnot\beta) \to \alpha) \to \beta ) )$. \end{description} Replacing $\alpha$, $\beta$, and $\gamma$ by particular formulas of $\mathcal{L}_P$ in any one of the schemas A1, A2, or A3 gives an {\em axiom\/} of $\mathcal{L}_P$. \end{defn} For example, $(A_1 \to (A_4 \to A_1))$ is an axiom, being an instance of axiom schema A1, but $(A_9 \to (\lnot A_0))$ is not an axiom as it is not the instance of any of the schema. As had better be the case, every axiom is always true: \begin{prop} \label{p:axta} Every axiom of $\mathcal{L}_P$ is a tautology. \end{prop} Second, we specify our one (and only!) rule of inference\index{rule of inference}\index{inference rule}.\footnote{Natural deductive systems, which are usually more convenient to actually execute deductions in than the system being developed here, compensate for having few or no axioms by having many rules of inference.} \begin{defn}[Modus Ponens] \index{Modus Ponens} Given the formulas $\varphi$ and $(\varphi \to \psi)$, one may infer $\psi$. \end{defn} We will usually refer to Modus Ponens by its initials, MP.\index{MP} Like any rule of inference worth its salt, MP preserves truth. \begin{prop} \label{p:snd} Suppose $\varphi$ and $\psi$ are formulas. Then $\{\, \varphi ,\, (\varphi \to \psi) \,\} \models \psi$. \end{prop} With axioms and a rule of inference in hand, we can execute formal proofs in $\mathcal{L}_P$. \begin{defn} \label{d:ded} \index{deduction} \index{proof} Let $\Sigma$ be a set of formulas. A {\em deduction\/} or {\em proof\/} from $\Sigma$ in $\mathcal{L}_P$ is a finite sequence $\varphi_1 \varphi_2 \dots \varphi_n$ of formulas such that for each $k \le n$, \begin{enumerate} \item $\varphi_k$ is an axiom, or \item $\varphi_k \in \Sigma$, or \item there are $i,j < k$ such that $\varphi_k$ follows from $\varphi_i$ and $\varphi_j$ by MP. \end{enumerate} A formula of $\Sigma$ appearing in the deduction is called a {\em premiss\/}\index{premiss}. $\Sigma$ {\em proves\/}\index{proves} a formula $\alpha$, written as $\Sigma \proves \alpha$,\index{$\proves$} if $\alpha$ is the last formula of a deduction from $\Sigma$. We'll usually write $\proves \alpha$ for $\emptyset \proves \alpha$, and take $\Sigma \proves \Delta$ to mean that $\Sigma \proves \delta$ for every formula $\delta \in \Delta$. \end{defn} In order to make it easier to verify that an alleged deduction really is one, we will number the formulas in a deduction, write them out in order on separate lines, and give a justification for each formula. Like the additional connectives and conventions for dropping parentheses in Chapter \ref{ch:one}, this is not officially a part of the definition of a deduction. \begin{exmp} \label{e:one} Let us show that $\proves \varphi \to \varphi$. \begin{enumerate} \item $(\varphi \to ((\varphi \to \varphi) \to \varphi)) \to ((\varphi \to (\varphi \to \varphi)) \to (\varphi \to \varphi))$ \hfill A2 \item $\varphi \to ((\varphi \to \varphi) \to \varphi)$ \hfill A1 \item $(\varphi \to (\varphi \to \varphi)) \to (\varphi \to \varphi)$ \hfill 1,2 MP \item $\varphi \to (\varphi \to \varphi)$ \hfill A1 \item $\varphi \to \varphi$ \hfill 3,4 MP \end{enumerate} Hence $\proves \varphi \to \varphi$, as desired. Note that indication of the formulas from which formulas 3 and 5 beside the mentions of MP. \end{exmp} \begin{exmp} \label{e:two} Let us show that $\{\, \alpha \to \beta,\, \beta \to \gamma \,\} \proves \alpha \to \gamma$. \begin{enumerate} \item $(\beta \to \gamma) \to (\alpha \to (\beta \to \gamma))$ \hfill A1 \item $\beta \to \gamma$ \hfill Premiss \item $\alpha \to (\beta \to \gamma)$ \hfill 1,2 MP \item $(\alpha \to (\beta \to \gamma)) \to ((\alpha \to \beta) \to (\alpha \to \gamma))$ \hfill A2 \item $(\alpha \to \beta) \to (\alpha \to \gamma)$ \hfill 4,3 MP \item $\alpha \to \beta$ \hfill Premiss \item $\alpha \to \gamma$ \hfill 5,6 MP \end{enumerate} Hence $\{\, \alpha \to \beta,\, \beta \to \gamma \,\} \proves \alpha \to \gamma$, as desired. \end{exmp} It is frequently convenient to save time and effort by simply referring to a deduction one has already done instead of writing it again as part of another deduction. If you do so, please make sure you appeal only to deductions that have already been carried out. \begin{exmp} \label{e:three} Let us show that $\proves (\lnot \alpha \to \alpha) \to \alpha$. \begin{enumerate} \item $( \lnot \alpha \to \lnot \alpha) \to (( \lnot \alpha \to \alpha ) \to \alpha )$ \hfill A3 \item $\lnot\alpha \to \lnot\alpha$ \hfill Example~\ref{e:one} \item $(\lnot \alpha \to \alpha) \to \alpha$ \hfill 1,2 MP \end{enumerate} Hence $\proves (\lnot \alpha \to \alpha) \to \alpha$, as desired. To be completely formal, one would have to insert the deduction given in Example \ref{e:one} (with $\varphi$ replaced by $\lnot \alpha$ throughout) in place of line 2 above and renumber the old line 3. \end{exmp} \begin{prob} \label{p:ded} Show that if $\alpha$, $\beta$, and $\gamma$ are formulas, then \begin{enumerate} \item $\{\, \alpha \to (\beta \to \gamma),\, \beta\,\} \proves \alpha \to \gamma$ \item $\proves \alpha \lor \lnot \alpha$ \end{enumerate} \end{prob} \begin{exmp} \label{e:four} Let us show that $\proves \lnot\lnot \beta \to \beta$. \begin{enumerate} \item $(\lnot\beta \to \lnot\lnot\beta) \to ((\lnot\beta \to \lnot\beta) \to \beta)$ \hfill A3 \item $\lnot\lnot\beta \to (\lnot\beta \to \lnot\lnot\beta)$ \hfill A1 \item $\lnot\lnot\beta \to ((\lnot\beta \to \lnot\beta) \to \beta)$ \hfill 1,2 Example~\ref{e:two} \item $\lnot\beta \to \lnot\beta$ \hfill Example~\ref{e:one} \item $\lnot\lnot \beta \to \beta$ \hfill 3,4 Problem~\ref{p:ded}.1 \end{enumerate} Hence $\proves \lnot\lnot \beta \to \beta$, as desired. \end{exmp} Certain general facts are sometimes handy: \begin{prop} \label{p:three3a} If $\varphi_1 \varphi_2 \dots \varphi_n$ is a deduction of $\mathcal{L}_P$, then $\varphi_1 \dots \varphi_\ell$ is also a deduction of $\mathcal{L}_P$ for any $\ell$ such that $1 \le \ell \le n$. \end{prop} \begin{prop} \label{p:dmp} If $\Gamma \proves \delta$ and $\Gamma \proves \delta \to \beta$, then $\Gamma \proves \beta$. \end{prop} \begin{prop} \label{p:three5} If $\Gamma \subseteq \Delta$ and $\Gamma \proves \alpha$, then $\Delta \proves \alpha$. \end{prop} \begin{prop} \label{p:three6} If $\Gamma \proves \Delta$ and $\Delta \proves \sigma$, then $\Gamma \proves \sigma$. \end{prop} The following theorem often lets one take substantial shortcuts when trying to show that certain deductions exist in $\mathcal{L}_P$, even though it doesn't give us the deductions explicitly. \begin{thm}[Deduction Theorem] \label{t:ded} \index{Deduction Theorem} If $\Sigma$ is any set of formulas and $\alpha$ and $\beta$ are any formulas, then $\Sigma \proves \alpha \to \beta$ if and only if $\Sigma \cup \{ \alpha \} \proves \beta$. \end{thm} \begin{exmp} \label{e:five} Let us show that $\proves \varphi \to \varphi$. By the Deduction Theorem it is enough to show that $\{ \varphi \} \proves \varphi$, which is trivial: \begin{enumerate} \item $\varphi$ \hfill Premiss \end{enumerate} Compare this to the deduction in Example~\ref{e:one}. \end{exmp} \begin{prob} \label{p:prov} Appealing to previous deductions and the Deduction Theorem if you wish, show that: \begin{enumerate} \item $\{ \delta, \lnot\delta \} \proves \gamma$ \item $\proves \varphi \to \lnot\lnot \varphi$ \item $\proves (\lnot \beta \to \lnot \alpha) \to (\alpha \to \beta)$ \item $\proves (\alpha \to \beta) \to (\lnot \beta \to \lnot \alpha)$ \item $\proves (\beta \to \lnot \alpha) \to (\alpha \to \lnot \beta)$ \item $\proves (\lnot \beta \to \alpha) \to (\lnot \alpha \to \beta)$ \item $\proves \sigma \to (\sigma \lor \tau)$ \item $\{ \alpha \land \beta \} \proves \beta$ \item $\{ \alpha \land \beta \} \proves \alpha$ \end{enumerate} \end{prob} % % Chapter 4 of "A Problem Course in Mathematical Logic" % \chapter{Soundness and Completeness} \label{ch:four} How are deduction and implication related, given that they were defined in completely different ways? We have some evidence that they behave alike; compare, for example, Proposition~\ref{p:moto} and the Deduction Theorem. It had better be the case that if there is a deduction of a formula $\varphi$ from a set of premisses $\Sigma$, then $\varphi$ is implied by $\Sigma$. (Otherwise, what's the point of defining deductions?) It would also be nice for the converse to hold: whenever $\varphi$ is implied by $\Sigma$, there is a deduction of $\varphi$ from $\Sigma$. (So anything which is true can be proved.) The Soundness and Completeness Theorems say that both ways do hold, so $\Sigma \proves \varphi$ if and only if $\Sigma \models \varphi$, {\em i.e.\/} $\proves$ and $\models$ are equivalent for propositional logic. One direction is relatively straightforward to prove\dots \begin{thm}[Soundness Theorem] \label{t:psnd} \index{Soundness Theorem} If $\Delta$ is a set of formulas and $\alpha$ is a formula such that $\Delta \proves \alpha$, then $\Delta \models \alpha$. \end{thm} \dots but for the other direction we need some additional concepts. \begin{defn} \label{d:cons} A set of formulas $\Gamma$ is {\em inconsistent\/}\index{inconsistent} if $\Gamma \proves \lnot(\alpha \to \alpha)$ for some formula $\alpha$, and {\em consistent\/}\index{consistent} if it is not inconsistent. \end{defn} For example, $\{ A_{41} \}$ is consistent by Proposition~\ref{p:stoc}, but it follows from Problem~\ref{p:prov} that $\{ A_{13}, \lnot A_{13} \}$ is inconsistent. \begin{prop} \label{p:stoc} If a set of formulas is satisfiable, then it is consistent. \end{prop} \begin{prop} \label{p:inca} Suppose $\Delta$ is an inconsistent set of formulas. Then $\Delta \proves \psi$ for any formula $\psi$. \end{prop} \begin{prop} \label{p:cmp} Suppose $\Sigma$ is an inconsistent set of formulas. Then there is a finite subset $\Delta$ of $\Sigma$ such that $\Delta$ is inconsistent. \end{prop} \begin{cor} \label{c:cmp} A set of formulas $\Gamma$ is consistent if and only if every finite subset of $\Gamma$ is consistent. \end{cor} To obtain the Completeness Theorem requires one more definition. \begin{defn} \label{d:mxc} \index{maximally consistent} \index{consistent maximally} A set of formulas $\Sigma$ is {\em maximally consistent} if $\Sigma$ is consistent but $\Sigma \cup \{\varphi\}$ is inconsistent for any $\varphi \notin \Sigma$. \end{defn} That is, a set of formulas is maximally consistent if it is consistent, but there is no way to add any other formula to it and keep it consistent. \begin{prob} \label{p:emc} Suppose $v$ is a truth assignment. Show that $\Sigma = \{\, \varphi \mid v(\varphi) = T \,\}$ is maximally consistent. \end{prob} We will need some facts concerning maximally consistent theories. \begin{prop} \label{p:inmc} If $\Sigma$ is a maximally consistent set of formulas, $\varphi$ is a formula, and $\Sigma \proves \varphi$, then $\varphi \in \Sigma$. \end{prop} \begin{prop} \label{p:nimc} Suppose $\Sigma$ is a maximally consistent set of formulas and $\varphi$ is a formula. Then $\lnot\varphi \in \Sigma$ if and only if $\varphi \notin \Sigma$. \end{prop} \begin{prop} \label{p:iimc} Suppose $\Sigma$ is a maximally consistent set of formulas and $\varphi$ and $\psi$ are formulas. Then $\varphi \to \psi \in \Sigma$ if and only if $\varphi \notin \Sigma$ or $\psi \in \Sigma$. \end{prop} It is important to know that any consistent set of formulas can be expanded to a maximally consistent set. \begin{thm} \label{t:exmc} Suppose $\Gamma$ is a consistent set of formulas. Then there is a maximally consistent set of formulas $\Sigma$ such that $\Gamma \subseteq \Sigma$. \end{thm} Now for the main event! \begin{thm} \label{t:saco} A set of formulas is consistent if and only if it is satisfiable. \end{thm} Theorem~\ref{t:saco} gives the equivalence between $\proves$ and $\models$ in slightly disguised form. \begin{thm}[Completeness Theorem] \label{t:pcmpl} \index{Completeness Theorem} If $\Delta$ is a set of formulas and $\alpha$ is a formula such that $\Delta \models \alpha$, then $\Delta \proves \alpha$. \end{thm} It follows that anything provable from a given set of premisses must be true if the premisses are, and {\em vice versa\/}. The fact that $\proves$ and $\models$ are actually equivalent can be very convenient in situations where one is easier to use than the other. For example, most parts of Problems~\ref{p:ded} and \ref{p:prov} are much easier to do with truth tables instead of deductions, even if one makes use of the Deduction Theorem. Finally, one more consequence of Theorem~\ref{t:saco}. \begin{thm}[Compactness Theorem] \label{t:pcpct} \index{Compactness Theorem} A set of formulas $\Gamma$ is satisfiable if and only if every finite subset of $\Gamma$ is satisfiable. \end{thm} We will not look at any uses of the Compactness Theorem now, but we will consider a few applications of its counterpart for first-order logic in Chapter~\ref{ch:nine}. \chapter*{Hints for Chapters 1--4} % % Hints for Chapter 1 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:one}} \begin{clue}{p:one1} Symbols not in the language, unbalanced parentheses, lack of connectives\dots \end{clue} \begin{clue}{p:lrp} The key idea is to exploit the recursive structure of Definition~\ref{d:form} and proceed by induction on the length of the formula or on the number of connectives in the formula. As this is an idea that will be needed repeatedly in Parts I, II, and IV, here is a skeleton of the argument in this case: \begin{proof} By induction on $n$, the number of connectives ({\em i.e.\/} occurrences of $\lnot$ and/or $\to$) in a formula $\varphi$ of $\mathcal{L}_P$, we will show that any formula $\varphi$ must have just as many left parentheses as right parentheses. {\em Base step:\/} ($n=0$) If $\varphi$ is a formula with no connectives, then it must be atomic. (Why?) Since an atomic formula has no parentheses at all, it has just as many left parentheses as right parentheses. {\em Induction hypothesis:\/} ($n \le k$) Assume that any formula with $n \le k$ connectives has just as many left parentheses as right parentheses. {\em Induction step:\/} ($n=k+1$) Suppose $\varphi$ is a formula with $n=k+1$ connectives. It follows from Definition~\ref{d:form} that $\varphi$ must be either \begin{enumerate} \item $(\lnot \alpha)$ for some formula $\alpha$ with $k$ connectives or \item $(\beta \to \gamma)$ for some formulas $\beta$ and $\gamma$ which have $\le k$ connectives each. \end{enumerate} (Why?) We handle the two cases separately: \begin{enumerate} \item By the induction hypothesis, $\alpha$ has just as many left parentheses as right parentheses. Since $\varphi$, {\em i.e.\/} $(\lnot \alpha)$, has one more left parenthesis and one more right parentheses than $\alpha$, it must have just as many left parentheses as right parentheses as well. \item By the induction hypothesis, $\beta$ and $\gamma$ each have the same number of left parentheses as right parentheses. Since $\varphi$, {\em i.e.\/} $(\beta \to \alpha)$, has one more left parenthesis and one more right parnthesis than $\beta$ and $\gamma$ together have, it must have just as many left parntheses as right parentheses as well. \end{enumerate} It follows by induction that every formula $\varphi$ of $\mathcal{L}_P$ has just as many left parentheses as right parentheses. \end{proof} \end{clue} \begin{clue}{p:one3} Compute $p(\alpha) / \ell(\alpha)$ for a number of examples and look for patterns. Getting a minimum value should be pretty easy. \end{clue} \begin{clue}{p:one4} Proceed by induction on the length of or on the number of connectives in the formula. \end{clue} \begin{clue}{p:lof} Construct examples of formulas of all the short lengths that you can, and then see how you can make longer formulas out of short ones. \end{clue} \begin{clue}{p:pn} Hewlett-Packard sells calculators that use such a trick. A similar one is used in Definition \ref{d:ter}. \end{clue} \begin{clue}{p:foc} Observe that $\mathcal{L}_P$ has countably many symbols and that every formula is a finite sequence of symbols. The relevant facts from set theory are given in Appendix \ref{ap:sets}. \end{clue} \begin{clue}{p:one8} Stick several simple statements together with suitable connectives. \end{clue} \begin{clue}{p:one9} This should be straightforward. \end{clue} \begin{clue}{p:one10} Ditto. \end{clue} \begin{clue}{p:one11} To make sure you get all the subformulas, write out the formula in official form with all the parentheses. \end{clue} \begin{clue}{t:ur} Proceed by induction on the length or number of connectives of the formula. \end{clue} % % Hints for Chapter 2 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:two}} \begin{clue}{p:two1} Use truth tables. \end{clue} \begin{clue}{p:tav} Proceed by induction on the length of $\delta$ or on the number of connectives in $\delta$. \end{clue} \begin{clue}{c:tav} Use Proposition \ref{p:tav}. \end{clue} \begin{clue}{p:tif} In each case, unwind Definition \ref{d:tras} and the definitions of the abbreviations. \end{clue} \begin{clue}{p:two5} Use truth tables. \end{clue} \begin{clue}{p:two6} Use Definition \ref{d:taco} and Proposition \ref{p:tif}. \end{clue} \begin{clue}{p:two6a} If a truth assignment satisfies every formula in $\Sigma$ and every formula in $\Gamma$ is also in $\Sigma$, then\dots \end{clue} \begin{clue}{p:two7} Grinding out an appropriate truth table will do the job. Why is it important that $\Sigma$ be finite here? \end{clue} \begin{clue}{p:moto} Use Definition \ref{d:imp} and Proposition \ref{p:tif}. \end{clue} \begin{clue}{p:sanc} Use Definitions \ref{d:taco} and \ref{d:imp}. If you have trouble trying to prove one of the two directions directly, try proving its contrapositive instead. \end{clue} % % Hints for Chapter 3 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:three}} \begin{clue}{p:axta} Truth tables are probably the best way to do this. \end{clue} \begin{clue}{p:snd} Look up Proposition \ref{p:tif}. \end{clue} \begin{clue}{p:ded} There are usually many different deductions with a given conclusion, so you shouldn't take the following hints as gospel. \begin{enumerate} \item Use A2 and A1. \item Recall what $\lor$ abbreviates. \end{enumerate} \end{clue} \begin{clue}{p:three3a} You need to check that $\varphi_1 \dots \varphi_\ell$ satisfies the three conditions of Definition~\ref{d:ded}; you know $\varphi_1 \dots \varphi_n$ does. \end{clue} \begin{clue}{p:dmp} Put together a deduction of $\beta$ from $\Gamma$ from the deductions of $\delta$ and $\delta \to \beta$ from $\Gamma$. \end{clue} \begin{clue}{p:three5} Examine Definition \ref{d:ded} carefully. \end{clue} \begin{clue}{p:three6} The key idea is similar to that for proving Proposition \ref{p:dmp}. \end{clue} \begin{clue}{t:ded} One direction follows from Proposition \ref{p:dmp}. For the other direction, proceed by induction on the length of the shortest proof of $\beta$ from $\Sigma \cup \{ \alpha \}$. \end{clue} \begin{clue}{p:prov} Again, don't take these hints as gospel. Try using the Deduction Theorem in each case, plus \begin{enumerate} \item A3. \item A3 and Problem \ref{p:ded}. \item A3. \item A3, Problem \ref{p:ded}, and Example \ref{e:two}. \item Some of the above parts and Problem \ref{p:ded}. \item Ditto. \item Use the definition of $\lor$ and one of the above parts. \item Use the definition of $\land$ and one of the above parts. \item Aim for $\lnot\alpha \to (\alpha \to \lnot\beta)$ as an intermediate step. \end{enumerate} \end{clue} % % Hints for Chapter 4 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:four}} \begin{clue}{t:psnd} Use induction on the length of the deduction and Proposition \ref{p:snd}. \end{clue} \begin{clue}{p:stoc} Assume, by way of contradiction, that the given set of formulas is inconsistent. Use the Soundness Theorem to show that it can't be satisfiable. \end{clue} \begin{clue}{p:inca} First show that $\{ \lnot(\alpha \to \alpha) \} \proves \psi$. \end{clue} \begin{clue}{p:cmp} Note that deductions are finite sequences of formulas. \end{clue} \begin{clue}{c:cmp} Use Proposition \ref{p:cmp}. \end{clue} \begin{clue}{p:emc} Use Proposition \ref{p:stoc}, the definition of $\Sigma$, and Proposition \ref{p:tif}. \end{clue} \begin{clue}{p:inmc} Assume, by way of contradiction, that $\varphi \notin \Sigma$. Use Definition \ref{d:mxc} and the Deduction Theorem to show that $\Sigma$ must be inconsistent. \end{clue} \begin{clue}{p:nimc} Use Definition \ref{d:mxc} and Problem \ref{p:prov}. \end{clue} \begin{clue}{p:iimc} Use Definition \ref{d:mxc} and Proposition \ref{p:nimc}. \end{clue} \begin{clue}{t:exmc} Use Proposition \ref{p:foc} and induction on a list of all the formulas of $\mathcal{L}_P$. \end{clue} \begin{clue}{t:saco} One direction is just Proposition \ref{p:stoc}. For the other, expand the set of formulas in question to a maximally consistent set of formulas $\Sigma$ using Theorem \ref{t:exmc}, and define a truth assignment $v$ by setting $v(A_n) = T$ if and only if $A_n \in \Sigma$. Now use induction on the length of $\varphi$ to show that $\varphi \in \Sigma$ if and only if $v$ satisfies $\varphi$. \end{clue} \begin{clue}{t:pcmpl} Prove the contrapositive using Theorem \ref{t:saco}. \end{clue} \begin{clue}{t:pcpct} Put Corollary \ref{c:cmp} together with Theorem \ref{t:saco}. \end{clue} % % Part II of "A Problem Course in Mathematical Logic" % \part{First-Order Logic} % % Chapter 5 of "A Problem Course in Mathematical Logic" % \chapter{Languages} \label{ch:five} As noted in the Introduction, propositional logic has obvious deficiencies as a tool for mathematical reasoning. First-order logic\index{first-order logic}\index{logic first-order} remedies enough of these to be adequate for formalizing most ordinary mathematics. It does have enough in common with propositional logic to let us recycle some of the material in Chapters 1--4. A few informal words about how first-order languages\index{first-order languages}\index{language first-order} work are in order. In mathematics one often deals with structures consisting of a set of elements plus various operations on them or relations among them. To cite three common examples, a group is a set of elements plus a binary operation on these elements satisfying certain conditions, a field is a set of elements plus two binary operations on these elements satisfying certain conditions, and a graph is a set of elements plus a binary relation with certain properties. In most such cases, one frequently uses symbols naming the operations or relations in question, symbols for variables which range over the set of elements, symbols for logical connectives such as {\em not\/} and {\em for all\/}, plus auxiliary symbols such as parentheses, to write formulas which express some fact about the structure in question. For example, if $(G,\cdot)$ is a group, one might express the associative law by writing something like \[ \forall x\, \forall y\, \forall z\; x\cdot (y\cdot z) = (x\cdot y)\cdot z\, , \] it being understood that the variables range over the set $G$ of group elements. A formal language to do as much will require some or all of these: symbols for various logical notions and for variables, some for functions or relations, plus auxiliary symbols. It will also be necessary to specify rules for putting the symbols together to make formulas, for interpreting the meaning and determining the truth of these formulas, and for making inferences in deductions. For a concrete example, consider elementary number theory. The set of elements under discussion is the set of natural numbers $\mathbb N = \{\, 0,1,2,3,4, \dots \}$. One might need symbols or names for certain interesting numbers, say $0$ and $1$; for variables over $\mathbb{N}$ such as $n$ and $x$; for functions on $\mathbb{N}$, say $\cdot$ and $+$; and for relations, say $=$, $<$, and $|$. In addition, one is likely to need symbols for punctuation, such as $($ and $)$; for logical connectives, such as $\lnot$ and $\to$; and for quantifiers, such as $\forall$ (``for all'') and $\exists$ (``there exists''). A statement of mathematical English such as ``For all $n$ and $m$, if $n$ divides $m$, then $n$ is less than or equal to $m$'' can then be written as a cool formula like \[ \forall n \forall m \, (n \mid m \to (n < m \land n = m)) \, . \] The extra power of first-order logic comes at a price: greater complexity. First, there are many first-order languages one might wish to use, practically one for each subject, or even problem, in mathematics.\footnote{It is possible to formalize almost all of mathematics in a single first-order language, like that of set theory or category theory. However, trying to actually do most mathematics in such a language is so hard as to be pointless.} We will set up our definitions and general results, however, to apply to a wide range of them.\footnote{Specifically, to countable one-sorted first-order languages with equality.} As with $\mathcal{L}_P$, our formal language for propositional logic, first-order languages are defined by specifying their symbols and how these may be assembled into formulas. \begin{defn} \label{d:sym} \index{symbols} \index{symbols logical} \index{symbols non-logical} The {\em symbols\/} of a first-order language $\mathcal{L}$\index{$\mathcal{L}$} include: \begin{enumerate} \item Parentheses: $($ and $)$. \index{parentheses} \index{$($} \index{$)$} \item Connectives: $\lnot$ and $\to$. \index{connectives} \index{$\lnot$} \index{$\to$} \item Quantifier: $\forall$. \index{quantifier universal} \index{$\forall$} \item Variables: $v_0$, $v_1$, $v_2$, \dots, $v_n$, \dots \index{variable} \index{$v_n$} \item Equality: $=$. \index{equality} \index{$=$} \item A (possibly empty) set of {\em constant\/} symbols. \index{constant} \item For each $k \ge 1$, a (possibly empty) set of {\em $k$-place function\/} symbols. \index{function $k$-place} \index{function} \item For each $k \ge 1$, a (possibly empty) set of {\em $k$-place relation\/} (or {\em predicate\/}) symbols. \index{relation $k$-place} \index{relation} \index{predicate} \end{enumerate} The symbols described in parts 1--5 are the {\em logical\/} symbols of $\mathcal{L}$, shared by every first-order language, and the rest are the {\em non-logical\/} symbols of $\mathcal{L}$, which usually depend on what the language's intended use. \end{defn} \begin{note} It is possible to define first-order languages without $=$, so $=$ is considered a non-logical symbol by many authors. While such languages have some uses, they are uncommon in ordinary mathematics. Observe that any first-order language $\mathcal{L}$ has countably many logical symbols. It may have uncountably many symbols if it has uncountably many non-logical symbols. {\em Unless explicitly stated otherwise, we will assume that every first-order language we encounter has only countably many non-logical symbols.\/} Most of the results we will prove actually hold for countable and uncountable first-order languages alike, but some require heavier machinery to prove for uncountable languages. \end{note} Just as in $\mathcal{L}_P$, the parentheses are just punctuation\index{punctuation} while the connectives, $\lnot$\index{$\lnot$} and $\to$\index{$\to$}, are intended to express {\em not\/}\index{not} and {\em if \dots then\/}\index{if \dots then}. However, the rest of the symbols are new and are intended to express ideas that cannot be handled by $\mathcal{L}_P$. The quantifier\index{quantifier universal} symbol, $\forall$\index{$\forall$}, is meant to represent {\em for all\/}\index{for all}, and is intended to be used with the variable symbols, {\em e.g.\/} $\forall v_4$. The constant\index{constant} symbols are meant to be names for particular elements of the structure under discussion. $k$-place function\index{function $k$-place} symbols are meant to name particular functions which map $k$-tuples of elements of the structure to elements of the structure. $k$-place relation\index{relation $k$-place} symbols are intended to name particular $k$-place relations among elements of the structure.\footnote{Intuitively, a relation or predicate\index{predicate} expresses some (possibly arbitrary) relationship among one or more objects. For example, ``$n$ is prime'' is a 1-place relation on the natural numbers, $<$ is a 2-place or binary relation\index{relation binary} on the rationals, and $\vec{a} \times (\vec{b} \times \vec{c}) = \vec{0}$ is a 3-place relation on $\mathbb{R}^3$. Formally, a $k$-place relation\index{relation $k$-place} on a set $X$ is just a subset of $X^k$, {\em i.e.\/} the collection of sequences of length $k$ of elements of $X$ for which the relation is true.} Finally, $=$\index{$=$} is a special binary relation\index{relation binary} symbol intended to represent equality\index{equality}. \begin{exmp} \label{e:lannt} Since the logical symbols are always the same, first-order languages are usually defined by specifying the non-logical symbols. A formal language for elementary number theory like that unofficially described above, call it $\mathcal{L}_{NT}$\index{$\mathcal{L}_{NT}$}, can be defined as follows. \begin{itemize} \item Constant symbols: $0$ and $1$ \item Two $2$-place function symbols: $+$ and $\cdot$ \item Two binary relation symbols: $<$ and $|$ \end{itemize} Each of these symbols is intended to represent the same thing it does in informal mathematical usage: $0$ and $1$ are intended to be names for the numbers zero and one, $+$ and $\cdot$ names for the operations of addition and multiplications, and $<$ and $|$ names for the relations ``less than'' and ``divides''. (Note that we could, in principle, interpret things completely differently -- let $0$ represent the number forty-one, $+$ the operation of exponentiation, and so on -- or even use the language to talk about a different structure -- say the real numbers, $\mathbb{R}$, with $0$, $1$, $+$, $\cdot$, and $<$ representing what they usually do and, just for fun, $|$ interpreted as ``is not equal to''. More on this in Chapter \ref{ch:six}.) We will usually use the same symbols in our formal languages that we use informally for various common mathematical objects. This convention\index{convention for common symbols} can occasionally cause confusion if it is not clear whether an expression involving these symbols is supposed to be an expression in a formal language or not. \end{exmp} \begin{exmp} \label{e:lan} Here are some other first-order languages. Recall that we need only specify the non-logical symbols in each case and note that some parts of Definitions \ref{d:ter} and \ref{d:for} may be irrelevant for a given language\index{language} if it is missing the appropriate sorts of non-logical symbols. \begin{enumerate} \item The language of pure equality, $\mathcal{L}_=$:\index{$\mathcal{L}_=$} \begin{itemize} \item No non-logical symbols at all. \end{itemize} \item A language for fields, $\mathcal{L}_F$:\index{$\mathcal{L}_F$} \begin{itemize} \item Constant symbols: $0$, $1$ \item $2$-place function symbols: $+$, $\cdot$ \end{itemize} \item A language for set theory, $\mathcal{L}_S$:\index{$\mathcal{L}_S$} \begin{itemize} \item $2$-place relation symbol: $\in$ \end{itemize} \item A language for linear orders, $\mathcal{L}_O$:\index{$\mathcal{L}_O$} \begin{itemize} \item $2$-place relation symbol: $<$ \end{itemize} \item Another language for elementary number theory, $\mathcal{L}_N$:\index{$\mathcal{L}_N$} \begin{itemize} \item Constant symbol: $0$ \item $1$-place function symbol: $S$ \item $2$-place function symbols: $+$, $\cdot$, $E$ \end{itemize} Here $0$ is intended to represent zero, $S$ the successor function, {\em i.e.\/} $S(n) = n + 1$, and $E$ the exponential function, {\em i.e.\/} $E(n,m) = n^m$. \item A ``worst-case'' countable language, $\mathcal{L}_1$:\index{$\mathcal{L}_1$} \begin{itemize} \item Constant symbols: $c_1$, $c_2$, $c_3$, \dots \item For each $k \ge 1$, $k$-place function symbols: $f^k_1$, $f^k_2$, $f^k_3$, \dots \item For each $k \ge 1$, $k$-place relation symbols: $P^k_1$, $P^k_2$, $P^k_3$, \dots \end{itemize} This language has no use except as an abstract example. \end{enumerate} \end{exmp} It remains to specify how to form valid formulas from the symbols of a first-order language $\mathcal{L}$. This will be more complicated than it was for $\mathcal{L}_P$. In fact, we first need to define a type of expression in $\mathcal{L}$ which has no counterpart in propositional logic. \begin{defn} \label{d:ter} \index{term} The {\em terms\/} of a first-order language $\mathcal{L}$ are those finite sequences of symbols of $\mathcal{L}$ which satisfy the following rules: \begin{enumerate} \item Every variable symbol $v_n$ is a term. \item Every constant symbol $c$ is a term. \item If $f$ is a $k$-place function symbol and $t_1$, \dots, $t_k$ are terms, then $f t_1 \dots t_k$ is also a term. \item Nothing else is a term. \end{enumerate} \end{defn} That is, a term is an expression which represents some (possibly indeterminate) element of the structure under discussion. For example, in $\mathcal{L}_{NT}$ or $\mathcal{L}_N$, $+ v_0 v_1$ (informally, $v_0 + v_1$ ) is a term, though precisely which natural number it represents depends on what values are assigned to the variables $v_0$ and $v_1$. \begin{prob} \label{p:five1} Which of the following are terms of one of the languages defined in Examples \ref{e:lannt} and \ref{e:lan}? If so, which of these language(s) are they terms of; if not, why not? \begin{enumerate} \item $\cdot v_2$ \item $+ 0 \cdot + v_6 1 1$ \item $|1+v_30$ \item $(<E101 \to +11)$ \item $++\cdot +00000$ \item $f^3_4f^2_7 c_4 v_9 c_1 v_4$ \item $\cdot v_5 (+1v_8)$ \item $< v_6 v_2$ \item $1 + 0$ \end{enumerate} \end{prob} Note that in languages with no function symbols all terms have length one. \begin{prob} \label{p:five2} Choose one of the languages defined in Examples \ref{e:lannt} and \ref{e:lan} which has terms of length greater than one and determine the possible lengths of terms of this language. \end{prob} \begin{prop} \label{p:fcmt} The set of terms of a countable first-order language $\mathcal{L}$ is countable. \end{prop} Having defined terms, we can finally define first-order formulas. \begin{defn} \label{d:for} \index{formula} The {\em formulas\/} of a first-order language $\mathcal{L}$ are the finite sequences of the symbols of $\mathcal{L}$ satisfying the following rules: \begin{enumerate} \item If $P$ is a $k$-place relation symbol and $t_1$, \dots, $t_k$ are terms, then $P t_1 \dots t_k$ is a formula. \item If $t_1$ and $t_2$ are terms, then $= t_1 t_2$ is a formula. \item If $\alpha$ is a formula, then $(\lnot \alpha)$ is a formula. \item If $\alpha$ and $\beta$ are formulas, then $(\alpha \to \beta)$ is a formula. \item If $\varphi$ is a formula and $v_n$ is a variable, then $\forall v_n \varphi$ is a formula. \item Nothing else is a formula. \end{enumerate} Formulas of form 1 or 2 will often be referred to as the {\em atomic formulas\/}\index{atomic formulas} \index{formulas atomic} of $\mathcal{L}$. \end{defn} Note that three of the conditions in Definition \ref{d:for} are borrowed directy from propositional logic. As before, we will exploit the way formulas are built up in making definitions and in proving results by induction on the length of a formula. We will also recycle the use of lower-case Greek characters\index{Greek characters} to refer to formulas and of upper-case Greek characters to refer to sets of formulas. \begin{prob} \label{p:five4} Which of the following are formulas of one of the languages defined in Examples \ref{e:lannt} and \ref{e:lan}? If so, which of these language(s) are they formulas of; if not, why not? \begin{enumerate} \item $= 0 + v_7 \cdot 1 v_3$ \item $(\lnot = v_1 v_1)$ \item $(| v_2 0 \to \cdot 0 1)$ \item $(\lnot \forall v_5 (= v_5 v_5))$ \item $< +01 |v_1v_3$ \item $(v_3 = v_3 \to \forall v_5 \, v_3 = v_5)$ \item $\forall v_6 (= v_6 0 \to \forall v_9 (\lnot | v_9 v_6))$ \item $\forall v_8 < +11 v_4$ \end{enumerate} \end{prob} \begin{prob} \label{p:five5} Show that every formula of a first-order language has the same number of left parentheses as of right parentheses. \end{prob} \begin{prob} \label{p:five6} Choose one of the languages defined in Examples \ref{e:lannt} and \ref{e:lan} and determine the possible lengths of formulas of this language. \end{prob} \begin{prop} \label{p:five7} A countable first-order language $\mathcal{L}$ has countably many formulas. \end{prop} In practice, devising a formal language intended to deal with a particular (kind of) structure isn't the end of the job: one must also specify axioms\index{axiom} in the language that the structure(s) one wishes to study should satisfy. Defining satisfaction is officially done in the next chapter, but it is usually straightforward to unofficially figure out what a formula in the language is supposed to mean. \begin{prob} \label{p:for} In each case, write down a formula of the given language expressing the given informal statement. \begin{enumerate} \item ``Addition is associative'' in $\mathcal{L}_F$. \item ``There is an empty set'' in $\mathcal{L}_S$. \item ``Between any two distinct elements there is a third element'' in $\mathcal{L}_O$. \item ``$n^0 = 1$ for every $n$ different from $0$'' in $\mathcal{L}_N$. \item ``There is only one thing'' in $\mathcal{L}_=$. \end{enumerate} \end{prob} \begin{prob} \label{p:fole} Define first-order languages to deal with the following structures and, in each case, an appropriate set of axioms in your language: \begin{enumerate} \item Groups. \item Graphs. \item Vector spaces. \end{enumerate} \end{prob} We will need a few additional concepts and facts about formulas of first-order logic later on. First, what are the subformulas of a formula? \begin{prob} \label{p:five10} \index{subformula} Define the set of subformulas of a formula $\varphi$ of a first-order language $\mathcal{L}$. \end{prob} For example, if $\varphi$ is \[ (((\lnot \forall v_1\, (\lnot =v_1c_7) ) \to P^2_3 v_5 v_8) \to \forall v_8 ( = v_8 f^3_5 c_0 v_1 v_5 \to P^1_2 v_8 )) \] in the language $\mathcal{L}_1$, then the set of subformulas of $\varphi$, $\mathcal{S}(\varphi)$, ought to include \begin{itemize} \item $=v_1c_7$, $P^2_3 v_5 v_8$, $= v_8 f^3_5 c_0 v_1 v_5$, $P^1_2 v_8$, \item $(\lnot =v_1c_7)$, $(= v_8 f^3_5 c_0 v_1 v_5 \to P^1_2 v_8)$, \item $\forall v_1\, (\lnot =v_1c_7)$, $\forall v_8 (= v_8 f^3_5 c_0 v_1 v_5 \to P^1_2 v_8)$, \item $(\lnot \forall v_1\, (\lnot =v_1c_7))$, \item $(\lnot \forall v_1\, (\lnot =v_1c_7) ) \to P^2_3 v_5 v_8)$, and \item $(((\lnot \forall v_1\, (\lnot =v_1c_7) ) \to P^2_3 v_5 v_8) \to \forall v_8 (= v_8 f^3_5 c_0 v_1 v_5 \to P^1_2 v_8 ))$ itself. \end{itemize} Second, we will need a concept that has no counterpart in propositional logic. \begin{defn} \label{d:frv} \index{variable free} \index{free variable} \index{variable bound} \index{bound variable} Suppose $x$ is a variable of a first-order language $\mathcal{L}$. Then $x$ {\em occurs free\/} in a formula $\varphi$ of $\mathcal{L}$ is defined as follows: \begin{enumerate} \item If $\varphi$ is atomic, then $x$ occurs free in $\varphi$ if and only if $x$ occurs in $\varphi$. \item If $\varphi$ is $(\lnot \alpha)$, then $x$ occurs free in $\varphi$ if and only if $x$ occurs free in $\alpha$. \item If $\varphi$ is $(\beta \to \delta)$, then $x$ occurs free in $\varphi$ if and only if $x$ occurs free in $\beta$ or in $\delta$. \item If $\varphi$ is $\forall v_k \, \psi$, then $x$ occurs free in $\varphi$ if and only if $x$ is different from $v_k$ and $x$ occurs free in $\psi$. \end{enumerate} An occurrence of $x$ in $\varphi$ which is not free is said to be {\em bound\/}. A formula $\sigma$ of $\mathcal{L}$ in which no variable occurs free is said to be a {\em sentence\/}.\index{sentence} \end{defn} Part 4 is the key: it asserts that an occurrence of a variable $x$ is bound instead of free if it is in the ``scope'' of an occurrence of $\forall x$. For example, $v_7$ is free in $\forall v_5 \, = v_5 v_7$, but $v_5$ is not. Different occurences of a given variable in a formula may be free or bound, depending on where they are; {\em e.g.\/} $v_6$ occurs both free and bound in $\forall v_0 \, (= v_0 f^1_3 v_6 \to (\lnot \forall v_6 \, P^1_9 v_6))$. \begin{prob} \label{p:five11} \index{scope of a quantifier} \index{quantifier, scope of} Give a precise definition of the scope of a quantifier. \end{prob} Note the distinction between sentences and ordinary formulas introduced in the last part of Definition 5.4. As we shall see, sentences are often more tractable and useful theoretically than ordinary formulas. \begin{prob} \label{p:five12} Which of the formulas you gave in solving Problem~\ref{p:for} are sentences? \end{prob} Finally, we will eventually need to consider a relationship between first-order languages. \begin{defn} \label{d:exlan} \index{extension of a language} \index{language extension of} A first-order language $\mathcal{L}'$ is an {\em extension\/} of a first-order language $\mathcal{L}$, sometimes written as $\mathcal{L} \subseteq \mathcal{L}'$, if every non-logical symbol of $\mathcal{L}$ is a non-logical symbol of the same kind of $\mathcal{L}'$. \end{defn} For example, every first-order language is an extension of $\mathcal{L}_=$. \begin{prob} \label{p:five13} Which of the languages given in Example \ref{e:lan} are extensions of other languages given in Example \ref{e:lan}? \end{prob} \begin{prop} \label{p:exlan} Suppose $\mathcal{L}$ is a first-order language and $\mathcal{L}'$ is an extension of $\mathcal{L}$. Then every formula $\varphi$ of $\mathcal{L}$ is a formula of $\mathcal{L}'$. \end{prop} \subsection*{Common Conventions} As with propositional logic, we will often use abbreviations\index{abbreviations} and informal conventions to simplify the writing of formulas in first-order languages. In particular, we will use the same additional connectives we used in propositional logic, plus an additional quantifier, $\exists$ (``there exists''): \begin{itemize} \item $(\alpha \land \beta)$ is short for $(\lnot (\alpha \to (\lnot \beta)))$\index{$\land$}. \item $(\alpha \lor \beta)$ is short for $( (\lnot \alpha) \to \beta)$\index{$\lor$}. \item $(\alpha \fromto \beta)$ is short for $((\alpha \to \beta) \land (\beta \to \alpha))$\index{$\fromto$}. \item $\exists v_k \varphi$ is short for $(\lnot \forall v_k (\lnot \varphi))$\index{$\exists$}. \end{itemize} ($\forall$\index{$\forall$} is often called the universal quantifier \index{universal quantifier} \index{quantifier universal} and $\exists$ is often called the existential quantifier.) \index{existential quantifier} \index{quantifier existential} Parentheses \index{parentheses conventions} \index{conventions, parentheses} will often be omitted in formulas according to the same conventions we used in propositional logic, with the modification that $\forall$ and $\exists$ take precedence over all the logical connectives: \begin{itemize} \item We will usually drop the outermost parentheses in a formula, writing $\alpha \to \beta$ instead of $(\alpha \to \beta)$ and $\lnot \alpha$ instead of $(\lnot \alpha)$. \item We will let $\forall$ take precedence over $\lnot$, and $\lnot$ take precedence over $\to$ when parentheses are missing, and fit the informal abbreviations into this scheme by letting the order of precedence be $\forall$, $\exists$, $\lnot$, $\land$, $\lor$, $\to$, and $\fromto$. \item Finally, we will group repetitions of $\to$, $\lor$, $\land$, or $\fromto$ to the right when parentheses are missing, so $\alpha \to \beta \to \gamma$ is short for $(\alpha \to (\beta \to \gamma))$. \end{itemize} For example, $\exists v_k \lnot \alpha \to \forall v_n \beta$ is short for $((\lnot \forall v_k (\lnot (\lnot\alpha))) \to \forall v_n \beta)$. On the other hand, we will sometimes add parentheses and arrange things in unofficial ways to make terms and formulas easier to read. In particular we will often write \begin{enumerate} \item $f(t_1,\dots,t_k)$ for $ft_1\dots t_k$ if $f$ is a $k$-place function symbol and $t_1$, \dots, $t_k$ are terms, \item $s \circ t$ for $\circ st$ if $\circ$ is a $2$-place function symbol and $s$ and $t$ are terms, \item $P(t_1, \dots, t_k)$ for $Pt_1 \dots t_k$ if $P$ is a $k$-place relation symbol and $t_1$, \dots, $t_k$ are terms, \item $s \bullet t$ for $\bullet st$ if $\bullet$ is a $2$-place relation symbol and $s$ and $t$ are terms, and \item $s=t$ for $=st$ if $s$ and $t$ are terms, and \item enclose terms in parentheses to group them. \end{enumerate} Thus, we could write the formula $= +1 \cdot 0 v_6 \cdot 11$ of $\mathcal{L}_{NT}$ as $1 + (0 \cdot v_6) = 1 \cdot 1$. As was observed in Example \ref{e:lannt}, it is customary in devising a formal language to recycle the same symbols used informally for the given objects. In situations where we want to talk about symbols without committing ourselves to a particular one, such as when talking about first-order languages in general, we will often use ``generic'' choices: \begin{itemize} \item $a$, $b$, $c$, \dots for constant\index{constant} symbols; \item $x$, $y$, $z$, \dots for variable\index{variable} symbols; \item $f$, $g$, $h$, \dots for function\index{function} symbols; \item $P$, $Q$, $R$, \dots for relation\index{relation} symbols; and \item $r$, $s$, $t$, \dots for generic terms\index{term}. \end{itemize} These can be thought of as variables in the metalanguage\footnote{The metalanguage is the language\index{language}, mathematical English in this case, in which we talk {\em about\/} a language. The theorems we prove about formal logic are, strictly speaking, metatheorems\index{metatheorem}, as opposed to the theorems\index{theorem} proved within a formal logical system. For more of this kind of stuff, read some philosophy\dots}\index{metalanguage} ranging over different kinds objects of first-order logic, much as we're already using lower-case Greek characters as variables which range over formulas. (In fact, we have already used some of these conventions in this chapter\dots) \subsection*{Unique Readability} The slightly paranoid might ask whether Definitions \ref{d:sym}, \ref{d:ter} and \ref{d:for} actually ensure that the terms and formulas of a first-order language $\mathcal{L}$ are unambiguous, {\em i.e.\/} cannot be read in more than one way. As with $\mathcal{L}_P$, to actually prove this one must assume that all the symbols of $\mathcal{L}$ are distinct and that no symbol is a subsequence of any other symbol. It then follows that: \begin{thm} \label{t:urt} \index{unique readibility of terms} Any term of a first-order language $\mathcal{L}$ satisfies exactly one of conditions 1--3 in Definition \ref{d:ter}. \end{thm} \begin{thm}[Unique Readability Theorem] \label{t:urf} \index{unique readability of formulas} \index{formula unique readability} \index{Unique Readability Theorem} Any formula of a first-order language satisfies exactly one of conditions 1--5 in Definition \ref{d:for}. \end{thm} % % Chapter 6 of "A Problem Course in Mathematical Logic" % \chapter{Structures and Models} \label{ch:six} Defining truth and implication in first-order logic is a lot harder than it was in propositional logic. First-order languages are intended to deal with mathematical objects like groups or linear orders, so it makes little sense to speak of the truth of a formula without specifying a context. For example, one can write down a formula expressing the commutative law in a language for group theory, $\forall x\, \forall y\, x \cdot y = y \cdot x$, but whether it is true or not depends on which group we're dealing with. It follows that we need to make precise which mathematical objects or structures a given first-order language can be used to discuss and how, given a suitable structure, formulas in the language are to be interpreted. Such a structure for a given language should supply most of the ingredients needed to interpret formulas of the language. Throughout this chapter, let $\mathcal{L}$ be an arbitrary fixed countable first-order language. All formulas will be assumed to be formulas of $\mathcal{L}$ unless stated otherwise. \begin{defn} \label{d:str} \index{structure} A {\em structure\/} $\mathfrak{M}$ for $\mathcal{L}$ consists of the following: \begin{enumerate} \item A non-empty set $M$, often written as $|\mathfrak{M}|$, called the {\em universe\/} of $\mathfrak{M}$.\index{universe} \item For each constant symbol $c$ of $\mathcal{L}$, an element $c^{\mathfrak{M}}$ of $M$.\index{constant} \item For each $k$-place function symbol $f$ of $\mathcal{L}$, a function $f^{\mathfrak{M}} : M^k \to M$, {\em i.e.\/} a $k$-place function on $M$.\index{function} \item For each $k$-place relation symbol $P$ of $\mathcal{L}$, a relation $P^{\mathfrak{M}} \subseteq M^k$, {\em i.e.\/} a $k$-place relation on $M$.\index{relation} \end{enumerate} \end{defn} That is, a structure supplies an underlying set of elements plus interpretations for the various non-logical symbols of the language: constant symbols are interpreted by particular elements of the underlying set, function symbols by functions on this set, and relation symbols by relations among elements of this set. It is customary to use upper-case ``gothic'' characters\index{gothic characters} such as $\mathfrak{M}$\index{$\mathfrak{M}$} and $\mathfrak{N}$\index{$\mathfrak{N}$} for structures. For example, consider $\mathfrak{Q} = (\mathbb{Q}, <)$, where $<$ is the usual ``less than'' relation on the rationals. This is a structure for $\mathcal{L}_O$, the language for linear orders defined in Example~\ref{e:lan}; it supplies a $2$-place relation to interpret the language's $2$-place relation symbol. $\mathfrak{Q}$ is {\em not\/} the only possible structure for $\mathcal{L}_O$: $(\mathbb{R}, < )$, $(\{0\}, \emptyset)$, and $(\mathbb{N}, \mathbb{N}^2)$ are three others among infinitely many more. (Note that in these cases the relation symbol $<$ is interpreted by relations on the universe which are not linear orders. One can ensure that a structure satisfy various conditions beyond what Definition~\ref{d:str} guarantees by requiring appropriate formulas to be true when interpreted in the structure.) On the other hand, $(\mathbb{R})$ is not a structure for $\mathcal{L}_O$ because it lacks a binary relation to interpret the symbol $<$ by, while $(\mathbb{N}, 0, 1, +, \cdot, |, <)$ is not a structure for $\mathcal{L}_O$ because it has two binary relations where $\mathcal{L}_O$ has a symbol only for one, plus constants and functions for which $\mathcal{L}_O$ lacks symbols. \begin{prob} \label{p:six1} The first-order languages referred to below were all defined in Example~\ref{e:lan}. \begin{enumerate} \item Is $(\emptyset)$ a structure for $\mathcal{L}_=$? \item Determine whether $\mathfrak Q = (\mathbb{Q}, <)$ is a structure for each of $\mathcal{L}_=$, $\mathcal{L}_F$, and $\mathcal{L}_S$. \item Give three different structures for $\mathcal{L}_F$ which are not fields. \end{enumerate} \end{prob} To determine what it means for a given formula to be true in a structure for the corresponding language, we will also need to specify how to interpret the variables when they occur free. (Bound variables have the associated quantifier to tell us what to do.) \begin{defn} \label{d:ass} \index{assignment} Let $V = \{\, v_0, v_1, v_2, \dots \,\}$ be the set of all variable\index{variable} symbols of $\mathcal{L}$ and suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$. A function $s : V \to |\mathfrak{M}|$ is said to be an {\em assignment\/} for $\mathfrak{M}$. \end{defn} Note that these are {\em not\/} truth assignments like those for $\mathcal{L}_P$. An assignment just interprets each variable in the language by an element of the universe of the structure. Also, as long as the universe of the structure has more than one element, any variable can be interpreted in more than one way. Hence there are usually many different possible assignments for a given structure. \begin{exmp} \label{e:as} Consider the structure $\mathfrak{R} = (\mathbb{R},0,1,+,\cdot)$ for $\mathcal{L}_F$. Each of the following functions $V \to \mathbb{R}$ is an assignment for $\mathfrak{R}$: \begin{enumerate} \item $p(v_n) = \pi$ for each $n$, \item $r(v_n) = e^n$ for each $n$, and \item $s(v_n) = n + 1$ for each $n$. \end{enumerate} In fact, {\em every\/} function $V \to \mathbb{R}$ is an assignment for $\mathfrak{R}$. \end{exmp} In order to use assignments to determine whether formulas are true in a structure, we need to know how to use an assignment to interpret each term of the language as an element of the universe. \begin{defn} \label{d:exas} \index{assignment extended} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$ and $s \colon V \to |\mathfrak{M}|$ is an assignment for $\mathfrak{M}$. Let $T$ be the set of all terms\index{term} of $\mathcal{L}$. Then the {\em extended assignment\/} $\mathbf{s} \colon T \to |\mathfrak{M}|$ is defined inductively as follows: \begin{enumerate} \item For each variable $x$, $\mathbf{s}(x) = s(x)$.\index{variable} \item For each constant symbol $c$, $\mathbf{s}(c) = c^{\mathfrak{M}}$.\index{constant} \item For every $k$-place function symbol $f$ and terms $t_1$, \dots, $t_k$, \[ \mathbf{s}(f t_1 \dots t_k) = f^{\mathfrak{M}} (\mathbf{s}(t_1), \dots, \mathbf{s}(t_k) ). \]\index{function} \end{enumerate} \end{defn} \begin{exmp} \label{e:exas} Let $\mathfrak{R}$ be the structure for $\mathcal{L}_F$ given in Example \ref{e:as}, and let $\mathbf{p}$, $\mathbf{r}$, and $\mathbf{s}$ be the extended assignments corresponding to the assignments $p$, $r$, and $s$ defined in Example \ref{e:as}. Consider the term $+ \cdot v_6 v_0 + 0 v_3$ of $\mathcal{L}_F$. Then: \begin{enumerate} \item $\mathbf{p}(+ \cdot v_6 v_0 + 0 v_3) = \pi^2 + \pi$, \item $\mathbf{r}(+ \cdot v_6 v_0 + 0 v_3) = e^6 + e^3$, and \item $\mathbf{s}(+ \cdot v_6 v_0 + 0 v_3) = 11$. \end{enumerate} Here's why for the last one: since $s(v_6) = 7$, $s(v_0) = 1$, $s(v_3) = 4$, and $\mathbf{s}(0) = 0$ (by part 2 of Definition \ref{d:exas}), it follows from part 3 of Definition \ref{d:exas} that $\mathbf{s}(+ \cdot v_6 v_0 + 0 v_3) = (7 \cdot 1) + (0 + 4) = 7 + 4 = 11$. \end{exmp} \begin{prob} \label{pb:exas} $\mathfrak{N} = (\mathbb{N}, 0, S, +, \cdot, E)$ is a structure for $\mathcal{L}_N$. Let $s \colon V \to \mathbb{N}$ be the assignment defined by $s(v_k) = k + 1$. What are $\mathbf{s}( E + v_{19} v_1 \cdot 0 v_{45})$ and $\mathbf{s}(SSS + E 0 v_6 v_7 )$? \end{prob} \begin{prop} \label{p:eau} $\mathbf s$ is unique, {\em i.e.\/} given an assignment $s$, no other function $T \to |\mathfrak{M}|$ satisfies conditions 1--3 in Definition~\ref{d:exas}. \end{prop} With Definitions \ref{d:ass} and \ref{d:exas} in hand, we can take our first cut at defining what it means for a first-order formula to be true. \begin{defn} \label{d:sat} \index{assignment} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$, $s$ is an assignment for $\mathfrak{M}$, and $\varphi$ is a formula of $\mathcal{L}$. Then $\mathfrak{M} \models \varphi [s]$ is defined as follows:\index{$\models$} \begin{enumerate} \item If $\varphi$ is $t_1 = t_2$ for some terms $t_1$ and $t_2$, then $\mathfrak{M} \models \varphi [s]$ if and only if $\mathbf{s}(t_1) = \mathbf{s}(t_2)$. \item If $\varphi$ is $P t_1 \dots t_k$ for some $k$-place relation symbol $P$ and terms $t_1$, \dots, $t_k$, then $\mathfrak{M} \models \varphi [s]$ if and only if $(\mathbf{s}(t_1), \dots, \mathbf{s}(t_k)) \in P^{\mathfrak{M}}$, {\em i.e.\/} $P^{\mathfrak{M}}$ is true of $(\mathbf{s}(t_1), \dots, \mathbf{s}(t_k))$. \item If $\varphi$ is $(\lnot \psi)$ for some formula $\psi$, then $\mathfrak{M} \models \varphi [s]$ if and only if it is not the case that $\mathfrak{M} \models \psi [s]$. \item If $\varphi$ is $(\alpha \to \beta)$, then $\mathfrak{M} \models \varphi [s]$ if and only if $\mathfrak{M} \models \beta [s]$ whenever $\mathfrak{M} \models \alpha [s]$, {\em i.e.\/} unless $\mathfrak{M} \models \alpha [s]$ but not $\mathfrak{M} \models \beta [s]$. \item If $\varphi$ is $\forall x \, \delta$ for some variable $x$, then $\mathfrak{M} \models \varphi [s]$ if and only if for all $m \in |\mathfrak{M}|$, $\mathfrak{M} \models \delta [s(x|m)]$, where $s(x|m)$ is the assignment given by \begin{displaymath} s(x|m)(v_k) = \begin{cases} s(v_k) & \text{if $v_k$ is different from $x$} \\ m & \text{if $v_k$ is $x$.} \end{cases} \end{displaymath} \end{enumerate} If $\mathfrak{M} \models \varphi [s]$, we shall say that $\mathfrak{M}$ {\em satisfies $\varphi$ on assignment\/}\index{satisfies} $s$ or that $\varphi$ {\em is true in $\mathfrak{M}$ on assignment\/}\index{truth in a structure} $s$. We will often write $\mathfrak{M} \nmodels \varphi [s]$ if it is not the case that $\mathfrak{M} \models \varphi [s]$.\index{$\nmodels$} Also, if $\Gamma$ is a set of formulas of $\mathcal{L}$, we shall take $\mathfrak{M} \models \Gamma [s]$ to mean that $\mathfrak{M} \models \gamma [s]$ for every formula $\gamma$ in $\Gamma$ and say that $\mathfrak{M}$ {\em satisfies $\Gamma$ on assignment\/} $s$. Similarly, we shall take $\mathfrak{M} \nmodels \Gamma [s]$ to mean that $\mathfrak{M} \nmodels \gamma [s]$ for {\em some\/} formula $\gamma$ in $\Gamma$. \end{defn} Clauses 1 and 2 are pretty straightforward and clauses 3 and 4 are essentially identical to the corresponding parts of Definition~\ref{d:tras}. The key clause is 5, which says that $\forall$ should be interpreted as ``for all elements of the universe''. \begin{exmp} Let $\mathfrak{R}$ be the structure for $\mathcal{L}_F$ and $s$ the assignment for $\mathfrak{R}$ given in Example \ref{e:as}, and consider the formula $\forall v_1\, (= v_3 \cdot 0 v_1 \to = v_3 0)$ of $\mathcal{L}_F$. We can verify that $\mathfrak{R} \models \forall v_1\, (= v_3 \cdot 0 v_1 \to = v_3 0) \, [s]$ as follows: \[ \begin{aligned} \mbox{} &\mathfrak{R} \models \forall v_1\, (= v_3 \cdot 0 v_1 \to = v_3 0) \, [s] \\ \iff &\text{for all $a \in |\mathfrak{R}|$,\ } \mathfrak{R} \models (= v_3 \cdot 0 v_1 \to = v_3 0) \, [s(v_1|a)] \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $\mathfrak{R} \models = v_3 \cdot 0 v_1 \, [s(v_1|a)]$,} \\ & \;\;\; \text{then $\mathfrak{R} \models = v_3 0 \, [s(v_1|a)]$} \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $\mathbf{s}(v_1|a)(v_3) = \mathbf{s}(v_1|a)(\cdot 0 v_1)$,} \\ & \;\;\; \text{then $\mathbf{s}(v_1|a)(v_3) = \mathbf{s}(v_1|a)(0)$} \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $\mathbf{s}(v_3) = \mathbf{s}(v_1|a)(0) \cdot \mathbf{s}(v_1|a)(v_1)$, then $\mathbf{s}(v_3) = 0$} \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $s(v_3) = 0 \cdot a$, then $s(v_3) = 0$} \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $4 = 0 \cdot a$, then $4 = 0$} \\ \iff &\text{for all $a \in |\mathfrak{R}|$, if $4 = 0$, then $4 = 0$} \\ \end{aligned} \] \dots which last is true whether or not $4 = 0$ is true or false. \end{exmp} \begin{prob} \label{p:six4} Let $\mathfrak{N}$ be the structure for $\mathcal{L}_N$ in Problem \ref{pb:exas}. Let $p : V \to \mathbb{N}$ be defined by $p(v_{2k}) = k$ and $p(v_{2k+1}) = k$. Verify that \begin{enumerate} \item $\mathfrak{N} \models \forall w \, (\lnot Sw = 0) \, [p]$ and \item $\mathfrak{N} \nmodels \forall x \exists y \, x + y = 0 \, [p]$. \end{enumerate} \end{prob} \begin{prop} \label{p:six5} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$, $s$ is an assignment for $\mathfrak{M}$, $x$ is a variable, and $\varphi$ is a formula of a first-order language $\mathcal{L}$. Then $\mathfrak{M} \models \exists x\, \varphi [s]$ if and only if $\mathfrak{M} \models \varphi [s(x|m)]$ for some $m \in |\mathfrak{M}|$. \end{prop} Working with particular assignments is difficult but, while sometimes unavoidable, not always necessary. \begin{defn} \label{d:mod} \index{model} \index{true in a structure} \index{$\models$} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$, and $\varphi$ a formula of $\mathcal{L}$. Then $\mathfrak{M} \models \varphi$ if and only if $\mathfrak{M} \models \varphi [s]$ for every assignment $s : V \to |\mathfrak{M}|$ for $\mathfrak{M}$. $\mathfrak{M}$ is a {\em model\/} of $\varphi$ or that $\varphi$ is {\em true\/} in $\mathfrak{M}$ if $\mathfrak{M} \models \varphi$. We will often write $\mathfrak{M} \nmodels \psi$ if it is not the case that $\mathfrak{M} \models \psi$. Similarly, if $\Gamma$ is a set of formulas, we will write $\mathfrak{M} \models \Gamma$ if $\mathfrak{M} \models \gamma$ for every formula $\gamma \in \Gamma$, and say that $\mathfrak{M}$ is a {\em model\/}\index{model} of $\Gamma$ or that $\mathfrak{M}$ {\em satisfies\/}\index{satisfies} $\Gamma$. A formula or set of formulas is {\em satisfiable\/}\index{satisfiable} if there is some structure $\mathfrak{M}$ which satisfies it. We will often write $\mathfrak{M} \nmodels \Gamma$ if it is not the case that $\mathfrak{M} \models \Gamma$.\index{$\nmodels$} \end{defn} \begin{note} $\mathfrak{M} \nmodels \varphi$ does {\em not\/} mean that for every assignment $s : V \to |\mathfrak{M}|$, it is not the case that $\mathfrak{M} \models \varphi [s]$. It only means that that there is {\em some\/} assignment $r : V \to |\mathfrak{M}|$ for which $\mathfrak{M} \models \varphi [r]$ is not true. \end{note} \begin{prob} \label{p:ord} $\mathfrak{Q} = (\mathbb{Q},<)$ is a structure for $\mathcal{L}_O$. For each of the following formulas $\varphi$ of $\mathcal{L}_O$, determine whether or not $\mathfrak{Q} \models \varphi$. \begin{enumerate} \item $\forall v_0\, \exists v_2\, v_0 < v_2$ \item $\exists v_1\, \forall v_3\, (v_1 < v_3 \to v_1 = v_3)$ \item $\forall v_4\, \forall v_5\, \forall v_6 (v_4 < v_5 \to (v_5 < v_6 \to v_4 < v_6))$ \end{enumerate} \end{prob} The following facts are counterparts of sorts for Proposition~\ref{p:tav}. Their point is that what a given assignment does with a given term or formula depends only on the assignment's values on the (free) variables of the term or formula. \begin{lem} \label{l:six7} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$, $t$ is a term of $\mathcal{L}$, and $r$ and $s$ are assignments for $\mathfrak{M}$ such that $r(x) = s(x)$ for every variable $x$ which occurs in $t$. Then $\mathbf{r}(t) = \mathbf{s}(t)$. \end{lem} \begin{prop} \label{p:six8} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$, $\varphi$ is a formula of $\mathcal{L}$, and $r$ and $s$ are assignments for $\mathfrak{M}$ such that $r(x) = s(x)$ for every variable $x$ which occurs free in $\varphi$. Then $\mathfrak{M} \models \varphi [r]$ if and only if $\mathfrak{M} \models \varphi [s]$. \end{prop} \begin{cor} \label{c:six9} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$ and $\sigma$ is a sentence of $\mathcal{L}$. Then $\mathfrak{M} \models \sigma$ if and only if there is some assignment $s : V \to |\mathfrak{M}|$ for $\mathfrak{M}$ such that $\mathfrak{M} \models \sigma [s]$. \end{cor} Thus sentences are true or false in a structure independently of any particular assignment. This does not necessarily make life easier when trying to verify whether a sentence is true in a structure -- try doing Problem~\ref{p:ord} again with the above results in hand -- but it does let us simplify things on occasion when proving things about sentences rather than formulas. We recycle a sense in which we used $\models$\index{$\models$} in propositional logic. \begin{defn} Suppose $\Gamma$ is a set of formulas of $\mathcal{L}$ and $\psi$ is a formula of $\mathcal{L}$. Then $\Gamma$ {\em implies\/}\index{implies} $\psi$, written as $\Gamma \models \psi$, if $\mathfrak{M} \models \psi$ whenever $\mathfrak{M} \models \Gamma$ for every structure $\mathfrak{M}$ for $\mathcal{L}$. Similarly, if $\Gamma$ and $\Delta$ are sets of formulas of $\mathcal{L}$, then $\Gamma$ {\em implies\/} $\Delta$, written as $\Gamma \models \Delta$, if $\mathfrak{M} \models \Delta$ whenever $\mathfrak{M} \models \Gamma$ for every structure $\mathfrak{M}$ for $\mathcal{L}$. We will usually write $\models \dots$ for $\emptyset \models \dots$. \end{defn} \begin{prop} \label{p:inf} Suppose $\alpha$ and $\beta$ are formulas of some first-order language. Then $\{\, (\alpha \to \beta),\, \alpha\, \} \models \beta$. \end{prop} \begin{prop} \label{p:six12} Suppose $\Sigma$ is a set of formulas and $\psi$ and $\rho$ are formulas of some first-order language. Then $\Sigma \cup \{\psi\} \models \rho$ if and only if $\Sigma \models (\psi \to \rho)$. \end{prop} \begin{defn} A formula $\psi$ of $\mathcal{L}$ is a {\em tautology\/}\index{tautology} if it is true in every structure, {\em i.e.\/} if $\models \psi$. $\psi$ is a {\em contradiction\/}\index{contradiction} if $\lnot \psi$ is a tautology, {\em i.e.\/} if $\models \lnot \psi$. \end{defn} For some trivial examples, let $\varphi$ be a formula of $\mathcal{L}$ and $\mathfrak{M}$ a structure for $\mathcal{L}$. Then $\mathfrak{M} \models \{ \varphi \}$ if and only if $\mathfrak{M} \models \varphi$, so it must be the case that $\{ \varphi \} \models \varphi$. It is also easy to check that $\varphi \to \varphi$ is a tautology and $\lnot (\varphi \to \varphi)$ is a contradiction. \begin{prob} \label{p:taut} Show that $\forall y\, y = y$ is a tautology and that $\exists y\, \lnot y = y$ is a contradiction. \end{prob} \begin{prob} \label{p:cont} Suppose $\varphi$ is a contradiction. Show that $\mathfrak{M} \models \varphi [s]$ is false for every structure $\mathfrak{M}$ and assignment $s : V \to |\mathfrak{M}|$ for $\mathfrak{M}$. \end{prob} \begin{prob} \label{p:six13} Show that a set of formulas $\Sigma$ is satisfiable if and only if there is no contradiction $\chi$ such that $\Sigma \models \chi$. \end{prob} The following fact is a counterpart of Proposition \ref{p:tif}. \begin{prop} \label{p:mif} Suppose $\mathfrak{M}$ is a structure for $\mathcal{L}$ and $\alpha$ and $\beta$ are sentences of $\mathcal{L}$. Then: \begin{enumerate} \item $\mathfrak{M} \models \lnot\alpha$ if and only if $\mathfrak{M} \nmodels \alpha$. \item $\mathfrak{M} \models \alpha \to \beta$ if and only if $\mathfrak{M} \models \beta$ whenever $\mathfrak{M} \models \alpha$. \item $\mathfrak{M} \models \alpha \lor \beta$ if and only if $\mathfrak{M} \models \alpha$ or $\mathfrak{M} \models \beta$. \item $\mathfrak{M} \models \alpha \land \beta$ if and only if $\mathfrak{M} \models \alpha$ and $\mathfrak{M} \models \beta$. \item $\mathfrak{M} \models \alpha \fromto \beta$ if and only if $\mathfrak{M} \models \alpha$ exactly when $\mathfrak{M} \models \beta$. \item $\mathfrak{M} \models \forall x\, \alpha$ if and only if $\mathfrak{M} \models \alpha$. \item $\mathfrak{M} \models \exists x\, \alpha$ if and only if there is some $m \in |\mathfrak{M}|$ so that $\mathfrak{M} \models \alpha\, [s(x|m)]$ for every assignment $s$ for $\mathfrak{M}$. \end{enumerate} \end{prop} \begin{prob} \label{p:mif2} How much of Proposition \ref{p:mif} must remain true if $\alpha$ and $\beta$ are not sentences? \end{prob} Recall that by Proposition \ref{p:exlan} a formula of a first-order language is also a formula of any extension of the language. The following relationship between extension languages and satisfiability will be needed later on. \begin{prop} \label{p:exsat} Suppose $\mathcal{L}$ is a first-order language, $\mathcal{L}'$ is an extension of $\mathcal{L}$, and $\Gamma$ is a set of formulas of $\mathcal{L}$. Then $\Gamma$ is satisfiable in a structure for $\mathcal{L}$ if and only if $\Gamma$ is satisfiable in a structure for $\mathcal{L}'$. \end{prop} One last bit of terminology\dots \begin{defn} \label{d:ax} \index{axiom} \index{theory} \index{$\text{Th}$} If $\mathfrak{M}$ is a structure for $\mathcal{L}$, then the {\em theory\/} of $\mathfrak{M}$ is just the set of all sentences of $\mathcal{L}$ true in $\mathfrak{M}$, {\em i.e.\/} \[ \text{Th}(\mathfrak{M}) = \{\, \tau \mid \tau \text{\ is a sentence and\ } \mathfrak{M} \models \tau \,\}. \] If $\Delta$ is a set of sentences and $\mathcal{S}$ is a collection of structures, then $\Delta$ is a set of (non-logical) {\it axioms\/} for $\mathcal{S}$ if for every structure $\mathfrak{M}$, $\mathfrak{M} \in \mathcal{S}$ if and only if $\mathfrak{M} \models \Delta$. \end{defn} \begin{exmp} Consider the sentence $\exists x\, \exists y\, ( (\lnot x = y) \land \forall z\, (z = x \lor z = y))$ of $\mathcal{L}_=$. Every structure of $\mathcal{L}_=$ satisfying this sentence must have exactly two elements in its universe, so $\{\, \exists x\, \exists y\, ( (\lnot x = y) \land \forall z\, (z = x \lor z = y)) \,\}$ is a set of non-logical axioms for the collection of sets of cardinality $2$: \[ \{\, \mathfrak{M} \mid \mathfrak{M} \text{\ is a structure for\ } \mathcal{L}_= \text{\ with exactly $2$ elements} \,\} \, . \] \end{exmp} \begin{prob} \label{p:six16} In each case, find a suitable language and a set of axioms in it for the given collection of structures. \begin{enumerate} \item Sets of size 3. \item Bipartite graphs. \item Commutative groups. \item Fields of characteristic 5. \end{enumerate} \end{prob} % % Chapter 7 of "A Problem Course in Mathematical Logic" % \chapter{Deductions} \label{ch:seven} Deductions in first-order logic are not unlike deductions in propositional logic. Of course, some changes are necessary to handle the various additional features of propositional logic, especially quantifiers. In particular, one of the new axioms requires a tricky preliminary definition. Roughly, the problem is that we need to know when we can replace occurrences of a variable in a formula by a term without letting any variable in the term get captured by a quantifier. Throughout this chapter, let $\mathcal{L}$ be a fixed arbitrary first-order language. Unless stated otherwise, all formulas will be assumed to be formulas of $\mathcal{L}$. \begin{defn} \label{d:subs} \index{substitutable} Suppose $x$ is a variable, $t$ is a term, and $\varphi$ is a formula. Then {\em $t$ is substitutable for $x$ in $\varphi$\/} is defined as follows: \begin{enumerate} \item If $\varphi$ is atomic, then $t$ is substitutable for $x$ in $\varphi$. \item If $\varphi$ is $(\lnot \psi)$, then $t$ is substitutable for $x$ in $\varphi$ if and only if $t$ is substitutable for $x$ in $\psi$. \item If $\varphi$ is $(\alpha \to \beta)$, then $t$ is substitutable for $x$ in $\varphi$ if and only if $t$ is substitutable for $x$ in $\alpha$ and $t$ is substitutable for $x$ in $\beta$. \item If $\varphi$ is $\forall y \, \delta$, then $t$ is substitutable for $x$ in $\varphi$ if and only if either \begin{enumerate} \item $x$ does not occur free in $\varphi$, or \item if $y$ does not occur in $t$ and $t$ is substitutable for $x$ in $\delta$. \end{enumerate} \end{enumerate} \end{defn} For example, $x$ is always substitutable for itself in any formula $\varphi$ and $\varphi^x_x$ is just $\varphi$ (see Problem~\ref{p:subs}). On the other hand, $y$ is not substitutable for $x$ in $\forall y\, x = y$ because if $x$ were to be replaced by $y$, the new instance of $y$ would be ``captured'' by the quantifier $\forall y$. This makes a difference to the truth of the formula. The truth of $\forall y\, x = y$ depends on the structure in which it is interpreted --- it's true if the universe has only one element and false otherwise --- but $\forall y\, y = y$ is a tautology by Problem~\ref{p:taut} so it is true in any structure whatsoever. This sort of difficulty makes it necessary to be careful when substituting for variables. \begin{defn} \label{d:subst} \index{substitution} Suppose $x$ is a variable, $t$ is a term, and $\varphi$ is a formula. If $t$ is substitutable for $x$ in $\varphi$, then $\varphi^x_t$ ({\em i.e.\/} $\varphi$ with $t$ substituted for $x$) is defined as follows:\index{$\varphi^x_t$} \begin{enumerate} \item If $\varphi$ is atomic, then $\varphi^x_t$ is the formula obtained by replacing each occurrence of $x$ in $\varphi$ by $t$. \item If $\varphi$ is $(\lnot \psi)$, then $\varphi^x_t$ is the formula $(\lnot \psi^x_t)$. \item If $\varphi$ is $(\alpha \to \beta)$, then $\varphi^x_t$ is the formula $(\alpha^x_t \to \beta^x_t)$. \item If $\varphi$ is $\forall y \, \delta$, then $\varphi^x_t$ is the formula \begin{enumerate} \item $\forall y \, \delta$ if $x$ is $y$, and \item $\forall y \, \delta^x_t$ if $x$ isn't $y$. \end{enumerate} \end{enumerate} \end{defn} \begin{prob} \label{p:subs} \begin{enumerate} \item Is $x$ substitutable for $z$ in $\psi$ if $\psi$ is $z = x \to \forall z\, z = x$? If so, what is $\psi^z_x$? \item Show that if $t$ is any term and $\sigma$ is a sentence, then $t$ is substitutable in $\sigma$ for any variable $x$. What is $\sigma^x_t$? \item Show that if $t$ is a term in which no variable occurs that occurs in the formula $\varphi$, then $t$ is substitutable in $\varphi$ for any variable $x$. \item Show that $x$ is substitutable for $x$ in $\varphi$ for any variable $x$ and any formula $\varphi$, and that $\varphi^x_x$ is just $\varphi$. \end{enumerate} \end{prob} Along with the notion of substitutability, we need an additional notion in order to define the logical axioms of $\mathcal{L}$. \begin{defn} \index{generalization} If $\varphi$ is any formula and $x_1$, \dots, $x_n$ are any variables, then $\forall x_1 \dots \forall x_n \, \varphi$ is said to be a {\em generalization\/} of $\varphi$. \end{defn} For example, $\forall y\, \forall x\, (x = y \to fx = fy)$ and $\forall z\, (x = y \to fx = fy)$ are (different) generalizations of $x = y \to fx = fy$, but $\forall x\, \exists y\, (x = y \to fx = fy)$ is not. Note that the variables being quantified don't have to occur in the formula being generalized. \begin{lem} \label{l:gen} Any generalization of a tautology is a tautology. \end{lem} \begin{defn} \label{d:axs} \index{axiom schema} Every first-order language $\mathcal{L}$ has eight {\em logical axiom schema\/}: \begin{description} \item[A1] $(\alpha \to (\beta \to \alpha))$ \index{A1} \item[A2] $((\alpha \to (\beta \to \gamma)) \to ((\alpha \to \beta) \to (\alpha \to \gamma)))$ \index{A2} \item[A3] $(((\lnot \beta)\to (\lnot \alpha)) \to (((\lnot \beta) \to \alpha) \to \beta))$ \index{A3} \item[A4] $(\forall x \, \alpha \to \alpha^x_t)$, if $t$ is substitutable for $x$ in $\alpha$. \index{A4} \item[A5] $(\forall x \, (\alpha \to \beta) \to (\forall x \, \alpha \to \forall x \, \beta))$ \index{A5} \item[A6] $(\alpha \to \forall x \, \alpha)$, if $x$ does not occur free in $\alpha$. \index{A6} \item[A7] $x = x$ \index{A7} \item[A8] $(x = y \to (\alpha \to \beta))$, if $\alpha$ is atomic and $\beta$ is obtained from $\alpha$ by replacing some occurrences (possibly all or none) of $x$ in $\alpha$ by $y$. \index{A8} \end{description} Plugging in any particular formulas of $\mathcal{L}$ for $\alpha$, $\beta$, and $\gamma$, and any particular variables for $x$ and $y$, in any of A1--A8 gives a {\em logical axiom\/}\index{logical axiom}\index{axiom logical} of $\mathcal{L}$. In addition, any generalization of a logical axiom of $\mathcal{L}$ is also a logical axiom of $\mathcal{L}$. \end{defn} The reason for calling the instances of A1--A8 the logical axioms, instead of just axioms, is to avoid conflict with Definition~\ref{d:ax}. \begin{prob} \label{p:seven3} Determine whether or not each of the following formulas is a logical axiom. \begin{enumerate} \item $\forall x\, \forall z\, (x = y \to (x = c \to x = y))$ \item $x = y \to (y = z \to z = x)$ \item $\forall z\, (x = y \to (x = c \to y = c))$ \item $\forall w\, \exists x\, (Pwx \to Pww) \to \exists x\, (Pxx \to Pxx)$ \item $\forall x\, (\forall x\, c = fxc \to \forall x\, \forall x\, c = fxc)$ \item $(\exists x\, Px \to \exists y\, \forall z\, Rzfy) \to ( (\exists x\, Px \to \forall y\, \lnot \forall z\, Rzfy) \to \forall x\, \lnot Px)$ \end{enumerate} \end{prob} \begin{prop} \label{p:seven4} Every logical axiom is a tautology. \end{prop} Note that we have recycled our axiom schemas A1---A3 from propositional logic. We will also recycle MP as the sole rule of inference\index{rule of inference} for first-order logic. \begin{defn}[Modus Ponens] \index{Modus Ponens} Given the formulas $\varphi$ and $(\varphi \to \psi)$, one may infer $\psi$. \end{defn} As in propositional logic, we will usually refer to Modus Ponens by its initials, MP\index{MP}. That MP preserves truth in the sense of Chapter \ref{ch:six} follows from Problem \ref{p:inf}. Using the logical axioms and MP, we can execute deductions in first-order logic just as we did in propositional logic. \begin{defn} \index{deduction} \index{proof} Let $\Delta$ be a set of formulas of the first-order language $\mathcal{L}$. A {\em deduction\/} or {\em proof\/} from $\Delta$ in $\mathcal{L}$ is a finite sequence $\varphi_1 \varphi_2 \dots \varphi_n$ of formulas of $\mathcal{L}$ such that for each $k \le n$, \begin{enumerate} \item $\varphi_k$ is a logical axiom, or \item $\varphi_k \in \Delta$, or \item there are $i,j < k$ such that $\varphi_k$ follows from $\varphi_i$ and $\varphi_j$ by MP. \end{enumerate} A formula of $\Delta$ appearing in the deduction is usually referred to as a {\em premiss\/}\index{premiss} of the deduction. $\Delta$ {\em proves\/}\index{proves} a formula $\alpha$, written as $\Delta \proves \alpha$,\index{$\proves$} if $\alpha$ is the last formula of a deduction from $\Delta$. We'll usually write $\proves \alpha$ instead of $\emptyset \proves \alpha$. Finally, if $\Gamma$ and $\Delta$ are sets of formulas, we'll take $\Gamma \proves \Delta$ to mean that $\Gamma \proves \delta$ for every formula $\delta \in \Delta$. \end{defn} \begin{note} We have reused the axiom schema, the rule of inference, and the definition of deduction from propositional logic. It follows that any deduction of propositional logic can be converted into a deduction of first-order logic simply by replacing the formulas of $\mathcal{L}_P$ occurring in the deduction by first-order formulas. Feel free to appeal to the deductions in the exercises and problems of Chapter~\ref{ch:three}. {\em You should probably review the Examples and Problems of Chapter~\ref{ch:three} before going on, since most of the rest of this Chapter concentrates on what is {\em different\/} about deductions in first-order logic.\/} \end{note} \begin{exmp} \label{e:apf} We'll show that $\{ \alpha \} \proves \exists x\, \alpha$ for any first-order formula $\alpha$ and any variable $x$. \begin{enumerate} \item $(\forall x\, \lnot\alpha \to \lnot\alpha) \to (\alpha \to \lnot \forall x\, \lnot \alpha)$ \hfill Problem~\ref{p:prov}.5 \item $\forall x\, \lnot\alpha \to \lnot\alpha$ \hfill A4 \item $\alpha \to \lnot \forall x\, \lnot \alpha$ \hfill 1,2 MP \item $\alpha$ \hfill Premiss \item $\lnot \forall x\, \lnot \alpha$ \hfill 3,4 MP \item $\exists x\, \alpha$ \hfill Definition of $\exists$ \end{enumerate} Strictly speaking, the last line is just for our convenience, like $\exists$ itself. \end{exmp} \begin{prob} \label{p:deds} Show that: \begin{enumerate} \item $\proves \forall x\, \varphi \to \forall y\, \varphi^x_y$, if $y$ does not occur at all in $\varphi$. \item $\proves \alpha \lor \lnot \alpha$. \item $\{ c = d \} \proves \forall z\, Qazc \to Qazd$. \item $\proves x = y \to y = x$. \item $\{ \exists x\, \alpha \} \proves \alpha$ if $x$ does not occur free in $\alpha$. \end{enumerate} \end{prob} Many general facts about deductions can be recycled from propositional logic, including the Deduction Theorem. \begin{prop} \label{p:seven5a} If $\varphi_1 \varphi_2 \dots \varphi_n$ is a deduction of $\mathcal{L}$, then $\varphi_1 \dots \varphi_\ell$ is also a deduction of $\mathcal{L}$ for any $\ell$ such that $1 \le \ell \le n$. \end{prop} \begin{prop} \label{p:seven6} If $\Gamma \proves \delta$ and $\Gamma \proves \delta \to \beta$, then $\Gamma \proves \beta$. \end{prop} \begin{prop} \label{p:bim} If $\Gamma \subseteq \Delta$ and $\Gamma \proves \alpha$, then $\Delta \proves \alpha$. \end{prop} \begin{prop} \label{p:seven8} Then if $\Gamma \proves \Delta$ and $\Delta \proves \sigma$, then $\Gamma \proves \sigma$. \end{prop} \begin{thm}[Deduction Theorem] \label{t:fded} \index{Deduction Theorem} If $\Sigma$ is any set of formulas and $\alpha$ and $\beta$ are any formulas, then $\Sigma \proves \alpha \to \beta$ if and only if $\Sigma \cup \{ \alpha \} \proves \beta$. \end{thm} Just as in propositional logic, the Deduction Theorem is useful because it often lets us take shortcuts when trying to show that deductions exist. There is also another result about first-order deductions which often supplies useful shortcuts. \begin{thm}[Generalization Theorem] \label{t:gen} \index{Generalization Theorem} Suppose $x$ is a variable, $\Gamma$ is a set of formulas in which $x$ does not occur free, and $\varphi$ is a formula such that $\Gamma \proves \varphi$. Then $\Gamma \proves \forall x \, \varphi$. \end{thm} \begin{thm}[Generalization On Constants] \label{t:genc} \index{Generalization On Constants} Suppose that $c$ is a constant symbol, $\Gamma$ is a set of formulas in which $c$ does not occur, and $\varphi$ is a formula such that $\Gamma \proves \varphi$. Then there is a variable $x$ which does not occur in $\varphi$ such that $\Gamma \proves \forall x \, \varphi^c_x$.\footnote{$\varphi^c_x$ is $\varphi$ with every occurence of the constant $c$ replaced by $x$.} Moreover, there is a deduction of $\forall x \, \varphi^c_x$ from $\Gamma$ in which $c$ does not occur. \end{thm} \begin{exmp} We'll show that if $\varphi$ and $\psi$ are any formulas, $x$ is any variable, and $\proves \varphi \to \psi$, then $\proves \forall x\, \varphi \to \forall x\, \psi$. Since $x$ does not occur free in any formula of $\emptyset$, it follows from $\proves \varphi \to \psi$ by the Generalization Theorem that $\proves \forall x\, (\varphi \to \psi)$. But then \begin{enumerate} \item $\forall x\, (\varphi \to \psi)$ \hfill above \item $\forall x\, (\varphi \to \psi) \to (\forall x\, \varphi \to \forall x\, \psi)$ \hfill A5 \item $\forall x\, \varphi \to \forall x\, \psi$ \hfill 1,2 MP \end{enumerate} is the tail end of a deduction of $\forall x\, \varphi \to \forall x\, \psi$ from $\emptyset$. \end{exmp} \begin{prob} \label{p:seven12} Show that: \begin{enumerate} \item $\proves \forall x\, \forall y\, \forall z\, ( x = y \to (y = z \to x = z) )$. \item $\proves \forall x\, \alpha \to \exists x\, \alpha$. \item $\proves \exists x \, \gamma \to \forall x\, \gamma$ if $x$ does not occur free in $\gamma$. \end{enumerate} \end{prob} We conclude with a bit of terminology. \begin{defn} \index{theory} \index{$\text{Th}$} If $\Sigma$ is a set of sentences, then the {\em theory\/} of $\Sigma$ is \[ \mathrm{Th}(\Sigma) = \{\, \tau \mid \tau \text{\ is a sentence and\ } \Sigma \proves \tau \,\}. \] \end{defn} That is, the theory of $\Sigma$ is just the collection of all sentences which can be proved from $\Sigma$. % % Chapter 8 of "A Problem Course in Mathematical Logic" % \chapter{Soundness and Completeness} \label{ch:eight} As with propositional logic, first-order logic had better satisfy the Soundness Theorem and it is desirable that it satisfy the Completeness Theorem. These theorems do hold for first-order logic. The Soundness Theorem is proved in a way similar to its counterpart for propositional logic, but the Completeness Theorem will require a fair bit of additional work.\footnote{This is not too surprising because of the greater complexity of first-order logic. Also, it turns out that first-order logic is about as powerful as a logic can get and still have the Completeness Theorem hold.} It is in this extra work that the distinction between formulas and sentences becomes useful. Let $\mathcal{L}$ be a fixed countable first-order language throughout this chapter. All formulas will be assumed to be formulas of $\mathcal{L}$ unless stated otherwise. First, we rehash many of the definitions and facts we proved for propositional logic in Chapter~\ref{ch:four} for first-order logic. \begin{thm}[Soundness Theorem] \label{t:fsnd} \index{Soundness Theorem} If $\alpha$ is a sentence and $\Delta$ is a set of sentences such that $\Delta \proves \alpha$, then $\Delta \models \alpha$. \end{thm} \begin{defn} \index{consistent} \index{inconsistent} A set of sentences $\Gamma$ is {\em inconsistent\/} if $\Gamma \proves \lnot (\psi \to \psi)$ for some formula $\psi$, and is {\em consistent\/} if it is not inconsistent. \end{defn} Recall that a set of sentences $\Gamma$ is satisfiable if $\mathfrak{M} \models \Gamma$ for some structure $\mathfrak{M}$. \begin{prop} \label{p:sacon} If a set of sentences $\Gamma$ is satisfiable, then it is consistent. \end{prop} \begin{prop} \label{p:eight4} Suppose $\Delta$ is an inconsistent set of sentences. Then $\Delta \proves \psi$ for any formula $\psi$. \end{prop} \begin{prop} \label{p:eight5} Suppose $\Sigma$ is an inconsistent set of sentences. Then there is a finite subset $\Delta$ of $\Sigma$ such that $\Delta$ is inconsistent. \end{prop} \begin{cor} \label{c:eight6} A set of sentences $\Gamma$ is consistent if and only if every finite subset of $\Gamma$ is consistent. \end{cor} \begin{defn} \index{maximally consistent} \index{consistent maximally} A set of sentences $\Sigma$ is {\em maximally consistent} if $\Sigma$ is consistent but $\Sigma \cup \{\tau\}$ is inconsistent whenever $\tau$ is a sentence such that $\tau \notin \Sigma$. \end{defn} One quick way of finding examples of maximally consistent sets is given by the following proposition. \begin{prop} \label{p:smac} If $\mathfrak{M}$ is a structure, then $\text{Th}(\mathfrak{M})$ is a maximally consistent set of sentences. \end{prop} \begin{exmp} \label{e:maxcon} $\mathfrak{M} = \left( \{ 5 \} \right)$ is a structure for $\mathcal{L}_=$, so $\text{Th}(\mathfrak{M})$ is a maximally consistent set of sentences. Since it turns out that $\text{Th}(\mathfrak{M}) = \text{Th}\left( \{\, \forall x\, \forall y\, x=y \,\} \right)$, this also gives us an example of a set of sentences $\Sigma = \{\, \forall x\, \forall y\, x=y \,\}$ such that $\text{Th}(\Sigma)$ is maximally consistent. \end{exmp} \begin{prop} \label{p:eight8} If $\Sigma$ is a maximally consistent set of sentences, $\tau$ is a sentence, and $\Sigma \proves \tau$, then $\tau \in \Sigma$. \end{prop} \begin{prop} \label{p:eight9} Suppose $\Sigma$ is a maximally consistent set of sentences and $\tau$ is a sentence. Then $\lnot\tau \in \Sigma$ if and only if $\tau \notin \Sigma$. \end{prop} \begin{prop} \label{p:eight10} Suppose $\Sigma$ is a maximally consistent set of sentences and $\varphi$ and $\psi$ are any sentences. Then $\varphi \to \psi \in \Sigma$ if and only if $\varphi \notin \Sigma$ or $\psi \in \Sigma$. \end{prop} \begin{thm} \label{t:etmc} Suppose $\Gamma$ is a consistent set of sentences. Then there is a maximally consistent set of sentences $\Sigma$ with $\Gamma \subseteq \Sigma$. \end{thm} The counterparts of these notions and facts for propositional logic sufficed to prove the Completeness Theorem, but here we will need some additional tools. The basic problem is that instead of defining a suitable truth assignment from a maximally consistent set of formulas, we need to construct a suitable structure from a maximally consistent set of sentences. Unfortunately, structures for first-order languages are usually more complex than truth assignments for propositional logic. The following definition supplies the key new idea we will use to prove the Completeness Theorem. \begin{defn} \index{witnesses} Suppose $\Sigma$ is a set of sentences and $C$ is a set of (some of the) constant symbols of $\mathcal{L}$. Then $C$ is a {\em set of witnesses\/} for $\Sigma$ in $\mathcal{L}$ if for every formula $\varphi$ of $\mathcal{L}$ with at most one free variable $x$, there is a constant symbol $c \in C$ such that $\Sigma \proves \exists x\, \varphi \to \varphi^x_c$. \end{defn} The idea is that every element of the universe which $\Sigma$ proves must exist is named, or ``witnessed'', by a constant symbol in $C$. Note that if $\Sigma \proves \lnot \exists x\, \varphi$, then $\Sigma \proves \exists x\, \varphi \to \varphi^x_c$ for any constant symbol $c$. \begin{prop} \label{p:eight14} Suppose $\Gamma$ and $\Sigma$ are sets of sentences of $\mathcal{L}$, $\Gamma \subseteq \Sigma$, and $C$ is a set of witnesses for $\Gamma$ in $\mathcal{L}$. Then $C$ is a set of witnesses for $\Sigma$ in $\mathcal{L}$. \end{prop} \begin{exmp} \label{e:rw} Let $\mathcal{L}'_O$ be the first-order language with a single 2-place relation symbol, $<$, and countably many constant symbols, $c_q$ for each $q \in \mathbb{Q}$. Let $\Sigma$ include all the sentences \begin{enumerate} \item $c_p < c_q$, for every $p,q \in \mathbb{Q}$ such that $p < q$, \item $\forall x\, (\lnot x < x)$, \item $\forall x\, \forall y\, (x < y \lor x = y \lor y < x)$, \item $\forall x\, \forall y\, \forall z\, (x < y \to (y < z \to x < z))$, \item $\forall x\, \forall y\, (x < y \to \exists z\, (x < z \land z < y))$, \item $\forall x\, \exists y\, (x < y)$, and \item $\forall x\, \exists y\, (y < x)$. \end{enumerate} In effect, $\Sigma$ asserts that $<$ is a linear order on the universe (2--4) which is dense (5) and has no endpoints (6--7), and which has a suborder isomorphic to $\mathbb{Q}$ (1). Then $C = \{\, c_q \mid q \in \mathbb{Q} \,\}$ is a set of witnesses for $\Sigma$ in $\mathcal{L}'_O$. \end{exmp} In the example above, one can ``reverse-engineer'' a model for the set of sentences in question from the set of witnesses simply by letting the universe of the structure {\em be\/} the set of witnesses. One can also define the necessary relation interpreting $<$ in a pretty obvious way from $\Sigma$.\footnote{Note, however, that an isomorphic copy of $\mathbb{Q}$ is not the only structure for $\mathcal{L}'_O$ satisfying $\Sigma$. For example, $\mathfrak{R} = (\mathbb{R},<, q + \pi \colon q \in \mathbb{Q})$ will also satisfy $\Sigma$ if we intepret $c_q$ by $q + \pi$.} This example is obviously contrived: there are no constant symbols around which are not witnesses, $\Sigma$ proves that distinct constant symbols aren't equal to to each other, there is little by way of non-logical symbols needing interpretation, and $\Sigma$ explicitly includes everything we need to know about $<$. In general, trying to build a model for a set of sentences $\Sigma$ in this way runs into a number of problems. First, how do we know whether $\Sigma$ has a set of witnesses at all? Many first-order languages have few or no constant symbols, after all. Second, if $\Sigma$ has a set of witnesses $C$, it's unlikely that we'll be able to get away with just letting the universe of the model be $C$. What if $\Sigma \proves c = d$ for some distinct witnesses $c$ and $d$? Third, how do we handle interpreting constant symbols which are not in $C$? Fourth, what if $\Sigma$ doesn't prove enough about whatever relation and function symbols exist to let us define interpretations of them in the structure under construction? (Imagine, if you like, that someone hands you a copy of Joyce's {\em Ulysses\/} and asks you to produce a complete road map of Dublin on the basis of the book. Even if it has no geographic contradictions, you are unlikely to find all the information in the novel needed to do the job.) Finally, even if $\Sigma$ does prove all we need to define functions and relations on the universe to interpret the function and relation symbols, just how do we do it? Getting around all these difficulties requires a fair bit of work. One can get around many by sticking to maximally consistent sets of sentences in suitable languages. \begin{lem} \label{l:eight12} Suppose $\Sigma$ is a set of sentences, $\varphi$ is any formula, and $x$ is any variable. Then $\Sigma \proves \varphi$ if and only if $\Sigma \proves \forall x\, \varphi$. \end{lem} \begin{thm} \label{t:exmcw} Suppose $\Gamma$ is a consistent set of sentences of $\mathcal{L}$. Let $C$ be an infinite countable set of constant symbols which are {\em not\/} symbols of $\mathcal{L}$, and let $\mathcal{L}' = \mathcal{L} \cup C$ be the language obtained by adding the constant symbols in $C$ to the symbols of $\mathcal{L}$. Then there is a maximally consistent set $\Sigma$ of sentences of $\mathcal{L}'$ such that $\Gamma \subseteq \Sigma$ and $C$ is a set of witnesses for $\Sigma$. \end{thm} This theorem allows one to use a certain measure of brute force: No set of witnesses? Just add one! The set of sentences doesn't decide enough? Decide {\em everything\/} one way or the other! \begin{thm} \label{t:mfc} Suppose $\Sigma$ is a maximally consistent set of sentences and $C$ is a set of witnesses for $\Sigma$. Then there is a structure $\mathfrak{M}$ such that $\mathfrak{M} \models \Sigma$. \end{thm} The important part here is to define $\mathfrak{M}$ --- proving that $\mathfrak{M} \models \Sigma$ is tedious but fairly straightforward if you have the right definition. Proposition \ref{p:exsat} now lets us deduce the fact we really need. \begin{cor} \label{p:eight17} Suppose $\Gamma$ is a consistent set of sentences of a first-order language $\mathcal{L}$. Then there is a structure $\mathfrak{M}$ for $\mathcal{L}$ satisfying $\Gamma$. \end{cor} With the above facts in hand, we can rejoin our proof of Soundness and Completeness, already in progress: \begin{thm} \label{t:sacof} A set of sentences $\Sigma$ in $\mathcal{L}$ is consistent if and only if it is satisfiable. \end{thm} The rest works just like it did for propositional logic. \begin{thm}[Completeness Theorem] \label{t:fcmpl} \index{Completeness Theorem} If $\alpha$ is a sentence and $\Delta$ is a set of sentences such that $\Delta \models \alpha$, then $\Delta \proves \alpha$. \end{thm} It follows that in a first-order logic, as in propositional logic, a sentence is implied by some set of premisses if and only if it has a proof from those premisses. \begin{thm}[Compactness Theorem] \label{p:fcmpct} \index{Compactness Theorem} A set of sentences $\Delta$ is satisfiable if and only if every finite subset of $\Delta$ is satisfiable. \end{thm} % % Chapter 9 of "A Problem Course in Mathematical Logic" % \chapter{Applications of Compactness} \label{ch:nine} \index{Compactness Theorem, applications of} After wading through the preceding chapters, it should be obvious that first-order logic is, in principle, adequate for the job it was originally developed for: the essentially philosophical exercise of formalizing most of mathematics. As something of a bonus, first-order logic can supply useful tools for doing ``real'' mathematics. The Compactness Theorem is the simplest of these tools and glimpses of two ways of using it are provided below. \subsection*{From the finite to the infinite} Perhaps the simplest use of the Compactness Theorem is to show that if there exist arbitrarily large finite objects of some type, then there must also be an infinite object of this type. \begin{exmp} \label{e:com} We will use the Compactness Theorem to show that there is an infinite commutative group in which every element is of order $2$, {\em i.e.\/} such that $g \cdot g = e$ for every element $g$. Let $\mathcal{L}_G$\index{$\mathcal{L}_G$} be the first-order language with just two non-logical symbols: \begin{itemize} \item Constant symbol: $e$ \item 2-place function symbol: $\cdot$ \end{itemize} Here $e$ is intended to name the group's identity element and $\cdot$ the group operation. Let $\Sigma$ be the set of sentences of $\mathcal{L}_G$ including: \begin{enumerate} \item The axioms for a commutative group: \begin{itemize} \item $\forall x\, x\cdot e = x$ \item $\forall x\, \exists y\, x \cdot y = e$ \item $\forall x\, \forall y\, \forall z\, x \cdot (y \cdot z) = (x \cdot y) \cdot z$ \item $\forall x\, \forall y\, y \cdot x = x \cdot y$ \end{itemize} \item A sentence which asserts that every element of the universe is of order $2$: \begin{itemize} \item $\forall x\, x \cdot x = e$ \end{itemize} \item For each $n \ge 2$, a sentence, $\sigma_n$, which asserts that there are at least $n$ different elements in the universe: \begin{itemize} \item $\exists x_1\, \dots \exists x_n\, ( (\lnot x_1 = x_2) \land (\lnot x_1 = x_3) \land \dots \land (\lnot x_{n-1} = x_n))$ \end{itemize} \end{enumerate} We claim that every finite subset of $\Sigma$ is satisfiable. The most direct way to verify this is to show how, given a finite subset $\Delta$ of $\Sigma$, to produce a model $\mathfrak{M}$ of $\Delta$. Let $n$ be the largest integer such that $\sigma_n \in \Delta \cup \{ \sigma_2 \}$ (Why is there such an $n$?) and choose an integer $k$ such that $2^k \ge n$. Define a structure $(G,\circ)$ for $\mathcal{L}_G$ as follows: \begin{itemize} \item $G = \{\, \langle a_\ell \mid 1 \le \ell \le k \rangle \mid a_\ell = 0 \text{\ or\ } 1 \,\}$ \item $\langle a_\ell \mid 1 \le \ell \le k \rangle \circ \langle b_\ell \mid 1 \le \ell \le k \rangle = \langle a_\ell + b_\ell \pmod 2 \mid 1 \le \ell \le k \rangle$ \end{itemize} That is, $G$ is the set of binary sequences of length $k$ and $\circ$ is coordinatewise addition modulo $2$ of these sequences. It is easy to check that $(G,\circ)$ is a commutative group with $2^k$ elements in which every element has order $2$. Hence $(G,\circ) \models \Delta$, so $\Delta$ is satisfiable. Since every finite subset of $\Sigma$ is satisfiable, it follows by the Compactness Theorem that $\Sigma$ is satisfiable. A model of $\Sigma$, however, must be an infinite commutative group in which every element is of order $2$. (To be sure, it is quite easy to build such a group directly; for example, by using coordinatewise addition modulo $2$ of infinite binary sequences.) \end{exmp} \begin{prob} \label{p:nine1} Use the Compactness Theorem to show that there is an infinite \begin{enumerate} \item bipartite graph, \item non-commutative group, and \item field of characteristic 3, \end{enumerate} and also give concrete examples of such objects. \end{prob} Most applications of this method, including the ones above, are not really interesting: it is usually more valuable, and often easier, to directly construct examples of the infinite objects in question rather than just show such must exist. Sometimes, though, the technique can be used to obtain a non-trivial result more easily than by direct methods. We'll use it to prove an important result from graph theory, Ramsey's Theorem. Some definitions first: \begin{defn} If $X$ is a set, let the set of unordered pairs of elements of $X$ be $[X]^2 = \{\, \{a,b\} \mid a,b \in X \text{\ and\ } a \ne b \,\}$. (See Definition~\ref{d:sed}.) \begin{enumerate} \item A {\em graph\/} \index{graph} is a pair $(V,E)$ such that $V$ is a non-empty set and $E \subseteq [V]^2$. Elements of $V$ are called {\em vertices\/}\index{vertex} of the graph and elements of $E$ are called {\em edges\/}\index{edge}. \item A {\em subgraph\/}\index{subgraph} of $(V,E)$ is a pair $(U,F)$, where $U \subset V$ and $F = E \cap [U]^2$. \item A subgraph $(U,F)$ of $(V,E)$ is a {\em clique\/}\index{clique} if $F = [U]^2$. \item A subgraph $(U,F)$ of $(V,E)$ is an {\em independent set\/}\index{independent set} if $F = \emptyset$. \end{enumerate} \end{defn} That is, a graph is some collection of vertices, some of which are joined to one another. A subgraph is just a subset of the vertices, together with all edges joining vertices of this subset in the whole graph. It is a clique if it happens that the original graph joined every vertex in the subgraph to all other vertices in the subgraph, and an independent set if it happens that the original graph joined none of the vertices in the subgraph to each other. The question of when a graph must have a clique or independent set of a given size is of some interest in many applications, especially in dealing with colouring problems. \begin{thm}[Ramsey's Theorem] \label{t:ram} \index{Ramsey's Theorem} For every $n \ge 1$ there is an integer $R_n$ such that any graph with at least $R_n$ vertices has a clique with $n$ vertices or an independent set with $n$ vertices. \end{thm} $R_n$\index{$R_n$} is the {\em $n$th Ramsey number\/}.\index{Ramsey number} It is easy to see that $R_1 = 1$ and $R_2 = 2$, but $R_3$ is already $6$, and $R_n$ grows very quickly as a function of $n$ thereafter. Ramsey's Theorem is fairly hard to prove directly, but the corresponding result for infinite graphs is comparatively straightforward. \begin{lem} \label{l:irt} \index{Infinite Ramsey's Theorem} \index{Ramsey's Theorem Infinite} If $(V,E)$ is a graph with infinitely many vertices, then it has an infinite clique or an infinite independent set. \end{lem} A relatively quick way to prove Ramsey's Theorem is to first prove its infinite counterpart, Lemma \ref{l:irt}, and then get Ramsey's Theorem out of it by way of the Compactness Theorem. (If you're an ambitious minimalist, you can try to do this using the Compactness Theorem for propositional logic instead!) \subsection*{Elementary equivalence and non-standard models} One of the common uses for the Compactness Theorem is to construct ``non-standard'' models\index{non-standard model} of the theories satisfied by various standard mathematical structures. Such a model satisfies all the same first-order sentences as the standard model, but differs from it in some way not expressible in the first-order language in question. This brings home one of the intrinsic limitations of first-order logic: it can't always tell essentially different structures apart. Of course, we need to define just what constitutes essential difference. \begin{defn} Suppose $\mathcal{L}$ is a first-order language and $\mathfrak{N}$ and $\mathfrak{M}$ are two structures for $\mathcal{L}$. Then $\mathfrak{N}$ and $\mathfrak{M}$ are: \begin{enumerate} \item {\em isomorphic\/},\index{isomorphism of structures} written as $\mathfrak{N} \cong \mathfrak{M}$, if there is a function $F \colon |\mathfrak{N}| \to |\mathfrak{M}|$ such that \begin{enumerate} \item $F$ is $1-1$ and onto, \item $F(c^{\mathfrak{N}}) = c^{\mathfrak{M}}$ for every constant symbol $c$ of $\mathcal{L}$, \item $F(f^{\mathfrak{N}}(a_1, \dots, a_k) = f^{\mathfrak{M}}(F(a_1), \dots, F(a_k))$ for every $k$-place function symbol $f$ of $\mathcal{L}$ and elements $a_1, \dots, a_k \in |\mathfrak{N}|$, and \item $P^{\mathfrak{N}}(a_1, \dots, a_k)$ holds if and only if $P^{\mathfrak{N}}(F(a_1), \dots, F(a_k))$ for every $k$-place relation symbol of $\mathcal{L}$ and elements $a_1$, \dots, $a_k$ of $|\mathfrak{N}|$; \end{enumerate} and \item {\em elementarily equivalent\/},\index{elementary equivalence} \index{equivalence, elementary} written as $\mathfrak{N} \equiv \mathfrak{M}$, if $\text{\rm Th}(\mathfrak{N}) = \text{\rm Th}(\mathfrak{M})$, {\em i.e.\/} if $\mathfrak{N} \models \sigma$ if and only if $\mathfrak{M} \models \sigma$ for every sentence $\sigma$ of $\mathcal{L}$. \end{enumerate} \end{defn} That is, two structures for a given language are isomorphic if they are structurally identical and elementarily equivalent if no statement in the language can distinguish between them. Isomorphic structures are elementarily equivalent: \begin{prop} \label{p:nine4} Suppose $\mathcal{L}$ is a first-order language and $\mathfrak{N}$ and $\mathfrak{M}$ are structures for $\mathcal{L}$ such that $\mathfrak{N} \cong \mathfrak{M}$. Then $\mathfrak{N} \equiv \mathfrak{M}$. \end{prop} However, as the following application of the Compactness Theorem shows, elementarily equivalent structures need not be isomorphic: \begin{exmp} Note that $\mathfrak{C} = (\mathbb{N})$ is an infinite structure for $\mathcal{L}_=$. Expand $\mathcal{L}_=$ to $\mathcal{L}_R$ by adding a constant symbol $c_r$ for every real number $r$, and let $\Sigma$ be the set of sentences of $\mathcal{L}_=$ including \begin{itemize} \item every sentence $\tau$ of $\text{\rm Th}(\mathfrak{C})$, {\em i.e.\/} such that $\mathfrak{C} \models \tau$, and \item $\lnot c_r = c_s$ for every pair of real numbers $r$ and $s$ such that $r \ne s$. \end{itemize} Every finite subset of $\Sigma$ is satisfiable. (Why?) Thus, by the Compactness Theorem, there is a structure $\mathfrak{U}'$ for $\mathcal{L}_R$ satisfying $\Sigma$, and hence $\text{\rm Th}(\mathfrak{C})$. The structure $\mathfrak{U}$ obtained by dropping the interpretations of all the constant symbols $c_r$ from $\mathfrak{U}'$ is then a structure for $\mathcal{L}_=$ which satisfies $\text{\rm Th}(\mathfrak{C})$. Note that $|\mathfrak{U}| = |\mathfrak{U}'|$ is at least large as the set of all real numbers $\mathbb{R}$, since $\mathfrak{U}'$ requires a distinct element of the universe to interpret each constant symbol $c_r$ of $\mathcal{L}_R$. Since $\text{\rm Th}(\mathfrak{C})$ is a maximally consistent set of sentences of $\mathcal{L}_=$ by Problem \ref{p:smac}, it follows from the above that $\mathfrak{C} \equiv \mathfrak{U}$. On the other hand, $\mathfrak{C}$ cannot be isomorphic to $\mathfrak{U}$ because there cannot be an onto map between a countable set, such as $\mathbb{N} = |\mathfrak{C}|$, and a set which is at least as large as $\mathbb{R}$, such as $|\mathfrak{U}|$. \end{exmp} In general, the method used above can be used to show that if a set of sentences in a first-order language has an infinite model, it has many different ones. In $\mathcal{L}_=$ that is essentially all that can happen: \begin{prop} \label{p:nine5} Two structures for $\mathcal{L}_=$ are elementarily equivalent if and only if they are isomorphic or infinite. \end{prop} \begin{prob} \label{p:nine6} Let $\mathfrak{N} = (\mathbb{N}, 0, 1, S, +, \cdot, E)$ be the standard structure for $\mathcal{L}_N$. Use the Compactness Theorem to show there is a structure $\mathfrak{M}$ for $\mathcal{L}_N$ such that $\mathfrak{N} \equiv \mathfrak{N}$ but not $\mathfrak{N} \cong \mathfrak{M}$. \end{prob} Note that because $\mathfrak{N}$ and $\mathfrak{M}$ both satisfy $\text{\rm Th}(\mathfrak{N})$, which is maximally consistent by Problem \ref{p:smac}, there is absolutely no way of telling them apart in $\mathcal{L}_N$. \begin{prop} \label{p:nine7} Every model of $\text{\rm Th}(\mathfrak{N})$ which is {\em not\/} isomorphic to $\mathfrak{N}$ has \begin{enumerate} \item an isomorphic copy of $\mathfrak{N}$ embedded in it, \item an infinite number, {\em i.e.\/} one larger than all of those in the copy of $\mathfrak{N}$, and \item an infinite decreasing sequence. \end{enumerate} \end{prop} The apparent limitation of first-order logic that non-isomorphic structures may be elementarily equivalent can actually be useful. A non-standard model\index{non-standard model} may have features that make it easier to work with than the standard model one is really interested in. Since both structures satisfy exactly the same sentences, if one uses these features to prove that some statement expressible in the given first-order language is true about the non-standard structure, one gets for free that it must be true of the standard structure as well. A prime example of this idea is the use of non-standard models of the real numbers\index{non-standard models of the reals} containing infinitesimals (numbers which are infinitely small but different from zero) in some areas of analysis. \begin{thm} \label{t:nsr} Let $\mathfrak{R} = (\mathbb{R}, 0, 1, +, \cdot)$ be the field of real numbers, considered as a structure for $\mathcal{L}_F$. Then there is a model of $\text{Th}(\mathfrak{R})$ which contains a copy of $\mathbb{R}$ and in which there is an infinitesimal\index{infinitesimal}. \end{thm} The non-standard models of the real numbers\index{non-standard models of the reals} actually used in analysis are usually obtained in more sophisticated ways in order to have more information about their internal structure. It is interesting to note that infinitesimals were the intuition behind calculus for Leibniz when it was first invented, but no one was able to put their use on a rigourous footing until Abraham Robinson did so in 1950. \chapter*{Hints for Chapters 5--9} % % Hints for Chapter 5 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:five}} \begin{clue}{p:five1} Try to disassemble each string using Definition \ref{d:ter}. Note that some might be valid terms of more than one of the given languages. \end{clue} \begin{clue}{p:five2} This is similar to Problem \ref{p:lof}. \end{clue} \begin{clue}{p:fcmt} This is similar to Proposition \ref{p:foc}. \end{clue} \begin{clue}{p:five4} Try to disassemble each string using Definitions \ref{d:ter} and \ref{d:for}. Note that some might be valid formulas of more than one of the given languages. \end{clue} \begin{clue}{p:five5} This is just like Problem \ref{p:lrp}. \end{clue} \begin{clue}{p:five6} This is similar to Problem \ref{p:lof}. You may wish to use your solution to Problem \ref{p:five2}. \end{clue} \begin{clue}{p:five7} This is similar to Proposition \ref{p:foc}. \end{clue} \begin{clue}{p:for} You might want to rephrase some of the given statements to make them easier to formalize. \begin{enumerate} \item Look up associativity if you need to. \item``There is an object such that every object is not in it.'' \item This should be easy. \item Ditto. \item ``Any two things must be the same thing.'' \end{enumerate} \end{clue} \begin{clue}{p:fole} If necessary, don't hesitate to look up the definitions of the given structures. \begin{enumerate} \item Read the discussion at the beginning of the chapter. \item You really need only one non-logical symbol. \item There are two sorts of objects in a vector space, the vectors themselves and the scalars of the field, which you need to be able to tell apart. \end{enumerate} \end{clue} \begin{clue}{p:five10} Use Definition \ref{d:for} in the same way that Definition \ref{d:form} was used in Definition \ref{d:subf}. \end{clue} \begin{clue}{p:five11} The scope of a quantifier ought to be a certain subformula of the formula in which the quantifier occurs. \end{clue} \begin{clue}{p:five12} Check to see whether they satisfy Definition \ref{d:frv}. \end{clue} \begin{clue}{p:five13} Check to see which pairs satisfy Definition \ref{d:exlan}. \end{clue} \begin{clue}{p:exlan} Proceed by induction on the length of $\varphi$ using Definition \ref{d:for}. \end{clue} \begin{clue}{t:urt} This is similar to Theorem \ref{t:ur}. \end{clue} \begin{clue}{t:urf} This is similar to Theorem \ref{t:ur} and uses Theorem~\ref{t:urt}. \end{clue} % % Hints for Chapter 6 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:six}} \begin{clue}{p:six1} In each case, apply Definition \ref{d:str}. \begin{enumerate} \item This should be easy. \item Ditto. \item Invent objects which are completely different except that they happen to have the right number of the right kind of components. \end{enumerate} \end{clue} \begin{clue}{pb:exas} Figure out the relevant values of $s(v_n)$ and apply Definition \ref{d:exas}. \end{clue} \begin{clue}{p:eau} Suppose $\mathbf{s}$ and $\mathbf{r}$ both extend the assignment $s$. Show that $\mathbf{s}(t) = \mathbf{r}(t)$ by induction on the length of the term $t$. \end{clue} \begin{clue}{p:six4} Unwind the formulas using Definition \ref{d:sat} to get informal statements whose truth you can determine. \end{clue} \begin{clue}{p:six5} Unwind the abbreviation $\exists$ and use Definition~\ref{d:sat}. \end{clue} \begin{clue}{p:ord} Unwind each of the formulas using Definitions~\ref{d:sat} and \ref{d:mod} to get informal statements whose truth you can determine. \end{clue} \begin{clue}{l:six7} This is much like Proposition \ref{p:eau}. \end{clue} \begin{clue}{p:six8} Proceed by induction on the length of the formula using Definition \ref{d:sat} and Lemma \ref{l:six7}. \end{clue} \begin{clue}{c:six9} How many free variables does a sentence have? \end{clue} \begin{clue}{p:inf} Use Definition \ref{d:sat}. \end{clue} \begin{clue}{p:taut} Unwind the sentences in question using Definition \ref{d:sat}. \end{clue} \begin{clue}{p:six12} Use Definitions \ref{d:sat} and \ref{d:mod}; the proof is similar in form to the proof of Proposition \ref{p:moto}. \end{clue} \begin{clue}{p:six13} Use Definitions \ref{d:sat} and \ref{d:mod}; the proof is similar in form to the proof for Problem \ref{p:sanc}. \end{clue} \begin{clue}{p:mif} Use Definitions \ref{d:sat} and \ref{d:mod} in each case, plus the meanings of our abbreviations. \end{clue} \begin{clue}{p:exsat} In one direction, you need to add appropriate objects to a structure; in the other, delete them. In both cases, you still have to verify that $\Gamma$ is still satisfied. \end{clue} \begin{clue}{p:six16} Here are some appropriate languages. \begin{enumerate} \item $\mathcal{L}_=$ \item Modify your language for graph theory from Problem~\ref{p:fole} by adding a 1-place relation symbol. \item Use your language for group theory from Problem~\ref{p:fole}. \item $\mathcal{L}_F$ \end{enumerate} \end{clue} % % Chapter 7 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:seven}} \begin{clue}{p:subs} \begin{enumerate} \item Use Definition \ref{d:subs}. \item Ditto. \item Ditto. \item Proceed by induction on the length of the formula $\varphi$. \end{enumerate} \end{clue} \begin{clue}{l:gen} Use the definitions and facts about $\models$ from Chapter \ref{ch:six}. \end{clue} \begin{clue}{p:seven3} Check each case against the schema in Definition \ref{d:axs}. Don't forget that any generalization of a logical axiom is also a logical axiom. \end{clue} \begin{clue}{p:seven4} You need to show that any instance of the schemas A1--A8 is a tautology and then apply Lemma \ref{l:gen}. That each instance of schemas A1--A3 is a tautology follows from Proposition \ref{p:mif}. For A4--A8 you'll have to use the definitions and facts about $\models$ from Chapter 6. \end{clue} \begin{clue}{p:deds} You may wish to appeal to the deductions that you made or were given in Chapter \ref{ch:three}. \begin{enumerate} \item Try using A4 and A6. \item You don't need A4--A8 here. \item Try using A4 and A8. \item A8 is the key; you may need it more than once. \item This is just A6 in disguise. \end{enumerate} \end{clue} \begin{clue}{p:seven5a} This is just like its counterpart for propositional logic. \end{clue} \begin{clue}{p:seven6} Ditto. \end{clue} \begin{clue}{p:bim} Ditto. \end{clue} \begin{clue}{p:seven8} Ditto. \end{clue} \begin{clue}{t:fded} Ditto. \end{clue} \begin{clue}{t:gen} Proceed by induction on the length of the shortest proof of $\varphi$ from $\Gamma$. \end{clue} \begin{clue}{t:genc} Ditto. \end{clue} \begin{clue}{p:seven12} As usual, don't take the following suggestions as gospel. \begin{enumerate} \item Try using A8. \item Start with Example \ref{e:apf}. \item Start with part of Problem \ref{p:deds}. \end{enumerate} \end{clue} % % Hints for Chapter 8 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:eight}} \begin{clue}{t:fsnd} This is similar to the proof of the Soundness Theorem for propositional logic, using Proposition \ref{p:inf} in place of Proposition \ref{p:snd}. \end{clue} \begin{clue}{p:sacon} This is similar to its counterpart for prpositional logic, Proposition \ref{p:stoc}. Use Proposition \ref{p:inf} instead of Proposition \ref{p:snd}. \end{clue} \begin{clue}{p:eight4} This is just like its counterpart for propositional logic. \end{clue} \begin{clue}{p:eight5} Ditto. \end{clue} \begin{clue}{c:eight6} Ditto. \end{clue} \begin{clue}{p:smac} This is a counterpart to Problem \ref{p:emc}; use Proposition \ref{p:sacon} instead of Proposition \ref{p:stoc} and Proposition \ref{p:mif} instead of Proposition \ref{p:tif}. \end{clue} \begin{clue}{p:eight8} This is just like its counterpart for propositional logic. \end{clue} \begin{clue}{p:eight9} Ditto \end{clue} \begin{clue}{p:eight10} Ditto. \end{clue} \begin{clue}{t:etmc} This is much like its counterpart for propositional logic, Theorem \ref{t:exmc}. \end{clue} \begin{clue}{p:eight14} Use Proposition \ref{p:bim}. \end{clue} \begin{clue}{l:eight12} Use the Generalization Theorem for the hard direction. \end{clue} \begin{clue}{t:exmcw} This is essentially a souped-up version of Theorem~\ref{t:etmc}. To ensure that $C$ is a set of witnesses of the maximally consistent set of sentences, enumerate all the formulas $\varphi$ of $\mathcal{L}'$ with one free variable and take care of one at each step in the inductive construction. \end{clue} \begin{clue}{t:mfc} To construct the required structure, $\mathfrak{M}$, proceed as follows. Define an equivalence relation $\sim$ on $C$ by setting $c \sim d$ if and only if $c = d \in \Sigma$, and let $[c] = \{\, a \in C \mid a \sim c \,\}$ be the equivalence class of $c \in C$. The universe of $\mathfrak{M}$ will be $M = \{\, [c] \mid c \in C \,\}$. For each $k$-place function symbol $f$ define $f^{\mathfrak{M}}$ by setting $f^{\mathfrak{M}}([a_1], \dots, [a_k]) =[b]$ if and only if $fa_1\dots a_k = b$ is in $\Sigma$. Define the interpretations of constant symbols and relation symbols in a similar way. You need to show that all these things are well-defined, and then show that $\mathfrak{M} \models \Sigma$. \end{clue} \begin{clue}{p:eight17} Expand $\Gamma$ to a maximally consistent set of sentences with a set of witnesses in a suitable extension of $\mathcal{L}$, apply Theorem \ref{t:mfc}, and then cut down the resulting structure to one for $\mathcal{L}$. \end{clue} \begin{clue}{t:sacof} One direction is just Proposition \ref{p:sacon}. For the other, use Corollary \ref{p:eight17}. \end{clue} \begin{clue}{t:fcmpl} This follows from Theorem \ref{t:sacof} in the same way that the Completeness Theorem for propositional logic followed from Theorem \ref{t:saco}. \end{clue} \begin{clue}{p:fcmpct} This follows from Theorem \ref{t:sacof} in the same way that the Compactness Theorem for propositional logic followed from Theorem \ref{t:saco}. \end{clue} % % Hints for Chapter 9 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:nine}} \begin{clue}{p:nine1} In each case, apply the trick used in Example~\ref{e:com}. For definitions and the concrete examples, consult texts on combinatorics and abstract algebra. \end{clue} \begin{clue}{t:ram} Suppose Ramsey's Theorem fails for some $n$. Use the Compactness Theorem to get a contradiction to Lemma \ref{l:irt} by showing there must be an infnite graph with no clique or independent set of size $n$. \end{clue} \begin{clue}{l:irt} Inductively define a sequence $a_0$, $a_1$, \dots, of vertices so that for every $n$, either it is the case that for all $k \ge n$ there is an edge joining $a_n$ to $a_k$ or it is the case that for all $k \ge n$ there is no edge joining $a_n$ to $a_k$. There will then be a subsequence of the sequence which is an infinite clique or a subsequence which is an infinite independent set. \end{clue} \begin{clue}{p:nine4} The key is to figure out how, given an assignment for one structure, one should define the corresponding assignment in the other structure. After that, proceed by induction using the definition of satisfaction. \end{clue} \begin{clue}{p:nine5} When are two finite structures for $\mathcal{L}_=$ elementarily equivalent? \end{clue} \begin{clue}{p:nine6} In a suitable expanded language, consider $\text{\rm Th}(\mathfrak{N})$ together with the sentences $\exists x\, 0 + x = c$, $\exists x\, S0 + x = c$, $\exists x\, SS0 + x = c$, \dots \end{clue} \begin{clue}{p:nine7} Suppose $\mathfrak{M} \models \text{\rm Th}(\mathfrak{N})$ but is not isomorphic to $\mathfrak{N}$. \begin{enumerate} \item Consider the subset of $|\mathfrak{M}|$ given by $0^{\mathfrak{M}}$, $S^{\mathfrak{M}}(0^{\mathfrak{M}})$, $S^{\mathfrak{M}}(S^{\mathfrak{M}}(0^{\mathfrak{M}}))$, \dots \item If it didn't have one, it would be a copy of $\mathfrak{N}$. \item Start with a infinite number and work down. \end{enumerate} \end{clue} \begin{clue}{t:nsr} Expand $\mathcal{L}_F$ by throwing in a constant symbol for every real number, plus an extra one, and take it from there. \end{clue} % % Part III of "A Problem Course in Mathematical Logic" % \part{Computability} % % Tenth chapter of "A Problem Course in Mathematical Logic" % \chapter{Turing Machines} \label{ch:ten} Of the various ways to formalize the notion an ``effective method'', the most commonly used are the simple abstract computers called Turing machines,\index{machine, Turing}\index{Turing machine} which were introduced more or less simultaneously by Alan Turing and Emil Post in 1936.\footnote{Both papers are reprinted in \cite{DA:U}. Post's brief paper gives a particularly lucid informal description.} Like most real-life digital computers, Turing machines have two main parts, a processing unit and a memory (which doubles as the input/output device), which we will consider separately before seeing how they interact. The memory can be thought of as an infinite tape which is divided up into cells like the frames of a movie. The Turing machine proper is the processing unit. It has a scanner\index{scanner} or head\index{head} which can read from or write to a single cell of the tape, and which can be moved to the left or right one cell at a time. \subsection*{Tapes} To keep things simple, in this chapter we will only allow Turing machines to read and write the symbols $0$ and $1$. (One symbol per cell!) Moreover, we will allow the tape to be infinite in only one direction. That these restrictions do not affect what a Turing machine can, in principle, compute follows from the results in the next chapter. \begin{defn} \label{d:tape} \index{tape} A {\em tape\/} is an infinite sequence $$ \mathbf{a} = a_0\, a_1\, a_2\, a_3 \dots $$ such that for each integer $i$ the {\em cell\/} $a_i \in \{0,1\}$. The $i$th cell is said to be {\em blank\/}\index{blank cell}\index{cell, blank} if $a_i$ is $0$, and {\em marked\/}\index{marked cell}\index{cell, marked} if $a_i$ is $1$. \end{defn} A blank tape \index{tape, blank}\index{blank tape} is one in which every cell is $0$. \begin{exmp}\label{TM:tape} A blank tape looks like: $$ 000000000000000000000000 \cdots $$ The $0$th cell is the leftmost one, cell $1$ is the one immediately to the right, cell $2$ is the one immediately to the right of cell $1$, and so on. The following is a slightly more exciting tape: $$ 0101101110001000000000000000 \cdots $$ In this case, cell $1$ is marked ({\it i.e.\/} contains a $1$), as do cells $3$, $4$, $5$, $7$, $8$, and $12$; all the rest are blank ({\it i.e.\/} contain a $0$). \end{exmp} \begin{prob} \label{ten:tape} Write down tapes satisfying the following. \begin{enumerate} \item Entirely blank except for cells $3$, $12$, and $20$. \item Entirely marked except for cells $0$, $2$, and $3$. \item Entirely blank except that 1025 is written out in binary just to the right of cell $2$. \end{enumerate} \end{prob} To keep track of which cell the Turing machine's scanner is at, plus which instruction the Turing machine is to execute next, we will usually attach additional information to our description of the tape. \begin{defn} \index{tape position} \index{position, tape} A {\em tape position\/} is a triple $(s,i,\mathbf{a})$, where $s$ and $i$ are natural numbers with $s > 0$, and $\mathbf{a}$ is a tape. Given a tape position $(s,i,\mathbf{a})$, we will refer to cell $i$ as the {\em scanned cell\/}\index{scanned cell}\index{cell, scanned} and to $s$ as the {\em state\/}\index{state}. \end{defn} Note that if $(s,i,\mathbf{a})$ is a tape position, then the corresponding Turing machine's scanner is presently reading $a_i$ (which is one of $0$ or $1$). \subsection*{Conventions for tapes} Unless stated otherwise, we will assume that all but finitely many cells of any given tape are blank, and that any cells not explicitly described or displayed are blank. We will usually depict as little of a tape as possible and omit the $\cdots$s we used above. Thus $$ 0101101110001 $$ represents the tape given in the Example \ref{TM:tape}. In many cases we will also use $z^n$ to abbreviate $n$ consecutive copies of $z$, so the same tape could be represented by $$ 0101^201^30^31\, . $$ Similarly, if $\sigma$ is a finite sequence of elements of $\{0,1\}$, we may write $\sigma^n$ for the sequence consisting of $n$ copies of $\sigma$ stuck together end-to-end. For example, $(010)^3$ is short for $010010010$. In displaying tape positions we will usually underline the scanned cell and write $s$ to the left of the tape. For example, we would display the tape position using the tape from Example \ref{TM:tape} with cell $3$ being scanned and state $2$ as follows: $$ 2 \colon 010\underline{1}101110001 $$ Note that in this example, the scanner is reading a $1$. \begin{prob} \label{ten:tppos} Using the tapes you gave in the corresponding part of Problem \ref{ten:tape}, write down tape positions satisfying the following conditions. \begin{enumerate} \item Cell $7$ being scanned and state $4$. \item Cell $4$ being scanned and state $3$. \item Cell $3$ being scanned and state $413$. \end{enumerate} \end{prob} \subsection*{Turing machines} The ``processing unit'' of a Turing machine is just a finite list of specifications describing what the machine will do in various situations. (Remember, this is an {\em abstract\/} computer\dots) The formal definition may not seem to amount to this at first glance. \begin{defn} \label{d:TM} \index{machine, Turing} \index{Turing machine} A {\em Turing machine\/} is a function $M$ such that for some natural number $n$, \begin{eqnarray*} \mathrm{dom}(M) & \subseteq & \{1,\dots,n\} \times \{0,1\} \\ & = & \{\, (s,b) \mid 1 \le s \le n \text{\ and\ } b \in \{0,1\} \,\} \end{eqnarray*} and \begin{eqnarray*} \mathrm{ran}(M) & \subseteq & \{0,1\} \times \{-1,1\} \times \{1,\dots,n\} \\ & = & \{\, (c,d,t) \mid c \in \{0,1\} \text{\ and\ } d \in \{-1,1\} \text{\ and\ }1 \le t \le n \,\}\, . \end{eqnarray*} Note that $M$ does not have to be defined for all possible pairs $$ (s,b) \in \{1,\dots,n\} \times \{0,1\} \, . $$ We will sometimes refer to a Turing machine simply as a {\em machine\/} \index{machine} or {\rm TM\/} \index{TM}. If $n \ge 1$ is least such that $M$ satisfies the definition above, we shall say that $M$ is an {\em $n$-state Turing machine\/}\index{$n$-state Turing machine}\index{Turing machine $n$-state} and that $\{1,\dots,n\}$ is the set of {\em states\/}\index{state} of $M$. \end{defn} Intuitively, we have a processing unit which has a finite list of basic instructions, the states, which it can execute. Given a combination of current state and the symbol marked in the currently scanned cell of the tape, the list specifies \begin{itemize} \item a symbol to be written in the currently scanned cell, overwriting the symbol being read, then \item a move of the scanner one cell to the left or right, and then \item the next instruction to be executed. \end{itemize} That is, $M(s,c) = (b,d,t)$ means that if our machine is in state $s$ ({\em i.e.\/} executing instruction number $s$) and the scanner is presently reading a $c$ in cell $i$, then the machine $M$ should \begin{itemize} \item set $a_i = b$ ({\em i.e.\/} write $b$ instead of $c$ in the scanned cell), then \item move the scanner to $a_{i+d}$ ({\em i.e.\/} move one cell left if $d = -1$ and one cell right if $d = 1$), and then \item enter state $t$ ({\em i.e.\/} go to instruction $t$). \end{itemize} If our processor isn't equipped to handle input $c$ for instruction $s$ ({\em i.e.\/} $M(s,c)$ is undefined), then the computation in progress will simply stop dead or {\it halt\/}.\index{halt, Turing machine}\index{Turing machine, halt} \begin{exmp}\label{TM:M} We will usually present Turing machines in the form of a table\index{Turing machine, table}\index{table, Turing machine}, with a row for each state and a column for each possible entry in the scanned cell. Instead of $-1$ and $1$, we will usually use $L$ and $R$ when writing such tables in order to make them more readable. Thus the table \begin{center} \mbox{ \begin{tabular}{c|c|c} $M$ & $0$ & $1$ \\ \hline $1$ & $1R2$ & $0R1$ \\ $2$ & $0L2$ & \end{tabular} } \end{center} \noindent defines a Turing machine $M$ with two states such that $M(1,0) = (1,1,2)$, $M(1,1) = (0,1,1)$, and $M(2,0) = (0,-1,2)$, but $M(2,1)$ is undefined. In this case $M$ has domain $\{\, (1,0),\, (1,1),\, (2,0) \,\}$ and range $\{\, (1,1,2),\, (0,1,1),\, (0,-1,2) \,\}$. If the machine $M$ were faced with the tape position $$ 1 \colon 010\underline{0}1111 \, , $$ it would, since it was in state $1$ while scanning a cell containing $0$, \begin{itemize} \item write a $1$ in the scanned cell, \item move the scanner one cell to the right, and \item go to state $2$. \end{itemize} This would give the new tape position $$ 2 \colon 0101\underline{1}111 \, . $$ Since $M$ doesn't know what to do on input $1$ in state $2$, it would then halt, ending the computation. \end{exmp} \begin{prob} \label{ten:TMs} In each case, give the table of a Turing machine $M$ meeting the given requirement. \begin{enumerate} \item $M$ has three states. \item $M$ changes $0$ to $1$ and {\em vice versa\/} in any cell it scans. \item $M$ is as simple as possible. How many possibilities are there here? \end{enumerate} \end{prob} \subsection*{Computations} Informally, a computation is a sequence of actions of a machine $M$ on a tape according to the rules above, starting with instruction $1$ and the scanner at cell $0$ on the given tape. A computation ends (or {\em halts\/}\index{halt, Turing machine}\index{Turing machine, halt}) when and if the machine encounters a tape position which it does not know what to do in If it never halts, and doesn't {\it crash\/}\index{crash, Turing machine}\index{Turing machine, crash} by running the scanner off the left end of the tape\footnote{Be warned that most authors prefer to treat running the scanner off the left end of the tape as being just another way of halting. Halting with the scanner on the tape is more convenient, however, when putting together different Turing machines to make more complex ones.} either, the computation will never end. The formal definition makes all this seem much more formidable. \begin{defn} \label{TM:pc} Suppose $M$ is a Turing machine. Then: \begin{itemize} \item If $p = (s,i,\mathbf{a})$ is a tape position and $M(s,a_i) = (b,d,t)$ is defined, then $\mathbf{M}(p) = (t,i+d,\mathbf{a}')$ is the {\em successor tape position\/}, \index{successor tape position} \index{tape position, successor} where $a'_i = b$ and $a'_j = a_j$ whenever $j \ne i$. \item A {\em partial computation\/} \index{partial computation} \index{computation, partial} with respect to $M$ is a sequence $p_1 p_2 \dots$ of tape positions such that $p_{\ell+1} = \mathbf M(p_\ell)$ for each $\ell < k$. \item A partial computation $p_1 p_2 \dots p_k$ with respect to $M$ is a {\em computation\/}\index{computation} (with respect to $M$) with {\em input tape\/} \index{input tape} \index{tape, input} $\mathbf{a}$ if $p_1 = (1,0,\mathbf{a})$ and $\mathbf M(p_k)$ is undefined (and {\it not\/} because the scanner would run off the end of the tape). The {\em output tape\/} \index{output tape} \index{tape, output} of the computation is the tape of the final tape position $p_k$. \end{itemize} \end{defn} Note that a partial computation is a computation only if the Turing machine halts but doesn't crash in the final tape position. The requirement that it halt means that any computation can have only finitely many steps. Unless stated otherwise, we will assume that every partial computation on a given input begins in state $1$. We will often omit the ``partial'' when speaking of computations that might not strictly satisfy the definition of computation. \begin{exmp} Let's see the machine $M$ of Example \ref{TM:M} perform a computation. Our input tape will be $\mathbf{a} = 1 1 0 0$, that is, the tape which is entirely blank except that $a_0 = a_1 = 1$. The initial tape position of the computation of $M$ with input tape $\mathbf{a}$ is: $$ 1 \colon \underline{1} 1 0 0 $$ The subsequent steps in the computation are: \begin{align*} & 1 \colon 0 \underline{1} 0 0 \\ & 1 \colon 0 0 \underline{0} 0 \\ & 2 \colon 0 0 1 \underline{0} \\ & 2 \colon 0 0 \underline{1} \end{align*} We leave it to the reader to check that this is indeed a partial computation with respect to $M$. Since $M(2,1)$ is undefined the process terminates at this point and this partial computation is therefore a computation. \end{exmp} \begin{prob} \label{ten:comp} Give the (partial) computation of the Turing machine $M$ of Example \ref{TM:M} starting in state $1$ with the input tape: \begin{enumerate} \item $\underline{0} 0$ \item $1 \underline{1} 0$ \item The tape with all cells marked and cell $5$ being scanned. \end{enumerate} \end{prob} \begin{prob} \label{ten:inps} For which possible input tapes does the partial computation of the Turing machine $M$ of Example \ref{TM:M} eventually terminate? Explain why. \end{prob} \begin{prob} \label{ten:mrks} Find a Turing machine that (eventually!) fills a blank input tape with the pattern $010110001011000101100\dots$. \end{prob} \begin{prob} \label{ten:runs} Find a Turing machine that never halts (or crashes), no matter what is on the tape. \end{prob} \subsection*{Building Turing Machines} It will be useful later on to have a library of Turing machines that manipulate blocks of $1$s in various ways, and very useful to be able to combine machines peforming simpler tasks to perform more complex ones. \begin{exmp} \label{ex:STU} The Turing machine $S$ given below is intended to halt with output $01^k\underline{0}$ on input $\underline{0}1^k$, if $k>0$; that is, it just moves past a single block of $1$s without disturbing it. \begin{center} \mbox{ \begin{tabular}{c|c|c} $S$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & \\ $2$ & & $1R2$ \\ \end{tabular} } \end{center} Trace this machine's computation on, say, input $\underline{0}1^3$ to see how it works. The following machine, which is itself a variation on $S$, does the reverse of what $S$ does: on input $01^k\underline{0}$ it halts with output $\underline{0}1^k$. \begin{center} \mbox{ \begin{tabular}{c|c|c} $T$ & $0$ & $1$ \\ \hline $1$ & $0L2$ & \\ $2$ & & $1L2$ \\ \end{tabular} } \end{center} We can combine $S$ and $T$ into a machine $U$ which does nothing to a block of $1$s: given input $\underline{0}1^k$ it halts with output $\underline{0}1^k$. (Of course, a better way to do nothing is to really do nothing!) \begin{center} \mbox{ \begin{tabular}{c|c|c} $T$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & \\ $2$ & $0L3$ & $1R2$ \\ $3$ & & $1L3$ \end{tabular} } \end{center} Note how the states of $T$ had to be renumbered to make the combination work. \end{exmp} \begin{exmp} \label{ex:P22} The Turing machine $P$ given below is intended to move a block of $1$s: on input $\underline{0}0^n1^k$, where $n \ge 0$ and $k > 0$, it halts with output $\underline{0}1^k$. \begin{center} \mbox{ \begin{tabular}{c|c|c} $P$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & \\ $2$ & $1R3$ & $1L8$ \\ $3$ & $0R3$ & $0R4$ \\ $4$ & $0R7$ & $1L5$ \\ $5$ & $0L5$ & $1R6$ \\ $6$ & $1R3$ & \\ $7$ & $0L7$ & $1L8$ \\ $8$ & & $1L8$ \\ \end{tabular} } \end{center} Trace $P$'s computation on, say, input $\underline{0} 0^3 1^3$ to see how it works. Trace it on inputs $\underline{0} 1^2$ and $\underline{0}0^21$ as well to see how it handles certain special cases. \end{exmp} \begin{note} In both Examples \ref{ex:STU} and \ref{ex:P22} we do not really care what the given machines do on other inputs, so long as they perform as intended on the particular inputs we are concerned with. \end{note} \begin{prob} \label{ten:combTMs} We can combine the machine $P$ of Example~\ref{ex:P22} with the machines $S$ and $T$ of Example~\ref{ex:STU} to get the following machine. \begin{center} \mbox{ \begin{tabular}{c|c|c} $R$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & \\ $2$ & $0R3$ & $1R2$ \\ $3$ & $1R4$ & $1L9$ \\ $4$ & $0R4$ & $0R5$ \\ $5$ & $0R8$ & $1L6$ \\ $6$ & $0L6$ & $1R7$ \\ $7$ & $1R4$ & \\ $8$ & $0L8$ & $1L9$ \\ $9$ & $0L10$ & $1L9$ \\ $10$ & & $1L10$ \end{tabular} } \end{center} What task involving blocks of $1$s is this machine intended to perform? \end{prob} \begin{prob} \label{ten:mTMs} In each case, devise a Turing machine that: \begin{enumerate} \item Halts with output $\underline{0}1^4$ on input $\underline{0}$. \item Halts with output $01^n\underline{0}$ on input $\underline{0}0^n1$. \item Halts with output $\underline{0}1^{2n}$ on input $\underline{0}1^n$. \item Halts with output $\underline{0}(10)^n$ on input $\underline{0}1^n$. \item Halts with output $\underline{0}1^m$ on input $\underline{0}1^n01^m$ whenever $n,m > 0$. \item Halts with output $\underline{0}1^m01^n01^k$ on input $\underline{0}1^n01^k01^m$, if $n,m,k > 0$. \item Halts with output $\underline{0}1^m01^n01^k01^m01^n01^k$ on input $\underline{0}1^m01^n01^k$, if $n,m,k > 0$. \item On input $\underline{0}1^m01^n$, where $m,n >0$, halts with output $\underline{0}1$ if $m \ne n$ and output $\underline{0}11$ if $m = n$. \end{enumerate} It doesn't matter what the machine you define in each case may do on other inputs, so long as it does the right thing on the given one(s). \end{prob} % % Eleventh chapter of "A Problem Course in Mathematical Logic" % \chapter{Variations and Simulations} \label{ch:eleven} The definition of a Turing machine given in Chapter \ref{ch:ten} is arbitrary in a number of ways, among them the use of the symbols $0$ and $1$, a single read-write scanner, and a single one-way infinite tape. One could further restrict the definition we gave by allowing \begin{itemize} \item the machine to move the scanner\index{scanner} only to one of left or right in each state, \end{itemize} or expand it by allowing the use of \begin{itemize} \item any finite alphabet\index{alphabet} of at least two symbols, \item separate read and write heads\index{head separate}, \item multiple heads\index{head multiple}, \item two-way infinite tapes, \index{two-way infinite tape} \index{tape two-way infinite} \item multiple tapes, \index{tape multiple} \item two- and higher-dimensional tapes, \index{tape higher-dimensional} \end{itemize} or various combinations of these, among many other possibilities. We will construct a number of Turing machines that simulate others with additional features; this will show that various of the modifications mentioned above really change what the machines can compute. (In fact, none of them turn out to do so.) \begin{exmp} \label{sim:LR} Consider the following Turing machine: \begin{center} \mbox{ \begin{tabular}{c|c|c} $M$ & $0$ & $1$ \\ \hline $1$ & $1R2$ & $0L1$ \\ $2$ & $0L2$ & $1L1$ \end{tabular} } \end{center} Note that in state $1$, this machine may move the scanner to either the left or the right, depending on the contents of the cell being scanned. We will construct a Turing machine using the same alphabet that emulates the action of $M$ on any input, but which moves the scanner to only one of left or right in each state. There is no problem with state $2$ of $M$, by the way, because in state $2$ $M$ always moves the scanner to the left. The basic idea is to add some states to $M$ which replace part of the description of state $1$. \begin{center} \mbox{ \begin{tabular}{c|c|c} $M'$ & $0$ & $1$ \\ \hline $1$ & $1R2$ & $0R3$ \\ $2$ & $0L2$ & $1L1$ \\ $3$ & $0L4$ & $1L4$ \\ $4$ & $0L1$ & \end{tabular} } \end{center} This machine is just like $M$ except that in state $1$ with input $1$, instead of moving the scanner to the left and going to state $1$, the machine moves the scanner to the right and goes to the new state $3$. States $3$ and $4$ do nothing between them except move the scanner two cells to the left without changing the tape, thus putting it where $M$ would have put it, and then entering state $1$, as $M$ would have. \end{exmp} \begin{prob} \label{p:eleven1} Compare the computations of the machines $M$ and $M'$ of Example \ref{sim:LR} on the input tapes \begin{enumerate} \item $0$ \item $011$ \end{enumerate} and explain why is it not necessary to define $M'$ for state $4$ on input $1$. \end{prob} \begin{prob} \label{p:eleven2} Explain in detail how, given an arbitrary Turing machine $M$, one can construct a machine $M'$ that simulates what $M$ does on any input, but which moves the scanner only to one of left or right in each state. \end{prob} It should be obvious that the converse, simulating a Turing machine that moves the scanner only to one of left or right in each state by an ordinary Turing machine, is easy to the point of being trivial. It is often very convenient to add additional symbols to the alphabet that Turing machines are permitted to use. For example, one might want to have special symbols to use as place markers in the course of a computation. (For a more spectacular application, see Example~\ref{sim:2way} below.) It is conventional to include $0$, the ``blank'' symbol, in an alphabet used by a Turing machine, but otherwise any finite set of symbols goes. \begin{prob} \label{p:eleven1a} How do you need to change Definitions \ref{d:tape} and \ref{d:TM} to define Turing machines using a finite alphabet $\Sigma$? \end{prob} While allowing arbitary alphabets is often convenient when designing a machine to perform some task, it doesn't actually change what can, in principle, be computed. \begin{exmp} \label{sim:alph} Consider the machine $W$ below which uses the alphabet $\{0,x,y,z\}$. \begin{center} \mbox{ \begin{tabular}{c|c|c|c|c} $W$ & $0$ & $x$ & $y$ & $z$\\ \hline $1$ & $0R1$ & $xR1$ & $0L2$ & $zR1$\\ \end{tabular} } \end{center} For example, on input $\underline{0}xzyxy$, $W$ will eventually halt with output $0x\underline{z}0xy$. Note that state $2$ of $W$ is used only to halt, so we don't bother to make a row for it on the table. To simulate $W$ with a machine $Z$ using the alphabet $\{0,1\}$, we first have to decide how to represent $W$'s tape. We will use the following scheme, arbitrarily chosen among a number of alternatives. Every cell of $W$'s tape will be represented by two consecutive cells of $Z$'s tape, with a $0$ on $W$'s tape being stored as $00$ on $Z$'s, an $x$ as $01$, a $y$ as $10$, and a $z$ as $11$. Thus, if $W$ had input tape $\underline{0}xzyxy$, the corresponding input tape for $Z$ would be $\underline{0}0 01 11 10 01 10$. Designing the machine $Z$ that simulates the action of $W$ on the representation of $W$'s tape is a little tricky. In the example below, each state of $W$ corresponds to a ``subroutine'' of states of $Z$ which between them read the information in each representation of a cell of $W$'s tape and take appropriate action. \begin{center} \mbox{ \begin{tabular}{c|c|c} $Z$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & $1R3$ \\ $2$ & $0L4$ & $1L6$ \\ $3$ & $0L8$ & $1L13$ \\ $4$ & $0R5$ & \\ $5$ & $0R1$ & \\ $6$ & $0R7$ & \\ $7$ & & $1R1$ \\ $8$ & & $0R9$ \\ $9$ & $0L10$ & \\ $10$ & $0L11$ & \\ $11$ & $0L12$ & $1L12$ \\ $12$ & $0L15$ & $1L15$ \\ $13$ & & $1R14$ \\ $14$ & & $1R1$ \end{tabular} } \end{center} States $1$--$3$ of $Z$ read the input for state $1$ of $W$ and then pass on control to subroutines handling each entry for state $1$ in $W$'s table. Thus states $4$--$5$ of $Z$ take action for state $1$ of $W$ on input $0$, states $6$--$7$ of $Z$ take action for state $1$ of $W$ on input $x$, states $8$--$12$ of $Z$ take action for state $1$ of $W$ on input $y$, and states $13$--$14$ take action for state $1$ of $W$ on input $z$. State $15$ of $Z$ does what state $2$ of $W$ does: nothing but halt. \end{exmp} \begin{prob} \label{p:eleven2a} Trace the (partial) computations of $W$, and their counterparts for $Z$, for the input $\underline{0}xzyxy$ for $W$. Why is the subroutine for state $1$ of $W$ on input $y$ so much longer than the others? How much can you simplify it? \end{prob} \begin{prob} \label{p:eleven3} Given a Turing machine $M$ with an arbitrary alphabet $\Sigma$, explain in detail how to construct a machine $N$ with alphabet $\{0,1\}$ that simulates $M$. \end{prob} Doing the converse of this problem, simulating a Turing machine with alphabet $\{0,1\}$ by one using an arbitrary alphabet, is pretty easy. To define Turing machines with two-way infinite tapes \index{two-way infinite tape} \index{tape, two-way infinite} we need only change Definition~\ref{d:tape}: instead of having tapes $\mathbf{a} = a_0 a_1a_2 \ldots$ indexed by $\mathbb{N}$, we let them be $\mathbf{b} = \ldots b_{-2} b_{-1} b_0 b_1 b_2 \ldots$ indexed by $\mathbb{Z}$. In defining computations for machines with two-way infinite tapes, we adopt the same conventions that we did for machines with one-way infinite tapes, such as having the scanner start off scanning cell $0$ on the input tape. The only real difference is that a machine with a two-way infinite tape cannot crash\index{crash} by running off the left end of the tape; it can only stop by halting.\index{halt} \begin{exmp} \label{sim:2way} Consider the following two-way infinite tape Turing machine with alphabet $\{0,1\}$: \begin{center} \mbox{ \begin{tabular}{c|c|c} $T$ & $0$ & $1$ \\ \hline $1$ & $1L1$ & $0R2$ \\ $2$ & $0R2$ & $1L1$ \\ \end{tabular} } \end{center} To emulate $T$ with a Turing machine $O$ that has a one-way infinite tape, we need to decide how to represent a two-way infinite tape on a one-way infinite tape. This is easier to do if we allow ourselves to use an alphabet for $O$ other than $\{0,1\}$, chosen with malice aforethought: \[ \{\, \tover{0}{\mathrm{S}},\, \tover{1}{\mathrm{S}},\, \tover{0}{0},\, \tover{0}{1},\, \tover{1}{0},\, \tover{1}{1} \,\} \] We can now represent the tape $\mathbf{a} = \ldots a_{-2} a_{-1} a_0 a_1 a_2 \ldots$ for $T$ by the tape $\mathbf{a}' = \tover{a_0}{\mathrm{S}}\, \tover{a_1}{a_{-1}}\, \tover{a_2}{a_{-2}}\, \ldots$ for $O$. In effect, this trick allows us to split $O$'s tape into two tracks, each of which accomodates half of the tape of $T$. To define $O$, we split each state of $T$ into a pair of states for $O$, one for the lower track and one for the upper track. One must take care to keep various details straight: when $O$ changes a ``cell'' on one track, it should not change the corresponding ``cell'' on the other track; directions are reversed on the lower track; one has to ``turn a corner'' moving past cell $0$; and so on. \begin{center} \mbox{ \begin{tabular}{c|c|c|c|c|c|c|c} $O$ & $0$ & $\tover{0}{\mathrm{S}}$ & $\tover{0}{0}$ & $\tover{0}{1}$ & $\tover{1}{\mathrm{S}}$ & $\tover{1}{0}$ & $\tover{1}{1}$ \\ \hline $1$ & $\tover{1}{0}L1$ & $\tover{1}{\mathrm{S}}R3$ & $\tover{1}{0}L1$ & $\tover{1}{1}L1$ & $\tover{0}{\mathrm{S}}R2$ & $\tover{0}{0}R2$ & $\tover{0}{1}R2$ \\ $2$ & $\tover{0}{0}R2$ & $\tover{0}{\mathrm{S}}R2$ & $\tover{0}{0}R2$ & $\tover{0}{1}R2$ & $\tover{1}{\mathrm{S}}R3$ & $\tover{1}{0}L1$ & $\tover{1}{1}L1$ \\ $3$ & $\tover{0}{1}R3$ & $\tover{1}{\mathrm{S}}R3$ & $\tover{0}{1}R3$ & $\tover{0}{0}L4$ & $\tover{0}{\mathrm{S}}R2$ & $\tover{1}{1}R3$ & $\tover{1}{0}L4$ \\ $4$ & $\tover{0}{0}L4$ & $\tover{0}{\mathrm{S}}R2$ & $\tover{0}{0}L4$ & $\tover{0}{1}R3$ & $\tover{1}{\mathrm{S}}R3$ & $\tover{1}{0}L4$ & $\tover{1}{1}R3$ \\ \end{tabular} } \end{center} States $1$ and $3$ are the upper- and lower-track versions, respectively, of $T$'s state $1$; states $2$ and $4$ are the upper- and lower-track versions, respectively, of $T$'s state $2$. We leave it to the reader to check that $O$ actually does simulate $T$\dots \end{exmp} \begin{prob} \label{p:eleven4} Trace the (partial) computations of $T$, and their counterparts for $O$, for each of the following input tapes for $T$: \begin{enumerate} \item $\underline{0}$ ({\em i.e.\/} a blank tape) \item $1\underline{0}$ \item $\dots 111\underline{1}111 \dots$ ({\em i.e.\/} every cell marked with $1$) \end{enumerate} \end{prob} \begin{prob} \label{p:eleven5} Explain in detail how, given a Turing machine $N$ with alphabet $\Sigma$ and a two-way infinite tape, one can construct a Turing machine $P$ with an one-way infinite tape that simulates $N$. \end{prob} \begin{prob} \label{p:eleven5a} Explain in detail how, given a Turing machine $P$ with alphabet $\Sigma$ and an one-way infinite tape, one can construct a Turing machine $N$ with a two-way infinite tape that simulates $P$. \end{prob} Combining the techniques we've used so far, we could simulate any Turing machine with a two-way infinite tape and arbitrary alphabet by a Turing machine with a one-way infinite tape and alphabet $\{0,1\}$. \begin{prob} \label{p:eleven6} Give a precise definition for Turing machines with two tapes. Explain how, given any such machine, one could construct a single-tape machine to simulate it. \end{prob} \begin{prob} \label{p:eleven7} Give a precise definition for Turing machines with two-dimensional tapes. Explain how, given any such machine, one could construct a single-tape machine to simulate it. \end{prob} These results, and others like them, imply that none of the variant types of Turing machines mentioned at the start of this chapter differ essentially in what they can, in principle, compute. In Chapter~\ref{ch:fourteen} we will construct a Turing machine that can simulate {\it any\/} (standard) Turing machine. % % Twelfth chapter of "A Problem Course in Mathematical Logic" % \chapter{Computable and Non-Computable Functions} \label{ch:twelve} A lot of computational problems in the real world have to do with doing arithmetic, and any notion of computation that can't deal with arithmetic is unlikely to be of great use. \subsection*{Notation and conventions} To keep things as simple as possible, we will stick to computations involving the {\em natural numbers\/}\index{natural numbers}, {\em i.e.\/} the non-negative integers, the set of which is usually denoted by $\mathbb{N} = \{\, 0, 1, 2, \dots \,\}$.\index{$\mathbb{N}$}. The set of all $k$-tuples $(n_1,\dots,n_k)$ of natural numbers is denoted by $\mathbb{N}^k$.\index{$\mathbb{N}^k$} For all practical purposes, we may take $\mathbb{N}^1$ to be $\mathbb{N}$ by identifying the $1$-tuple $(n)$ with the natural number $n$. For $k \ge 1$, $f$ is a {\em $k$-place function\/} \index{$k$-place function} \index{function $k$-place} (from the natural numbers to the natural numbers), often written as $f \colon \mathbb{N}^k \to \mathbb{N}$,\index{$f \colon \mathbb{N}^k \to \mathbb{N}$} if it associates a value, $f(n_1,\dots,n_k)$, to each $k$-tuple $(n_1,n_2,\dots,n_k) \in \mathbb{N}^k$. Strictly speaking, though we will frequently forget to be explicit about it, we will often be working with $k$-place {\em partial functions\/} \index{partial function} \index{function partial} which might not be defined for all the $k$-tuples in $\mathbb{N}^k$. If $f$ is a $k$-place partial function, the {\em domain\/} \index{domain of a function} \index{function domain of} of $f$ is the set $$ \mathrm{dom}(f) = \{\, (n_1, \dots, n_k) \in \mathbb{N}^k \mid \text{$f(n_1, \dots, n_k)$ is defined} \,\} \, . $$ Similarly, the {\em range\/} \index{range of a function} of $f$ is the set $$ \mathrm{ran}(f) = \{\, f(n_1, \dots, n_k) \in \mathbb{N} \mid (n_1, \dots, n_k) \in \mathrm{dom}(f) \,\} \, . $$ In subsequent chapters we will also work with relations on the natural numbers. Recall that a {\em $k$-place relation\/} \index{$k$-place relation} \index{relation $k$-place} on $\mathbb{N}$ is formally a subset $P$ of $\mathbb{N}^k$; $P(n_1,\dots,n_k)$ is {\em true\/} if $(n_1,\dots,n_k) \in P$ and {\em false\/} otherwise. In particular, a $1$-place relation is really just a subset of $\mathbb{N}$. Relations and functions are closely related. All one needs to know about a $k$-place function $f$ can be obtained from the $(k+1)$-place relation $P_f$ given by $$ P_f(n_1,\dots,n_k,n_{k+1}) \iff f(n_1,\dots,n_k) = n_{k+1} \, . $$ Similarly, all one needs to know about the $k$-place relation $P$ can be obtained from its {\em characteristic function\/} \index{characteristic function} \index{relation characteristic function}: $$ \chi_P (n_1,\dots,n_k) = \begin{cases} 1 & \text{if $P(n_1,\dots,n_k)$ is true;} \\ 0 & \text{if $P(n_1,\dots,n_k)$ is false.} \end{cases} $$ The basic convention for representing natural numbers on the tape of a standard Turing machine is a slight variation of {\em unary notation\/} \index{unary notation}: $n$ is represented by $1^{n+1}$. (Why would using $1^n$ be a bad idea?) A $k$-tuple $(n_1, n_2, \ldots, n_k) \in \mathbb{N}$ will be represented by $1^{n_1 + 1} 0 1^{n_2 + 1} 0 \ldots 0 1^{n_k + 1}$, {\em i.e.\/} with the representations of the individual numbers separated by $0$s. This scheme is inefficient in its use of space --- compared to binary notation, for example --- but it is simple and can be implemented on Turing machines restricted to the alphabet $\{1\}$. \subsection*{Turing computable functions} With suitable conventions for representing the input and output of a function on the natural numbers on the tape of a Turing machine in hand, we can define what it means for a function to be computable by a Turing machine. \begin{defn} \label{D:TC} A $k$-place function $f$ is {\em Turing computable\/}, \index{Turing computable function} \index{function Turing computable} or just {\em computable\/}, \index{computable function} \index{function computable} if there is a Turing machine $M$ such that for any $k$-tuple $(n_1, \ldots, n_k) \in \mathrm{dom}(f)$ the computation of $M$ with input tape $\underline{0} 1^{n_1 + 1} 0 1^{n_2 + 1} \ldots 0 1^{n_k + 1}$ eventually halts with output tape $\underline{0} 1^{f(n_1,\dots,n_k) + 1}$. Such a machine $M$ is said to {\em compute\/} $f$. \end{defn} Note that for a Turing machine $M$ to compute a function $f$, $M$ need only do the right thing on the right kind of input: what $M$ does in other situations does not matter. In particular, it does not matter what $M$ might do with $k$-tuple which is not in the domain of $f$. \begin{exmp} \index{identity function} \index{function identity} \index{$i_{\mathbb{N}}$} The identity function $i_{\mathbb{N}} \colon \mathbb{N} \to \mathbb{N}$, {\em i.e.\/} $i_{\mathbb{N}}(n) = n$, is computable. It is computed by $M = \emptyset$, the Turing machine with an empty table that does absolutely nothing on any input. \end{exmp} \begin{exmp} \label{ex:projmach} The projection function $\pi^2_1 : \mathbb{N}^2 \to \mathbb{N}$ given by $\pi^2_1(n,m) = n$ is computed by the Turing machine: \begin{center} \mbox{ \begin{tabular}{c|c|c} $P^2_1$ & $0$ & $1$ \\ \hline $1$ & $0R2$ & \\ $2$ & $0R3$ & $1R2$ \\ $3$ & $0L4$ & $0R3$ \\ $4$ & $0L4$ & $1L5$ \\ $5$ & & $1L5$ \end{tabular} } \end{center} \noindent $P^2_1$ acts as follows: it moves to the right past the first block of $1$s without disturbing it, erases the second block of $1$s, and then returns to the left of first block and halts. The projection function $\pi^2_2 : \mathbb{N}^2 \to \mathbb{N}$ given by $\pi^2_2(n,m) = m$ is also computable: the Turing machine $P$ of Example \ref{ex:P22} does the job. \end{exmp} \begin{prob} \label{C:c} Find Turing machines that compute the following functions and explain how they work. \begin{enumerate} \item $\mathsc{O}(n) = 0$. \index{$\mathsc{O}$} \item $\mathsc{S}(n) = n + 1$. \index{$\mathsc{S}$} \item $\mathsc{Sum}(n,m) = n + m$. \index{$\mathsc{Sum}$} \item $\mathsc{Pred}(n) = \begin{cases} n - 1 & n \ge 1 \\ 0 & n = 0 \end{cases}$. \index{$\mathsc{Pred}$} \item $\mathsc{Diff}(n,m) = \begin{cases} n - m & n \ge m \\ 0 & n < m \end{cases}$. \index{$\mathsc{Diff}$} \item $\pi^3_2 (p,q,r) = q$. \item $\pi^k_i (a_1, \dots, a_i, \dots, a_k) = a_i$ \end{enumerate} \end{prob} We will consider methods for building functions computable by Turing machines out of simpler ones later on. \subsection*{A non-computable function} In the meantime, it is worth asking whether or not every function on the natural numbers is computable. No such luck! \begin{prob} \label{C:inc} Show that there is some $1$-place function $f : \mathbb{N} \to \mathbb{N}$ which is not computable by comparing the number of such functions to the number of Turing machines. \end{prob} The argument hinted at above is unsatisfying in that it tells us there is a non-computable function without actually producing an explicit example. We can have some fun on the way to one. \begin{defn}[Busy Beaver Competition] A machine $M$ is an {\em $n$-state entry in the busy beaver competition\/} \index{busy beaver competition} \index{busy beaver competition $n$-state entry} \index{$n$-state entry in busy beaver competition} if: \begin{itemize} \item $M$ has a two-way infinite tape and alphabet $\{1\}$ (see Chapter~\ref{ch:eleven}; \item $M$ has $n+1$ states, but state $n+1$ is used only for halting (so both $M(n+1,0)$ and $M(n+1,1)$ are undefined); \item $M$ eventually halts when given a blank input tape. \end{itemize} $M$'s {\em score\/} \index{score in busy beaver competition} \index{busy beaver competition score in} in the competition is the number of $1$'s on the output tape of its computation from a blank input tape. The greatest possible score of an $n$-state entry in the competition is denoted by $\Sigma(n)$. \end{defn} Note that there are only finitely many possible $n$-state entries in the busy beaver competition because there are only finitely many $(n+1)$-state Turing machines with alphabet $\{1\}$. Since there is at least one $n$-state entry in the busy beaver competition for every $n \ge 0$ , it follows that $\Sigma(n)$ is well-defined for each $n \in \mathbb{N}$. \begin{exmp} $M = \emptyset$ is the {\em only\/} $0$-state entry in the busy beaver competition, so $\Sigma(0) = 0$. \end{exmp} \begin{exmp} \label{bb:exs} The machine $P$ given by \begin{center} \mbox{ \begin{tabular}{c|c|c} $P$ & $0$ & $1$ \\ \hline $1$ & $1R2$ & $1L2$ \\ $2$ & $1L1$ & $1L3$ \end{tabular} } \end{center} \noindent is a $2$-state entry in the busy beaver competition with a score of $4$, so $\Sigma(2) \ge 4$. \end{exmp} The function $\Sigma$ grows extremely quickly. It is known that $\Sigma(0) = 0$, $\Sigma(1) = 1$, $\Sigma(2) = 4$, $\Sigma(3) = 6$, and $\Sigma(4) = 13$. The value of $\Sigma(5)$ is still unknown, but must be quite large.\footnote{The best score known to the author by a $5$-state entry in the busy beaver competition is $4098$. One of the two machines achieving this score does so in a computation that takes over $40$ million steps! The other requires only $11$ million or so\dots} \begin{prob} \label{prob:BB} Show that: \begin{enumerate} \item The $2$-state entry given in Example \ref{bb:exs} actually scores $4$. \item $\Sigma(1) = 1$. \item $\Sigma(3) \ge 6$. \item $\Sigma(n) < \Sigma(n+1)$ for every $n \in \mathbb{N}$. \end{enumerate} \end{prob} \begin{prob} \label{prob:mBB} Devise as high-scoring $4$- and $5$-state entries in the busy beaver competition as you can. \end{prob} The serious point of the busy beaver competition is that the function $\Sigma$ is {\em not\/} a Turing computable function. \begin{prop} \label{p:twelve5} $\Sigma$ is not computable by any Turing machine. \end{prop} Anyone interested in learning more about the busy beaver competition should start by reading the paper \cite{TR:ONCF} in which it was first introduced. \subsection*{Building more computable functions} One of the most common methods for assembling functions from simpler ones in many parts of mathematics is composition. It turns out that compositions of computable functions are computable. \begin{defn} \index{composition} \index{function composition of} Suppose that $m,k \ge 1$, $g$ is an $m$-place function, and $h_1$, \dots, $h_m$ are $k$-place functions. Then the $k$-place function $f$ is said to be obtained from $g$, $h_1$, \dots, $h_m$ by {\em composition\/}, written as $$ f = g \circ (h_1, \dots, h_m) \, , $$ if for all $(n_1, \ldots, n_k) \in \mathbb N^k$, $$ f(n_1, \ldots, n_k) = g(h_1(n_1, \ldots, n_k), \ldots, h_m(n_1, \ldots, n_k)). $$ \end{defn} \begin{exmp} \label{e:con} The constant function $c^1_1$, where $c^1_1(n) = 1$ for all $n$, can be obtained by composition from the functions $\mathsc{S}$ and $\mathsc{O}$. For any $n \in \mathbb N$, $$ c^1_1(n) = (\mathsc{S} \circ \mathsc{O})(n) = \mathsc{S}(\mathsc{O}(n)) = \mathsc{S}(0) = 0 + 1 = 1 \, . $$ \end{exmp} \begin{prob} \label{p:cnstfns} \index{constant function}\index{function constant} Suppose $k \ge 1$ and $a \in \mathbb{N}$. Use composition to define the constant function $c^k_a$, where $c^k_a (n_1, \ldots, n_k) = a$ for all $(n_1, \ldots, n_k) \in \mathbb N^k$, from functions already known to be computable. \end{prob} \begin{prop} \label{p:compcomp} Suppose that $1 \le k$, $1 \le m$, $g$ is a Turing computable $m$-place function, and $h_1$, \dots, $h_m$ are Turing computable $k$-place functions. Then $g \circ (h_1, \dots, h_m)$ is also Turing computable. \end{prop} Starting with a small set of computable functions, and applying computable ways (such as composition) of building functions from simpler ones, we will build up a useful collection of computable functions. This will also provide a characterization of computable functions which does not mention any type of computing device. The ``small set of computable functions'' that will be the fundamental building blocks is infinite only because it includes all the projection functions. \begin{defn} \label{d:initfns} The following are the {\em initial functions\/}: \index{initial function} \index{function initial} \begin{itemize} \item $\mathsc{O}$, the $1$-place function such that $\mathsc{O}(n) = 0$ for all $n \in \mathbb{N}$; \index{$\mathsc{O}$} \item $\mathsc{S}$, the $1$-place function such that $\mathsc{S}(n) = n + 1$ for all $n \in \mathbb{N}$; \index{$\mathsc{S}$} and, \item for each $k \ge 1$ and $1 \le i \le k$, $\pi^k_i$, the $k$-place function such that $\pi^k_i (n_1, \ldots, n_k) = n_i$ for all $(n_1, \ldots, n_k) \in \mathbb{N}^k$. \index{$\pi^k_i$} \end{itemize} $\mathsc{O}$ is often referred to as the {\em zero function\/}, \index{zero function} \index{function zero} $\mathsc{S}$ is the {\em successor function\/}, \index{successor function} \index{function successor} and the functions $\pi^k_i$ are called the {\em projection functions\/}. \index{projection function} \index{function projection} \end{defn} Note that $\pi^1_1$ is just the identity function on $\mathbb{N}$. We have already shown, in Problem~\ref{C:c}, that all the initial functions are computable. It follows from Proposition~\ref{p:compcomp} that every function defined from the initial functions using composition (any number of times) is computable too. Since one can build relatively few functions from the initial functions using only composition\dots \begin{prop} \label{p:complin} Suppose $f$ is a $1$-place function obtained from the initial functions by finitely many applications of composition. Then there is a constant $c \in \mathbb{N}$ such that $f(n) \le n + c$ for all $n \in \mathbb{N}$. \end{prop} \dots in the next chapter we will add other methods of building functions to our repertoire that will allow us to build all computable functions from the initial functions. % % Thirteenth chapter of "A Problem Course in Mathematical Logic" % \chapter{Recursive Functions} \label{ch:thirteen} We will add two other methods of building computable functions from computable functions to composition, and show that one can use the three methods to construct all computable functions on $\mathbb{N}$ from the initial functions. \subsection*{Primitive recursion} The second of our methods is simply called recursion in most parts of mathematics and computer science. Historically, the term ``primitive recursion'' has been used to distinguish it from the other recursive method of defining functions that we will consider, namely unbounded minimalization. ... Primitive recursion boils down to defining a function inductively, using different functions to tell us what to do at the base and inductive steps. Together with composition, it suffices to build up just about all familiar arithmetic functions from the initial functions. \begin{defn} \index{primitive recursion}\index{recursion primitive}\index{function primitive recursion} Suppose that $k \ge 1$, $g$ is a $k$-place function, and $h$ is a $k+2$-place function. Let $f$ be the $(k+1)$-place function such that \begin{enumerate} \item $f(n_1, \ldots, n_k, 0) = g(n_1, \ldots, n_k)$ and \item $f(n_1, \ldots, n_k, m + 1) = h\left(n_1, \ldots, n_k, m, f(n_1, \ldots, n_k, m)\right)$ \end{enumerate} for every $(n_1, \ldots, n_k) \in \mathbb{N}^k$ and $m \in \mathbb{N}$. Then $f$ is said to be obtained from $g$ and $h$ by {\em primitive recursion\/}. \end{defn} That is, the initial values of $f$ are given by $g$, and the rest are given by $h$ operating on the given input and the preceding value of $f$. For a start, primitive recursion and composition let us define addition and multiplication from the initial functions. \begin{exmp} $\mathsc{Sum}(n,m) = n + m$ is obtained by primitive recursion from the initial function $\pi^1_1$ and the composition $\mathsc{S} \circ \pi^3_3$ of initial functions as follows: \index{$\mathsc{Sum}$} \begin{itemize} \item $\mathsc{Sum}(n,0) = \pi^1_1(n)$; \item $\mathsc{Sum}(n,m+1) = (\mathsc{S} \circ \pi^3_3)(n,m,\mathsc{Sum}(n,m))$. \end{itemize} To see that this works, one can proceed by induction on $m$: At the base step, $m = 0$, we have $$ \mathsc{Sum}(n,0) = \pi^1_1(n) = n = n + 0 \, . $$ Assume that $m \ge 0$ and $\mathsc{Sum}(n,m) = n + m$. Then \begin{align*} \mathsc{Sum}(n,m+1) &= (\mathsc{S} \circ \pi^3_3)(n,m,\mathsc{Sum}(n,m)) \\ &= \mathsc{S}(\pi^3_3(n,m,\mathsc{Sum}(n,m))) \\ &= \mathsc{S}(\mathsc{Sum}(n,m)) \\ &= \mathsc{Sum}(n,m) + 1 \\ &= n + m + 1 \, , \end{align*} as desired. \end{exmp} As addition is to the successor function, so multiplication is to addition. \begin{exmp} $\mathsc{Mult}(n,m) = nm$ is obtained by primitive recursion from $\mathsc{O}$ and $\mathsc{Sum} \circ (\pi^3_3, \pi^3_1)$: \begin{itemize} \item $\mathsc{Mult}(n,0) = \mathsc{O}(n)$; \item $\mathsc{Mult}(n,m+1) = (\mathsc{Sum} \circ (\pi^3_3, \pi^3_1))(n,m,\mathsc{Mult}(n,m))$. \end{itemize} \index{$\mathsc{Mult}$} We leave it to the reader to check that this works. \end{exmp} \begin{prob} \label{pr:fns} Use composition and primitive recursion to obtain each of the following functions from the initial functions or other functions already obtained from the initial functions. \begin{enumerate} \item $\mathsc{Exp}(n,m) = n^m$ \index{$\mathsc{Exp}$} \item $\mathsc{Pred}(n)$ (defined in Problem \ref{C:c}) \index{$\mathsc{Pred}$} \item $\mathsc{Diff}(n,m)$ (defined in Problem \ref{C:c}) \index{$\mathsc{Diff}$} \item $\mathsc{Fact}(n) = n!$ \index{$\mathsc{Fact}$} \end{enumerate} \end{prob} \begin{prop} \label{p:thirteen5} Suppose $k \ge 1$, $g$ is a Turing computable $k$-place function, and $h$ is a Turing computable $(k+2)$-place function. If $f$ is obtained from $g$ and $h$ by primitive recursion, then $f$ is also Turing computable. \end{prop} \subsection*{Primitive recursive functions and relations} The collection of functions which can be obtained from the initial functions by (possibly repeatedly) using composition and primitive recursion is useful enough to have a name. \begin{defn} \index{primitive recursive function}\index{function primitive recursive} A function $f$ is {\em primitive recursive\/} if it can be defined from the initial functions by finitely many applications of the operations of composition and primitive recursion. \end{defn} So we already know that all the initial functions, addition, and multiplication, among others, are primitive recursive. \begin{prob} \label{p:thirteen6} Show that each of the following functions is primitive recursive. \begin{enumerate} \item For any $k \ge 0$ and primitive recursive $(k+1)$-place function $g$, the $(k+1)$-place function $f$ given by \begin{align*} f(n_1, \ldots, n_k,m) &= \Pi_{i=0}^m g(n_1, \ldots, n_k, i) \\ &= g(n_1, \ldots, n_k, 0) \cdot \ldots \cdot g(n_1, \ldots, n_k, m) \, . \end{align*} \item For any constant $a \in \mathbb{N}$, $\chi_{\{a\}}(n) = \begin{cases} 0 & n \ne a \\ 1 & n = a \, . \end{cases}$ \item $h(n_1, \ldots, n_k) = \begin{cases} f(n_1, \ldots, n_k) & (n_1, \ldots, n_k) \ne (c_1, \ldots, c_k) \\ a & (n_1, \ldots, n_k) = (c_1, \ldots, c_k) \end{cases}$, if $f$ is a primitive recursive $k$-place function and $a, c_1, \dots, c_k \in \mathbb{N}$ are constants. \end{enumerate} \end{prob} \begin{thm} \label{t:thirteen7} Every primitive recursive function is Turing computable. \end{thm} Be warned, however, that there are computable functions which are not primitive recursive. We can extend the idea of ``primitive recursive'' to relations by using their characteristic functions. \begin{defn} \index{relation primitive recursive} \index{primitive recursive relation} Suppose $k \ge 1$. A $k$-place relation $P \subseteq \mathbb N^k$ is {\em primitive recursive\/} if its characteristic function $$ \chi_P(n_1,\dots,n_k) = \begin{cases} 1 & (n_1,\dots,n_k) \in P \\ 0 & (n_1,\dots,n_k) \notin P \end{cases} $$ is primitive recursive. \end{defn} \begin{exmp} $P = \{2\} \subset \mathbb N$ is primitive recursive since $\chi_{\{2\}}$ is recursive by Problem \ref{p:thirteen6}. \end{exmp} \begin{prob} \label{r:rfs} Show that the following relations and functions are primitive recursive. \begin{enumerate} \item $\lnot P$, {\em i.e.\/} $\mathbb{N}^k \setminus P$, if $P$ is a primitive recursive $k$-place relation. \index{$\lnot P$} \index{$\mathbb{N}^k \setminus P$} \item $P \lor Q$, {\em i.e.\/} $P \cup Q$, if $P$ and $Q$ are primitive recursive $k$-place relations. \index{$P \lor Q$} \index{$P \cup Q$} \item $P \land Q$, {\em i.e.\/} $P \cap Q$, if $P$ and $Q$ are primitive recursive $k$-place relations. \index{$P \land Q$} \index{$P \cap Q$} \item $\mathsc{Equal}$, where $\mathsc{Equal}(n,m) \iff n = m$. \index{$\mathsc{Equal}$} \item $h(n_1,\dots,n_k,m) = \sum_{i=0}^m g(n_1,\dots,n_k,i)$, for any $k \ge 0$ and primitive recursive $(k+1)$-place function $g$. \item $\mathsc{Div}$, where $\mathsc{Div}(n,m) \iff n \mid m$. \index{$\mathsc{Div}$} \item $\mathsc{IsPrime}$, where $\mathsc{IsPrime}(n) \iff n \text{\ is prime}$. \index{$\mathsc{IsPrime}$} \item $\mathsc{Prime}(k) = p_k$, where $p_0 = 1$ and $p_k$ is the $k$th prime if $k \ge 1$. \index{$\mathsc{Prime}$} \item $\mathsc{Power}(n,m) = k$, where $k \ge 0$ is maximal such that $n^k \mid m$. \index{$\mathsc{Power}$} \item $\mathsc{Length}(n) = \ell$, where $\ell$ is maximal such that $p_\ell \mid n$. \index{$\mathsc{Length}$} \item $\mathsc{Element}(n,i) = n_i$, if $n = p_1^{n_1}\dots p_k^{n_k}$ (and $n_i = 0$ if $i > k$). \index{$\mathsc{Element}$} \item $\mathsc{Subseq}(n,i,j) = \begin{cases} p_i^{n_i} p_{i+1}^{n_{i+1}} \dots p_j^{n_j} & \text{if\ } 1 \le i \le j \le k \\ 0 & \text{otherwise} \end{cases}$, \index{$\mathsc{Subseq}$} whenever $n = p_1^{n_1}\dots p_k^{n_k}$. \item $\mathsc{Concat}(n,m) = p_1^{n_1} \dots p_k^{n_k} p_{k+1}^{m_1} \dots p_{k+\ell}^{m_l}$, if $n = p_1^{n_1}\dots p_k^{n_k}$ and $m = p_1^{m_1} \dots p_\ell^{m_\ell}$. \end{enumerate} \end{prob} Parts of Problem \ref{r:rfs} give us tools for representing finite sequences of integers by single integers, as well as some tools for manipulating these representations. This lets us reduce, in principle, all problems involving primitive recursive functions and relations to problems involving only $1$-place primitive recursive functions and relations. \begin{thm} \label{r:cd} A $k$-place $g$ is primitive recursive if and only if the $1$-place function $h$ given by $h(n) = g(n_1, \dots, n_k)$ if $n = p_1^{n_1}\dots p_k^{n_k}$ is primitive recursive. \end{thm} \begin{note} It doesn't matter what the function $h$ may do on an $n$ which does not represent a sequence of length $k$. \end{note} \begin{cor} \label{c:thirteen10} A $k$-place relation $P$ is primitive recursive if and only if the $1$-place relation $P'$ is primitive recursive, where $$ (n_1, \dots, n_k) \in P \iff p_1^{n_1}\dots p_k^{n_k} \in P' \, . $$ \end{cor} \subsection*{A computable but not primitive recursive function} While primitive recursion and composition do not quite suffice to build all Turing computable functions from the initial functions, they are powerful enough that specific counterexamples are not all that easy to find. \begin{exmp}[Ackerman's Function] \index{Ackerman's Function} \index{$\mathsc{A}$} \index{$\alpha$} \label{pr:ack} Define the $2$-place function $\mathsc{A}$ from as follows: \begin{itemize} \item $\mathsc{A}(0,\ell) = \mathsc{S}(\ell)$ \item $\mathsc{A}(\mathsc{S}(k),0) = \mathsc{A}(k,1)$ \item $\mathsc{A}(\mathsc{S}(k),\mathsc{S}(\ell)) = \mathsc{A}(k,\mathsc{A}(\mathsc{S}(k),\ell))$ \end{itemize} Given $\mathsc{A}$, define the $1$-place function $\alpha$ by $\alpha(n) = \mathsc{A}(n,n)$. It isn't too hard to show that $\mathsc{A}$, and hence also $\alpha$, are Turing computable. However, though it takes considerable effort to prove it, $\alpha$ grows faster with $n$ than any primitive recursive function. (Try working out the first few values of $\alpha$\dots) \end{exmp} \begin{prob} \label{p:thirteen11} Show that the functions $\mathsc{A}$ and $\alpha$ defined in Example \ref{pr:ack} are Turing computable. \end{prob} If you are very ambitious, you can try to prove the following theorem. \begin{thm} \label{t:thirteen12} Suppose $\alpha$ is the function defined in Example \ref{pr:ack} and $f$ is any primitive recursive function. Then there is an $n \in \mathbb{N}$ such that for all $k > n$, $\alpha(k) > f(k)$. \end{thm} \begin{cor} \label{c:thirteen13} The function $\alpha$ defined in Example \ref{pr:ack} is not primitive recursive. \end{cor} \noindent\dots but if you aren't, you can still try the following exercise. \begin{prob} \label{p:thirteen14} Informally, define a computable function which must be different from every primitive recursive function. \end{prob} \subsection*{Unbounded minimalization} The last of our three method of building computable functions from computable functions is unbounded minimalization. The functions which can be defined from the initial functions using unbounded minimalization, as well as composition and primitive recursion, turn out to be precisely the Turing computable functions. Unbounded minimalization is the counterpart for functions of ``brute force'' algorithms that try every possibility until they succeed. (Which, of course, they might not\dots) \begin{defn} \index{unbounded minimalization} \index{minimalization unbounded} \index{function unbounded minimalization of} Suppose $k \ge 1$ and $g$ is a $(k+1)$-place function. Then the {\em unbounded minimalization\/} of $g$ is the $k$-place function $f$ defined by $$ f(n_1, \ldots, n_k) = m \text{ where $m$ is least so that $g(n_1, \ldots, n_k,m) = 0$.} $$ This is often written as $f(n_1, \ldots, n_k) = \mu m [g(n_1, \ldots, n_k, m) = 0]$. \end{defn} \begin{note} If there is no $m$ such that $g(n_1, \ldots, n_k,m) = 0$, then the unbounded minimalization of $g$ is not defined on $(n_1, \ldots, n_k)$. This is one reason we will occasionally need to deal with partial functions. \end{note} If the unbounded minimalization of a computable function is to be computable, we have a problem even if we ask for some default output ($0$, say) to ensure that it is defined for all $k$-tuples. The obvious procedure which tests successive values of $g$ to find the needed $m$ will run forever if there is no such $m$, and the incomputability of the Halting Problem suggests that other procedure's won't necessarily succeed either. It follows that it is desirable to be careful, so far as possible, which functions unbounded minimalization is applied to. \begin{defn} \index{regular function} \index{function regular} A $(k+1)$-place function $g$ is said to be {\em regular\/} if for every $(n_1, \ldots, n_k) \in \mathbb N^k$, there is at least one $m \in \mathbb N$ so that $g(n_1, \ldots, n_k, m) = 0$. \end{defn} That is, $g$ is regular precisely if the obvious strategy of computing $g(n_1, \dots, n_k, m)$ for $m = 0$, $1$, \dots in succession until an $m$ is found with $g(n_1, \dots, n_k, m) = 0$ always succeeds. \begin{prop} \label{p:thirteen15} If $g$ is a Turing computable regular $(k+1)$-place function, then the unbounded minimalization of $g$ is also Turing computable. \end{prop} While unbounded minimalization adds something essentially new to our repertoire, it is worth noticing that {\em bounded minimalization\/} \index{bounded minimalization} \index{minimalization bounded} \index{function bounded minimalization of} does not. \begin{prob} \label{p:thirteen16} Suppose $g$ is a $(k+1)$-place primitive recursive regular function such that for some primitive recursive $k$-place function $h$, $$ \mu m [g(n_1, \ldots, n_k, m) = 0] \le h(n_1, \ldots, n_k) $$ for all $(n_1, \ldots, n_k) \in \mathbb{N}$. Show that $\mu m [g(n_1, \ldots, n_k, m) = 0]$ is also primitive recursive. \end{prob} \subsection*{Recursive functions and relations} We can finally define an equivalent notion of computability for functions on the natural numbers which makes no mention of any computational device. \begin{defn} \index{recursive function}\index{function recursive} A $k$-place function $f$ is {\em recursive\/} if it can be defined from the initial functions by finitely many applications of composition, primitive recursion, and the unbounded minimalization of regular functions. Similarly, $k$-place partial function is {\em recursive\/} if it can be defined from the initial functions by finitely many applications of composition, primitive recursion, and the unbounded minimalization of (possibly non-regular) functions. \end{defn} In particular, every primitive recursive function is a recursive function. \begin{thm} \label{RF:TC} Every recursive function is Turing computable. \end{thm} We shall show that every Turing computable function is recursive later on. Similarly to primitive recursive relations we have the following. \begin{defn} \label{df:rr} \index{recursive relation} \index{relation recursive} \index{Turing computable relation} \index{relation Turing computable} A $k$-place relation $P$ is said to be {\em recursive\/} ({\em Turing computable\/}) if its characteristic function $\chi_P$ is recursive (Turing computable). \end{defn} Since every recursive function is Turing computable, and {\em vice versa\/}, ``recursive'' is just a synonym of ``Turing computable'', for functions and relations alike. Also, similarly to Theorem \ref{r:cd} and Corollary \ref{c:thirteen10} we have the following. \begin{thm} \label{t:thirteen17} A $k$-place function $g$ is recursive if and only if the $1$-place function $h$ given by $h(n) = g(n_1, \dots, n_k)$ if $n = p_1^{n_1}\dots p_k^{n_k}$ is recursive. \end{thm} As before, it doesn't really matter what the function $h$ does on an $n$ which does not represent a sequence of length $k$. \begin{cor} \label{c:thirteen18} A $k$-place relation $P$ is recursive if and only if the $1$-place relation $P'$ is recursive, where $$ (n_1, \dots, n_k) \in P \iff p_1^{n_1}\dots p_k^{n_k} \in P' \, . $$ \end{cor} % % Fourteenth chapter of "A Problem Course in Mathematical Logic" % \chapter{Characterizing Computability} \label{ch:fourteen} By putting together some of the ideas in Chapters \ref{ch:twelve} and \ref{ch:thirteen}, we can use recursive functions to simulate Turing machines. This will let us show that Turing computable functions are recursive, completing the argument that Turing machines and recursive functions are essentially equivalent models of computation. We will also use these techniques to construct an {\em universal Turing machine\/}\index{universal Turing machine}\index{Turing machine universal} (or {\em UTM\/}\index{UTM}): a machine $U$ that, when given as input (a suitable description of) some Turing machine $M$ and an input tape $\mathbf a$ for $M$, simulates the computation of $M$ on input $\mathbf a$. In effect, an universal Turing machine is a single piece of hardware that lets us treat other Turing machines as software. \subsection*{Turing computable functions are recursive} Our basic strategy is to show that any Turing machine can be simulated by some recursive function. Since recursive functions operate on integers, we will need to encode the tape positions of Turing machines, as well as Turing machines themselves, by integers. For simplicity, we shall stick to Turing machines with alphabet $\{1\}$; we already know from Chapter~\ref{ch:eleven} that such machines can simulate Turing machines with bigger alphabets. \begin{defn} \label{TP:gcode} \index{code tape position} \index{tape position code} Suppose $(s,i,\mathbf{a})$ is a tape position such that all but finitely many cells of $\mathbf{a}$ are blank. Let $n$ be any positive integer such that $a_k = 0$ for all $k > n$. Then the {\em code\/} of $(s,i,\mathbf{a})$ is $$ \ulcorner (s,i,\mathbf{a}) \urcorner = 2^s 3^i 5^{a_0} 7^{a_1} 11^{a_2} \dots p_{n+3}^{a_n} \, . $$ \end{defn} \begin{exmp} \label{TP:gcd} Consider the tape position $(2,1,1001)$. Then $$ \ulcorner (2,1,1001) \urcorner = 2^2 3^1 5^1 7^0 11^0 13^1 = 780 \, . $$ \end{exmp} \begin{prob} \label{p:fourteen6} Find the codes of the following tape positions. \begin{enumerate} \item $(1,0,\mathbf{a})$, where $\mathbf a$ is entirely blank. \item $(4,3,\mathbf{a})$, where $\mathbf a$ is $1011100101$. \end{enumerate} \end{prob} \begin{prob} \label{p:fourteen7} What is the tape position whose code is $10314720$? \end{prob} When dealing with computations, we will also need to encode sequences of tape positions by integers. \begin{defn} \label{TPS:gc} \index{code sequence of tape positions} \index{tape positions code of a sequence} \index{sequence of tape positions code} Suppose $t_1 t_2 \dots t_n$ is a sequence of tape positions. Then the {\em code\/} of this sequence is $$ \ulcorner t_1 t_2 \dots t_n \urcorner = 2^{\ulcorner t_1 \urcorner} 3^{\ulcorner t_2 \urcorner} \ldots p_n^{\ulcorner t_n \urcorner} \, . $$ \end{defn} \begin{note} Both tape positions and sequences of tape positions have unique codes. \end{note} \begin{prob} \label{p:fourteen8} Pick some (short!) sequence of tape positions and find its code. \end{prob} Having defined how to represent tape positions as integers, we now need to manipulate these representations using recursive functions. The recursive functions and relations in Problems \ref{p:thirteen6} and \ref{r:rfs} provide most of the necessary tools. \begin{prob} \label{p:fourteen8a} Show that both of the following relations are primitive recursive. \begin{enumerate} \item $\mathsc{TapePos}$, where $\mathsc{TapePos}(n)$ $\iff$ $n$ is the code of a tape position. \index{$\mathsc{TapePos}$} \item $\mathsc{TapePosSeq}$, where $\mathsc{TapePosSeq}(n)$ $\iff$ $n$ is the code of a sequence of tape positions. \index{$\mathsc{TapePosSeq}$} \end{enumerate} \end{prob} \begin{prob} \label{p:fourteen9} Show that each of the following is primitive recursive. \begin{enumerate} \item The $4$-place function $\mathsc{Entry}$\index{$\mathsc{Entry}$} such that \begin{align*} &\mathsc{Entry}(j,w,t,n) \\ &= \begin{cases} \ulcorner (t,i+w-1,\mathbf{a}') \urcorner & \text{if $n = \ulcorner (s,i,\mathbf{a}) \urcorner$, $j \in \{0,1\}$,} \\ & \text{$w \in \{0,2\}$, $i+w-1 \ge 0$, and $t \ge 1$,} \\ & \text{where $a_k' = a_k$ for $k \ne i$ and $a_i' = j$;} \\ 0 & \text{otherwise.} \end{cases} \end{align*} \item For any Turing machine $M$ with alphabet $\{1\}$, the $1$-place function $\mathsc{Step}_M$\index{$\mathsc{Step}_M$} such that $$ \mathsc{Step}_M(n) = \begin{cases} \ulcorner \mathbf{M}(s,i,\mathbf{a}) \urcorner & \text{if $n = \ulcorner (s,i,\mathbf{a}) \urcorner$ and} \\ & \text{$\mathbf{M}(s,i,\mathbf{a})$ is defined;} \\ 0 & \text{otherwise.} \end{cases} $$ \item For any Turing machine $M$ with alphabet $\{1\}$, the $1$-place relation $\mathsc{Comp}_M$\index{$\mathsc{Comp}_M$}, where $$ \mathsc{Comp}_M(n) \iff \text{$n$ is the code of a computation of $M$.} $$ \end{enumerate} \end{prob} The functions and relations above may be primitive recursive, but the last big step in showing that Turing computable functions are recursive requires unbounded minimalization. \begin{prop} \label{p:fourteen9a} For any Turing machine $M$ with alphabet $\{1\}$, the $1$-place (partial) function $\mathsc{Sim}_M$\index{$\mathsc{Sim}_M$} is recursive, where $$ \mathsc{Sim}_M(n) = \ulcorner (t,j,\mathbf{b}) \urcorner $$ if $n = \ulcorner (1,0,\mathbf{a}) \urcorner$ for some input tape $\mathbf{a}$ and $M$ eventually halts in position $(t,j,\mathbf{b})$ on input $\mathbf{a}$. (Note that $\mathsc{Sim}_M(n)$ may be undefined if $n \ne \ulcorner (1,0,\mathbf{a}) \urcorner$ for an input tape $\mathbf{a}$, or if $M$ does not eventually halt on input $\mathbf{a}$.) \end{prop} \begin{lem} \label{l:fourteen9b} Show that the following functions are primitive recursive: \begin{enumerate} \item For any fixed $k \ge 1$, $\mathsc{Code}_k (n_1,\ldots,n_k) = \ulcorner (1,0,01^{n_1}0\ldots 01^{n_k}) \urcorner$. \index{$\mathsc{Code}_k$} \item $\mathsc{Decode}(t) = n$ if $t = \ulcorner (s,i,01^{n+1}) \urcorner$ (and anything you like otherwise). \index{$\mathsc{Decode}$} \end{enumerate} \end{lem} \begin{thm} \label{t:fourteen10} Any $k$-place Turing computable function is recursive. \end{thm} \begin{cor} \label{c:TCiffRec} A function $f : \mathbb{N}^k \to \mathbb{N}$ is Turing computable if and only if it is recursive. \end{cor} Thus Turing machines and recursive functions are essentially equivalent models of computation. \subsection*{An universal Turing machine} One can push the techniques used above little farther to get a recursive function that can simulate {\em any\/} Turing machine. Since every recursive function can be computed by some Turing machine, this effectively gives us an universal Turing machine. \index{universal Turing machine} \index{Turing machine universal} \begin{prob} \label{p:fourteen11} \index{code Turing machine} \index{Turing machine code} Devise a suitable definition for the code $\ulcorner M \urcorner$ of a Turing machine $M$ with alphabet $\{1\}$. \end{prob} \begin{prob} \label{p:fourteen12} Show, using your definition of $\ulcorner M \urcorner$ from Problem \ref{p:fourteen11}, that the following are primitive recursive. \begin{enumerate} \item The $2$-place function $\mathsc{Step}$, where \index{$\mathsc{Step}$} $$ \mathsc{Step}(m,n) = \begin{cases} \ulcorner \mathbf{M}(s,i,\mathbf{a}) \urcorner & \text{if $m = \ulcorner M \urcorner$ for some machine $M$,} \\ & \text{$n = \ulcorner (s,i,\mathbf{a}) \urcorner$, \& $\mathbf{M}(s,i,\mathbf{a})$ is defined;} \\ 0 & \text{otherwise.} \end{cases} $$ \item The $2$-place relation $\mathsc{Comp}$, where \index{$\mathsc{Comp}$} $$ \mathsc{Comp}(m,n) \iff m = \ulcorner M \urcorner $$ for some Turing machine $M$ and $n$ is the code of a computation of $M$. \end{enumerate} \end{prob} \begin{prop} \label{p:fourteen12a} The $2$-place (partial) function $\mathsc{Sim}$ is recursive, where, for any Turing machine $M$ with alphabet $\{1\}$ and input tape $\mathbf{a}$ for $M$, \index{$\mathsc{Sim}$} $$ \mathsc{Sim}(\ulcorner M \urcorner,\ulcorner (1,0,\mathbf{a}) \urcorner) = \ulcorner (t,j,\mathbf{b}) \urcorner $$ if $M$ eventually halts in position $(t,j,\mathbf{b})$ on input $\mathbf{a}$. (Note that $\mathsc{Sim}(m,n)$ may be undefined if $m$ is not the code of some Turing machine $M$, or if $n \ne \ulcorner (1,0,\mathbf{a}) \urcorner$ for an input tape $\mathbf{a}$, or if $M$ does not eventually halt on input $\mathbf{a}$.) \end{prop} \begin{cor} \label{c:UTM} There is a Turing machine $U$ which can simulate any Turing machine $M$. \end{cor} \begin{cor} \label{c:URM} There is a recursive function $f$ which can compute any other recursive function. \end{cor} \subsection*{The Halting Problem} An effective method to determine whether or not a given machine will eventually halt on a given input --- short of waiting forever! --- would be nice to have. For example, assuming Church's Thesis is true, such a method would let us identify computer programs which have infinite loops before we attempt to execute them. \begin{question}{The Halting Problem} \index{Halting Problem} Given a Turing machine $M$ and an input tape $\mathbf{a}$, is there an effective method to determine whether or not $M$ eventually halts on input $\mathbf{a}$? \end{question} Given that we are using Turing machines to formalize the notion of an effective method, one of the difficulties with solving the Halting Problem is representing a given Turing machine and its input tape as input for another machine. As this is one of the things that was accomplished in the course of constructing an universal Turing machine, we can now formulate a precise version of the Halting Problem and solve it. \begin{question}{The Halting Problem} \index{Halting Problem} Is there a Turing machine $T$ which, for any Turing machine $M$ with alphabet $\{1\}$ and tape $\mathbf{a}$ for $M$, halts on input $$ \underline{0} 1^{\ulcorner M \urcorner + 1} 0 1^{\ulcorner (1,0,\mathbf{a}) \urcorner + 1} $$ with output $\underline{0} 11$ if $M$ halts on input $\mathbf{a}$, and with output $\underline{0} 1$ if $M$ does not halt on input $\mathbf{a}$? \end{question} Note that this precise version of the Halting Problem is equivalent to the informal one above only if Church's Thesis is true. \begin{prob} \label{HP:coder} Show that there is a Turing machine $C$ which, for any Turing machine $M$ with alphabet $\{1\}$, on input $$ \underline{0} 1^{\ulcorner M \urcorner + 1} $$ eventually halts with output $$ \underline{0} 1^{\ulcorner M \urcorner + 1} 0 1^{\ulcorner (0,1,0 1^{\ulcorner M \urcorner + 1}) \urcorner + 1} $$ \end{prob} \begin{thm} \label{HP:no} The answer to (the precise version of) the Halting Problem is ``No.'' \end{thm} \subsection*{Recursively enumerable sets} The following notion is of particular interest in the advanced study of computability. \begin{defn} \label{df:re} \index{recursively enumerable} \index{r.e.} A subset ({\em i.e.\/} a $1$-place relation) $P$ of $\mathbb{N}$ is {\em recursively enumerable\/}, often abbreviated as {\em r.e.\/}, if there is a $1$-place recursive function $f$ such that $P = \mathrm{im}(f) = \{\, f(n) \mid n \in \mathbb{N} \,\}$. \end{defn} Since the image of any recursive $1$-place function is recursively enumerable by definition, we do not lack for examples. For one, the set $E$ of even natural numbers is recursively enumerable, since it is the image of $f(n) = \mathsc{Mult}(\mathsc{S}(\mathsc{S}(\mathsc{O}(n))),n)$. \begin{prop} \label{p:fourteen13} If $P$ is a $1$-place recursive relation, then $P$ is recursively enumerable. \end{prop} This proposition is not reversible, but it does come close. \begin{prop} \label{p:fourteen14} $P \subseteq \mathbb{N}$ is recursive if and only if both $P$ and $\mathbb{N} \setminus P$ are recursively enumerable. \end{prop} \begin{prob} \label{p:fourteen15} Find an example of a recursively enumerable set which is not recursive. \end{prob} \begin{prob} \label{p:fourteen16} Is $P \subseteq \mathbb N$ primitive recursive if and only if both $P$ and $\mathbb N \setminus P$ are enumerable by primitive recursive functions? \end{prob} \begin{prob} \label{p:fourteen17} $P \subseteq \mathbb N$ recursively enumerable if and only if there is a $1$-place recursive partial function $g$ such that $P = \text{dom}(g) = \{\, n \mid g(n) \text{\ is defined} \,\}$ \end{prob} \chapter*{Hints for Chapters 10--14} % % Hints for Chapter 10 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:ten}} \begin{clue}{ten:tape} This should be easy\dots \end{clue} \begin{clue}{ten:tppos} Ditto. \end{clue} \begin{clue}{ten:TMs} \begin{enumerate} \item Any machine with the given alphabet and a table with three non-empty rows will do. \item Every entry in the table in the $0$ column must write a $1$ in the scanned cell; similarly, every entry in the $1$ column must write a $0$ in the scanned cell. \item What's the simplest possible table for a given alphabet? \end{enumerate} \end{clue} \begin{clue}{ten:comp} Unwind the definitions step by step in each case. Not all of these are computations\dots \end{clue} \begin{clue}{ten:inps} Examine your solutions to the previous problem and, if necessary, take the computations a little farther. \end{clue} \begin{clue}{ten:mrks} Have the machine run on forever to the right, writing down the desired pattern as it goes no matter what may be on the tape already. \end{clue} \begin{clue}{ten:runs} Consider your solution to Problem \ref{ten:mrks} for one possible approach. It should be easy to find simpler solutions, though. \end{clue} \begin{clue}{ten:combTMs} Consider the tasks $S$ and $T$ are intended to perform. \end{clue} \begin{clue}{ten:mTMs} \begin{enumerate} \item Use four states to write the $1$s, one for each. \item The input has a convenient marker. \item Run back and forth to move one marker $n$ cells {\em from\/} the block of $1$'s while moving another {\em through\/} the block, and then fill in. \item Modify the previous machine by having it delete every other $1$ after writing out $1^{2n}$. \item Run back and forth to move the right block of $1$s cell by cell to the desired position. \item Run back and forth to move the left block of $1$s cell by cell past the other two, and then apply a minor modification of the machine in part 5. \item Variations on the ideas used in part 6 should do the job. \item Run back and forth between the blocks, moving a marker through each. After the race between the markers to the ends of their respective blocks has been decided, erase everything and write down the desired output. \end{enumerate} \end{clue} % % Hints for Chapter 11 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:eleven}} \begin{clue}{p:eleven1} This ought to be easy. \end{clue} \begin{clue}{p:eleven2} Generalize the technique of Example \ref{sim:LR}, adding two new states to help with each old state that may cause a move in different directions. You do have to be a bit careful not to make a machine that would run off the end of the tape when the original would not. \end{clue} \begin{clue}{p:eleven1a} You only need to change the parts of the definitions involving the symbols $0$ and $1$. \end{clue} \begin{clue}{p:eleven2a} If you have trouble figuring out whether the subroutine of $Z$ simulating state $1$ of $W$ on input $y$, try tracing the partial computations of $W$ and $Z$ on other tapes involving $y$. \end{clue} \begin{clue}{p:eleven3} Generalize the concepts used in Example~\ref{sim:alph}. Note that the simulation must operate with coded versions of $M$s tape, unless $\Sigma = \{ 1 \}$. The key idea is to use the tape of the simulator in blocks of some fixed size, with the patterns of $0$s and $1$s in each block corresponding to elements of $\Sigma$. \end{clue} \begin{clue}{p:eleven4} This should be straightforward, if somewhat tedious. You do need to be careful in coming up with the appropriate input tapes for $O$. \end{clue} \begin{clue}{p:eleven5} Generalize the technique of Example \ref{sim:2way}, splitting up the tape of the simulator into upper and lower tracks and splitting each state of $N$ into two states in $P$. You will need to be quite careful in describing just how the latter is to be done. \end{clue} \begin{clue}{p:eleven5a} This is mostly pretty easy. The only problem is to devise $N$ so that one can tell from its output whether $P$ halted or crashed, and this is easy to indicate using some extra symbol in $N$s alphabet. \end{clue} \begin{clue}{p:eleven6} If you're in doubt, go with one read/write scanner for each tape, and have each entry in the table of a two-tape machine take both scanners into account. Simulating such a machine is really just a variation on the techniques used in Example \ref{sim:2way}. \end{clue} \begin{clue}{p:eleven7} Such a machine should be able to move its scanner to cells up and down from the current one, as well to the side. (Diagonally too, if you want to!) Simulating such a machine on a single tape machine is a challenge. You might find it easier to first describe how to simulate it on a suitable multiple-tape machine. \end{clue} % % Hints for Chapter 12 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:twelve}} \begin{clue}{C:c} \begin{enumerate} \item Delete most of the input. \item Add a one to the far end of the input. \item Add a little to the input, and delete a little more elsewhere. \item Delete a little from the input {\em most\/} of the time. \item Run back and forth between the two blocks in the input, deleting until one side disappears. Clean up appropriately! (This is a relative of Problem~\ref{ten:mTMs}.8.) \item Delete two of blocks and move the remaining one. \item This is just a souped-up version of the machine immediately preceding\dots \end{enumerate} \end{clue} \begin{clue}{C:inc} There are just as many functions $\mathbb{N} \to \mathbb{N}$ as there are real numbers, but only as many Turing machines as there are natural numbers. \end{clue} \begin{clue}{prob:BB} \begin{enumerate} \item Trace the computation through step-by-step. \item Consider the scores of each of the $1$-state entries in the busy beaver competition. \item Find a $3$-state entry in the busy beaver competition which scores six. \item Show how to turn an $n$-state entry in the busy beaver competition into an $(n+1)$-state entry that scores just one better. \end{enumerate} \end{clue} \begin{clue}{prob:mBB} You could start by looking at modifications of the $3$-state entry you devised in Problem~\ref{prob:BB}.3, but you will probably want to do some serious fiddling to do better than what Problem~\ref{prob:BB}.4 do from there. \end{clue} \begin{clue}{p:twelve5} Suppose $\Sigma$ was computable by a Turing machine $M$. Modify $M$ to get an $n$-state entry in the busy beaver competition for some $n$ which achieves a score greater than $\Sigma(n)$. The key idea is to add a ``pre-processor'' to $M$ which writes a block with more $1$s than the number odf states that $M$ and the pre-processor have between them. \end{clue} \begin{clue}{p:cnstfns} Generalize Example \ref{e:con}. \end{clue} \begin{clue}{p:compcomp} Use machines computing $g$, $h_1$, \dots, $h_m$ as sub-machines of the machine computing the composition. You might also find sub-machines that copy the original input and various stages of the output useful. It is important that each sub-machine get all the data it needs and does not damage the data needed by other sub-machines. \end{clue} \begin{clue}{p:complin} Proceed by induction on the number of applications of composition used to define $f$ from the initial functions. \end{clue} % % Hints for Chapter 13 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:thirteen}} \begin{clue}{pr:fns} \begin{enumerate} \item Exponentiation is to multiplication as multiplication is to addition. \item This is straightforward except for taking care of $\mathsc{Pred}(0) = \mathsc{Pred}(1) = 0$. \item $\mathsc{Diff}$ is to $\mathsc{Pred}$ as $\mathsc{S}$ is to $\mathsc{Sum}$. \item This is straightforward if you let $0! = 1$. \end{enumerate} \end{clue} \begin{clue}{p:thirteen5} Machines used to compute $g$ and $h$ are the principal parts of the machine computing $f$, along with parts to copy, move, and/or delete data on the tape between stages in the recursive process. \end{clue} \begin{clue}{p:thirteen6} \begin{enumerate} \item $f$ is to $g$ as $\mathsc{Fact}$ is to the identity function. \item Use $\mathsc{Diff}$ and a suitable constant function as the basic building blocks. \item This is a slight generalization of the preceding part. \end{enumerate} \end{clue} \begin{clue}{t:thirteen7} Proceed by induction on the number of applications of primitive recursion and composition. \end{clue} \begin{clue}{r:rfs} \begin{enumerate} \item Use a composition including $\mathsc{Diff}$, $\chi_{P}$, and a suitable constant function. \item A suitable composition will do the job; it's just a little harder than it looks. \item A suitable composition will do the job; it's rather more straightforward than the previous part. \item Note that $n = m$ exactly when $n - m = 0 = m - n$. \item Adapt your solution from the first part of Problem~\ref{p:thirteen6}. \item First devise a characteristic function for the relation $$ \mathsc{Product}(n,k,m) \iff nk = m \, , $$ and then sum up. \item Use $\chi_{\mathsc{Div}}$ and sum up. \item Use $\mathsc{IsPrime}$ and some ingenuity. \item Use $\mathsc{Exp}$ and $\mathsc{Div}$ and some more ingenuity. \item A suitable combination of $\mathsc{Prime}$ with other things will do. \item A suitable combination of $\mathsc{Prime}$ and $\mathsc{Power}$ will do. \item Throw the kitchen sink at this one\dots \item Ditto. \end{enumerate} \end{clue} \begin{clue}{r:cd} In each direction, use a composition of functions already known to be primitive recursive to modify the input as necessary. \end{clue} \begin{clue}{c:thirteen10} A straightforward application of Theorem \ref{r:cd}. \end{clue} \begin{clue}{p:thirteen11} This is not unlike, though a little more complicated than, showing that primitive recursion preserves computability. \end{clue} \begin{clue}{t:thirteen12} It's {\em not\/} easy! Look it up\dots \end{clue} \begin{clue}{c:thirteen13} This is a very easy consequence of Theorem \ref{t:thirteen12}. \end{clue} \begin{clue}{p:thirteen14} Listing the definitions of all possible primitive recursive functions is a computable task. Now borrow a trick from Cantor's proof that the real numbers are uncountable. (A formal argument to this effect could be made using techniques similar to those used to show that all Turing computable functions are recursive in the next chapter.) \end{clue} \begin{clue}{p:thirteen15} The strategy should be easy. Make sure that at each stage you preserve a copy of the original input for use at later stages. \end{clue} \begin{clue}{p:thirteen16} The primitive recursive function you define only needs to check values of $g(n_1,\dots,n_k,m)$ for $m$ such that $0 \le m \le h(n_1,\dots,n_k)$, but it still needs to pick the least $m$ such that $g(n_1,\dots,n_k,m) = 0$. \end{clue} \begin{clue}{RF:TC} This is very similar to Theorem \ref{t:thirteen7}. \end{clue} \begin{clue}{t:thirteen17} This is virtually identical to Theorem \ref{r:cd}. \end{clue} \begin{clue}{c:thirteen18} This is virtually identical to Corollary \ref{c:thirteen10}. \end{clue} % % Hints for Chapter 14 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:fourteen}} \begin{clue}{p:fourteen6} Emulate Example \ref{TP:gcd} in both parts. \end{clue} \begin{clue}{p:fourteen7} Write out the prime power expansion of the given number and unwind Definition \ref{TP:gcode}. \end{clue} \begin{clue}{p:fourteen8} Find the codes of each of the positions in the sequence you chose and then apply Definition \ref{TPS:gc}. \end{clue} \begin{clue}{p:fourteen8a} \begin{enumerate} \item $\chi_{\mathsc{TapePos}}(n) = 1$ \index{$\mathsc{TapePos}$} exactly when the power of $2$ in the prime power expansion of $n$ is at least $1$ and every other prime appears in the expansion with a power of $0$ or $1$. This can be achieved with a composition of recursive functions from Problems \ref{p:thirteen6} and \ref{r:rfs}. \item $\chi_{\mathsc{TapePosSeq}}(n)=1$ exactly when $n$ is the code of a sequence of tape positions, {\em i.e.\/} every power in the prime power expansion of $n$ is the code of a tape position. \index{$\mathsc{TapePosSeq}$} \end{enumerate} \end{clue} \begin{clue}{p:fourteen9} \begin{enumerate} \item If the input is of the correct form, make the necessary changes to the prime power expansion of $n$ using the tools in Problem \ref{r:rfs}. \item Piece $\mathsc{Step}_M$ together by cases using the function $\mathsc{Entry}$ in each case. The piecing-together works a lot like redefining a function at a particular point in Problem \ref{p:thirteen6}. \item If the input is of the correct form, use the function $\mathsc{Step}_M$ to check that the successive elements of the sequence of tape positions are correct. \end{enumerate} \end{clue} \begin{clue}{p:fourteen9a} The key idea is to use unbounded minimalization on $\chi_{\mathsc{Comp}}$, with some additions to make sure the computation found (if any) starts with the given input, and then to extract the output from the code of the computation. \end{clue} \begin{clue}{l:fourteen9b} \begin{enumerate} \item To define $\mathsc{Code}_k$\index{$\mathsc{Code}_k$}, consider what $\ulcorner (1,0,01^{n_1}0\ldots 01^{n_k}) \urcorner$ is as a prime power expansion, and arrange a suitable composition to obrtain it from $(n_1,\dots,n_k)$. \item To define $\mathsc{Decode}$\index{$\mathsc{Decode}$} you only need to count how many powers of primes other than $3$ in the prime-power expansion of $\ulcorner (s,i,01^{n+1}) \urcorner$ are equal to $1$. \end{enumerate} \end{clue} \begin{clue}{t:fourteen10} Use Proposition \ref{p:fourteen9a} and Lemma \ref{l:fourteen9b}. \end{clue} \begin{clue}{c:TCiffRec} This follows directly from Theorems \ref{RF:TC} and \ref{t:fourteen10}. \end{clue} \begin{clue}{p:fourteen11} Take some creative inspiration from Definitions \ref{TP:gcode} and \ref{TPS:gc}. For example, if $(s,i) \in \mathrm{dom}(M)$ and $M(s,i) = (j,d,t)$, you could let the code of $M(s,i)$ be $$ \ulcorner M(s,i) \urcorner = 2^s 3^i 5^j 7^{d+1} 11^t \, . $$ \end{clue} \begin{clue}{p:fourteen12} Much of what you need for both parts is just what was needed for Problem \ref{p:fourteen9}, except that $\mathsc{Step}$\index{$\mathsc{Step}$} is probably easier to define than $\mathsc{Step}_M$\index{$\mathsc{Step}_M$} was. (Define it as a composition\dots) The additional ingredients mainly have to do with using $m = \ulcorner M \urcorner$ properly. \end{clue} \begin{clue}{p:fourteen12a} Essentially, this is to Problem \ref{p:fourteen12} as proving Proposition \ref{p:fourteen9a} is to Problem \ref{p:fourteen9}. \end{clue} \begin{clue}{c:UTM} The machine that computes $\mathsc{SIM}$\index{$\mathsc{SIM}$} does the job. \end{clue} \begin{clue}{c:URM} A modification of $\mathsc{SIM}$\index{$\mathsc{SIM}$} does the job. The modifications are needed to handle appropriate input and output. Check Theorem \ref{t:thirteen17} for some ideas on what may be appropriate. \end{clue} \begin{clue}{HP:coder} This can be done directly, but may be easier to think of in terms of recursive functions. \end{clue} \begin{clue}{HP:no} Suppose the answer was yes and such a machine $T$ did exist. Create a machine $U$ as follows. Give $T$ the machine $C$ from Problem \ref{HP:coder} as a pre-processor and alter its behaviour by having it run forever if $M$ halts and halt if $M$ runs forever. What will $T$ do when it gets itself as input? \end{clue} \begin{clue}{p:fourteen13} Use $\chi_P$ to help define a function $f$ such that $\textrm{im}(f) = P$. \end{clue} \begin{clue}{p:fourteen14} One direction is an easy application of Proposition \ref{p:fourteen13}. For the other, given an $n \in \mathbb{N}$, run the functions enumerating $P$ and $\mathbb{N} \setminus P$ concurrently until one or the other outputs $n$. \end{clue} \begin{clue}{p:fourteen15} Consider the set of natural numbers coding (according to some scheme you must devise) Turing machines together with input tapes on which they halt. \end{clue} \begin{clue}{p:fourteen16} See how far you can adapt your argument for Proposition \ref{p:fourteen14}. \end{clue} \begin{clue}{p:fourteen17} This may well be easier to think of in terms of Turing machines. Run a Turing machine that computes $g$ for a few steps on the first possible input, a few on the second, a few more on the first, a few more on the second, a few on the third, a few more on the first, \dots \end{clue} % % Part IV of "A Problem Course in Mathematical Logic" % \part{Incompleteness} % % Fifteenth chapter of "A Problem Course in Mathematical Logic" % \chapter{Preliminaries} \label{ch:fifteen} It was mentioned in the Introduction that one of the motivations for the development of notions of computability was the following question. \begin{proof}[Entscheidungsproblem] \index{Entscheidungsproblem} Given a reasonable set $\Sigma$ of formulas of a first-order language $\mathcal{L}$ and a formula $\varphi$ of $\mathcal{L}$, is there an effective method for determining whether or not $\Sigma \proves \varphi$? \end{proof} Armed with knowledge of first-order logic on the one hand and of computability on the other, we are in a position to formulate this question precisely and then solve it. To cut to the chase, the answer is usually ``no''. G\"odel's Incompleteness Theorem \index{G\"odel Incompleteness Theorem} \index{Incompleteness Theorem} asserts, roughly, that given any set of axioms in a first-order language which are computable and also powerful enough to prove certain facts about arithmetic, it is possible to formulate statements in the language whose truth is not decided by the axioms. In particular, it turns out that no consistent set of axioms can hope to prove its own consistency. We will tackle the Incompleteness Theorem in three stages. First, we will code the formulas and proofs of a first-order language as numbers and show that the functions and relations involved are recursive. This will, in particular, make it possible for us to define a ``computable set of axioms'' precisely. Second, we will show that all recursive functions and relations can be defined by first-order formulas in the presence of a fairly minimal set of axioms about elementary number theory. Finally, by putting recursive functions talking about first-order formulas together with first-order formulas defining recursive functions, we will manufacture a self-referential sentence which asserts its own unprovability. \begin{note} It will be assumed in what follows that you are familiar with the basics of the syntax and semantics of first-order languages, as laid out in Chapters 5--8 of this text. Even if you are already familiar with the material, you may wish to look over Chapters 5--8 to familiarize yourself with the notation, definitions, and conventions used here, or at least keep them handy in case you need to check some such point. \end{note} \subsection*{A language for first-order number theory} To keep things as concrete as possible we will work with and in the following language for first-order number theory, mentioned in Example~\ref{e:lan}. \begin{defn} \label{d:LN} \index{$\mathcal{L}_N$} \index{language for first-order number theory} \index{first-order language for number theory} \index{number theory first-order language for} $\mathcal{L}_N$ is the first-order language with the following symbols: \begin{enumerate} \item Parentheses: $($ and $)$ \item Connectives: $\lnot$ and $\to$ \item Quantifier: $\forall$ \item Equality: $=$ \item Variable symbols: $v_0$, $v_2$, $v_3$, \dots \item Constant symbol: $0$ \item $1$-place function symbol: $S$ \item $2$-place function symbols: $+$, $\cdot$, and $E$. \end{enumerate} \end{defn} The non-logical symbols of $\mathcal{L}_N$, $0$, $S$, $+$, $\cdot$, and $E$, are intended to name, respectively, the number zero, and the successor, addition, multiplication, and exponentiation functions on the natural numbers. That is, the (standard!) structure this language is intended to discuss is $\mathfrak{N} = (\mathbb{N},0,\mathsc{S},+,\cdot,\mathsc{E})$. \index{$\mathfrak{N}$} \subsection*{Completeness} The notion of completeness used in the Incompleteness Theorem is different from the one used in the Completeness Theorem.\footnote{Which, to confuse the issue, was also first proved by Kurt G\"odel.} ``Completeness'' in the latter sense is a property of a logic: it asserts that whenever $\Gamma \models \sigma$ ({\em i.e.\/} the truth of the sentence $\sigma$ follows from that of the set of sentences $\Gamma$), $\Gamma \proves \sigma$ ({\em i.e.\/} there is a deduction of $\sigma$ from $\Gamma$). The sense of ``completeness'' in the Incompleteness Theorem, defined below, is a property of a set of sentences. \begin{defn} \label{d:cplt} \index{completeness} \index{complete set of sentences} A set of sentences $\Sigma$ of a first-order language $\mathcal{L}$ is said to be {\em complete\/} if for every sentence $\tau$ either $\Sigma \proves \tau$ or $\Sigma \proves \lnot \tau$. \end{defn} That is, a set of sentences, or non-logical axioms, is complete if it suffices to prove or disprove every sentence of the langage in in question. \begin{prop} \label{p:fifteen1} A consistent set $\Sigma$ of sentences of a first-order language $\mathcal{L}$ is complete if and only if the theory of $\Sigma$, $$ \mathrm{Th}(\Sigma) = \{\, \tau \mid \text{$\tau$ is a sentence of $\mathcal{L}$ and $\Sigma \proves \tau$} \,\} \, , $$ is maximally consistent. \index{theory of a set of sentences} \index{$\mathrm{Th}(\Sigma)$} \end{prop} % % Sixteenth chapter of "A Problem Course in Mathematical Logic" % \chapter{Coding First-Order Logic} \label{ch:sixteen} We will encode the symbols, formulas, and deductions of $\mathcal{L}_N$ as natural numbers in such a way that the operations necessary to manipulate these codes are recursive. Although we will do so just for $\mathcal{L}_N$, any countable first-order language can be coded in a similar way. \subsection*{G\"odel coding} The basic approach of the coding scheme we will use was devised by G\"odel in the course of his proof of the Incompleteness Theorem. \begin{defn} \label{df:gcsym} \index{G\"odel code of symbols of $\mathcal{L}_N$} To each symbol $s$ of $\mathcal L_N$ we assign an unique positive integer $\ulcorner s \urcorner$, the {\em G\"odel code\/} of $s$, as follows: \begin{enumerate} \item $\ulcorner ( \urcorner = 1$ and $\ulcorner ) \urcorner = 2$ \item $\ulcorner \lnot \urcorner = 3$ and $\ulcorner \to \urcorner = 4$ \item $\ulcorner \forall \urcorner = 5$ \item $\ulcorner = \urcorner = 6$. \item $\ulcorner v_k \urcorner = k + 12$ \item $\ulcorner 0 \urcorner = 7$ \item $\ulcorner S \urcorner = 8$ \item $\ulcorner + \urcorner = 9$, $\ulcorner \cdot \urcorner = 10$, and $\ulcorner E \urcorner = 11$ \end{enumerate} \end{defn} Note that each positive integer is the G\"odel code of one and only one symbol of $\mathcal{L}_N$. We will also need to code sequences of the symbols of $\mathcal{L}_N$, such as terms and formulas, as numbers, not to mention sequences of sequences of symbols of $\mathcal{L}_N$, such as deductions. \begin{defn} \label{df:gcfor} \index{G\"odel code of sequences} Suppose $s_1 s_2 \dots s_k$ is a sequence of symbols of $\mathcal{L}_N$. Then the {\em G\"odel code\/} of this sequence is $$ \ulcorner s_1 \dots s_k \urcorner = p_1^{\ulcorner s_1 \urcorner} \dots p_k^{\ulcorner s_k \urcorner} \, , $$ where $p_n$ is the $n$th prime number. Similarly, if $\sigma_1 \sigma_2 \dots \sigma_\ell$ is a sequence of sequences of symbols of $\mathcal{L}_N$, then the {\em G\"odel code\/} of this sequence is $$ \ulcorner \sigma_1 \dots \sigma_\ell \urcorner = p_1^{\ulcorner \sigma_1 \urcorner} \dots p_k^{\ulcorner \sigma_\ell \urcorner} \, . $$ \end{defn} \begin{exmp} The code of the formula $\forall v_1\, = \cdot v_1 S 0 v_1$ (the official form of $\forall v_1\, v_1 \cdot S 0 = v_1$), $\ulcorner \forall v_1\, = \cdot v_1 S 0 v_1 \urcorner$, works out to \begin{align*} & 2^{\ulcorner \forall \urcorner} 3^{\ulcorner v_1 \urcorner} 5^{\ulcorner = \urcorner} 7^{\ulcorner \cdot \urcorner} 11^{\ulcorner v_1 \urcorner} 13^{\ulcorner S \urcorner} 17^{\ulcorner 0 \urcorner} 19^{\ulcorner v_1 \urcorner} = 2^5 3^{13} 5^6 7^{10} 11^{13} 13^8 17^7 19^{13} \\ & = {\scriptstyle 109425289274918632559342112641443058962750733001979829025245569500000} \, . \end{align*} This is {\em not\/} the most efficient conceivable coding scheme! \end{exmp} \begin{exmp} \label{ex:formseqcode} The code of the sequence of formulas \begin{center} \begin{tabular}{ll} $=00$ & {\em i.e.\/} $0 = 0$ \\ $(=00 \to =S0S0)$ & {\em i.e.\/} $0 = 0 \to S0 = S0$ \\ $=S0S0$ & {\em i.e.\/} $S0 = S0$ \end{tabular} \end{center} works out to \begin{align*} 2^{\ulcorner =00 \urcorner} & 3^{\ulcorner (=00 \to =S0S0) \urcorner} 5^{\ulcorner =S0S0 \urcorner} \\ & = 2^{2^{\ulcorner = \urcorner} 3^{\ulcorner 0 \urcorner} 5^{\ulcorner 0 \urcorner}} \\ & \;\;\;\;\;\;\; \cdot 3^{2^{\ulcorner ( \urcorner} 3^{\ulcorner = \urcorner} 5^{\ulcorner 0 \urcorner} 7^{\ulcorner 0 \urcorner} 11^{\ulcorner \to \urcorner} 13^{\ulcorner = \urcorner} 17^{\ulcorner S \urcorner} 19^{\ulcorner 0 \urcorner} 23^{\ulcorner S \urcorner} 29^{\ulcorner 0 \urcorner} 31^{\ulcorner ) \urcorner}} \\ & \;\;\;\;\;\;\; \cdot 5^{2^{\ulcorner = \urcorner} 3^{\ulcorner S \urcorner} 5^{\ulcorner 0 \urcorner} 7^{\ulcorner S \urcorner} 11^{\ulcorner 0 \urcorner}} \\ & = 2^{2^6 3^7 5^7} 3^{2^1 3^6 5^7 7^7 11^4 13^6 17^8 19^7 23^8 29^7 31^2} 5^{2^6 3^8 5^7 7^8 11^7} \, , \end{align*} which is large enough not to be worth the bother of working it out explicitly. \end{exmp} \begin{prob} \label{p:formcodes} Pick a short sequence of short formulas of $\mathcal{L}_N$ and find the code of the sequence. \end{prob} A particular integer $n$ may simultaneously be the G\"odel code of a symbol, a sequence of symbols, and a sequence of sequences of symbols of $\mathcal{L}_N$. We shall rely on context to avoid confusion, but, with some more work, one could set things up so that no integer was the code of more than one kind of thing. In any case, we will be most interested in the cases where sequences of symbols are (official) terms or formulas and where sequences of sequences of symbols are sequences of (official) formulas. In these cases things are a little simpler. \begin{prob} \label{p:dualcodes} Is there a natural number $n$ which is simultaneously the code of a symbol of $\mathcal{L}_N$, the code of a formula of $\mathcal{L}_N$, and the code of a sequence of formulas of $\mathcal{L}_N$? If not, how many of these three things can a natural number be? \end{prob} \subsection*{Recursive operations on G\"odel codes} We will need to know that various relations and functions which recognize and manipulate G\"odel codes are recursive, and hence computable. \begin{prob} \label{pr:loth} Show that each of the following relations is primitive recursive. \begin{enumerate} \item $\mathsc{Term}(n) \iff n = \ulcorner t \urcorner$ for some term $t$ of $\mathcal{L}_N$. \index{$\mathsc{Term}$} \item $\mathsc{Formula}(n) \iff n = \ulcorner \varphi \urcorner$ for some formula $\varphi$ of $\mathcal{L}_N$. \index{$\mathsc{Formula}$} \item $\mathsc{Sentence}(n) \iff n = \ulcorner \sigma \urcorner$ for some sentence $\sigma$ of $\mathcal{L}_N$. \index{$\mathsc{Sentence}$} \item $\mathsc{Logical}(n) \iff n = \ulcorner \gamma \urcorner$ for some logical axiom $\gamma$ of $\mathcal{L}_N$. \index{$\mathsc{Logical}$} \end{enumerate} \end{prob} Using these relations as building blocks, we will develop relations and functions to handle deductions of $\mathcal{L}_N$. First, though, we need to make ``a computable set of formulas'' precise. \begin{defn} \index{recursive set of formulas} \index{recursively enumerable set of formulas} \index{computable set of formulas} \index{$\ulcorner \Delta \urcorner$} A set $\Delta$ of formulas of $\mathcal{L}_N$ is said to be {\em recursive\/} if the set of G\"odel codes of formulas of $\Delta$, $$ \ulcorner \Delta \urcorner = \{\, \ulcorner \delta \urcorner \mid \delta \in \Delta \,\} \, , $$ is a recursive subset of $\mathbb{N}$ ({\em i.e.\/} a recursive $1$-place relation). Similarly, $\Delta$ is said to be {\em recursively enumerable\/} if $\ulcorner \Delta \urcorner$ is recursively enumerable. \end{defn} \begin{prob} \label{pr:loac} Suppose $\Delta$ is a recursive set of sentences of $\mathcal{L}_N$. Show that each of the following relations is recursive. \begin{enumerate} \item $\mathsc{Premiss}_\Delta(n) \iff n = \ulcorner \beta \urcorner$ for some formula $\beta$ of $\mathcal{L}_N$ which is either a logical axiom or in $\Delta$. \index{$\mathsc{Premiss}_\Delta$} \item $\mathsc{Formulas}(n) \iff n = \ulcorner \varphi_1 \dots \varphi_k \urcorner$ for some sequence $\varphi_1 \dots \varphi_k$ of formulas of $\mathcal{L}_N$. \index{$\mathsc{Formulas}$} \item $\mathsc{Inference}(n,i,j) \iff n = \ulcorner \varphi_1 \dots \varphi_k \urcorner$ for some sequence $\varphi_1 \dots \varphi_k$ of formulas of $\mathcal{L}_N$, $1 \le i,j \le k$, and $\varphi_k$ follows from $\varphi_i$ and $\varphi_j$ by Modus Ponens. \index{$\mathsc{Inference}$} \item $\mathsc{Deduction}_\Delta(n) \iff n = \ulcorner \varphi_1 \dots \varphi_k \urcorner$ for a deduction $\varphi_1 \dots \varphi_k$ from $\Delta$ in $\mathcal{L}_N$. \index{$\mathsc{Deduction}_\Delta$} \item $\mathsc{Conclusion}_\Delta(n,m) \iff n = \ulcorner \varphi_1 \dots \varphi_k \urcorner$ for a deduction $\varphi_1 \dots \varphi_k$ from $\Delta$ in $\mathcal{L}_N$ and $m = \ulcorner \varphi_k \urcorner$. \index{$\mathsc{Conclusion}_\Delta$} \end{enumerate} If $\ulcorner \Delta \urcorner$ is primitive recursive, which of these are primitive recursive? \end{prob} It is at this point that the connection between computability and completeness begins to appear. \begin{thm} \label{t:sixteen3} Suppose $\Delta$ is a recursive set of sentences of $\mathcal{L}_N$. Then $\ulcorner \mathrm{Th}(\Delta) \urcorner$ is \begin{enumerate} \item recursively enumerable, and \item recursive if and only if $\Delta$ is complete. \end{enumerate} \end{thm} \begin{note} It follows that if $\Delta$ is not complete, then $\ulcorner \mathrm{Th}(\Delta) \urcorner$ is an example of a recursively enumerable but not recursive set. \end{note} % % Seventeenth chapter of "A Problem Course in Mathematical Logic} % \chapter{Defining Recursive Functions In Arithmetic} \label{ch:seventeen} The definitions and results in Chapter \ref{ch:seventeen} let us use natural numbers and recursive functions to code and manipulate formulas of $\mathcal{L}_N$. We will also need complementary results that let us use terms and formulas of $\mathcal{L}_N$ to represent and manipulate natural numbers and recursive functions. \subsection*{Axioms for basic arithmetic} We will define a set of non-logical axioms in $\mathcal{L}_N$ which prove enough about the operations of successor, addition, mutliplication, and exponentiation to let us define all the recursive functions using formulas of $\mathcal{L}_N$. The non-logical axioms in question essentially guarantee that basic arithmetic works properly. \begin{defn} \label{df:lna} Let $\mathcal{A}$ be the following set of sentences of $\mathcal{L}_N$, written out in official form. \index{$\mathcal{A}$} \begin{description} \item[N1] $\forall v_0\, (\lnot = Sv_0 0)$ \index{N1} \item[N2] $\forall v_0\, ((\lnot = v_0 0) \to (\lnot \forall v_1\, (\lnot = Sv_1 v_0)))$ \index{N2} \item[N3] $\forall v_0 \forall v_1 \, (= Sv_0 Sv_1 \to = v_0 v_1)$ \index{N3} \item[N4] $\forall v_0\, = + v_0 0 v_0$ \index{N4} \item[N5] $\forall v_0 \forall v_1\, = + v_0 Sv_1 S + v_0 v_1$ \index{N5} \item[N6] $\forall v_0 \, = \cdot v_0 0 0$ \index{N6} \item[N7] $\forall v_0 \forall v_1\, = \cdot v_0 Sv_1 + \cdot v_0 v_1 v_0$ \index{N7} \item[N8] $\forall v_0\, = Ev_0 0 S0$ \index{N8} \item[N9] $\forall v_0 \forall v_1\, = E v_0 Sv_1 \cdot E v_0 v_1 v_0$ \index{N9} \end{description} \end{defn} Translated from the official forms, $\mathcal{A}$ consists of the following axioms about the natural numbers: \begin{description} \item[N1] For all $n$, $n + 1 \ne 0$. \index{N1} \item[N2] For all $n$, $n \ne 0$ there is a $k$ such that $k + 1 = n$. \index{N2} \item[N3] For all $n$ and $k$, $n + 1 = k + 1$ implies that $n = k$. \index{N3} \item[N4] For all $n$, $n + 0 = n$. \index{N4} \item[N5] For all $n$ and $k$, $n + (k + 1) = (n + k) + 1$. \index{N5} \item[N6] For all $n$, $n \cdot 0 = 0$. \index{N6} \item[N7] For all $n$ and $k$, $n \cdot (k + 1) = (n \cdot k) + n$. \index{N7} \item[N8] For all $n$, $n^0 = 1$. \index{N8} \item[N9] For all $n$ and $k$, $n^{k+1} = (n^k) \cdot n$. \index{N9} \end{description} Each of the axioms in $\mathcal{A}$ is true of the natural numbers: \begin{prop} \label{p:axioms} $\mathfrak{N} \models \mathcal{A}$, where $\mathfrak{N} = (\mathbb{N},0,\mathsc{S},+,\cdot,\mathsc{E})$ is the structure consisting of the natural numbers with the usual zero and the usual successor, addition, multiplication, and exponentiation operations. \end{prop} However, $\mathcal{A}$ is a long way from being able to prove all the sentences of first-order arithmetic true in $\mathfrak{N}$. For example, though we won't prove it, it turns out that $\mathcal{A}$ is not enough to ensure that induction works: that for every formula $\varphi$ with at most the variable $x$ free, if $\varphi^x_0$ and $\forall y\, (\varphi^x_y \to \varphi^x_{Sy})$ hold, then so does $\forall x\, \varphi$. On the other hand, neither $\mathcal{L}_N$ nor $\mathcal{A}$ are quite as minimal as they might be. For example, with some (considerable) extra effort one could do without $E$ and define it from $\cdot$ and $+$. \subsection*{Representing functions and relations} For convenience, we will adopt the following conventions. First, we will often abbreviate the term of $\mathcal{L}_N$ consisting of $m$ $S$s followed by $0$ by $S^m0$.\index{$S^m0$} For example, $S^30$ abbreviates $SSS0$. The term $S^m0$ is a convenient name for the natural number $m$ in the language $\mathcal L_N$ since the interpretation of $S^m0$ in $\mathfrak{N}$ is $m$: \begin{lem} \label{l:inter} For every $m \in \mathbb{N}$ and every assignment $s$ for $\mathfrak{N}$, $\mathbf{s}(S^m0) = m$. \end{lem} Second, if $\varphi$ is a formula of $\mathcal{L}_N$ with all of its free variables among $v_1$, \dots, $v_k$, and $m_0$, $m_1$, \dots, $m_k$ are natural numbers, we will write $\varphi(S^{m_1}0, \dots, S^{m_k}0)$ \index{$\varphi(S^{m_1}0, \dots, S^{m_k}0)$} for the sentence $\varphi^{v_1 \dots v_k}_{S^{m_1}0, \dots, S^{m_k}0}$, {\em i.e.\/} $\varphi$ with $S^{m_i}0$ substituted for every free occurrence of $v_i$. Since the term $S^{m_i}0$ involves no variables, it is substitutable for $v_i$ in $\varphi$. \begin{defn} \label{df:rcna} Suppose $\Sigma$ is a set of sentences of $\mathcal{L}_N$. A $k$-place function $f$ is said to be {\em representable\/} \index{representable, function} in $\mathrm{Th}(\Sigma) = \{\, \tau \mid \Sigma \proves \tau \,\}$ if there is a formula $\varphi$ of $\mathcal{L}_N$ with at most $v_1$, \dots, $v_k$, and $v_{k+1}$ as free variables such that $$ \begin{aligned} f(n_1,\dots,n_k) = m &\iff \varphi(S^{n_1}0,\dots,S^{n_k}0,S^m0) \in \mathrm{Th}(\Sigma) \\ &\iff \Sigma \proves \varphi(S^{n_1}0,\dots,S^{n_k}0,S^m0) \end{aligned} $$ for all $n_1$, \dots, $n_k$, and $m$ in $\mathbb{N}$. The formula $\varphi$ is said to {\em represent\/}\index{represent, function} $f$ in $\mathrm{Th}(\Sigma)$. \end{defn} We will use this definition mainly with $\Sigma = \mathcal{A}$. \begin{exmp} \label{ex:frcf} The constant function $c^1_3$ given by $c^1_3(n) = 3$ is representable in $\mathrm{Th}(\mathcal{A})$; $v_2 = S^30$ is a formula representing it. Note that that this formula has no free variable for the input of the $1$-place function, but then the input is irrelevant\dots To see that $v_2 = S^30$ really does represent $c^1_3$ in $\mathrm{Th}(\mathcal{A})$, we need to verify that \begin{align*} c^1_3(n) = m & \iff \mathcal{A} \proves v_2 = S^30 (S^n0,S^m0) \\ & \iff \mathcal{A} \proves S^m0 = S^30 \end{align*} for all $n,m \in \mathbb{N}$. In one direction, suppose that $c^1_3(n) = m$. Then, by the definition of $c^1_3$, we must have $m = 3$. Now \begin{enumerate} \item $\forall x \, x = x \,\to\, S^30 = S^30$ \hfill A4 \item $\forall x \, x = x$ \hfill A8 \item $S^30 = S^30$ \hfill 1,2 MP \end{enumerate} is a deduction of $S^30 = S^30$ from $\mathcal{A}$. Hence if $c^1_3(n) = m$, then $\mathcal{A} \proves S^m0 = S^30$. In the other direction, suppose that $\mathcal{A} \proves S^m0 = S^30$. Since $\mathfrak{N} \models \mathcal{A}$, it follows that $\mathfrak{N} \models S^m0 = S^30$. It follows from Lemma \ref{l:inter} that $m = 3$, so $c^1_3(n) = m$. Hence if $\mathcal{A} \proves S^m0 = S^30$, then $c^1_3(n) = m$. \end{exmp} \begin{prob} \label{p:reppi32} Show that the projection function $\pi^3_2$ can be represented in $\mathrm{Th}(\mathcal{A})$. \end{prob} \begin{defn} \label{df:rcnab} A $k$-place relation $P \subseteq \mathbb{N}^k$ is said to be {\em representable\/} \index{representable, relation} in $\mathrm{Th}(\Sigma)$ if there is a formula $\psi$ of $\mathcal L_N$ with at most $v_1$, \dots, $v_k$ as free variables such that $$ \begin{aligned} P(n_1,\dots,n_k) &\iff \psi(S^{n_1}0,\dots,S^{n_k}0) \in \mathrm{Th}(\Sigma) \\ &\iff \Sigma \proves \psi(S^{n_1}0,\dots,S^{n_k}0) \end{aligned} $$ for all $n_1$, \dots, $n_k$ in $\mathbb N$. The formula $\psi$ is said to {\em represent\/}\index{represent, relation} $P$ in $\mathrm{Th}(\Sigma)$. \end{defn} We will also use this definition mainly with $\Sigma = \mathcal{A}$. \begin{exmp} \label{ex:frcr} Almost the same formula, $v_1 = S^30$, serves to represent the set --- {\em i.e.\/} $1$-place relation --- $\{ 3 \}$ in $\mathrm{Th}(\mathcal{A})$. Showing that $v_1 = S^30$ really does represent $\{ 3 \}$ in $\mathrm{Th}(\mathcal{A})$ is virtually identical to the corresponding argument in Example \ref{ex:frcf}. \end{exmp} \begin{prob} \label{p:repdif} Explain why $v_2 = SSS0$ does not represent the set $\{ 3 \}$ in $\mathrm{Th}(\mathcal{A})$ and $v_1 = SSS0$ does not represent the constant function $c^1_3$ in $\mathrm{Th}(\mathcal{A})$. \end{prob} \begin{prob} Show that the set of all even numbers can representable in $\mathrm{Th}(\mathcal{A})$. \end{prob} \begin{prob} \label{p:seventeen2} Show that the initial functions are representable in $\mathrm{Th}(\mathcal{A})$: \begin{enumerate} \item The zero function $\mathsc{O}(n) = 0$. \index{$\mathsc{O}$} \item The successor function $\mathsc{S}(n) = n + 1$. \index{$\mathsc{S}$} \item For every positive $k$ and $i \le k$, the projection function $\pi^k_i$. \index{$\pi^k_i$} \end{enumerate} \end{prob} It turns out that all recursive functions and relations are representable in $\mathrm{Th}(\mathcal{A})$. \begin{prop} \label{p:seventeen3} A $k$-place function $f$ is representable in $\mathrm{Th}(\mathcal{A})$ if and only if the $k+1$-place relation $P_f$ defined by $$ P_f(n_1, \dots, n_k, n_{k+1}) \iff f(n_1, \dots, n_k) = n_{k+1} $$ is representable in $\mathrm{Th}(\mathcal{A})$. Also, a relation $P \subseteq \mathbb N^k$ is representable in $\mathrm{Th}(\mathcal{A})$ if and only if its characteristic function $\chi_P$ is representable in $\mathrm{Th}(\mathcal{A})$. \end{prop} \begin{prop} \label{p:seventeen4} Suppose $g_1$, \dots, $g_m$ are $k$-place functions and $h$ is an $m$-place function, all of them representable in $\mathrm{Th}(\mathcal{A})$. Then $f = h \circ (g_1,\dots,g_m)$ is a $k$-place function representable in $\mathrm{Th}(\mathcal{A})$. \end{prop} \begin{prop} \label{p:seventeen5} Suppose $g$ is a $k+1$-place regular function which is representable in $\mathrm{Th}(\mathcal{A})$. Then the unbounded minimalization of $g$ is a $k$-place function representable in $\mathrm{Th}(\mathcal{A})$. \end{prop} Between them, the above results supply most of what is needed to conclude that all recursive functions and relations on the natural numbers are representable. The exception is showing that functions defined by primitive recursion from representable functions are also representable, which requires some additional effort. The basic problem is that it is not obvious how a formula defining a function can get at previous values of the function. To accomplish this, we will borrow a trick from Chapter \ref{ch:thirteen}. \begin{prob} \label{p:seventeen6} Show that each of the following relations and functions (first defined in Problem \ref{r:rfs}) is representable in $\mathrm{Th}(\mathcal{A})$. \begin{enumerate} \item $\mathsc{Div}(n,m) \iff n \mid m$ \index{$\mathsc{Div}$} \item $\mathsc{IsPrime}(n) \iff n \text{ is prime}$ \index{$\mathsc{IsPrime}$} \item $\mathsc{Prime}(k) = p_k$, where $p_0 = 1$ and $p_k$ is the $k$th prime if $k \ge 1$. \index{$\mathsc{Prime}$} \item $\mathsc{Power}(n,m) = k$, where $k \ge 0$ is maximal such that $n^k \mid m$. \index{$\mathsc{Power}$} \item $\mathsc{Length}(n) = \ell$, where $\ell$ is maximal such that $p_\ell \mid n$. \index{$\mathsc{Length}$} \item $\mathsc{Element}(n,i) = n_i$, where $n = p_1^{n_1} \dots p_k^{n_k}$ (and $n_i = 0$ if $i > k$). \index{$\mathsc{Element}$} \end{enumerate} \end{prob} Using the representable functions and relations given above, we can represent a ``history function'' of any representable function\dots \begin{prob} \label{p:seventeen7} Suppose $f$ is a $k$-place function representable in $\mathrm{Th}(\mathcal{A})$. Show that \begin{align*} F(n_1,\dots,n_k,m) &= p_1^{f(n_1,\dots,n_k,0)} \dots p_{m+1}^{f(n_1,\dots,n_k,m)} \\ &= \prod_{i=0}^m p_i^{f(n_1,\dots,n_k,i)} \end{align*} is also representable in $\mathrm{Th}(\mathcal{A})$. \end{prob} \noindent\dots and use it! \begin{prop} \label{p:seventeen8} Suppose $g$ is a $k$-place function and $h$ is a $k+2$-place function, both representable in $\mathrm{Th}(\mathcal{A})$. Then the $k+1$-place function $f$ defined by primitive recursion from $g$ and $h$ is also representable in $\mathrm{Th}(\mathcal{A})$. \end{prop} \begin{thm} \label{th:rfra} Recursive functions are representable in $\mathrm{Th}(\mathcal{A})$. \end{thm} In particular, it follows that there are formulas of $\mathcal{L}_N$ representing each of the functions from Chapter \ref{ch:sixteen} for manipulating the codes of formulas. This will permit us to construct formulas which encode assertions about terms, formulas, and deductions; we will ultimately prove the Incompleteness Theorem by showing there is a formula which codes its own unprovability. \subsection*{Representability} We conclude with some more general facts about representability. \begin{prop} \label{p:seventeen1a} Suppose $\Sigma$ is a set of sentences of $\mathcal{L}_N$ and $f$ is a $k$-place function which is representable in $\mathrm{Th}(\Sigma)$. Then $\Sigma$ must be consistent. \end{prop} \begin{prob} \label{p:seventeen1b} If $\Sigma$ is a set of sentences of $\mathcal{L}_N$ and $P$ is a $k$-place relation which is representable in $\mathrm{Th}(\Sigma)$, does $\Sigma$ have to be consistent? \end{prob} \begin{prop} \label{p:seventeen1} Suppose $\Sigma$ and $\Gamma$ are consistent sets of sentences of $\mathcal{L}_N$ and $\Sigma \proves \Gamma$, {\em i.e.\/} $\Sigma \proves \gamma$ for every $\gamma \in \Gamma$. Then every function and relation which is representable in $\mathrm{Th}(\Gamma)$ is representable in $\mathrm{Th}(\Sigma)$. \end{prop} This lets us use everything we can do with representability in $\mathrm{Th}(\mathcal{A})$ with any set of axioms in $\mathcal{L}_N$ that is at least as powerful as $\mathcal{A}$. \begin{cor} \label{p:representability} Functions and relations which representable in $\mathrm{Th}(\mathcal{A})$ are also representable in $\mathrm{Th}(\Sigma)$, for any consistent set of sentences $\Sigma$ such that $\Sigma \proves \mathcal{A}$. \end{cor} % % Chapter 18 of "A Problem Course in Mathematical Logic" % \chapter{The Incompleteness Theorem} \label{ch:eighteen} The material in Chapter \ref{ch:sixteen} effectively allows us to use recursive functions to manipulate coded formulas of $\mathcal{L}_N$, while the material in Chapter \ref{ch:seventeen} allows us to represent recursive functions using formulas $\mathcal{L}_N$. Combining these techniques allows us to use formulas of $\mathcal{L}_N$ to refer to and manipulate codes of formulas of $\mathcal{L}_N$. This is the key to proving G\"odel's Incompleteness Theorem and related results. In particular, we will need to know one further trick about manipulating the codes of formulas recursively, that the operation of substituting (the code of) the term $S^k0$ into (the code of) a formula with one free variable is recursive. \begin{prob} \label{pb:rpfn} Show that the function $$ \mathsc{Sub}(n,k) = \begin{cases} \ulcorner \varphi(S^k0) \urcorner & \text{if $n = \ulcorner \varphi \urcorner$ for a formula $\varphi$ of $\mathcal{L}_N$} \\ & \text{with at most $v_1$ free} \\ 0 & \text{otherwise} \end{cases} $$ \index{$\mathsc{Sub}$} is recursive, and hence representable $\mathrm{Th}(\mathcal{A})$. \end{prob} In order to combine the the results from Chapter \ref{ch:sixteen} with those from Chapter \ref{ch:seventeen}, we will also need to know the following. \begin{lem} \label{p:eighteen1} $\mathcal{A}$ is a recursive set of sentences of $\mathcal{L}_N$. \end{lem} \subsection*{The First Incompleteness Theorem} The key result needed to prove the First Incompleteness Theorem (another will follow shortly!) is the following lemma. It asserts, in effect, that for any statement about (the code of) some sentence, there is a sentence $\sigma$ which is true or false exactly when the statement is true or flase of (the code of) $\sigma$. This fact will allow us to show that the self-referential sentence we will need to verify the Incompleteness theorem exists. \begin{lem}[Fixed-Point Lemma] \index{Fixed-Point Lemma} \label{l:fpl} Suppose $\varphi$ is a formula of $\mathcal{L}_N$ with only $v_1$ as a free variable. Then there is a sentence $\sigma$ of $\mathcal{L}_N$ such that $$ \mathcal{A} \proves \sigma \fromto \varphi(S^{\ulcorner \sigma \urcorner}0) \, . $$ \end{lem} Note that $\sigma$ must be different from the sentence $\varphi(S^{\ulcorner \sigma \urcorner}0)$: there is no way to find a formula $\varphi$ with one free variable and an integer $k$ such that $\ulcorner \varphi(S^k0) \urcorner = k$. (Think about how G\"odel codes are defined\dots) With the Fixed-Point Lemma in hand, G\"odel's First Incompleteness Theorem can be put away in fairly short order. \begin{thm}[G\"odel's First Incompleteness Theorem] \index{G\"odel's First Incompleteness Theorem} \index{Incompleteness Theorem, G\"odel's First} \label{t:GIT} Suppose $\Sigma$ is a consistent recursive set of sentences of $\mathcal{L}_N$ such that $\Sigma \proves \mathcal{A}$. Then $\Sigma$ is not complete. \end{thm} That is, any consistent set of sentences which proves at least as much about the natural numbers as $\mathcal{A}$ does can't be both complete and recursive. The First Incompleteness Theorem has many variations, corollaries, and relatives, a few of which will be mentioned below. \cite{RS:GIT} is a good place to learn about more of them. \begin{cor} \label{p:eighteen5} \begin{enumerate} \item Let $\Gamma$ be a complete set of sentences of $\mathcal{L}_N$ such that $\Gamma \cup \mathcal{A}$ is consistent. Then $\Gamma$ is not recursive. \item Let $\Delta$ be a recursive set of sentences such that $\Delta \cup \mathcal{A}$ is consistent. Then $\Delta$ is not complete. \item The theory of $\mathfrak{N}$, \index{theory of $\mathfrak{N}$} \index{$\mathrm{Th}(\mathfrak{N})$} $$ \mathrm{Th}(\mathfrak{N}) =\{\, \sigma \mid \text{$\sigma$ is a sentence of $\mathcal{L}_N$ and $\mathfrak{N} \models \sigma$} \,\} \, , $$ is not recursive. \end{enumerate} \end{cor} There is nothing really special about working in $\mathcal{L}_N$. The proof of G\"odel's Incompleteness Theorem can be executed for any first order language and recursive set of axioms which allow one to code and prove enough facts about arithmetic. In particular, it can be done whenever the language and axioms are powerful enough --- as in Zermelo-Fraenkel set theory, for example --- to define the natural numbers and prove some modest facts about them. \subsection*{The Second Incompleteness Theorem} G\"odel also proved a strengthened version of the Incompleteness Theorem which asserts that, in particular, a consistent recursive set of sentences $\Sigma$ of $\mathcal{L}_N$ cannot prove its own consistency. To get at it, we need to express the statement ``$\Sigma$ is consistent'' in $\mathcal{L}_N$. \begin{prob} \label{p:eighteen6} \index{$\mathrm{Con}(\Sigma)$} Suppose $\Sigma$ is a recursive set of sentences of $\mathcal{L}_N$. Find a sentence of $\mathcal{L}_N$, which we'll denote by $\mathrm{Con}(\Sigma)$, such that $\Sigma$ is consistent if and only if $\mathcal{A} \proves \mathrm{Con}(\Sigma)$. \end{prob} \begin{thm}[G\"odel's Second Incompleteness Theorem] \index{G\"odel's Second Incompleteness Theorem} \index{Incompleteness Theorem, G\"odel's Second} \label{t:GSIT} Let $\Sigma$ be a consistent recursive set of sentences of $\mathcal{L}_N$ such that $\Sigma \proves \mathcal{A}$. Then $\Sigma \nproves \mathrm{Con}(\Sigma)$. \end{thm} As with the First Incompleteness Theorem, the Second Incompleteness Theorem holds for any recursive set of sentences in a first-order language which allow one to code and prove enough facts about arithmetic. The perverse consequence of the Second Incompleteness Theorem is that only an inconsistent set of axioms can prove its own consistency. \subsection*{Truth and definability} A close relative of the Incompleteness Theorem is the assertion that truth in $\mathfrak{N} = (\mathbb{N},\mathsc{S},+,\cdot,\mathsc{E},0)$ is not definable in $\mathfrak{N}$. To make sense of this, of course, we need to sort out what ``truth'' and ``definable in $\mathfrak{N}$'' mean here. ``Truth'' means what it usually does in first-order logic: all we mean when we say that a sentence $\sigma$ of $\mathcal{L}_N$ is true in $\mathfrak{N}$ is that when $\sigma$ is true when interpreted as a statement about the natural numbers with the usual operations. That is, $\sigma$ is true in $\mathfrak{N}$ exactly when $\mathfrak{N}$ satisfies $\sigma$, {\em i.e.\/} exactly when $\mathfrak{N} \models \sigma$. ``Definable in $\mathfrak{N}$'' we do have to define\dots \begin{defn} \index{definable relation} \index{relation definable in $\mathfrak{N}$} A $k$-place relation is {\em definable\/} in $\mathfrak{N}$ if there is a formula $\varphi$ of $\mathcal{L}_N$ with at most $v_1$, \dots, $v_k$ as free variables such that $$ P(n_1,\dots,n_k) \iff \mathfrak{N} \models \varphi [s(v_1|n_1)\dots(v_k|n_k)] $$ for every assignment $s$ of $\mathfrak{N}$. The formula $\varphi$ is said to {\em define\/} $P$ in $\mathfrak{N}$. \end{defn} A definition of ``function definable in $\mathfrak{N}$'' \index{function definable in $\mathfrak{N}$} \index{definable function} could be made in a similar way, of course. Definability is a close relative of representability: \begin{prop} \label{p:eighteen8} Suppose $P$ is a $k$-place relation which is representable in $\mathrm{Th}(\mathcal{A})$. Then $P$ is definable in $\mathfrak{N}$. \end{prop} \begin{prob} \label{p:eighteen9} Is the converse to Proposition \ref{p:eighteen8} true? \end{prob} The question of whether truth in $\mathfrak{N}$ is definable is then the question of whether the set of G\"odel codes of sentences of $\mathcal{L}_N$ true in $\mathfrak{N}$, $$ \ulcorner \mathrm{Th}(\mathfrak{N}) \urcorner = \{\, \ulcorner \sigma \urcorner \mid \text{$\sigma$ is a sentence of $\mathcal{L}_N$ and $\mathfrak N \models \sigma$} \,\} \, , $$ is definable in $\mathfrak{N}$. It isn't: \begin{thm}[Tarski's Undefinability Theorem] \index{Tarski's Undefinability Theorem} \index{Undefinability Theorem, Tarski's} \label{t:TUT} $\ulcorner \mathrm{Th}(\mathfrak{N}) \urcorner$ is \linebreak not definable in $\mathfrak{N}$. \end{thm} \subsection*{The implications} G\"odel's Incompleteness Theorems have some serious consequences. Since almost all of mathematics can be formalized in first-order logic, the First Incompleteness Theorem implies that there is no effective procedure that will find and prove all theorems. This might be considered as job security for research mathematicians. The Second Incompleteness Theorem, on the other hand, implies that we can never be completely sure that any reasonable set of axioms is actually consistent unless we take a more powerful set of axioms on faith. It follows that one can never be completely sure --- faith aside --- that the theorems proved in mathematics are really true. This might be considered as job security for philosophers of mathematics. We leave the question of who gets job security from Tarski's Undefinability Theorem to you, gentle reader\dots \chapter*{Hints for Chapters 15--18} % % Hints for Chapter 15 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:fifteen}} \begin{clue}{p:fifteen1} Compare Definition \ref{d:cplt} with the definition of maximal consistency. \end{clue} % % Hints Chapter 16 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:sixteen}} \begin{clue}{p:formcodes} Do what is done in Example \ref{ex:formseqcode} for some other sequence of formulas. \end{clue} \begin{clue}{p:dualcodes} You need to unwind Definitions \ref{df:gcsym} and \ref{df:gcfor}, keeping in mind that you are dealing with formulas and sequences of formulas, not just arbitrary sequences of symbolsof $\mathcal{L}_N$ or sequences of sequences of symbols. \end{clue} \begin{clue}{pr:loth} In each case, use Definitions \ref{df:gcsym} and \ref{df:gcfor}, along with the appropriate definitions from first-order logic and the tools developed in Problems \ref{p:thirteen6} and \ref{r:rfs}. \begin{enumerate} \item Recall that in $\mathcal{L}_N$, a term is either a variable symbol, {\it i.e.\/} $v_k$ for some $k$, the constant symbol $0$, of the form $St$ for some (shorter) term $t$, or $+t_1t_2$ for some (shorter) terms $t_1$ and $t_2$. $\chi_{\mathsc{Term}}(n)$ needs to check the length of the sequence coded by $n$. If this is of length $1$, it will need to check if the symbol coded is $0$ or $v_k$ for some $k$; otherwise, it needs to check if the sequence coded by $n$ begins with an $S$ or $+$, and then whether the rest of the sequence consists of one or two valid terms. Primitive recursion is likely to be necessary in the latter case if you can't figure out how to do it using the tools from Problems \ref{p:thirteen6} and \ref{r:rfs}. \item This is similar to showing $\mathsc{Term}(n)$ is primitive recursive. Recall that in $\mathcal L_N$, a formula is of the form either $=t_1t_2$ for some terms $t_1$ and $t_2$, $(\not\alpha)$ for some (shorter) formula $\alpha$, $(\beta \to \gamma)$ for some (shorter) formulas $\beta$ and $\gamma$, or $\forall v_i\, \delta$ for some variable symbol $v_i$ and some (shorter) formula $\delta$. $\chi_{\mathsc{Formula}}(n)$ needs to check the first symbol of the sequence coded by $n$ to identify which case ought to apply and then take it from there. \item Recall that a sentence is justa formula with no free variable; that is, every occurrence of a variable is in the scope of a quantifier. \item Each logical axiom is an instance of one of the schema A1--A8, or is a generalization thereof. \end{enumerate} \end{clue} \begin{clue}{pr:loac} In each case, use Definitions \ref{df:gcsym} and \ref{df:gcfor}, together with the appropriate definitions from first-order logic and the tools developed in Problems \ref{r:rfs} and \ref{pr:loth}. \begin{enumerate} \item $\ulcorner \Delta \urcorner$ is recursive and $\mathsc{Logical}$ is primitive recursive, so\dots \item All $\chi_{\mathsc{Formulas}}(n)$ has to do is check that every element of the sequence coded by $n$ is the code of a formula, and $\mathsc{Formula}$ is already known to be primitive recursive. \item $\chi_{\mathsc{Inference}}(n)$ needs to check that $n$ is the code of a sequence of formulas, with the additional property that either $\varphi_i$ is $(\varphi_j \to \varphi_k)$ or $\varphi_j$ is $(\varphi_i \to \varphi_k)$. Part of what goes into $\chi_{\mathsc{Formula}}(n)$ may be handy for checking the additional property. \item Recall that a deduction from $\Delta$ is a sequence of formulas $\varphi_1 \dots \varphi_k$ where each formula is either a premiss or follows from preceding formulas by Modus Ponens. \item $\chi_{\mathsc{Conclusion}}(n,m)$ needs to check that $n$ is the code of a deduction and that $m$ is the code of the last formula in that deduction. \end{enumerate} They're all primitive recursive if $\ulcorner \Delta \urcorner$ is, by the way. \end{clue} \begin{clue}{t:sixteen3} \begin{enumerate} \item Use unbounded minimalization and the relations in Problem \ref{pr:loac} to define a function which, given $n$, returns the $n$th integer which codes an element of $\mathrm{Th}(\Delta)$. \item If $\Delta$ is complete, then for any sentence $\sigma$, either $\lceil \sigma \rceil$ or $\lceil \lnot \sigma$ must eventually turn up in an enumeration of $\ulcorner \mathrm{Th}(\Delta) \urcorner$. The other direction is really just a matter of unwinding the definitions involved. \end{enumerate} \end{clue} % % Hints for Chapter 17 of "A Problem Course in Mathematical Logic} % \subsection*{Hints for Chapter~\ref{ch:seventeen}} \begin{clue}{p:seventeen1} Every deduction from $\Gamma$ can be replaced by a deduction of $\Sigma$ with the same conclusion. \end{clue} \begin{clue}{p:seventeen1a} If $\Sigma$ were insconsistent it would prove entirely too much\dots \end{clue} \begin{clue}{p:seventeen2} \begin{enumerate} \item Adapt Example \ref{ex:frcf}. \item Use the $1$-place function symbol $S$ of $\mathcal{L}_N$. \item There is much less to this part than meets the eye\dots \end{enumerate} \end{clue} \begin{clue}{p:seventeen3} In each case, you need to use the given representing formula to define the one you need. \end{clue} \begin{clue}{p:seventeen4} String together the formulas representing $g_1$, \dots, $g_m$, and $h$ with $\land$s and put some existential quantifiers in front. \end{clue} \begin{clue}{p:seventeen5} First show that that $<$ is representable in $\mathrm{Th}(\mathcal{A})$ and then exploit this fact. \end{clue} \begin{clue}{p:seventeen6} \begin{enumerate} \item $n \mid m$ if and only if there is some $k$ such that $n \cdot k = m$. \item $n$ is prime if and only if there is no $\ell$ such that $\ell \mid n$ and $1 < \ell < n$. \item $p_k$ is the first prime with exactly $k-1$ primes less than it. \item Note that $k$ must be minimal such that $n^{k+1} \nmid m$. \item You'll need a couple of the previous parts. \item Ditto. \end{enumerate} \end{clue} \begin{clue}{p:seventeen7} Problem \ref{p:seventeen6} has most of the necessary ingredients needed here. \end{clue} \begin{clue}{p:seventeen8} Problems \ref{p:seventeen6} and \ref{p:seventeen7} have most of the necessary ingredients between them. \end{clue} \begin{clue}{th:rfra} Proceed by induction on the numbers of applications of composition, primitive recursion, and unbounded minimalization in the recursive definition $f$, using the previous results in Chapter \ref{ch:seventeen} at the basis and induction steps. \end{clue} % % Hints for Chapter 18 of "A Problem Course in Mathematical Logic" % \subsection*{Hints for Chapter~\ref{ch:eighteen}} \begin{clue}{p:eighteen1} $\mathcal{A}$ is a {\em finite\/} set of sentences. \end{clue} \begin{clue}{pb:rpfn} First show that recognizing that a formula has at most $v_1$ as a free variable is recursive. The rest boils down to checking that substituting a term for a free variable is also recursive, which has already had to be done in the solutions to Problem \ref{pr:loth}. \end{clue} \begin{clue}{l:fpl} Let $\psi$ be the formula (with at most $v_1$, $v_2$, and $v_3$ free) which represents the function $f$ of Problem \ref{pb:rpfn} in $\mathrm{Th}(\mathcal{A})$. Then the formula $\forall v_3\, ( \psi^{v_2}{v_1} \to \varphi^{v_1}_{v_3} )$ has only one variable free, namely $v_1$, and is very close to being the sentence $\sigma$ needed. To obtain $\sigma$ you need to substitute $S^kO$ for a suitable $k$ for $v_1$. \end{clue} \begin{clue}{t:GIT} Try to prove this by contradiction. Observe first that if $\Sigma$ is recursive, then $\ulcorner \mathrm{Th}(\Sigma) \urcorner$ is representable in $\mathrm{Th}(\mathcal{A})$. \end{clue} \begin{clue}{p:eighteen5} \begin{enumerate} \item If $\Gamma$ were recursive, you could get a contradiction to the Incompleteness Theorem. \item If $\Delta$ were complete, it couldn't also be recursive. \item Note that $\mathcal{A} \subset \mathrm{Th}(\mathfrak{N})$. \end{enumerate} \end{clue} \begin{clue}{p:eighteen6} Modify the formula representing the function $\mathsc{Conclusion}_\Sigma$ (defined in Problem \ref{pr:loac}) to get $\mathrm{Con}(\Sigma)$. \end{clue} \begin{clue}{t:GSIT} Try to do a proof by contradiction in three stages. First, find a formula $\varphi$ (with just $v_1$ free) that represents ``$n$ is the code of a sentence which cannot be proven from $\Sigma$'' and use the Fixed-Point Lemma to find a sentence $\tau$ such that $\Sigma \proves \tau \fromto \varphi(S^{\ulcorner \tau \urcorner})$. Second, show that if $\Sigma$ is consistent, then $\Sigma \nproves \tau$. Third --- the {\em hard\/} part --- show that $\Sigma \proves \mathrm{Con}(\Sigma) \to \varphi(S^{\ulcorner \tau \urcorner})$. This leads directly to a contradiction. \end{clue} \begin{clue}{p:eighteen8} Note that $\mathfrak{N} \models \mathcal{A}$. \end{clue} \begin{clue}{p:eighteen9} If the converse was true, $\mathcal{A}$ would run afoul of the (First) Incompleteness Theorem. \end{clue} \begin{clue}{t:TUT} Suppose, by way of contradiction, that $\ulcorner \mathrm{Th}(\mathfrak{N}) \urcorner$ was definable in $\mathfrak{N}$. Now follow the proof of the (First) Incompleteness Theorem as closely as you can. \end{clue} % % Appendices % \part*{Appendices} \appendix % % Appendix to "A Problem Course in Mathematical Logic" % \chapter{A Little Set Theory} \label{ap:sets} \index{set theory} This apppendix is meant to provide an informal summary of the notation, definitions, and facts about sets needed in Chapters \ref{ch:one}--\ref{ch:nine}. For a proper introduction to elementary set theory, try \cite{PH:NST} or \cite{JH:OST}. \begin{defn} \label{d:sed} Suppose $X$ and $Y$ are sets. Then \begin{enumerate} \item $a \in X$ means that $a$ is an {\em element\/} of ({\em i.e.\/} a thing in) the set $X$.\index{element} \index{$\in$} \item $X$ is a subset of $Y$, written as $X \subseteq Y$, if $a \in Y$ for every $a \in X$.\index{subset} \index{$\subseteq$} \item The {\em union\/} of $X$ and $Y$ is $X \cup Y = \{\, a \mid a \in X \text{\ or\ } a \in Y \,\}$.\index{union} \index{$\cup$} \item The {\em intersection\/} of $X$ and $Y$ is $X \cap Y = \{\, a \mid a \in X \text{\ and\ } a \in Y \,\}$.\index{intersection} \index{$\cap$} \item The {\em complement of\/} $Y$ {\em relative to\/} $X$ is $X \setminus Y = \{\, a \mid a \in X \text{\ and\ } a \notin Y \,\}$. \index{complement} \index{$\setminus$} \item The {\em cross product\/} of $X$ and $Y$ is $X \times Y = \{\, (a,b) \mid a \in X \text{\ and\ } b \in Y \,\}$. \index{cross product} \index{$\times$} \item The {\em power set\/} of $X$ is $\mathcal{P}(X) = \{\, Z \mid Z \subseteq X \,\}$. \index{power set} \index{$\mathcal{P}$} \item $[X]^k = \{\, Z \mid Z \subseteq X \text{\ and\ } |Z| = k \,\}$ is the set of subsets of $X$ of size $k$. \index{$[X]^k$} \end{enumerate} \end{defn} If all the sets being dealt with are all subsets of some fixed set $Z$, the complement of $Y$, $\Bar{Y}$\index{$\Bar{Y}$}, is usually taken to mean the complement of $Y$ relative to $Z$. It may sometimes be necessary to take unions, intersections, and cross products of more than two sets. \begin{defn} Suppose $A$ is a set and $\mathbf{X} = \{\, X_a \mid a \in A \,\}$ is a family of sets indexed by $A$. Then \begin{enumerate} \item The union of $\mathbf{X}$ is the set $\bigcup \mathbf{X} = \{\, z \mid \exists a \in A \colon z \in X_a \,\}$.\index{union} \item The intersection of $\mathbf{X}$ is the set $\bigcap \mathbf{X} = \{\, z \mid \forall a \in A \colon z \in X_a \,\}$.\index{intersection} \item The cross product of $\mathbf{X}$ is the set of sequences (indexed by $A$) $\prod \mathbf{X} = \prod_{a \in A} X_a = \{\, (\, z_a \mid a \in A \,) \mid \forall a \in A \colon z_a \in X_a \,\}$.\index{cross product} \index{$\prod$} \end{enumerate} We will denote the cross product of a set $X$ with itself taken $n$ times ({\em i.e.\/} the set of all sequences of length $n$ of elements of $X$) by $X^n$. \index{$X^n$} \end{defn} \begin{defn} If $X$ is any set, a {\em $k$-place relation on $X$\/} is a subset $R \subseteq X^k$. \index{relation, $k$-place} \end{defn} For example, the set $E = \{\, 0, 2, 3, \dots \,\}$ of even natural numbers is a $1$-place relation on $\mathbb{N}$, $D = \{\, (x,y) \in \mathbb{N}^2 \mid x \text{\ divides\ } y \,\}$ is a $2$-place relation on $\mathbb{N}$, and $S = \{\, (a,b,c) \in \mathbb{N}^3 \mid a + b = c \,\}$ is a $3$-place relation on $\mathbb{N}$. $2$-place relations are usually called binary relations.\index{relation, binary} \begin{defn} A set $X$ is {\em finite\/}\index{finite} if there is some $n \in \mathbb{N}$ such that $X$ has $n$ elements, and is {\em infinite\/}\index{infinite} otherwise. $X$ is {\em countable\/}\index{countable} if it is infinite and there is a 1-1 onto function $f : \mathbb{N} \to X$, and {\em uncountable\/}\index{uncountable} if it is infinite but not countable. \end{defn} Various infinite sets occur frequently in mathematics, such as $\mathbb{N}$\index{$\mathbb{N}$} (the natural numbers), $\mathbb{Q}$\index{$\mathbb{Q}$} (the rational numbers), and $\mathbb{R}$\index{$\mathbb{R}$} (the real numbers). Many of these are uncountable, such as $\mathbb{R}$. The basic facts about countable sets needed to do the problems are the following. \begin{prop} \begin{enumerate} \item If $X$ is a countable set and $Y \subseteq X$, then $Y$ is either finite or a countable. \item Suppose $\mathbf{X} = \{\, X_n \mid n \in \mathbb{N} \,\}$ is a finite or countable family of sets such that each $X_n$ is either finite or countable. Then $\bigcup \mathbf{X}$ is also finite or countable. \item If $X$ is a non-empty finite or countable set, then $X^n$ is finite or countable for each $n \ge 1$. \item If $X$ is a non-empty finite or countable set, then the set of all finite sequences of elements of $X$, $X^{<\omega} = \bigcup_{n \in \mathbb{N}} X^n$ is countable. \end{enumerate} \end{prop} The properly sceptical reader will note that setting up propositional or first-order logic formally requires that we have some set theory in hand, but formalizing set theory itself requires one to have first-order logic.\footnote{Which came first, the chicken\index{chicken} or the egg\index{egg}? Since, biblically speaking, ``In the beginning was the Word'',\index{Word}\index{John} maybe we ought to plump for alphabetical order. Which begs the question: In which alphabet?} % % Appendix to "A Problem Course in Mathematical Logic" % \chapter{The Greek Alphabet} \label{ap:greek} \index{Greek characters} \begin{center} \mbox{ \begin{tabular}{cccl} $\text{A}$ & $\alpha$ & & alpha \\ $\text{B}$ & $\beta$ & & beta \\ $\Gamma$ & $\gamma$ & & gamma \\ $\Delta$ & $\delta$ & & delta \\ $\text{E}$ & $\epsilon$ & $\varepsilon$ & epsilon \\ $\text{Z}$ & $\zeta$ & & zeta \\ $\text{H}$ & $\eta$ & & eta \\ $\Theta$ & $\theta$ & $\vartheta$ & theta \\ $\text{I}$ & $\iota$ & & iota \\ $\text{K}$ & $\kappa$ & & kappa \\ $\Lambda$ & $\lambda$ & & lambda \\ $\text{M}$ & $\mu$ & & mu \\ $\text{N}$ & $\nu$ & & nu \\ $\text{O}$ & $o$ & & omicron \\ $\Xi$ & $\xi$ & & xi \\ $\Pi$ & $\pi$ & $\varpi$ & pi \\ $\text{P}$ & $\rho$ & $\varrho$ & rho \\ $\Sigma$ & $\sigma$ & $\varsigma$ & sigma \\ $\text{T}$ & $\tau$ & & tau \\ $\Upsilon$ & $\upsilon$ & & upsilon \\ $\Phi$ & $\phi$ & $\varphi$ & phi \\ $\text{X}$ & $\chi$ & & chi \\ $\Psi$ & $\psi$ & & psi \\ $\Omega$ & $\omega$ & & omega \end{tabular} } \end{center} % % Appendix to "A Problem Course in Mathematical Logic" % \chapter{Logic Limericks} \label{ap:lim} \index{limericks} \begin{poem}{Deduction Theorem} \index{Deduction Theorem} A Theorem fine is Deduction,\\ For it allows work-reduction:\\ To show ``A implies B'',\\ Assume A and prove B;\\ Quite often a simpler production. \end{poem} \begin{poem}{Generalization Theorem} \index{Generalization Theorem} When in premiss the variable's bound,\\ To get a ``for all'' without wound,\\ Generalization.\\ For civilization\\ Could use some help for reasoning sound. \end{poem} \begin{poem}{Soundness Theorem} \index{Soundness Theorem} It's a critical logical creed:\\ Always check that it's safe to proceed. \\ To tell us deductions \\ Are truthful productions, \\ It's the Soundness of logic we need. \end{poem} \begin{poem}{Completeness Theorem} \index{Completeness Theorem} The Completeness of logics is G\"odel's. \\ 'Tis advice for looking for m\"odels: \\ They're always existent \\ For statements consistent, \\ Most helpful for logical lab\"ors. \end{poem} % % Appendix to "A Problem Course in Mathematical Logic" % \chapter{GNU Free Documentation License} \label{app:gnufdl} \begin{center} Version 1.2, November 2002\\ \begin{quotation} {\footnotesize Copyright \copyright 2000,2001,2002 Free Software Foundation, Inc.\\ \noindent 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA\\ \noindent Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. } \end{quotation} \end{center} \setcounter{section}{-1} \section{PREAMBLE} The purpose of this License is to make a manual, textbook, or other functional and useful document ``free'' in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others. This License is a kind of ``copyleft'', which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software. We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference. \section{APPLICABILITY AND DEFINITIONS} This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The ``Document'', below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as ``you''. You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law. A ``Modified Version'' of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language. A ``Secondary Section'' is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them. The ``Invariant Sections'' are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none. The ``Cover Texts'' are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words. A ``Transparent'' copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not ``Transparent'' is called ``Opaque''. Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only. The ``Title Page'' means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, ``Title Page'' means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text. A section ``Entitled XYZ'' means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as ``Acknowledgements'', ``Dedications'', ``Endorsements'', or ``History''.) To ``Preserve the Title'' of such a section when you modify the Document means that it remains a section ``Entitled XYZ'' according to this definition. The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License. \section{VERBATIM COPYING} You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3. You may also lend copies, under the same conditions stated above, and you may publicly display copies. \section{COPYING IN QUANTITY} If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects. If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages. If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public. It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document. \section{MODIFICATIONS} You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version: \begin{itemize} \item[A.] Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission. \item[B.] List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement. \item[C.] State on the Title page the name of the publisher of the Modified Version, as the publisher. \item[D.] Preserve all the copyright notices of the Document. \item[E.] Add an appropriate copyright notice for your modifications adjacent to the other copyright notices. \item[F.] Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below. \item[G.] Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document's license notice. \item[H.] Include an unaltered copy of this License. \item[I.] Preserve the section Entitled ``History'', Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled ``History'' in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence. \item[J.] Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission. \item[K.] For any section Entitled ``Acknowledgements'' or ``Dedications'', Preserve the Title of the section, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein. \item[L.] Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles. \item[M.] Delete any section Entitled ``Endorsements''. Such a section may not be included in the Modified Version. \item[N.] Do not retitle any existing section to be Entitled ``Endorsements'' or to conflict in title with any Invariant Section. \item[O.] Preserve any Warranty Disclaimers. \end{itemize} If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles. You may add a section Entitled ``Endorsements'', provided it contains nothing but endorsements of your Modified Version by various parties--for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard. You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one. The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version. \section{COMBINING DOCUMENTS} You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers. The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work. In the combination, you must combine any sections Entitled ``History'' in the various original documents, forming one section Entitled ``History''; likewise combine any sections Entitled ``Acknowledgements'', and any sections Entitled ``Dedications''. You must delete all sections Entitled ``Endorsements.'' \section{COLLECTIONS OF DOCUMENTS} You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects. You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document. \section{AGGREGATION WITH INDEPENDENT WORKS} A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an ``aggregate'' if the copyright resulting from the compilation is not used to limit the legal rights of the compilation's users beyond what the individual works permit. When the Document is included an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document. If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document's Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate. \section{TRANSLATION} Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warrany Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail. If a section in the Document is Entitled ``Acknowledgements'', ``Dedications'', or ``History'', the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title. \section{TERMINATION} You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. \section{FUTURE REVISIONS OF THIS LICENSE} The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/. Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License ``or any later version'' applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation. \backmatter % % Bibliography for "A Problem Course in Mathematical Logic" % \begin{thebibliography}{99} \bibitem{JB:HML} Jon Barwise (ed.), {\em Handbook of Mathematical Logic\/}, North Holland, Amsterdam, 1977, ISBN 0-7204-2285-X. \bibitem{BE:LPL} J. Barwise and J. Etchemendy, {\em Language, Proof and Logic\/}, Seven Bridges Press, New York, 2000, ISBN 1-889119-08-3. \bibitem{MB:LB} Merrie Bergman, James Moor, and Jack Nelson, {\em The Logic Book\/}, Random House, NY, 1980, ISBN 0-394-32323-8. \bibitem{CK:MT} C.C. Chang and H.J. Keisler, {\em Model Theory\/}, third ed., North Holland, Amsterdam, 1990. \bibitem{DA:CU} Martin Davis, {\em Computability and Unsolvability\/}, McGraw-Hill, New York, 1958; Dover, New York, 1982, ISBN 0-486-61471-9. \bibitem{DA:U} Martin Davis (ed.), {\em The Undecidable; Basic Papers On Undecidable Propositions, Unsolvable Problems And Computable Functions\/}, Raven Press, New York, 1965. \bibitem{HE:MIL} Herbert~B. Enderton, {\em A Mathematical Introduction to Logic\/}, Academic Press, New York, 1972. \bibitem{PH:NST} Paul~R. Halmos, {\em Naive Set Theory\/}, Undergraduate Texts in Mathematics, Springer-Verlag, New York, 1974, ISBN 0-387-90092-6. \bibitem{JvH:FFG} Jean van Heijenoort, {\em From Frege to G\"odel\/}, Harvard University Press, Cambridge, 1967, ISBN 0-674-32449-8. \bibitem{JH:OST} James~M. Henle, {\em An Outline of Set Theory\/}, Problem Books in Mathematics, Springer-Verlag, New York, 1986, ISBN 0-387-96368-5. \bibitem{DH:GEB} Douglas~R. Hofstadter, {\em G\"odel, Escher, Bach\/}, Random House, New York, 1979, ISBN 0-394-74502-7. \bibitem{JM:IML} Jerome Malitz, {\em Introduction to Mathematical Logic\/}, Springer-Verlag, New York, 1979, ISBN 0-387-90346-1. \bibitem{YM:CML} Yu.I. Manin, {\em A Course in Mathematical Logic\/}, Graduate Texts in Mathematics~53, Springer-Verlag, New York, 1977, ISBN 0-387-90243-0. \bibitem{RP:ENM} Roger Penrose, {\em The Emperor's New Mind\/}, Oxford University Press, Oxford, 1989. \bibitem{RP:SOTM} Roger Penrose, {\em Shadows of the Mind\/}, Oxford University Press, Oxford, 1994, ISBN 0 09 958211 2. \bibitem{TR:ONCF} T. Rado, {\em On non-computable functions\/}, Bell System Tech. J. {\bf 41} (1962), 877--884. \bibitem{RS:GIT} Raymond~M. Smullyan, {\em G\" odel's Incompleteness Theorems\/}, Oxford University Press, Oxford, 1992, ISBN 0-19-504672-2. \end{thebibliography} % % Index for "A Problem Course in Mathematical Logic" % \begin{theindex} \item $($, 3, 24 \item $)$, 3, 24 \item $=$, 24, 25 \item $\cap$, 89, 133 \item $\cup$, 89, 133 \item $\exists$, 30 \item $\forall$, 24, 25, 30 \item $\fromto$, 5, 30 \item $\in$, 133 \item $\land$, 5, 30, 89 \item $\lnot P$, 89 \item $\lnot$, 3, 24, 25, 89 \item $\lor$, 5, 30, 89 \item $\models$, 10, 35, 37, 38 \item $\nmodels$, 10, 36, 37 \item $\prod$, 133 \item $\proves$, 12, 43 \item $\setminus$, 133 \item $\subseteq$, 133 \item $\times$, 133 \item $\to$, 3, 24, 25 \indexspace \item $\mathcal{A}$, 117 \item A1, 11, 42 \item A2, 11, 42 \item A3, 11, 42 \item A4, 42 \item A5, 42 \item A6, 42 \item A7, 42 \item A8, 43 \item $A_n$, 3 \item $\mathrm{Con}(\Sigma)$, 124 \item $\ulcorner \Delta \urcorner$, 115 \item $\mathrm{dom}(f)$, 81 \item $F$, 7 \item $f \colon \mathbb{N}^k \to \mathbb{N}$, 81 \item $\varphi(S^{m_1}0, \dots, S^{m_k}0)$, 118 \item $\varphi^x_t$, 42 \item $\mathcal{L}$, 24 \item $\mathcal{L}_1$, 26 \item $\mathcal{L}_=$, 26 \item $\mathcal{L}_F$, 26 \item $\mathcal{L}_G$, 53 \item $\mathcal{L}_N$, 26, 112 \item $\mathcal{L}_O$, 26 \item $\mathcal{L}_P$, 3 \item $\mathcal{L}_S$, 26 \item $\mathcal{L}_{NT}$, 25 \item $\mathfrak{M}$, 33 \item $\mathbb{N}$, 81, 134 \item $\mathfrak{N}$, 33, 112 \item N1, 117 \item N2, 117 \item N3, 117 \item N4, 117 \item N5, 117 \item N6, 117 \item N7, 117 \item N8, 117 \item N9, 117 \item $\mathbb{N}^k$, 81 \item $\mathbb{N}^k \setminus P$, 89 \item $\mathcal{P}$, 133 \item $P \cap Q$, 89 \item $P \cup Q$, 89 \item $P \land Q$, 89 \item $P \lor Q$, 89 \item $\pi^k_i$, 85, 120 \item $\mathbb{Q}$, 134 \item $\mathbb{R}$, 134 \item $\mathrm{ran}(f)$, 81 \item $R_n$, 55 \item $\mathcal{S}$, 6 \item $S^m0$, 118 \item $T$, 7 \item $\text{Th}$, 39, 45 \item $\mathrm{Th}(\Sigma)$, 112 \item $\mathrm{Th}(\mathfrak{N})$, 124 \item $v_n$, 24 \item $X^n$, 133 \item $[X]^k$, 133 \item $\Bar{Y}$, 133 \indexspace \item $\mathsc{A}$, 90 \item $\alpha$, 90 \item $\mathsc{Code}_k$, 97, 106 \item $\mathsc{Comp}$, 98 \item $\mathsc{Comp}_M$, 96 \item $\mathsc{Conclusion}_\Delta$, 115 \item $\mathsc{Decode}$, 97, 106 \item $\mathsc{Deduction}_\Delta$, 115 \item $\mathsc{Diff}$, 83, 88 \item $\mathsc{Div}$, 90, 120 \item $\mathsc{Element}$, 90, 120 \item $\mathsc{Entry}$, 96 \item $\mathsc{Equal}$, 89 \item $\mathsc{Exp}$, 88 \item $\mathsc{Fact}$, 88 \item $\mathsc{Formulas}$, 115 \item $\mathsc{Formula}$, 115 \item $i_{\mathbb{N}}$, 82 \item $\mathsc{Inference}$, 115 \item $\mathsc{IsPrime}$, 90, 120 \item $\mathsc{Length}$, 90, 120 \item $\mathsc{Logical}$, 115 \item $\mathsc{Mult}$, 88 \item $\mathsc{O}$, 83, 85, 120 \item $\mathsc{Power}$, 90, 120 \item $\mathsc{Pred}$, 83, 88 \item $\mathsc{Premiss}_\Delta$, 115 \item $\mathsc{Prime}$, 90, 120 \item $\mathsc{SIM}$, 106, 107 \item $\mathsc{Sentence}$, 115 \item $\mathsc{Sim}$, 98 \item $\mathsc{Sim}_M$, 97 \item $\mathsc{Step}$, 97, 106 \item $\mathsc{Step}_M$, 96, 106 \item $\mathsc{Subseq}$, 90 \item $\mathsc{Sub}$, 123 \item $\mathsc{Sum}$, 83, 87 \item $\mathsc{S}$, 83, 85, 120 \item $\mathsc{TapePosSeq}$, 96, 106 \item $\mathsc{TapePos}$, 96, 105 \item $\mathsc{Term}$, 115 \indexspace \item abbreviations, 5, 30 \item Ackerman's Function, 90 \item all, x \item alphabet, 75 \item and, x, 5 \item assignment, 7, 34, 35 \subitem extended, 35 \subitem truth, 7 \item atomic formulas, 3, 27 \item axiom, 11, 28, 39 \subitem for basic arithmetic, 117 \subsubitem N1, 117 \subsubitem N2, 117 \subsubitem N3, 117 \subsubitem N4, 117 \subsubitem N5, 117 \subsubitem N6, 117 \subsubitem N7, 117 \subsubitem N8, 117 \subsubitem N9, 117 \subitem logical, 43 \subitem schema, 11, 42 \subsubitem A1, 11, 42 \subsubitem A2, 11, 42 \subsubitem A3, 11, 42 \subsubitem A4, 42 \subsubitem A5, 42 \subsubitem A6, 42 \subsubitem A7, 42 \subsubitem A8, 43 \indexspace \item blank cell, 67 \item blank tape, 67 \item bound variable, 29 \item bounded minimalization, 92 \item busy beaver competition, 83 \subitem $n$-state entry, 83 \subitem score in, 83 \indexspace \item cell, 67 \subitem blank, 67 \subitem marked, 67 \subitem scanned, 68 \item characteristic function, 82 \item chicken, 134 \item Church's Thesis, xi \item clique, 55 \item code \subitem G\"odel, 113 \subsubitem of sequences, 113 \subsubitem of symbols of $\mathcal{L}_N$, 113 \subitem of a sequence of tape positions, 96 \subitem of a tape position, 95 \subitem of a Turing machine, 97 \item Compactness Theorem, 16, 51 \subitem applications of, 53 \item complement, 133 \item complete set of sentences, 112 \item completeness, 112 \item Completeness Theorem, 16, 50, 137 \item composition, 85 \item computable \subitem function, 82 \subitem set of formulas, 115 \item computation, 71 \subitem partial, 71 \item connectives, 3, 4, 24 \item consistent, 15, 47 \subitem maximally, 15, 48 \item constant, 24, 25, 31, 33, 35 \item constant function, 85 \item contradiction, 9, 38 \item convention \subitem common symbols, 25 \subitem parentheses, 5, 30 \item countable, 134 \item crash, 70, 78 \item cross product, 133 \indexspace \item decision problem, x \item deduction, 12, 43 \item Deduction Theorem, 13, 44, 137 \item definable \subitem function, 125 \subitem relation, 125 \item domain (of a function), 81 \indexspace \item edge, 54 \item egg, 134 \item element, 133 \item elementary equivalence, 56 \item Entscheidungsproblem, x, 111 \item equality, 24, 25 \item equivalence \subitem elementary, 56 \item existential quantifier, 30 \item extension of a language, 30 \indexspace \item finite, 134 \item first-order \subitem language for number theory, 112 \subitem languages, 23 \subitem logic, x, 23 \item Fixed-Point Lemma, 123 \item for all, 25 \item formula, 3, 27 \subitem atomic, 3, 27 \subitem unique readability, 6, 32 \item free variable, 29 \item function, 24, 31, 33, 35 \subitem $k$-place, 24, 25, 81 \subitem bounded minimalization of, 92 \subitem composition of, 85 \subitem computable, 82 \subitem constant, 85 \subitem definable in $\mathfrak{N}$, 125 \subitem domain of, 81 \subitem identity, 82 \subitem initial, 85 \subitem partial, 81 \subitem primitive recursion of, 87 \subitem primitive recursive, 88 \subitem projection, 85 \subitem recursive, x, 92 \subitem regular, 92 \subitem successor, 85 \subitem Turing computable, 82 \subitem unbounded minimalization of, 91 \subitem zero, 85 \indexspace \item G\"odel code \subitem of sequences, 113 \subitem of symbols of $\mathcal{L}_N$, 113 \item G\"odel Incompleteness Theorem, 111 \subitem First Incompleteness Theorem, 124 \subitem Second Incompleteness Theorem, 124 \item generalization, 42 \item Generalization Theorem, 45, 137 \item On Constants, 45 \item gothic characters, 33 \item graph, 54 \item Greek characters, 3, 28, 135 \indexspace \item halt, 70, 78 \item Halting Problem, 98 \item head, 67 \subitem multiple, 75 \subitem separate, 75 \indexspace \item identity function, 82 \item if \dots then, x, 3, 25 \item if and only if, 5 \item implies, 10, 38 \item Incompleteness Theorem, 111 \subitem G\"odel's First, 124 \subitem G\"odel's Second, 124 \item inconsistent, 15, 47 \item independent set, 55 \item inference rule, 11 \item infinite, 134 \item Infinite Ramsey's Theorem, 55 \item infinitesimal, 57 \item initial function, 85 \item input tape, 71 \item intersection, 133 \item isomorphism of structures, 55 \indexspace \item John, 134 \indexspace \item $k$-place function, 81 \item $k$-place relation, 81 \indexspace \item language, 26, 31 \subitem extension of, 30 \subitem first-order, 23 \subitem first-order number theory, 112 \subitem formal, ix \subitem natural, ix \subitem propositional, 3 \item limericks, 137 \item logic \subitem first-order, x, 23 \item mathematical, ix \item natural deductive, ix \item predicate, 3 \item propositional, x, 3 \item sentential, 3 \item logical axiom, 43 \indexspace \item machine, 69 \subitem Turing, xi, 67, 69 \item marked cell, 67 \item mathematical logic, ix \item maximally consistent, 15, 48 \item metalanguage, 31 \item metatheorem, 31 \item minimalization \subitem bounded, 92 \subitem unbounded, 91 \item model, 37 \item Modus Ponens, 11, 43 \item MP, 11, 43 \indexspace \item natural deductive logic, ix \item natural numbers, 81 \item non-standard model, 55, 57 \subitem of the real numbers, 57 \item not, x, 3, 25 \item $n$-state \subitem Turing machine, 69 \subitem entry in busy beaver competition, 83 \item number theory \subitem first-order language for, 112 \indexspace \item or, x, 5 \item output tape, 71 \indexspace \item parentheses, 3, 24 \subitem conventions, 5, 30 \subitem doing without, 4 \item partial \subitem computation, 71 \subitem function, 81 \item position \subitem tape, 68 \item power set, 133 \item predicate, 24, 25 \item predicate logic, 3 \item premiss, 12, 43 \item primitive recursion, 87 \item primitive recursive \subitem function, 88 \subitem recursive relation, 89 \item projection function, 85 \item proof, 12, 43 \item propositional logic, x, 3 \item proves, 12, 43 \item punctuation, 3, 25 \indexspace \item quantifier \subitem existential, 30 \subitem scope of, 30 \subitem universal, 24, 25, 30 \indexspace \item Ramsey number, 55 \item Ramsey's Theorem, 55 \subitem Infinite, 55 \item range of a function, 81 \item r.e., 99 \item recursion primitive, 87 \item recursive \subitem function, 92 \subitem functions, xi \subitem relation, 93 \subitem set of formulas, 115 \item recursively enumerable, 99 \subitem set of formulas, 115 \item regular function, 92 \item relation, 24, 31, 33 \item binary, 25, 134 \item characteristic function of, 82 \item definable in $\mathfrak{N}$, 125 \item $k$-place, 24, 25, 81, 133 \item primitive recursive, 89 \item recursive, 93 \item Turing computable, 93 \item represent (in $\mathrm{Th}(\Sigma)$) \subitem a function, 118 \subitem a relation, 119 \item representable (in $\mathrm{Th}(\Sigma)$) \subitem function, 118 \subitem relation, 119 \item rule of inference, 11, 43 \indexspace \item satisfiable, 9, 37 \item satisfies, 9, 36, 37 \item scanned cell, 68 \item scanner, 67, 75 \item scope of a quantifier, 30 \item score \subitem in busy beaver competition, 83 \item sentence, 29 \item sentential logic, 3 \item sequence of tape positions \subitem code of, 96 \item set theory, 133 \item Soundness Theorem, 15, 47, 137 \item state, 68, 69 \item structure, 33 \item subformula, 6, 29 \item subgraph, 54 \item subset, 133 \item substitutable, 41 \item substitution, 41 \item successor \subitem function, 85 \subitem tape position, 71 \item symbols, 3, 24 \subitem logical, 24 \subitem non-logical, 24 \indexspace \item table \subitem of a Turing machine, 70 \item tape, 67 \subitem blank, 67 \subitem higher-dimensional, 75 \subitem input, 71 \subitem multiple, 75 \subitem output, 71 \subitem tape position, 68 \subsubitem code of, 95 \subsubitem code of a sequence of, 96 \subsubitem successor, 71 \subitem two-way infinite, 75, 78 \item Tarski's Undefinability Theorem, 125 \item tautology, 9, 38 \item term, 26, 31, 35 \item theorem, 31 \item theory, 39, 45 \subitem of $\mathfrak{N}$, 124 \subitem of a set of sentences, 112 \item there is, x \item TM, 69 \item true in a structure, 37 \item truth \subitem assignment, 7 \subitem in a structure, 36, 37 \subitem table, 8, 9 \subitem values, 7 \item Turing computable \subitem function, 82 \subitem relation, 93 \item Turing machine, xi, 67, 69 \subitem code of, 97 \subitem crash, 70 \subitem halt, 70 \subitem $n$-state, 69 \subitem table for, 70 \subitem universal, 95, 97 \item two-way infinite tape, 75, 78 \indexspace \item unary notation, 82 \item unbounded minimalization, 91 \item uncountable, 134 \item Undefinability Theorem, Tarski's, 125 \item union, 133 \item unique readability \subitem of formulas, 6, 32 \subitem of terms, 32 \item Unique Readability Theorem, 6, 32 \item universal \subitem quantifier, 30 \subitem Turing machine, 95, 97 \item universe (of a structure), 33 \item UTM, 95 \indexspace \item variable, 24, 31, 34, 35 \subitem bound, 29 \subitem free, 29 \item vertex, 54 \indexspace \item witnesses, 48 \item Word, 134 \indexspace \item zero function, 85 \end{theindex} \end{document}