Browsed by
Category: Fundamentals

Beyond Cookbook Mathematics, Part 2

Beyond Cookbook Mathematics, Part 2

The previous article discussed the importance of definitions to mathematical thought. We looked at a definition (of an end-vertex in a graph), and picked it apart by finding multiple ways to look at it. We also directly used the definition in a practical manner to find “weak links” in a network. This time, we’ll look at the structure of mathematical theorems and proofs, and how we can read the proofs to understand the theorems. 

A mathematical theorem (informally) is a statement that takes the following structure:


IF {we have this stuff}, THEN {we get to say this about other stuff}.


Let’s pick this apart a bit even at the abstract level, because even this simple structure is important when we consider using theorems:

IF {we have this stuff}:

That IF is really important. You can’t move to the THEN without the IF part being true. Let’s say a particular theorem (say, the Central Limit theorem), has a set of requirements in the IF part. We have to have all of those parts satisfied before we can invoke the conclusion. I see data scientists just invoking “Central Limit Theorem” like a chant when they are analyzing a dataset. However, in many cases, their data does not fit the IF conditions given. The Central Limit Theorem requires, among other things, that the random variables be independent. No independence, no Central Limit Theorem. Do not move forward. 

Many complaints I hear, especially in statistics, revolve around “you can use statistics to say anything you like.” No. No you cannot. It only appears that way because practitioners are applying theorems blindly without ensuring the hypotheses (the IF bits) are all met. A theorem written and proven is not some magic silver bullet that allows for universal application and use. 

IF -> THEN:

When we write (or read) a proof, we assume the IF part is true, and use those conditions in the IF to logically deduce the conclusions in the THEN. 

(A remark: I just described a particular method of proof—the direct proof. There are other methods by which we may prove a statement that are logically equivalent to a direct proof such as proof by contradiction, or proof by contrapositive. Since the idea here is to understand how all the parts of a theorem and proof work, we’ll stick with direct proofs here. We’re also discussing a particular type of logical statement—a one directional implication. We can have bidirectional statements as well, but these are an extension of understanding this first type, so we’ll start here.)

The first part discussing definitions is essential to understanding the IF part of our theorem and how this part connects to our “then” conclusion. Suppose we take the statement 

If a real-valued function f is differentiable, then f is continuous. 

We can use the statement without understanding. As long as we understand the definitions reasonably well (as in, we know how to tell if a function is differentiable), then we say it’s continuous and move on with our day. This isn’t good enough anymore. Why does differentiability imply continuity? A good proof should illuminate this for us. My suggestion at this point is to find a calculus book that proves this statement and have it open for this next part. I’m going to outline a series of thought-steps to consider as you read a proof.

(1) Write down definitions.

Write down the definition of differentiability, and that of continuity. Study both. Play with examples of differentiable and continuous functions. See if all the differentiable functions you play with are also continuous. Try to play with some weird ones. 

Examples: f(x) = x^{2}. Definitely differentiable. Definitely continuous. (I do suggest actually using the definitions here to really show that f fits these.)

f(x) = x+5. Fits nicely. Try some nonpolynomials now.

f(x) = \sin(x). Still good.

f(x) = e^{x}. Really nice. Love differentiating that guy.

As you’re playing with these, try to move between formally showing these functions fit the definitions and developing a visual picture of what it means for these functions to fit the definitions. (My high school calculus teacher, Andy Kohler, gave a nice intuitive picture of continuity—you shouldn’t have to pick up your pen to draw the function.) 

This step helps you visualize your starting point and your destination. Every single time I develop a theorem, this is the process I go through. This isn’t always short either. I’ve spent weeks on this part—exploring connections between my start and my target to develop an intuition. Often, this leads me to a way to prove the desired theorem. Occasionally I manage to create an example that renders my statement untrue. This isn’t a trivial or unimportant part, so I implore you to temper any building frustration with the speed of this bit. However, since here we’re focused on reading proofs rather than writing them, we’ll likely assume we’re working with a true statement.

(2) Now take a look at the proof. 

A good proof should take you gently by the hand and guide you through the author’s reasoning in nice, comfortable steps of logic. The author shouldn’t require you to fill in gaps or make huge leaps, and especially never require you to just take something on faith. 

The proof contains all the pieces we need for understanding. We just need to know how to read it. At some point in the proof, every single part of the IF should be invoked as a justification for a logical step. Find these parts. 

“Because f is differentiable, [STATEMENT]”

Study this part of the proof. Flag it. Do this for each time an IF condition is invoked. Make sure you can use the definition or condition invoked to make the leap the author does. Here I implore you to not just “pass over”. Really take the time to convince yourself this is true. Go back to the definition. Return to the proof. Again, this study isn’t necessarily quick.1 

Doing this for each line in the proof will help you see what’s going on between IF and THEN.

(3) Break things.

The last piece of advice I’ll give is one that was drilled into me by my graduate analysis professor at UTA, Dr. Barbara Shipman. Breaking things (or trying to) in mathematics is the best way to really cement your understanding. 

Go back to all those points you flagged in the proof where the IF conditions were invoked. Now imagine you don’t have that condition anymore. The proof should fail in some way. Either you end up with a false conclusion (“if I don’t have X anymore, then I definitely cannot have Y”), or you just end up stuck (“if I don’t have X anymore, I can’t say anything about Y”.) This failure point helps you understand why that condition was necessary in the IF. Doing this for each condition in the proof helps you see the interplay among all the conditions, and showing you how and why all those parts were needed to get you to your conclusion. 

None of this is trivial. Please don’t take the informal descriptions of this process to mean that you can breeze through and develop this skill as quickly as you memorized derivatives of functions. This is a skill that is developed over years. The reason I recommend using material you already know to study proofs is that it removes the challenge of learning new material while studying the “why”. There’s a reason mathematicians take calculus classes before real analysis courses (which spend time deeply developing and proving many things from calculus.) You’re familiar with how the subject works and that it works, then you can understand why it works. 

This happens in engineering too. We flew a plane before we understood airflow deeply, but due to our understanding of airflow, we were able to advance flight technology far beyond what I imagine early aviators and inventors even thought possible. There’s a beautiful feedback loop between mathematics, formal proofs, and engineering. Mathematical proofs aren’t just the dry formalisms we use and throw away—they’re the keys to understanding why things work the way they do. Developing the skill to read and understand these arguments is not a waste of time for an engineer; it will help propel engineering forward.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Beyond Cookbook Mathematics, Part 1

Beyond Cookbook Mathematics, Part 1

This post is due to the requests of several independent engineers and programmers. They expressed disappointment at their mathematics education and its failure to impart a deeper understanding of the formulas and algorithms they were taught to use. 

This also reflects my observations of teaching university mathematics over the years. I started as a TA (and frequent substitute lecturer) in 2008, and have taught all levels of calculus, basic statistics, and advanced undergraduate statistics thus far. I’ve certainly noticed even since 2008 a de-emphasis on proofs and “why” in favor of more examples, applications, and formulas in general. In fact, proofs were passed over in lectures in calculus courses meant for general engineers because “there wasn’t enough time”, or it was not considered something engineers needed to know. 

This attitude is actually fairly recent. Many of the older calculus books in my library were written for engineers in an undergraduate program, and these books are quite proof-heavy. Some examples are F. Hildebrand’s Advanced Calculus for Engineers (1949), Tom Apostol’s Calculus (Volumes 1 and 2), and R. Courant’s Differential and Integral Calculus (1938), to name a few. For a more modern text that still has some proof treatment, the 10th edition of Calculus: One and Several Variables, by Salas et al is a good resource. I used this one both as a student at Georgia Tech and an instructor.1. However, when I taught at University of Texas at Arlington as a grad student from 2013-2015 and then Foothill College in early 2018, the texts they chose to use was woefully inadequate to suit a college-level calculus course; proofs were nonexistent, and reasoning was thrown away in favor of contrived examples. The course was designed to steer engineers away from any proofs or thorough reasoning, showing the experience is somewhat widespread.2

I do not want to discuss the reasons for this departure; this isn’t an education site. What is important now is to discuss how to satisfy the desire of an engineer or other highly technical person using various mathematical topics/formulas to develop a deeper understanding of what he or she is doing. It wouldn’t be particularly helpful to simply suggest acquiring books and reading the proofs. 

The best thing an education can give is the ability to teach oneself through developing methods of logical thought and creativity. There are books on formal logic and mathematical proofs circulating, but they can be a bit daunting to start with, as they are quite abstract and discuss mathematical logic and proof theory in general.3What I’ll give here can be perhaps considered a friendly introduction as to how to begin using mathematical proofs to facilitate understanding of material. 

Start with what you know

Most of you who are reading this have likely taken a calculus course or two, and probably some basic linear algebra/matrix theory (especially if you’re in computer science). You’ve been exposed to limits, differentiation, continuity, determinants, and linear maps. You know more math than you think. We’ll use this material to begin learning how to deconstruct mathematical statements and arguments in order to understand how the pieces all fit together. (It’s really not unlike diagramming sentences.)

Pick a topic you know well. That way we’re not trying to introduce new material and learn how to read proofs at the same time. Dust off your old calculus book, or differential equations book. 

Definitions

Definitions are the most important thing in mathematics, and perhaps the most ignored by those using it. “Continuous” has a meaning. “Differentiable” has a meaning. Spending time to really understand the definition of a mathematical term will provide an unshakable foundation. 

Example: Let’s take something visual: the degree of a vertex in a graph. 

You might see this definition of degree of a vertex:

Def. 1: The degree of a vertex v in a graph G is the number of edges connected to v

Here’s a graph from one of my research articles. I realize this is a digraph, but we can still say that every node except the first is an end-vertex.

This is fairly intuitive and straightforward. One way to go about understanding this definition is to find other equivalent ways to express it. For example, we know that if there’s an edge sticking out of a vertex v, there must be something on the other end of the edge. (Graphs don’t allow dangling edges.) Thus, we might reframe this definition as 

Def. 2: The degree of a vertex v is the number of vertices adjacent (connected to) v

If we go one step further and collect all the vertices adjacent to v into a set or bucket, and name that bucket the neighborhood of v, we can write one more equivalent definition of the degree of a vertex.

Def. 3: The degree of a vertex v is the number of vertices in (or cardinality of) the neighborhood of v. (Formally, mathematicians would say that the number of elements in a set is the cardinality of the set.)

Notice what we’ve done here. We’ve taken one simple definition and expressed it three equivalent ways. Each one of these gives us a slightly different facet of what the degree of a vertex is. We can look at it from the perspective of edges or vertices.

Let’s take this definition and use it in another definition.

An end-vertex or pendant vertex in a graph is a vertex of degree 1. 

New definition using our previously defined degree. But what does it really tell us here? Can we visualize this? If a vertex only has degree 1, then we know that only one edge sticks out of that vertex. Equivalently, we can also say that it has only one neighbor vertex adjacent to it. The size of its neighborhood is 1. We now can picture an end-vertex quite nicely. 

Using Definitions

Many applications rely on checking to see if a definition is satisfied. 

  • Is f(x) = \sin(x) continuous?
  • Do I have any end vertices in my network?

Here we are taking a specific example and looking to see if a definition is satisfied, typically because we know that (due to theorems) we get certain properties we either want (or maybe don’t want) if the definition is satisfied. 

For example, network engineers like to have resilient networks. By “resilient”, I mean that they’d like to be able to tolerate a link failure and still be able to send information anywhere on the network. Intuitively, it would be really bad if a particular link failure isolated a node/switch so that no information could reach it. 

Let’s try to frame that in mathematical terms. A network can be drawn as a graph, with circles representing nodes/switches/computers/whatever, and edges between nodes representing the physical links connecting them. We want to design a network so that a single link failure anywhere in the network will not isolate a node. 

We can mathematically represent a link failure by the removal of an edge. So we can take our graph representing our network, and start testing edges by deleting them to see if a node ends up isolated with no edges emanating from it. Or…we could return to a definition from earlier and think about this mathematically. 

If a single link failure isolates a particular node, then that means only one edge sticks out from it. That means that node has degree 1, by our first definition of degree. Thus, we may now conclude that this node also fits the definition of an end-vertex. 

Moving back to the practical space, we conclude: a vertex can only be isolated via a single-link failure if it’s an end-vertex. We can also write the statement the other way:  If a vertex is an end-vertex, then the deletion of its incident edge isolates it. 

Now, thanks to these definitions, and our understanding of them, we can find a way to spot all the end-vertices in a network. Since we have multiple ways of looking at this problem, we can find the way that suits us best.

Using definition 1, we can count the number of links from each node. Any node that has only one link connected to it is an end-vertex, and the failure of that link will isolate that node. 

Using definition 2, we can count the number of adjacent neighbors, especially if we have a forwarding table stored for each node. If any node only has one other node in its table, it’s an end-vertex, and the removal of the link connecting the two nodes would isolate our vertex. 

I used a fairly visual, practical definition and example here. Other definitions in mathematics can get fairly involved; I’ve spent hours simply picking apart a definition to understand it. But the strategy doesn’t change. The first step in developing a deeper understanding of mathematics is to pay attention to definitions–not just what they say, but what they mean. The next article will discuss how we use definitions to write theorems and understand proofs. 

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.