Let $A \in \mathbb{C}^{n\times n}$ be a complex matrix. We let $a_{i,j}$ be the $\left(i,j\right)$-th entry of $A$ for all $i, j \in \left[n\right]$ (where $\left[n\right]$ denotes $\left\{1,2,\ldots,n\right\}$). For each $i \in \left[n\right]$, we define the *$i$-th Gershgorin disk of $A$* to be the closed disk
\begin{align}
D_i\left(A\right) := \left\{z \in \mathbb{C} : \left|z-a_{i,i}\right| \leq \sum_{j \neq i} \left|a_{i,j}\right| \right\}.
\end{align}
(Here, the summation sign $\sum\limits_{j \neq i}$ means a sum over all $j \in \left[n\right]$ satisfying $j \neq i$.)

The following is well-known (see the Wikipedia for an easy proof):

Theorem 1 (Gershgorin's first theorem).Each eigenvalue of $A$ belongs to at least one of the $n$ disks $D_1\left(A\right), D_2\left(A\right), \ldots, D_n\left(A\right)$.

As has already been observed here on MO, this does **not** mean that there each of these $n$ disks contains at least one eigenvalue of $A$; there are counterexamples even for $n = 2$.

However, something in this direction does hold when the $n$ disks can be split into two sets that have no overlap:

Theorem 2 (Gershgorin's second theorem).Let $k$ be an integer with $0 \leq k \leq n$. Assume that $D_i\left(A\right) \cap D_j\left(A\right) = \varnothing$ for all $i$ and $j$ with $i \leq k < j$. Then, there are exactly $k$ eigenvalues of $A$ that belong to the first $k$ Gershgorin disks $D_1\left(A\right), D_2\left(A\right), \ldots, D_k\left(A\right)$, and there are exactly $n-k$ eigenvalues of $A$ that belong to the remaining $n-k$ Gershgorin disks $D_{k+1}\left(A\right), D_{k+2}\left(A\right), \ldots, D_n\left(A\right)$. (Here, eigenvalues are counted with their algebraic multiplicities, so there are always $n$ of them in total.)

Unlike Theorem 1, this is not trivial at all. Gershgorin's original proof (Theorem 2 is Satz III in his 1931 paper) uses a not-very-rigorous continuity argument. The idea is nice: We let $B$ be the diagonal $n\times n$-matrix whose diagonal entries are those of $A$. Consider the eigenvalues of the matrix $\left(1-t\right)B + tA$ for each $t \in \left[0,1\right]$. For $t = 0$, these eigenvalues are the $a_{i,i}$, whereas for $t = 1$ they are the eigenvalues of $A$. Now, it is intuitively clear but nontrivial to formalize and prove that the eigenvalues of a matrix depend continuously on the entries of the matrix; thus, when $t$ progresses from $0$ to $1$, the eigenvalues cannot "jump". However, by Theorem 1, these eigenvalues are always in the union $D_1\left(\left(1-t\right)B + tA\right) \cup D_2\left(\left(1-t\right)B + tA\right) \cup \cdots \cup D_n\left(\left(1-t\right)B + tA\right)$, which in turn is easily seen to be a subset of the union $D_1\left(A\right) \cup D_2\left(A\right) \cup \cdots \cup D_n\left(A\right)$. Since the latter union decomposes into the two disjoint closed subsets $D_1\left(A\right) \cup D_2\left(A\right) \cup \cdots \cup D_k\left(A\right)$ and $D_{k+1}\left(A\right) \cup D_{k+2}\left(A\right) \cup \cdots \cup D_n\left(A\right)$, and since we know how many eigenvalues of $\left(1-t\right)B + tA$ lie in which of these two subsets for $t = 0$ (namely, $k$ eigenvalues lie in the former, and $n-k$ eigenvalues in the latter), we thus conclude that the same numbers of eigenvalues are in these subsets for $t = 1$, that is, for the matrix $A$.

Horn and Johnson, in their *Matrix analysis* (2nd edition 2003) (proof of Theorem 6.1.1), suggest a variant of this argument. They, too, consider the matrix $\left(1-t\right)B + tA$, but instead of looking at specific eigenvalues, they use the argument principle from complex analysis to represent the number of eigenvalues of $\left(1-t\right)B + tA$ that lie within the union $D_1\left(A\right) \cup D_2\left(A\right) \cup \cdots \cup D_k\left(A\right)$ (or, rather, within a rectifiable curve that surrounds this union) as a meromorphic integral. The continuity of this integral, I assume, follows from some basic complex analysis. However, this too leaves something to be desired from a formalistic point of view: Why do we know that we can surround $D_1\left(A\right) \cup D_2\left(A\right) \cup \cdots \cup D_k\left(A\right)$ by a rectifiable curve (that does not contain any of the other disks in its interior)?

Thus, my...

Question:Is there a proof of Theorem 2 that avoids any analytic technicalities, or perhaps even any analysis whatsoever?

Of course, the very fact that $A$ has $n$ eigenvalues requires analysis (as it depends on the Fundamental Theorem of Algebra). However, we can take it for an assumption. Even better, Theorem 2 could be restated as "no more than $k+1$ eigenvalues of $A$ can lie within $D_1\left(A\right) \cup D_2\left(A\right) \cup \cdots \cup D_k\left(A\right)$", which looks like a perfectly elementary claim susceptible to attack using (e.g.) the triangle inequality just like Theorem 1 is commonly proved. It seems natural to start by observing that if you have $k+1$ eigenvalues of $A$, you can find an $A$-invariant subspace $W$ of $\mathbb{C}^n$ such that these $k+1$ eigenvalues are precisely the eigenvalues of the restriction $A\mid_W : W \to W$. Unfortunately, it is not obvious to me that what can be said about the Gershgorin disks of $A\mid_W$; this clearly depends on the right choice of basis for $W$. Maybe an echelon basis?

8more comments