The Group of Units

Definition 12.1.1 (Unit Group)   The $ U_K$ associated to a number field $ K$ is the group of elements of $ \O _K$ that have an inverse in $ \O _K$.

Theorem 12.1.2 (Dirichlet)   The group $ U_K$ is the product of a finite cyclic group of roots of unity with a free abelian group of rank $ r+s-1$, where $ r$ is the number of real embeddings of $ K$ and $ s$ is the number of complex conjugate pairs of embeddings.

We prove the theorem by defining a map $ \varphi :U_K\to \mathbf{R}^{r+s}$, and showing that the kernel of $ \varphi $ is finite and the image of $ \varphi $ is a lattice in a hyperplane in $ \mathbf{R}^{r+s}$. The trickiest part of the proof is showing that the image of $ \varphi $ spans a hyperplane, and we do this by a clever application of Blichfeldt's lemma (that if $ S$ is closed, bounded, symmetric, etc., and has volume at least $ 2^n\cdot
\Vol (V/L)$, then $ S\cap L$ contains a nonzero element).

Remark 12.1.3   Theorem 12.1.2 is due to Dirichlet who lived 1805-1859. Thomas Hirst described Dirichlet as follows:
He is a rather tall, lanky-looking man, with moustache and beard about to turn grey with a somewhat harsh voice and rather deaf. He was unwashed, with his cup of coffee and cigar. One of his failings is forgetting time, he pulls his watch out, finds it past three, and runs out without even finishing the sentence.
Koch wrote that:
... important parts of mathematics were influenced by Dirichlet. His proofs characteristically started with surprisingly simple observations, followed by extremely sharp analysis of the remaining problem.
I think Koch's observation nicely describes the proof we will give of Theorem 12.1.2.

The following proposition explains how to think about units in terms of the norm.

Proposition 12.1.4   An element $ a\in\O _K$ is a unit if and only if $ \Norm _{K/\mathbf{Q}}(a)=\pm 1$.

Proof. Write $ \Norm =\Norm _{K/\mathbf{Q}}$. If $ a$ is a unit, then $ a^{-1}$ is also a unit, and $ 1=\Norm (a)\Norm (a^{-1})$. Since both $ \Norm (a)$ and $ \Norm (a^{-1})$ are integers, it follows that $ \Norm (a)=\pm 1$. Conversely, if $ a\in\O _K$ and $ \Norm (a)=\pm 1$, then the equation $ aa^{-1}=1=\pm \Norm (a)$ implies that $ a^{-1} = \pm \Norm (a)/a$. But $ \Norm (a)$ is the product of the images of $ a$ in $ \mathbf{C}$ by all embeddings of $ K$ into $ \mathbf{C}$, so $ \Norm (a)/a$ is also a product of images of $ a$ in $ \mathbf{C}$, hence a product of algebraic integers, hence an algebraic integer. Thus $ a^{-1}\in\O _K$, which proves that $ a$ is a unit. $ \qedsymbol$

Let $ r$ be the number of real and $ s$ the number of complex conjugate embeddings of $ K$ into $ \mathbf{C}$, so $ n=[K:\mathbf{Q}]=r+2s$. Define a map

$\displaystyle \varphi :U_K \to \mathbf{R}^{r+s}
$

by

$\displaystyle \varphi (a) = (\log\vert\sigma_1(a)\vert,\ldots, \log\vert\sigma_{r+s}(a)\vert).
$

Lemma 12.1.5   The image of $ \varphi $ lies in the hyperplane

$\displaystyle H = \{(x_1,\ldots, x_{r+s})\in\mathbf{R}^{r+s} : x_1+ \cdots + x_r + 2x_{r+1} + \cdots + 2x_{r+s} = 0\}.$ (12.1)

Proof. If $ a\in U_K$, then by Proposition 12.1.4,

$\displaystyle \left(\prod_{i=1}^{r} \vert\sigma_i(a)\vert\right)
\cdot \left( \prod_{i=r+1}^s \vert\sigma_i(a)\vert^2 \right) = 1.$

Taking logs of both sides proves the lemma. $ \qedsymbol$

Lemma 12.1.6   The kernel of $ \varphi $ is finite.

Proof. We have

$\displaystyle \Ker (\varphi )$ $\displaystyle \subset \{a\in\O _K : \vert\sigma_i(a)\vert = 1$    for all $\displaystyle i=1,\ldots,r+2s\}$    
  $\displaystyle \subset \sigma(\O _K) \cap X,$    

where $ X$ is the bounded subset of $ \mathbf{R}^{r+2s}$ of elements all of whose coordinates have absolute value at most $ 1$. Since $ \sigma(\O _K)$ is a lattice (see Proposition 5.2.4), the intersection $ \sigma(\O _K)\cap X$ is finite, so $ \Ker (\varphi )$ is finite. $ \qedsymbol$

Lemma 12.1.7   The kernel of $ \varphi $ is a finite cyclic group.

Proof. It is a general fact that any finite subgroup of the multiplicative group of a field is cyclic. [Homework.] $ \qedsymbol$

To prove Theorem 12.1.2, it suffices to proove that Im$ (\varphi )$ is a lattice in the hyperplane $ H$ from (12.1.1), which we view as a vector space of dimension $ r+s-1$.

Define an embedding

$\displaystyle \sigma : K\hookrightarrow \mathbf{R}^n$ (12.2)

given by $ \sigma(x) = (\sigma_1(x),\ldots,\sigma_{r+s}(x))$, where we view $ \mathbf{C}\cong \mathbf{R}\times \mathbf{R}$ via $ a+b i\mapsto (a,b)$. Note that this is exactly the same as the embedding

$\displaystyle x\mapsto \big($ $\displaystyle \sigma_1(x), \sigma_2(x),\ldots, \sigma_r(x),$    
     Re$\displaystyle (\sigma_{r+1}(x)), \ldots,$   Re$\displaystyle (\sigma_{r+s}(x)),$   Im$\displaystyle (\sigma_{r+1}(x)), \ldots,$   Im$\displaystyle (\sigma_{r+s}(x))\big),$    

from before, except that we have re-ordered the last $ s$ imaginary components to be next to their corresponding real parts.

Lemma 12.1.8   The image of $ \varphi $ is discrete in $ \mathbf{R}^{r+s}$.

Proof. Suppose $ X$ is any bounded subset of $ \mathbf{R}^{r+s}$. Then for any $ u\in
Y=\varphi ^{-1}(X)$ the coordinates of $ \sigma(u)$ are bounded in terms of $ X$ (since $ \log$ is an increasing function). Thus $ \sigma(Y)$ is a bounded subset of $ \mathbf{R}^n$. Since $ \sigma(Y)\subset \sigma(\O _K)$, and $ \sigma(\O _K)$ is a lattice in $ \mathbf{R}^n$, it follows that $ \sigma(Y)$ is finite. Since $ \sigma$ is injective, $ Y$ is finite, and $ \varphi $ has finite kernel, so $ \varphi (U_K)\cap X$ is finite, which implies that $ \varphi (U_K)$ is discrete. $ \qedsymbol$

To finish the proof of Theorem 12.1.2, we will show that the image of $ \varphi $ spans $ H$. Let $ W$ be the $ \mathbf{R}$-span of the image $ \varphi (U_K)$, and note that $ W$ is a subspace of $ H$. We will show that $ W=H$ indirectly by showing that if $ v\not \in H^{\perp}$, where $ \perp$ is with respect to the dot product on $ \mathbf{R}^{r+s}$, then $ v\not \in W^{\perp}$. This will show that $ W^{\perp}\subset
H^{\perp}$, hence that $ H\subset W$, as required.

Thus suppose $ z=(z_1,\ldots,z_{r+s})\not\in H^{\perp}$. Define a function $ f:K^*\to \mathbf{R}$ by

$\displaystyle f(x) = z_1\log\vert\sigma_1(x)\vert + \cdots z_{r+s}\log\vert\sigma_{r+s}(x)\vert.$ (12.3)

To show that $ z\not\in W^{\perp}$ we show that there exists some $ u\in
U_K$ with $ f(u)\neq 0$.

Let

$\displaystyle A=\sqrt{\vert d_K\vert} \cdot \left( \frac{2}{\pi}\right)^s \in \mathbf{R}_{>0}.
$

Choose any positive real numbers $ c_1,\ldots, c_{r+s} \in \mathbf{R}_{>0}$ such that

$\displaystyle c_1\cdots c_r\cdot (c_{r+1}\cdots c_{r+s})^2 = A.
$

Let

$\displaystyle S$ $\displaystyle = \{(x_1,\ldots,x_n) \in \mathbf{R}^n :$    
  $\displaystyle \qquad\qquad \vert x_i\vert\leq c_i$ for $\displaystyle 1\leq i \leq r,$    
  $\displaystyle \qquad\qquad \vert x_i^2 + x_{i+s}^2\vert \leq c_i^2$    for $\displaystyle r<i\leq r+s\} \subset \mathbf{R}^n.$    

Then $ S$ is closed, bounded, convex, symmetric with respect to the origin, and of dimension $ r+2s$, since $ S$ is a product of $ r$ intervals and $ s$ discs, each of which has these properties. Viewing $ S$ as a product of intervals and discs, we see that the volume of $ S$ is

$\displaystyle \Vol (S) = \prod_{i=1}^r (2c_i) \cdot \prod_{i=1}^s (\pi c_i^2)
= 2^r\cdot \pi^s \cdot A.
$

Recall that if $ L$ is a lattice and $ S$ is closed, bounded, etc., and has volume at least $ 2^n\cdot
\Vol (V/L)$, then $ S\cap L$ contains a nonzero element. To apply this lemma, we take $ L=\sigma(\O _K)\subset \mathbf{R}^n$, where $ \sigma$ is as in (12.1.2). We showed, when proving finiteness of the class group, that $ \Vol (\mathbf{R}^n/L) = 2^{-s}\sqrt{\vert d_K\vert}$. To check the hypothesis to Blichfeld's lemma, note that

$\displaystyle \Vol (S) = 2^{r+s} \sqrt{\vert d_K\vert} = 2^n 2^{-s} \sqrt{\vert d_K\vert} = 2^n \Vol (\mathbf{R}^n/L).
$

Thus there exists a nonzero element $ a\in S\cap \sigma(\O _K)$, i.e., a nonzero $ a\in\O _K$ such that $ \vert\sigma_i(a)\vert\leq c_i$ for $ 1\leq i\leq r+s$. We then have

$\displaystyle \vert\Norm _{K/\mathbf{Q}}(a)\vert$ $\displaystyle = \left\vert\prod_{i=1}^{r+2s} \sigma_i(a)\right\vert$    
  $\displaystyle = \prod_{i=1}^r \vert\sigma_i(a)\vert\cdot \prod_{i=r+1}^s\vert\sigma_i(a)\vert^2$    
  $\displaystyle \leq c_1\cdots c_r\cdot (c_{r+1}\cdots c_{r+s})^2 = A.$    

Since $ a\in\O _K$ is nonzero, we also have

$\displaystyle \vert\Norm _{K/\mathbf{Q}}(a)\vert\geq 1.
$

Moreover, if for any $ i\leq r$, we have $ \vert\sigma_i(a)\vert< \frac{c_i}{A}$, then

$\displaystyle 1\leq \vert\Norm _{K/\mathbf{Q}}(a)\vert < c_1\cdots \frac{c_i}{A}\cdots c_r \cdot (c_{r+1}\cdots c_{r+s})^2 = \frac{A}{A} = 1,
$

a contradiction, so $ \vert\sigma_i(a)\vert\geq \frac{c_i}{A}$ for $ i=1,\ldots,r$. Likewise, $ \vert\sigma_i(a)\vert^2 \geq \frac{c_i^2}{A}$, for $ i=r+1,\ldots, r+s$. Rewriting this we have

$\displaystyle \frac{c_i}{\vert\sigma_i(a)\vert}\leq A$    for $\displaystyle i\leq r$    and $\displaystyle \quad
\left(\frac{c_i}{\vert\sigma_i(a)\vert}\right)^2\leq A$    for $\displaystyle i=r+1,\ldots, r+s.
$

Our strategy is to use an appropriately chosen $ a$ to construct a unit $ u\in
U_K$ such $ f(u)\neq 0$. First, let $ b_1,\ldots, b_m$ be representative generators for the finitely many nonzero principal ideals of $ \O _K$ of norm at most $ A$. Since $ \vert\Norm _{K/\mathbf{Q}}(a)\vert\leq A$, we have $ (a)=(b_j)$, for some $ j$, so there is a unit  $ u\in \O _K$ such that $ a=u b_j$.

Let

$\displaystyle s=s(c_1,\ldots, c_{r+s}) = z_1\log(c_1)+\cdots
+z_{r+s}\log(c_{r+s}),$

and recall $ f:K^*\to \mathbf{R}$ defined in (12.1.3) above. We first show that

$\displaystyle \vert f(u) - s\vert \leq B = \vert f(b_j)\vert + \log(A)\cdot\lef...
...{i=1}^{r}\vert z_i\vert + \frac{1}{2}\cdot \sum_{i=r+1}^s\vert z_i\vert\right).$ (12.4)

We have

$\displaystyle \vert f(u) - s\vert$ $\displaystyle = \vert f(a) - f(b_j) - s\vert$    
  $\displaystyle \leq \vert f(b_j)\vert + \vert s - f(a)\vert$    
  $\displaystyle =\vert f(b_j)\vert + \vert z_1(\log(c_1) - \log(\vert\sigma_1(a)\vert)) + \cdots + z_{r+s}(\log(c_{r+s}) - \log(\vert\sigma_{r+s}(a)\vert))\vert$    
  $\displaystyle =\vert f(b_j)\vert + \vert z_1\cdot \log(c_1/\vert\sigma_1(a)\ver...
...cdots + \frac{z_{r+s}}{2}\cdot \log((c_{r+s}/\vert\sigma_{r+s}(a)\vert)^2)\vert$    
  $\displaystyle \leq \vert f(b_j)\vert + \log(A)\cdot\left(\sum_{i=1}^{r}\vert z_i\vert + \frac{1}{2}\cdot \sum_{i=r+1}^s\vert z_i\vert\right).$    

The amazing thing about (12.1.4) is that the bound $ B$ on the right hand side does not depend on the $ c_i$. Suppose we can choose positive real numbers $ c_i$ such that

$\displaystyle c_1\cdots c_r\cdot (c_{r+1}\cdots c_{r+s})^2 = A$

and $ s=s(c_1,\ldots, c_{r+s})$ is such that $ \vert s\vert>B$. Then $ \vert f(u)-s\vert\leq
B$ would imply that $ \vert f(u)\vert>0$, which is exactly what we aimed to prove. It is possible to choose such $ c_i$, by proceeding as follows. If $ r+s=1$, then we are trying to prove that $ \varphi (U_K)$ is a lattice in $ \mathbf{R}^0=\mathbf{R}^{r+s-1}$, which is automatically true, so assume $ r+s>1$. Then there are at least two distinct $ c_i$. Let $ j$ be such that $ z_j\neq 0$ (which exists since $ z\neq 0$). Then $ \vert z_j\log(c_j)\vert\to
\infty$ as $ c_j\to\infty$, so we choose $ c_j$ very large and the other $ c_i$, for $ i\neq j$, in any way we want subject to the condition

$\displaystyle \prod_{i=1, i\neq j}^r c_i\cdot \prod_{i=r+1}^s c_i^2 = \frac{A}{c_j}.
$

Since it is possible to choose the $ c_i$ as needed, it is possible to find a unit $ u$ such that $ f(u)>0$. We conclude that $ z\not\in W^{\perp}$, so $ W^{\perp}\subset Z^{\perp}$, whence $ Z\subset W$, which finishes the proof Theorem 12.1.2.

William Stein 2004-05-06