The Group of Units

Definition 12.1.1 (Unit Group) The

associated to a number field

is the group of elements of $\O _K$ that have an inverse in $\O _K$ .

Theorem 12.1.2 (Dirichlet) The group is the product of a finite cyclic group of roots of unity with a free abelian group of rank , where is the number of real embeddings of and is the number of complex conjugate pairs of embeddings.

We prove the theorem by defining a map $\varphi :U_K\to \mathbf{R}^{r+s}$ , and showing that the kernel of $\varphi$ is finite and the image of $\varphi$ is a lattice in a hyperplane in $\mathbf{R}^{r+s}$ . The trickiest part of the proof is showing that the image of $\varphi$ spans a hyperplane, and we do this by a clever application of Blichfeldt's lemma (that if is closed, bounded, symmetric, etc., and has volume at least $2^n\cdot \Vol (V/L)$ , then $S\cap L$ contains a nonzero element).

Remark 12.1.3 Theorem 12.1.2 is due to Dirichlet who lived 1805-1859. Thomas Hirst described Dirichlet as follows:

He is a rather tall, lanky-looking man, with moustache and beard about to turn grey with a somewhat harsh voice and rather deaf. He was unwashed, with his cup of coffee and cigar. One of his failings is forgetting time, he pulls his watch out, finds it past three, and runs out without even finishing the sentence.

Koch wrote that:

... important parts of mathematics were influenced by Dirichlet. His proofs characteristically started with surprisingly simple observations, followed by extremely sharp analysis of the remaining problem.

I think Koch's observation nicely describes the proof we will give of Theorem 12.1.2.

The following proposition explains how to think about units in terms of the norm.

Proposition 12.1.4 An element $a\in\O _K$ is a unit if and only if $\Norm _{K/\mathbf{Q}}(a)=\pm 1$ .

Proof. Write $\Norm =\Norm _{K/\mathbf{Q}}$ . If

is a unit, then $a^{-1}$ is also a unit, and $1=\Norm (a)\Norm (a^{-1})$ . Since both $\Norm (a)$ and $\Norm (a^{-1})$ are integers, it follows that $\Norm (a)=\pm 1$ . Conversely, if $a\in\O _K$ and $\Norm (a)=\pm 1$ , then the equation $aa^{-1}=1=\pm \Norm (a)$ implies that $a^{-1} = \pm \Norm (a)/a$ . But $\Norm (a)$ is the product of the images of

in $\mathbf{C}$ by all embeddings of

into $\mathbf{C}$ , so $\Norm (a)/a$ is also a product of images of

in $\mathbf{C}$ , hence a product of algebraic integers, hence an algebraic integer. Thus $a^{-1}\in\O _K$ , which proves that

is a unit. $\qedsymbol$

Let be the number of real and the number of complex conjugate embeddings of into $\mathbf{C}$ , so $n=[K:\mathbf{Q}]=r+2s$ . Define a map

$\displaystyle \varphi :U_K \to \mathbf{R}^{r+s}$

$\displaystyle \varphi (a) = (\log\vert\sigma_1(a)\vert,\ldots, \log\vert\sigma_{r+s}(a)\vert).$

Lemma 12.1.5 The image of $\varphi$ lies in the hyperplane

$\displaystyle H = \{(x_1,\ldots, x_{r+s})\in\mathbf{R}^{r+s} : x_1+ \cdots + x_r + 2x_{r+1} + \cdots + 2x_{r+s} = 0\}.$

(12.1)

Proof. If $a\in U_K$ , then by Proposition 12.1.4,

$\displaystyle \left(\prod_{i=1}^{r} \vert\sigma_i(a)\vert\right) \cdot \left( \prod_{i=r+1}^s \vert\sigma_i(a)\vert^2 \right) = 1.$

Taking logs of both sides proves the lemma. $\qedsymbol$

Lemma 12.1.6 The kernel of $\varphi$ is finite.

Proof. We have

$\displaystyle \Ker (\varphi )$	$\displaystyle \subset \{a\in\O _K : \vert\sigma_i(a)\vert = 1$ for all $\displaystyle i=1,\ldots,r+2s\}$
	$\displaystyle \subset \sigma(\O _K) \cap X,$

where

is the bounded subset of $\mathbf{R}^{r+2s}$ of elements all of whose coordinates have absolute value at most

. Since $\sigma(\O _K)$ is a lattice (see Proposition 5.2.4), the intersection $\sigma(\O _K)\cap X$ is finite, so $\Ker (\varphi )$ is finite. $\qedsymbol$

Lemma 12.1.7 The kernel of $\varphi$ is a finite cyclic group.

Proof. It is a general fact that any finite subgroup of the multiplicative group of a field is cyclic. [Homework.] $\qedsymbol$

To prove Theorem 12.1.2, it suffices to proove that Im $(\varphi )$ is a lattice in the hyperplane from (12.1.1), which we view as a vector space of dimension .

Define an embedding

$\displaystyle \sigma : K\hookrightarrow \mathbf{R}^n$

(12.2)

given by $\sigma(x) = (\sigma_1(x),\ldots,\sigma_{r+s}(x))$ , where we view $\mathbf{C}\cong \mathbf{R}\times \mathbf{R}$ via $a+b i\mapsto (a,b)$ . Note that this is exactly the same as the embedding

$\displaystyle x\mapsto \big($	$\displaystyle \sigma_1(x), \sigma_2(x),\ldots, \sigma_r(x),$
	Re $\displaystyle (\sigma_{r+1}(x)), \ldots,$ Re $\displaystyle (\sigma_{r+s}(x)),$ Im $\displaystyle (\sigma_{r+1}(x)), \ldots,$ Im $\displaystyle (\sigma_{r+s}(x))\big),$

from before, except that we have re-ordered the last

imaginary components to be next to their corresponding real parts.

Lemma 12.1.8 The image of $\varphi$ is discrete in $\mathbf{R}^{r+s}$ .

Proof. Suppose

is any bounded subset of $\mathbf{R}^{r+s}$ . Then for any $u\in Y=\varphi ^{-1}(X)$ the coordinates of $\sigma(u)$ are bounded in terms of

(since $\log$ is an increasing function). Thus $\sigma(Y)$ is a bounded subset of $\mathbf{R}^n$ . Since $\sigma(Y)\subset \sigma(\O _K)$ , and $\sigma(\O _K)$ is a lattice in $\mathbf{R}^n$ , it follows that $\sigma(Y)$ is finite. Since $\sigma$ is injective,

is finite, and $\varphi$ has finite kernel, so $\varphi (U_K)\cap X$ is finite, which implies that $\varphi (U_K)$ is discrete. $\qedsymbol$

To finish the proof of Theorem 12.1.2, we will show that the image of $\varphi$ spans . Let be the $\mathbf{R}$ -span of the image $\varphi (U_K)$ , and note that is a subspace of . We will show that indirectly by showing that if $v\not \in H^{\perp}$ , where $\perp$ is with respect to the dot product on $\mathbf{R}^{r+s}$ , then $v\not \in W^{\perp}$ . This will show that $W^{\perp}\subset H^{\perp}$ , hence that $H\subset W$ , as required.

Thus suppose $z=(z_1,\ldots,z_{r+s})\not\in H^{\perp}$ . Define a function $f:K^*\to \mathbf{R}$ by

$\displaystyle f(x) = z_1\log\vert\sigma_1(x)\vert + \cdots z_{r+s}\log\vert\sigma_{r+s}(x)\vert.$

(12.3)

To show that $z\not\in W^{\perp}$ we show that there exists some $u\in U_K$ with $f(u)\neq 0$ .

Let

$\displaystyle A=\sqrt{\vert d_K\vert} \cdot \left( \frac{2}{\pi}\right)^s \in \mathbf{R}_{>0}.$

Choose any positive real numbers $c_1,\ldots, c_{r+s} \in \mathbf{R}_{>0}$ such that

$\displaystyle c_1\cdots c_r\cdot (c_{r+1}\cdots c_{r+s})^2 = A.$

Let

$\displaystyle S$	$\displaystyle = \{(x_1,\ldots,x_n) \in \mathbf{R}^n :$
	$\displaystyle \qquad\qquad \vert x_i\vert\leq c_i$ for $\displaystyle 1\leq i \leq r,$
	$\displaystyle \qquad\qquad \vert x_i^2 + x_{i+s}^2\vert \leq c_i^2$ for $\displaystyle r<i\leq r+s\} \subset \mathbf{R}^n.$

Then

is closed, bounded, convex, symmetric with respect to the origin, and of dimension

, since

is a product of

intervals and

discs, each of which has these properties. Viewing

as a product of intervals and discs, we see that the volume of

$\displaystyle \Vol (S) = \prod_{i=1}^r (2c_i) \cdot \prod_{i=1}^s (\pi c_i^2) = 2^r\cdot \pi^s \cdot A.$

Recall that if is a lattice and is closed, bounded, etc., and has volume at least $2^n\cdot \Vol (V/L)$ , then $S\cap L$ contains a nonzero element. To apply this lemma, we take $L=\sigma(\O _K)\subset \mathbf{R}^n$ , where $\sigma$ is as in (12.1.2). We showed, when proving finiteness of the class group, that $\Vol (\mathbf{R}^n/L) = 2^{-s}\sqrt{\vert d_K\vert}$ . To check the hypothesis to Blichfeld's lemma, note that

$\displaystyle \Vol (S) = 2^{r+s} \sqrt{\vert d_K\vert} = 2^n 2^{-s} \sqrt{\vert d_K\vert} = 2^n \Vol (\mathbf{R}^n/L).$

Thus there exists a nonzero element $a\in S\cap \sigma(\O _K)$ , i.e., a nonzero $a\in\O _K$ such that $\vert\sigma_i(a)\vert\leq c_i$ for $1\leq i\leq r+s$ . We then have

$\displaystyle \vert\Norm _{K/\mathbf{Q}}(a)\vert$	$\displaystyle = \left\vert\prod_{i=1}^{r+2s} \sigma_i(a)\right\vert$
	$\displaystyle = \prod_{i=1}^r \vert\sigma_i(a)\vert\cdot \prod_{i=r+1}^s\vert\sigma_i(a)\vert^2$
	$\displaystyle \leq c_1\cdots c_r\cdot (c_{r+1}\cdots c_{r+s})^2 = A.$

Since $a\in\O _K$ is nonzero, we also have

$\displaystyle \vert\Norm _{K/\mathbf{Q}}(a)\vert\geq 1.$

Moreover, if for any $i\leq r$ , we have $\vert\sigma_i(a)\vert< \frac{c_i}{A}$ , then

$\displaystyle 1\leq \vert\Norm _{K/\mathbf{Q}}(a)\vert < c_1\cdots \frac{c_i}{A}\cdots c_r \cdot (c_{r+1}\cdots c_{r+s})^2 = \frac{A}{A} = 1,$

a contradiction, so $\vert\sigma_i(a)\vert\geq \frac{c_i}{A}$ for $i=1,\ldots,r$ . Likewise, $\vert\sigma_i(a)\vert^2 \geq \frac{c_i^2}{A}$ , for $i=r+1,\ldots, r+s$ . Rewriting this we have

$\displaystyle \frac{c_i}{\vert\sigma_i(a)\vert}\leq A$ for $\displaystyle i\leq r$ and $\displaystyle \quad \left(\frac{c_i}{\vert\sigma_i(a)\vert}\right)^2\leq A$ for $\displaystyle i=r+1,\ldots, r+s.$

Our strategy is to use an appropriately chosen to construct a unit $u\in U_K$ such $f(u)\neq 0$ . First, let $b_1,\ldots, b_m$ be representative generators for the finitely many nonzero principal ideals of $\O _K$ of norm at most . Since $\vert\Norm _{K/\mathbf{Q}}(a)\vert\leq A$ , we have , for some , so there is a unit $u\in \O _K$ such that .

Let

$\displaystyle s=s(c_1,\ldots, c_{r+s}) = z_1\log(c_1)+\cdots +z_{r+s}\log(c_{r+s}),$

and recall $f:K^*\to \mathbf{R}$ defined in (12.1.3) above. We first show that

$\displaystyle \vert f(u) - s\vert \leq B = \vert f(b_j)\vert + \log(A)\cdot\lef... ...{i=1}^{r}\vert z_i\vert + \frac{1}{2}\cdot \sum_{i=r+1}^s\vert z_i\vert\right).$

(12.4)

We have

$\displaystyle \vert f(u) - s\vert$	$\displaystyle = \vert f(a) - f(b_j) - s\vert$
	$\displaystyle \leq \vert f(b_j)\vert + \vert s - f(a)\vert$
	$\displaystyle =\vert f(b_j)\vert + \vert z_1(\log(c_1) - \log(\vert\sigma_1(a)\vert)) + \cdots + z_{r+s}(\log(c_{r+s}) - \log(\vert\sigma_{r+s}(a)\vert))\vert$
	$\displaystyle =\vert f(b_j)\vert + \vert z_1\cdot \log(c_1/\vert\sigma_1(a)\ver... ...cdots + \frac{z_{r+s}}{2}\cdot \log((c_{r+s}/\vert\sigma_{r+s}(a)\vert)^2)\vert$
	$\displaystyle \leq \vert f(b_j)\vert + \log(A)\cdot\left(\sum_{i=1}^{r}\vert z_i\vert + \frac{1}{2}\cdot \sum_{i=r+1}^s\vert z_i\vert\right).$

The amazing thing about (12.1.4) is that the bound on the right hand side does not depend on the . Suppose we can choose positive real numbers such that

$\displaystyle c_1\cdots c_r\cdot (c_{r+1}\cdots c_{r+s})^2 = A$

and $s=s(c_1,\ldots, c_{r+s})$ is such that $\vert s\vert>B$ . Then $\vert f(u)-s\vert\leq B$ would imply that $\vert f(u)\vert>0$ , which is exactly what we aimed to prove. It is possible to choose such

, by proceeding as follows. If