The following lemma deals with trace and norm in a cyclic extension.
Let a cyclic extension F/K have galois group G, and let the automorphism c generate G. The trace of u is 0 iff u = v-c(v) for some v in F, and similarly, the norm of u is 1 iff u = v/c(v) for some nonzero v in F.
There are several equivalent definitions of trace and norm. We will use the one that is the most straightforward. The trace is the sum of the conjugates, and the norm is the product of the conjugates.
One direction is easy, and holds whether G is cyclic or not. The trace of v-c(v) is the automorphic images of v, minus the automorphic images of c(v). Yet the latter is merely a permutation of the former. The conjugates of v are subtracted from themselves, and the result is 0. A similar argument shows the norm of v/c(v) = 1.
Conversely, assume the trace of u is 0. Before we can find v, such that u = v-c(v), we need a lemma.
What happens if the trace of z is 0 for every z in F? This is possible, if F/K is purely inseparable, but we said the extension was cyclic, and that means F/K is galois by definition. The dimension of F equals the size of the galois group. There are n distinct K-automorphisms of F, and we're going to use this fact below.
The automorphisms of F are c0, c1, c2, etc, and these automorphisms are linearly independent. We cannot take coefficients from K, apply them to these n automorphisms, take the sum, and produce a map that carries all of F to 0. Yet trace() is just such a map. All the coefficients are 1, and the trace function is z+c(z)+c2(z)+c3(z)… This cannot send every z to 0, therefore there is some z in F with trace(z) ≠ 0.
Let w = z/trace(z). Since the trace of z lies in K, it is unaffected by any of the group automorphisms. Hence the trace of w is the trace of z over the trace of z, or 1.
Set v equal to the following sum, up to an exponent of n-2.
v = uw + (u+c(u))c(w) + (u+c(u)+c2(u))c2(w) +
(u+c(u)+c2(u)+c3(u))c3(w) … n-1 terms
Consider c(v). What does c do to the above sum? Each term becomes the term that follows, almost. For instance, watch what happens to the second term, as it almost becomes the third.
(u+c(u))c(w) → (c(u)+c2(u))c2(w)
All we're missing is uc2(w).
Let's try to get from c(v) back to v.
1. Add in uc(w) + uc2(w) + uc3(w) + … ucn-2(w).
2. Bring the first term back in, uw. This naturally fits into the series (1) above.
3. Subtract the nth term, which we don't want:
(c(u)+c2(u)+c3(u)…cn-1(u)) cn-1(w)
Put this all together and the difference, v-c(v), is the following:
(w + c(w) + c2(w) + c3(w) + … cn-2(w)) u -
(c(u) + c2(u) + c3(u)…cn-1(u)) cn-1(w)
Remember that trace(u) = 0 and trace(w) = 1. Rewrite the above as follows.
(trace(w) - cn-1(w)) u - (trace(u) - u) cn-1(w) =
(1 - cn-1(w)) u - (0 - u) cn-1(w) =
u
We have produced a v such that u = v-c(v).
Next assume norm(u) = 1. Set v to the following sum, containing n terms. Note, the value of w is not the same as it was above. I'll describe how to select w below.
v = uw + uc(u)c(w) + uc(u)c2(u)c2(w) + uc(u)c2(u)c3(u)c3(w) + … + |u|cn-1(w)
Since norm(u) = 1, u is nonzero. Similarly for uc(u), uc(u)c2(u), and so on. The "coefficients" on the functions c0(), c1(), c2(), c3(), etc, are all nonzero. We have a nontrivial linear combination of distinct K-automorphisms of F. By linear independence, the resulting function cannot map all of F to 0. Select any w, such that the image of w is nonzero. In other words, v is nonzero.
Start with v, apply c, and multiply by u. Once again the first term becomes the second, the second becomes the third, and so on. The difference between this expression and v is uc(|u|)w-uw, or 0. Therefore v = uc(v), or u = v/c(v).
You may be wondering, "How did anybody ever think of this?" I have no idea. Hilbert (biography) is way out of my league.