Let S be well ordered and let I be a proper initial segment. Let z be the least element in S that is not in I. If x < z and x ∉ I, then there is some y ∈ I such that x < y < z, else z would not be minimal. Yet this contradicts the definition of an initial segment. Thus every proper initial segment consists of the elements below z for some z in S.
Conversely, assume S is linearly ordered, and every proper initial segment has a least upper bound. Let T be any nonempty subset of S and let I be the elements that are less than every element of T. Let z be the upper bound for I. If x is in T then everything in I is less than x, hence T and I do not intersect. All of T is at least as large as z. If everything in T exceeds z then z belongs to I, so z is actually a member of T. Now z is the least element of T, and since T was an arbitrary subset of S, S is well ordered. Within a linearly ordered set, well ordered is equivalent to bounded initial segments.
If T is a proper initial segment of the well ordered set S, there is no isomorphism between T and S. We demonstrated this for ordinals; use the same proof.
Let S and T be arbitrary well ordered sets. We would like to map one into the other, via an isomorphism. In fact we will map the smaller onto an initial segment of the larger.
For any x in S and y in T, define f(x,y) as true if the initial segment below x is isomorphic to the initial segment below y. Verify that f is a valid relation, described by a formula on S cross T. If this relation admits two values of y for a given x, then by transitivity, an initial segment of T is isomorphic to a smaller initial segment of T. We showed (above) that this cannot happen. Thus f is a function, in fact a 1-1 function.
Suppose f(x1) = y1 and f(x2) = y2, with x1 < x2, and y1 > y2. An isomorphism maps the initial segment below y1 back to x1, then forward to a proper segment beneath y2. This makes the segment below y1 isomorphic to a smaller segment beneath y2, which is impossible. Therefore f respects order, and is an isomorphism.
By definition, f(x) = y means there is an isomorphism mapping the elements below x to the elements below y. We can restrict this isomorphism to shorter initial segments. Thus f is well defined on elements below x and y. In fact f establishes an initial segment in S (the domain) and an initial segment in T (the range).
Suppose the domain and range are proper initial segments, hence they are bounded by some x in S and some y in T. Now f is an isomorphism between these initial segments, and that means f(x) = y. We can't have x and y as upper bounds, because x is in the domain of f and y is in the range. This contradiction means either the domain is all of S or the range is all of T. One of the sets is mapped onto an initial segment of the other.