The cofactor in position i,j is the corresponding subdeterminant times (-1)i+j. Take the sum of the cofactors times their corresponding matrix entries and you get the determinant. That's the definition of det3.
The adjoint of a matrix is the transpose of its cofactors. Consider the product M times the adjoint of M. When the ith row is multiplied by the ith column, cofactors are multiplied by their corresponding entries, and the result is det(M). When the ith row is multiplied by the jth column, for i ≠ j, we are taking the determinant of a matrix with a repeated row. This is always 0. Thus M*adjoint(M) is the identity matrix scaled by the determinant.
Consider the first equation, represented by the top row of M and the first element of c. Multiply through by the cofactor of M, when the first row and column are deleted. Then move to the second equation, represented by the second row, and multiply through by the cofactor when the first column and second row are deleted. Do this for all n equations.
Now add up all the equations. The coefficient on x1 is M1,1 times its cofactor, plus M2,1 times its cofactor, plus … plus Mn,1 times its cofactor. This is simply the determinant of M.
The coefficient on x2 is the sum of the entries in the second column of M times the cofactors from the first column of M. This is the determinant of M with the second column copied onto the first column. That's a duplicate column, so the coefficient drops to 0. This holds for all the rest of the variables.
The constant term is the sum of the constants times the cofactors from the first column. This is the determinant of M with the constants copied onto the first column. Therefore x1 is the determinant of M with c in column 1, divided by the determinant of M.
Repeat the above procedure, multiplying each equation by the cofactor from the second column. Simplify the coefficients and x2 becomes the determinant of M with c in column 2, divided by the determinant of M. This can be repeated for all n variables. This only works whem M is nonsingular.
This procedure was developed by Cramer (biography), and is called "Cramer's rules". It is not a practical algorithm for large matrices, though it is often used for matrices up to dimension 3 or 4. And it helps us prove some important theorems. For instance, it places a bound on the solution. If M and c contain integers, The solution can be expressed using rational numbers with a common denominator no larger than det(M). Similarly, the inverse of M is an integer matrix divided by a common denominator no larger than det(M).