Monday, October 18, 2010

Decomposing Affine Transforms

There's no 'perfect' way to decompose a 4x4 affine transform matrix into its completely exact, initial component matrices (i.e., rotation, scale, translation) in all cases. However there are some pretty useful techniques when you know that you're transform only has particular types of component transforms inside it and if you're willing to accept the fact that you won't get an exact version of those transforms back after decomposition. Specifically, the scale and rotation components of the transform will become 'mixed' this is mainly due to the isomorphism between the effects of certain scale operations and rotation operations.

For example: For a scale in 3D space, if any two components are negative and the third is positive, a rotation can be found to exactly mimic the behaviour of the 'flipping' portion of that scale operation. That is, the rotation would mimic the scale exactly if the scale was still a unit scale (e.g., scale along x,y and z was 1, -1, -1). In fact there's a general rule where the sign (+/-) of the scale only matters if the total number of signed components is odd (in which case a rotation cannot accommodate the 'flipping' aspect of the scale).

The following are some required methods for properly analyzing and decomposing a 4x4 affine transform matrix:

This first method gets the determinant of a 4x4 matrix, this is required for detecting whether there's a odd negative scale (i.e., the determinant is less than zero) and also for detecting whether a matrix is invertible (i.e., the determinant is approximately equal to zero).

/**
 * @brief  Get the determinant of this matrix.
 * @return The determinant.
 */
template <typename T> T Matrix4x4<T>::Determinant() const {
#define MINOR(m, r0, r1, r2, c0, c1, c2) \
((m).rows[r0][c0] * ((m).rows[r1][c1] * (m).rows[r2][c2] - (m).rows[r2][c1] * (m).rows[r1][c2]) - \
(m).rows[r0][c1] * ((m).rows[r1][c0] * (m).rows[r2][c2] - (m).rows[r2][c0] * (m).rows[r1][c2]) + \
(m).rows[r0][c2] * ((m).rows[r1][c0] * (m).rows[r2][c1] - (m).rows[r2][c0] * (m).rows[r1][c1]))

    return this->rows[0][0] * MINOR(*this, 1, 2, 3, 1, 2, 3) -
           this->rows[0][1] * MINOR(*this, 1, 2, 3, 0, 2, 3) +
           this->rows[0][2] * MINOR(*this, 1, 2, 3, 0, 1, 3) -
           this->rows[0][3] * MINOR(*this, 1, 2, 3, 0, 1, 2);

    #undef MINOR
}

The next method can determine whether the 4x4 matrix is an appropriate, invertible affine transform or not. If you can afford the extra computation then this ensures that we can even begin to decompose the matrix.
Note that a right-handed 4x4 matrix is said to be affine if it meets the following set of necessary and sufficient conditions:

a) The matrix follows the form



b) There exists an inverse and it is the following matrix:


/**
 * @brief Determine whether this matrix represents an affine transform or not.
 * @return true if this matrix is an affine transform, false if not.
 */
template <typename T> bool Matrix4x4<T>::IsAffine() const {
    // First make sure the bottom row meets the condition that it is (0, 0, 0, 1)
    if (!Vector4D<T>::Equals(this->GetRow(3), Vector4D<T>(0.0, 0.0, 0.0, 1.0))) {
        return false;
    }

    // Get the inverse of this matrix:
    // Make sure the matrix is invertible to begin with...
    if (fabs(this->Determinant()) <= get_error_epsilon<T>()) {
        return false;
    }

    // Calculate the inverse and seperate the inverse translation component 
    // and the top 3x3 part of the inverse matrix
    Matrix4x4<T> inv4x4Matrix = Matrix4x4<T>::Inverse(*this);
    Vector3D<T> inv4x4Translation(inv4x4Matrix.rows[0][3],
                                        inv4x4Matrix.rows[1][3],
                                        inv4x4Matrix.rows[2][3]);
    Matrix3x3<T> inv4x4Top3x3 = Matrix4x4<T>::ToMatrix3x3(inv4x4Matrix);

    // Grab just the top 3x3 matrix
    Matrix3x3<T> top3x3Matrix     = Matrix4x4<T>::ToMatrix3x3(*this);
    Matrix3x3<T> invTop3x3Matrix  = Matrix3x3<T>::Inverse(top3x3Matrix);
    Vector3D<T> inv3x3Translation = -(invTop3x3Matrix * this->GetTranslation());

    // Make sure we adhere to the conditions of a 4x4 invertible affine transform matrix
    if (!Matrix3x3<T>::Equals(inv4x4Top3x3, invTop3x3Matrix)) {
        return false;
    }
    if (!Vector3D<T>::Equals(inv4x4Translation, inv3x3Translation)) {
        return false;
    }

    return true;
}

Under the assumption that we're dealing with an invertible homogeneous 4x4 affine transform matrix, (with both of the above functions considered) we can now safely decompose our matrix into its scale, rotation and translation components:

/**
 * @brief Decomposes the given matrix 'm' into its translation, rotation and scale components.
 * @param m The matrix to decompose.
 * @param translation [in,out] The resulting translation component of m.
 * @param rotation [in,out] The resulting rotation component of m.
 * @param scale [in,out] The resulting scale component of m.
 */
template <typename T> 
void Matrix4x4<T>::Decompose(const Matrix4x4<T>& m, 
                                   Vector3D<T>& translation, 
                                   Matrix4x4<T>& rotation, 
                                   Vector3D<T>& scale) {
    // Copy the matrix first - we'll use this to break down each component
    Matrix4x4<T> mCopy(m);

    // Start by extracting the translation (and/or any projection) from the given matrix
    translation = mCopy.GetTranslation();
    for (int i = 0; i < 3; i++) {
        mCopy.rows[i][3] = mCopy.rows[3][i] = 0.0;
    }
    mCopy.rows[3][3] = 1.0;

    // Extract the rotation component - this is done using polar decompostion, where
    // we successively average the matrix with its inverse transpose until there is
    // no/a very small difference between successive averages
    T norm;
    int count = 0;
    rotation = mCopy;
    do {
        Matrix4x4<T> nextRotation;
        Matrix4x4<T> currInvTranspose = 
          Matrix4x4<T>::Inverse(Matrix4x4<T>::Transpose(rotation));
        
        // Go through every component in the matrices and find the next matrix
        for (int i = 0; i < 4; i++) {
            for (int j = 0; j < 4; j++) {
                nextRotation.rows[i][j] = static_cast<T>(0.5 * 
                  (rotation.rows[i][j] + currInvTranspose.rows[i][j]));
            }
        }

        norm = 0.0;
        for (int i = 0; i < 3; i++) {
            float n = static_cast<float>(
                         fabs(rotation.rows[i][0] - nextRotation.rows[i][0]) +
                         fabs(rotation.rows[i][1] - nextRotation.rows[i][1]) +
                         fabs(rotation.rows[i][2] - nextRotation.rows[i][2]));
            norm = std::max<T>(norm, n);
        }
        rotation = nextRotation;
    } while (count < 100 && norm > blackbox::bbmath::get_error_epsilon<T>());

    // The scale is simply the removal of the rotation from the non-translated matrix
    Matrix4x4<T> scaleMatrix = Matrix4x4<T>::Inverse(rotation) * mCopy;
    scale = Vector3D<T>(scaleMatrix.rows[0][0],
                        scaleMatrix.rows[1][1],
                        scaleMatrix.rows[2][2]);

    // Calculate the normalized rotation matrix and take its determinant to determine whether
    // it had a negative scale or not...
    Vector3D<T> row1(mCopy.rows[0][0], mCopy.rows[0][1], mCopy.rows[0][2]);
    Vector3D<T> row2(mCopy.rows[1][0], mCopy.rows[1][1], mCopy.rows[1][2]);
    Vector3D<T> row3(mCopy.rows[2][0], mCopy.rows[2][1], mCopy.rows[2][2]);
    row1.Normalize();
    row2.Normalize();
    row3.Normalize();
    Matrix3x3<T> nRotation(row1, row2, row3);

    // Special consideration: if there's a single negative scale 
    // (all other combinations of negative scales will
    // be part of the rotation matrix), the determinant of the 
    // normalized rotation matrix will be < 0. 
    // If this is the case we apply an arbitrary negative to one 
    // of the component of the scale.
    T determinant = nRotation.Determinant();
    if (determinant < 0.0) {
        scale.SetX(scale.GetX() * -1.0);
    }
}

Note the last few lines of the Decompose function. If we find it to be the case that the 'normalized' rotation matrix has a determinant that is less than zero, then we know that there's at least one component of the scale that's negative; however, we cannot know which one. Fortunately that previous sentence is somewhat meaningless anyway, since the original matrix we were given could have been composed of tons of previous affine transforms and the meaning of its 'original' scale becomes a bit of a misnomer. Ultimately what this means is that as long as we know there is a negative scale, we can assign it to any single component of the decomposed scale and still be correct as long as the other decomposed elements are considered along with it.

No comments: