Skip to contents

This feature begins by creating a CHMM, which is created by constructing 4 matrices, \(A, B, C, D\) from the original HMM \(H\). \(A\) contains the first 75 percent of the original matrix \(H\) row-wise, \(B\) the last 75 percent, \(C\) the middle 75 percent and \(D\) the entire original matrix. These are then merged to create the new CHMM \(Z\). From there, the Bigrams feature is calculated with a flattened 20 x 20 matrix \(B\), in which \(B[i, j] = \sum_{a = 1}^{L-1} Z_{a, i} \times Z_{a+1, j}\). \(H\) corresponds to the original HMM matrix, and \(L\) is the number of rows in \(Z\). Local Average Group, or LAG is then calculated by splitting up the CHMM into 20 groups along the length of the protein sequence and calculating the sums of each of the columns of each group, making a 1 x 20 vector per group, and a length 20 x 20 vector for all groups. These features are then fused.

Usage

chmm(hmm)

Arguments

hmm

The name of a profile hidden markov model file.

Value

A fusion vector of length 800.

A LAG vector of length 400.

A Bigrams vector of length 400.

References

An, J., Zhou, Y., Zhao, Y., & Yan, Z. (2019). An Efficient Feature Extraction Technique Based on Local Coding PSSM and Multifeatures Fusion for Predicting Protein-Protein Interactions. Evolutionary Bioinformatics, 15, 117693431987992.

Examples

h<- chmm(system.file("extdata", "1DLHA2-7", package="protHMM"))