SDWT

A new sequence similarity analysis method based on stationary discrete wavelet transform

Need to install class classification of R packages and matlab
Usage a. Read file function: Reads sequence data according to the parameter familylist and belongs to the family Function prototype : seqsAndLabel = readFile("familylist") input: familylist, This parameter is the input file name, which contains the family file name. Each file contains the corresponding gene sequence. output: seqsAndLabel, When the training set file is read in the classification, the returned output variable includes two columns of vectors, which are two types of different data. The first column is a training set sequence, and the second column is a training set sequence tag When reading the test set file in the classification, the returned output variable is just the test set sequence. When the cluster reads the file, the returned output variable contains two columns of vectors, which are two different types of data. The first column is the sequence set, and the second column is the number of corresponding sequences in each family. b. Feature Constructor: Constructs SDWT features based on the input sequence Function prototype: feature = Feature(K,dat,decomLevel,char,alphabet) input: K: the parameter of k-mer dat: the vector of sequences decomLevel: decomposition level of wavelet transform char: the number of character classes contained in the sequence alphabet: characters included in the sequence output:
featrue: the matrix of feature

c. Training Feature Constructor: Construct SDWT features and training data features based on the input sequence Function prototype: traindata = TrainFeature(K,dat,decomLevel,char,alphabet) input: K: the parameter of k-mer dat: the vector of train sequences decomLevel: decomposition level of wavelet transform char: the number of character classes contained in the sequence alphabet: characters included in the sequence output: trainfeature: the matrix of train feature means: average of k-mer SD: Standard deviation of k-mer.

d. Predictive Feature Constructors: Construct SDWT features and training data features based on the input sequence Function prototype: testfeature = TestFeature(K,dat,decomLevel,center,sde,char,alphabet) input: K: the parameter of k-mer dat: the vector of train sequences decomLevel: decomposition level of wavelet transform char: the number of character classes contained in the sequence alphabet: characters included in the sequence output:
feature: the matrix of test feature

Usage: Classification example:

source("SDWTfunction.R") seqsAndLabel = readFile("traindata/file.lst") testseqs = testReadFile("testdata/file.lst") traindata = TrainFeature(3,seqsAndLabel$seqs,3,4,DNAalphabet) testfeature = TestFeature(3,testseqs,3,traindata$means,traindata$SD,4,DNAalphabet) library(class) predictResult = knn(traindata$trainseqfeature,testfeature,seqsAndLabel$seqLabel,k=1)

Clustering example:

source("SDWTfunction.R") seqsAndLabel = readFile("cluster/file.lst") feature = Feature(3,seqsAndLabel$seqs,3,4,DNAalphabet) KmeansResult = kmeans(feature,100)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
classification		classification
cluster		cluster
README.md		README.md
Rkmer.so		Rkmer.so
SDWTfunction.r		SDWTfunction.r
SDWTgetFeature.m		SDWTgetFeature.m
testSDWTgetFeature.m		testSDWTgetFeature.m
trainSDWTgetFeature.m		trainSDWTgetFeature.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SDWT

About

Releases

Packages

Languages

wjMY/SDWT

Folders and files

Latest commit

History

Repository files navigation

SDWT

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages