Objective This proposal introduces hierarchical motif vectors for numerical analysis of sequence motifs, and develops a novel framework for alignment and functional classification of proteins. Hierarchical motif vectors will be computed using multi-scale decompositions of property sequences obtained by converting amino acid sequences into numeric sequences of various amino acid properties. These hierarchical motif vectors will capture the variations of amino acid properties in the vicinity of each amino acid in the sequence of a given protein. We will develop alignment algorithms for amino acid sequences that match their hierarchical motif vectors. We will also use unsupervised statistical learning algorithms to identify hierarchical motif vectors specific to functional protein groups, notably the antigen binding proteins, transcription factors, growth factors, and glycosylation proteins. We will then apply these methods to protein classification, using the overlap scores from the hierarchical motif vector-based sequence alignment as well as the presence and extent of hierarchical motif vectors specific to the protein group in consideration. We will validate all methods developed in this project against existing sequence alignment, motif detection, and protein classification algorithms in the literature. Among the innovations of the project is the use of hierarchical motif vectors for characterization of local physico-chemical variations along an amino acid sequence. This allows analyzing sequence motifs by general machine learning methods via the embedded vector space arrangement. Next, sequence alignment can be tuned to different amino acid properties at various scales, improving the potential for sequence alignment-based protein similarity in functional classification. Furthermore, group-specific hierarchical motif vectors will be identified as those that occur exclusively among the members of a protein group, increasing their likelihood of bearing functional specificity. Fields of science natural sciencesbiological sciencesbiochemistrybiomoleculesproteinsnatural scienceschemical sciencesorganic chemistryaminesnatural sciencescomputer and information sciencesartificial intelligencemachine learningnatural sciencesmathematicsapplied mathematicsnumerical analysis Programme(s) FP7-PEOPLE - Specific programme "People" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013) Topic(s) PEOPLE-2007-4-3.IRG - Marie Curie Action: "International Reintegration Grants" Call for proposal FP7-PEOPLE-IRG-2008 See other projects for this call Funding Scheme MC-IRG - International Re-integration Grants (IRG) Coordinator IZMIR INSTITUTE OF TECHNOLOGY EU contribution € 75 000,00 Address GULBAHCE URLA 35430 İzmir Türkiye See on map Region Ege İzmir İzmir Activity type Higher or Secondary Education Establishments Administrative Contact Nazife Sahin (Ms.) Links Contact the organisation Opens in new window Website Opens in new window Total cost No data