Datasets in Bioinformatics

From DrugPedia: A Wikipedia for Drug discovery

Revision as of 09:15, 3 September 2008 by Ravi (Talk | contribs)
Jump to: navigation, search

There are a number of Datasets that are being created and used in the field of Bioinformatics. Datasets contains the vital information based on which a prediction server depends for it's function. here is some of the datasets that are being created or used by Bioinformatics Centre, Institute of Microbiology, Chandigarh are as follows :

Datasets for evaluation of beta turn prediction method

The dataset has 426 non-homologus protein chains. In this data set, no two protein chains have more than 25% sequence identity.The structure of these proteins is determined by X-ray crystallography at 2.0 resolution or better. Each chain contains minimum one beta turn.

Complete Dataset

  • Amino acid sequence of 426 protein chains in fasta format


ProPred-I

The Promiscuous MHC Class-I Binding Peptide Prediction Server

The ProPred-I is an on-line service for identifying the MHC Class-I binding regions in antigens. It implements matrices for 47 MHC Class-I alleles, proteasomal and immunoproteasomal models. The main aim of this server is to help users in identifying the promiscuous regions.

Dataset

Here is two datasets that are used in developing this server is :

HLA-A*0201

H2-kb


Matrix Optimization Technique for Predicting MHC binding Core

The X-ray crystal structure of MHC class II molecule has revealed open peptide binding groove. A peptide bound in this groove may flank from one or the other side. Understanding which residues are acctually involved in binding will be very useful for understanding MHC peptide interactios.Here Matrix Optimization Technique is used to predict MHC binding core. Using binders from MHCPEP and nonbinder Data with MOT an accuracy of correct classification from 97 to 99% was obtained with HLA-DR1, HLA-DR2 and HLA-DR5 allele. This is the highest accuracy reported by any method. The prediction method used in this server is based on MOT and relies on the thought that binders have unique patterns which can be easily distinguished from nonbinders.

Dataset

The "Binder" used in this study :

HLA-DR1

HLA-DR2

HLA-DR5

The "Non-binder" used in this study are :

HLA-DR1

HLA-DR2

HLA-DR