Annotazione Sequenze + informazioni Manuale Automatica.

20
Annotazione Sequenze Sequenze + informazioni Manuale Automatica

Transcript of Annotazione Sequenze + informazioni Manuale Automatica.

Page 1: Annotazione Sequenze + informazioni Manuale Automatica.

Annotazione

Sequenze Sequenze+

informazioni

Manuale

Automatica

Page 2: Annotazione Sequenze + informazioni Manuale Automatica.

Attendibilità di una sequenza proteica

In ordine decrescente

- Proteina nota- mRNA noto (tradotto?)- Gene noto (splicing

alternativi?)- Gene predetto per omologia con proteine note- Gene predetto con altri metodi

Page 3: Annotazione Sequenze + informazioni Manuale Automatica.

Attendibilità di una funzione

In ordine decrescente- Informazioni ottenute dalla

letteratura con metodi manuali

- Per similarità con proteine a funzione nota

- Con altri metodi bioinformatici

Page 4: Annotazione Sequenze + informazioni Manuale Automatica.

UNIPROT

UNIPROT

TrEMBLSwiss-Prot

5.400.000 350.000

5.750.000 proteine

Annotate manualmente

Annotate automaticamente

Page 5: Annotazione Sequenze + informazioni Manuale Automatica.
Page 6: Annotazione Sequenze + informazioni Manuale Automatica.
Page 7: Annotazione Sequenze + informazioni Manuale Automatica.

Campi di una banca datidi proteine

ProteinaID

ACCESSION

DATA CREAZIONE

DATA MODIFICA

NOME

SINONIMI

NOME GENE

SPECIE

FUNZIONE

SEQUENZA

Page 8: Annotazione Sequenze + informazioni Manuale Automatica.

SequenceSequence

MDVPCPWYSLLIPLFVFIFLLIHHCFFTTSKKQNMLLLPSPRKLPIIGNLHQLGSLPHRSLHKLSQKYGPVMLLHFGSKPVIVASSVDAARDIMKTHDVVRDIMKTHDVV

Lenght107 AA

Weight11023 Da

CRC numberNumero di controllo integrità

Es. 0890DD39E1473584

Page 9: Annotazione Sequenze + informazioni Manuale Automatica.

DescriptionProtein name

Annexin A5

SynonymsAnnexin V, Lipocortin V, Endonexin II

Ec numberEC 6.3.5.5

ContainsProdotti di taglio contenuti

IncludesDomini funzionali contenuti

Page 10: Annotazione Sequenze + informazioni Manuale Automatica.

Gene NamePrimary

Phospholipase A2

SynonymcPLA2

Locus nameNome ordinato sul genoma

Es: XAC2464

ORF nameNome temporaneo su genoma

Es: ACAM_3000_MVA_081

Page 11: Annotazione Sequenze + informazioni Manuale Automatica.

OrganismCommon name

Eggplant

ScientificSolanum melongea

SynonymAubergine

ClassificationEukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core

eudicotyledons; a sterids; lamiids; Solanales; Solanaceae; Solanum

TaxonomyNCBI_TaxID=9606

Page 12: Annotazione Sequenze + informazioni Manuale Automatica.

DateDD-MMM-YYYY Es. 01-Gen-2001

IntegratedData di creazione

Entry modifiedUltima modifica alla descrizione

Sequence modifiedUltima modifica alla sequenza

Page 13: Annotazione Sequenze + informazioni Manuale Automatica.

References

PositionNuleotide/protein sequence Genomic dna/rna, X-ray crystallography, variant

CommentStrain, Tissue, Plasmid, Species, TransposonEs. STRAIN=Liver

TypeJournal Article, Thesis, Submission, Unpublished

Author, Title, Year, Journal title, VolumeDati bibliografici

Page 14: Annotazione Sequenze + informazioni Manuale Automatica.

Keywords

Es:3D-structure; Alternative splicing; Alzheimer disease; Amyloid;Apoptosis; Cell adhesion;

Coated pits; Copper;Direct protein sequencing; Disease mutation; Endocytosis;Glycoprotein; Heparin- binding; Iron; Metal-binding; Notch signaling pathway; Phosphorylation; Polymorphism; Protease inhibitor; Proteoglycan; Serine protease inhibitor; Signal;Transmembrane; Zinc.

Iron

Bind

Page 15: Annotazione Sequenze + informazioni Manuale Automatica.

OrganelleEs:

HydrogenosomeMitochondrionNucleomorphPlasmidPlastid

Page 16: Annotazione Sequenze + informazioni Manuale Automatica.

CommentsFUNCTION: Binds to actin and affects the structure of the cytoskeleton. At high concentrations, profilin prevents the polymerization of actin, whereas it enhances it at low concentrations. By binding to PIP2, it inhibits the formation of IP3 and DG. ALLERGEN: Causes an allergic reaction in human. Minor allergen of bovine dander.ALTERNATIVE PRODUCTS: Event=Alternative initiation; Comment=2 isoforms, Alpha and Beta, are produced by alternative initiation;BIOPHYSICOCHEMICAL PROPERTIES: Kinetic parameters: KM=98 uM for ATP; KM=688 uM for pyridoxal; Vmax=1.604 mmol/min/mg enzyme; pH dependence: Optimum pH is 6.0. Active from pH 4.5 to 10.5;CATALYTIC ACTIVITY: ATP + L-glutamate + NH(3) = ADP + phosphate + L-glutamine. COFACTOR: Pyridoxal phosphate. DEVELOPMENTAL STAGE: Expressed early during conidial (dormant spores) differentiation. DISEASE: Defects in PHKA1 are linked to X-linked muscle glycogenosis [MIM:311870]. It is a disease characterized by slowly progressive, predominantly distal muscle weakness and atrophy. DOMAIN: Contains a coiled-coil domain essential for vesicular transport and a dispensable C-terminal region.PATHWAY: Porphyrin biosynthesis by the C5 pathway; second step. PHARMACEUTICAL: Available under the name Proleukin (Chiron). Used in patients with renal cell carcinoma or metastatic melanoma.POLYMORPHISM: The allelic form of the enzyme with Gln-191 (Allozyme A) hydrolyzes paraoxon with a low turnover number and the one with Arg-191 (Allozyme B) with a high turnover number. PTM: N-glycosylated and probably also O-glycosylated.RNA EDITING: Modified_positions=393, 431, 452, 495.TISSUE SPECIFICITY: Shoots, roots, and cotyledon from dehydrating seedlings.TOXIC DOSE: PD(50) is 1.72 mg/kg by injection in blowfly larvae.

Page 17: Annotazione Sequenze + informazioni Manuale Automatica.

CommentsALLERGEN Information relevant to allergenic proteins

ALTERNATIVE PRODUCTS Description of the existence of related protein sequence(s) produced by alternative splicing of the same gene or by the use of alternative initiation codons; see 3.20.15

BIOPHYSICOCHEMICAL PROPERTIES

Description of the information relevant to biophysical and physicochemical data and information on pH dependence, temperature dependence, kinetic parameters, redox potentials, and maximal absorption; see 3.20.8

BIOTECHNOLOGY Description of the use of a specific protein in a biotechnological process

CATALYTIC ACTIVITY Description of the reaction(s) catalyzed by an enzyme [1]

COFACTOR Description of any non-protein substance required by an enzyme for its catalytic activity

DEVELOPMENTAL STAGE Description of the developmentally-specific expression of mRNA or protein

DISEASE Description of the disease(s) associated with a deficiency of a protein

DOMAIN Description of the domain structure of a protein

ENZYME REGULATION Description of an enzyme regulatory mechanism

FUNCTION General description of the function(s) of a protein

INDUCTION Description of the compound(s) or condition(s) that regulate gene expression

INTERACTION Conveys information relevant to binary protein-protein interaction 3.20.12

MASS SPECTROMETRY Reports the exact molecular weight of a protein or part of a protein as determined by mass spectrometric methods; see 3.20.23

PATHWAY Description of the metabolic pathway(s) with which a protein is associated

PHARMACEUTICAL Description of the use of a protein as a pharmaceutical drug

POLYMORPHISM Description of polymorphism(s)

RNA EDITING Description of any type of RNA editing that leads to one or more amino acid changes

SIMILARITY Description of the similaritie(s) (sequence or structural) of a protein with other proteins

SUBCELLULAR LOCATION Description of the subcellular location of the mature protein

SUBUNIT Description of the quaternary structure of a protein and any kind of interactions with other proteins or protein complexes; except for receptor-ligand interactions, which are described in the topic FUNCTION.

TISSUE SPECIFICITY Description of the tissue-specific expression of mRNA or protein

TOXIC DOSE Description of the lethal dose (LD), paralytic dose (PD) or effective dose of a protein

Page 18: Annotazione Sequenze + informazioni Manuale Automatica.

FeaturesFeature type

Start-End range23-61

DescriptionDipende dal tipo

INIT_MET - Initiator methionine. SIGNAL - Extent of a signal sequence (prepeptide).PROPEP - Extent of a propeptide TRANSIT - Extent of a transit peptide (mitochondrion, chloroplast, thylakoid)CHAIN - Extent of a polypeptide chain in the mature protein. PEPTIDE - Extent of a released active peptide.TOPO_DOM - Topological domain. TRANSMEM - Extent of a transmembrane region.DOMAIN - Specific combination of secondary structures. NON_TER - The residue at an extremity of the sequence is not the terminal residue.REPEAT - Extent of an internal sequence repetition. CA_BIND - Extent of a calcium-binding region.ZN_FING - Extent of a zinc finger region. DNA_BIND - Extent of a DNA-binding reNP_BIND - Extent of a nucleotide phosphate-binding region. REGION - Extent of a region of interest in the sequen Hydrophobic.COILED - Extent of a coiled-coil region. MOTIF - Short (up to 20 amino acids) sequence motif of biological interest.COMPBIAS - Extent of a compositionally biased region. ACT_SITE - Amino acid(s) involved in the activity of an enzyme.METAL - Binding site for a metal ion. BINDING - Binding site for any chemical group (co-enzyme, prosthetic group, etc.).SITE - Any interesting single amino-acid site, not defined. SE_CYS - SelenocysteineMOD_RES - Posttranslational modification of a residue. LIPID - Lipid bindingCARBOHYD - Glycosylation site. DISULFID - Disulfide bond.CROSSLNK - Posttranslationally formed amino acid bonds. VARSPLIC - Description of sequence variants produced by alternative splicing.VARIANT - Authors report that sequence variants exist. MUTAGEN - Site which has been experimentally altered.UNSURE - Uncertainties in the sequence CONFLICT - Different sources report differing sequences.NON_CONS - Non-consecutive residues. HELIX - Secondary structureSTRAND - Secondary structure TURN - Secondary structure

Page 19: Annotazione Sequenze + informazioni Manuale Automatica.

IdentificativiEntry name

Swissprot: Sigla + specie Es. B2MG_HUMAN

TrEMBL: Accessionnumber + specieEs. O95417_MOUSE

Accession NumberEs: Q1AAA9

Primary : Quello da utilizzareSecondary : Elenco dei precedenti

numeri non più usati

Page 20: Annotazione Sequenze + informazioni Manuale Automatica.

Cross references70 Banche dati

Altri Idaltre informazioni

Ididentificativo primario della

bancadati

Es. EMBL; AJ297977; CAC17465.1; -; Genomic_DNA.

Databasenome banca dati