©2003/04 Alessandro Bogliolo Bioinformatica Corso di Laurea Specialistica in Biotecnologie Anno...
-
date post
19-Dec-2015 -
Category
Documents
-
view
217 -
download
0
Transcript of ©2003/04 Alessandro Bogliolo Bioinformatica Corso di Laurea Specialistica in Biotecnologie Anno...
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
BioinformaticaCorso di Laurea Specialistica in
BiotecnologieAnno Accademico 2003/04
Alessandro BoglioloSTI - University of Urbino
61029 Urbino - Italy
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Bioinformatics
When Biology Meets Informatics
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
©2003/04 Alessandro Bogliolo
Outline1. Computer systems vs Biological systems
2. Computational biology: using computers in biology1. Tools
2. Applications
3. Example: comparing genetic sequences
3. Bio-computing: using biology to inspire computers1. Models
2. Applications
3. Examples: cellular automata, DNA computing
4. The big picture: closing the loop
5. Research interests and open issues
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
©2003/04 Alessandro Bogliolo
Computer sys. vs Biological sys.1. Determinism
• C.sys. are deterministic in nature, B.sys. are not
2. Size• Atoms (10-10m), Molecules (10-9m), Transistors (10-7m), Cells (10-6m) • Genome (109 bp), Chip (108 transistors)
3. Speed• Mean time between molecular collisions in liquids (10-12s)• Propagation time through a gate (10-11s)• CPU clock cycle (10-10s), RAM access (10-8s, 109B/s), HD access (10-3s, 107B/s)• Processing time of a neuron (10-5s)• DNA hybridization time (102s), PCR cycle time (102s)
4. Memory• RAM bit (109), HD bit (1012), Neurons (1011), DNA base-pairs (109)
5. Parallelism• CPU (100:102units), RAM/HD (102bit), Human brain (1011)
6. Observability and controllability• C.sys. are much more observable and controllable than B.sys.
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
©2003/04 Alessandro Bogliolo
Outline1. Computer systems vs Biological systems
2. Computational biology: using computers in biology1. Tools
2. Applications
3. Example: comparing genetic sequences
3. Bio-computing: using biology to inspire computers1. Models
2. Applications
3. Examples: cellular automata, DNA computing
4. The big picture: closing the loop
5. Research interests and open issues
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools• Experimental data are always affected
by noise that may hide useful information
• Noise can be selectively removed based on its time-domain or frequency-domain statistics
Noise filtering
Computer systems in biologyTools
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Noise filtering
Decoding
Tools
• Experimental data need to be interpreted according to rules or experience
• Software tools may automatically decode data to extract information
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Noise filtering
Decoding
String comparison
Data basemanagement
Clusteringclassification
Tools• Most analysis are based on comparison• Most data can be represented as strings:
sequences of symbols taken from a finite alphabet (e.g., {A,C,G,T})
…ACCTGCCTTTCAG…
• String comparison is a key problem in biology.
• A metric is required (e.g., edit distance):Entry …ACCTGCCTTTCAG…Query …ACTGCATTTCCAG…
…ACCTGCATTTCCAG… (insert)
…ACCTGCATTTCCAG… (replace)
…ACCTGCCTTTCCAG… (delete)
C
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Noise filtering
Decoding
String comparison
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Tools
• Most data analysis techniques are traditionally based on graphical representations to be interpreted by an expert scientist
• Software tools that automate the analysis of graphical data mimic the human capability of recognizing application-specific shapes and patterns
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Noise filtering
Decoding
String comparison
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
StatisticalAnalysis
Tools
• Statistical analysis is required to capture the non deterministic behavior of most natural phenomena and experimental procedures
• The interpretation of a set of data is always subject to uncertainty
• Statistical analysis tools extract statistical properties of data set and estimate the confidence level of measured/estimated parameters
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
Tools
Biological system
Model Results
Results
comparison
hypo
thes
is
validation
simulation
experiment
modeling
Biological system
Model Results
Results
fitting
indi
rect
mea
sure
characterization
simulation
experiment
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
Neurophysiology
Patternanalysis
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Structuralanalysis
Proteomics
DB search
2D E.P.
MS-MS
MicroArray analysisReward matrix
ProteinFolding
Humankinetics
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Genomics
Sequencing
Base calling
Alignment
DB search
Edit costs
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Computer systems in biology
Genomics Proteomics NeuroPhys.
Noise filtering
Decoding
String comparison
Simulation
Data basemanagement
Patternrecognition
Imageprocessing
Clusteringclassification
Modeling
Analysis
StatisticalAnalysis
BioMechanics.
Signalacquisition
Codediscovery
Microscopeanalysis
Neuronalsimulation
Sequencing
Structuralanalysis
Base calling
DB search
Alignment
2D E.P.
MS-MS
DB search
MicroArray analysisReward matrixEdit costs
ProteinFolding
Signalacquisition
Signalanalysis
Movementanalysis
BioMetrics
Phylogenesis
Tools Applications
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
©2003/04 Alessandro Bogliolo
Outline1. Computer systems vs Biological systems
2. Computational biology: using computers in biology1. Tools
2. Applications
3. Example: comparing genetic sequences
3. Bio-computing: using biology to inspire computers1. Models
2. Applications
3. Examples: cellular automata, DNA computing
4. The big picture: closing the loop
5. Research interests and open issues
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
String comparison: edit distance
A C G T C CA
AG
CAC
Query: ACGTCCEntry: AGACAC
A match (0)AC insert (1)
ACGTCA delete (1)
ACGA replace (1)TACG match (0)
ACGTC match (0)
ACGTCC match (0)
0 1
1
2
2
3
3
• Edit operations (match, mismatch, insert, delete) performed on the Query string to match the Entry string can be represented as 1-step moves on a NxM matrix
• A sequence of edit operations is a path from the first entry (1,1) to the last entry (N,M) of the matrix
• Each move is associated with a cost
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
String comparison: edit distance
2 3 4 5
1 2 2 3 4
2 3 2 3 4
3 2 3 3 3
4 3 3 4 3
5 4 4 4 4
A C G T C CA
AG
CAC
Query: ACGTCCEntry: AGACAC
A match (0)AC insert (1)
ACGTCA delete (1)
ACGA replace (1)TACG match (0)
ACGTC match (0)
ACGTCC match (0)
0 1
1
2
2
3
3
• Entry (i,j) represents the edit distance between the first i characters of the Query and the first j characters of the Entry
• Entry (N,M) represents the global edit distance
• Entry (I,j) can me incrementally computed from (i-1,j) (i-1,j-1) (i,j-1)
• The algorithmic complexity of edit distance computation is O(NxM)
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
©2003/04 Alessandro Bogliolo
Outline1. Computer systems vs Biological systems
2. Computational biology: using computers in biology1. Tools
2. Applications
3. Example: comparing genetic sequences
3. Bio-computing: using biology to inspire computers1. Models
2. Applications
3. Examples: cellular automata, DNA computing
4. The big picture: closing the loop
5. Research interests and open issues
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
ModelsStarting from an initial population, each
generation evolves according to three main mechanisms:
• Natural selection: each individual survives and procreates based on its fitness
• Inheritance: the genome of each child is crossover of its parents’ genome
• Mutation: random mutations may occur during each individual’s lifetime
The population statistically evolves in a direction that increases the average fitness
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
Models• The firing rate of each neuron depends on
the weighted average of the firing rates of its input neurons
• Weights represent the strength of the synaptic link between the two neurons
• The weight of the synaptic link between two neurons is dynamically adjusted based on their activity
– The weight is increased if the activities of the two neurons are positively correlated
– The weight is reduced if the activities of the two neurons are negatively correlated
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Models
• Double-stranded DNA is composed of two complementary strands of DNA combined according to Watson-Crick base-pairing:
– A-T
– C-G
• Complementary bases form hydrogen bonds, while non-complementary bases do not
• Hybridization between single-stranded DNA segments is highly selective:
– The bond between complementary strands is much harder than between non-complementary ones
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Models
• Actuators– Muscles (synergic & antagonist)
• Low-level sensors– Golgi tendon organ
– Muscle spindle
• High-level sensors– Sense organs
• Low-level and high-level feedback
• Individual peculiarities
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Cellular tissues
Models
Tissues are composed of cells that:
• Are all of the same nature
• Perform the same elementary task
• Share the same genome
• Are locally connected to their neighbors
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Cellular tissues
Memory systems
Associativememories
Associative memories
Artificial intelligence
Genetic algorithms
Neural networks
Parallel systems
Cellular automata
DNAcomputation
Virtualreality.
Motiontracking
Robotics.
Antropom.robots
Artificial vision
Adaptive control
ApplicationsModels
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Cellular tissues
Memory systems
Associativememories
Associative memories
Artificial intelligence
Genetic algorithms
Neural networks
Parallel systems
Cellular automata
DNAcomputation
Virtualreality.
Motiontracking
Robotics.
Antropom.robots
Artificial vision
Adaptive control
ApplicationsModels
• The most natural interfaces of virtual reality are based in sensors that track the movements of the user
• Motion tracking is mainly used in this context to reproduce the human movements in the virtual environment and to associate virtual actions with them
• To some extent, most input devices (keyboard, mouse, ...) of PCs can be viewed as motion tracking systems
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Cellular tissues
Memory systems
Associativememories
Associative memories
Artificial intelligence
Genetic algorithms
Neural networks
Parallel systems
Cellular automata
DNAcomputation
Virtualreality.
Motiontracking
Robotics.
Antropom.robots
Artificial vision
Adaptive control
ApplicationsModels
• Find heuristic solutions to hard-to-solve problems
• Represent each solution as an individual’s genome
• Define a fitness function to evaluate the quality of a solution
• Start from a random population
• Evolve the population by means of selection, crossover and mutation
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Cellular tissues
Memory systems
Associativememories
Associative memories
Artificial intelligence
Genetic algorithms
Neural networks
Parallel systems
Cellular automata
DNAcomputation
Virtualreality.
Motiontracking
Robotics.
Antropom.robots
Artificial vision
Adaptive control
ApplicationsModels • The output of each neuron is a threshold function of the weighted average of its inputs
• A network of interconnected neurons has primary inputs and primary outputs
• The I/O function realized depend on the weights
• Weights are automatically adjusted during a training period
• In this way the NN learns its functionality from experience
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Cellular tissues
Memory systems
Associativememories
Associative memories
Artificial intelligence
Genetic algorithms
Neural networks
Parallel systems
Cellular automata
DNAcomputation
Virtualreality.
Motiontracking
Robotics.
Antropom.robots
Artificial vision
Adaptive control
ApplicationsModels • Parallel exploration of large design spaces
• Use DNA strands to encode solutions
• Generate DNA strands representing all possible solutions
• Select DNA strands representing good solutions (or best solutions) according to a specific goal
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Cellular tissues
Memory systems
Associativememories
Associative memories
Artificial intelligence
Genetic algorithms
Neural networks
Parallel systems
Cellular automata
DNAcomputation
Virtualreality.
Motiontracking
Robotics.
Antropom.robots
Artificial vision
Adaptive control
ApplicationsModels
• Large arrays (or matrices) of locally-connected elementary units
• Assign to each unit elementary independent tasks
• Make as many units as possible work in parallel
• Use parallelism to reduce computation time
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Cellular tissues
Memory systems
Associativememories
Associative memories
Artificial intelligence
Genetic algorithms
Neural networks
Parallel systems
Cellular automata
DNAcomputation
Virtualreality.
Motiontracking
Robotics.
Antropom.robots
Artificial vision
Adaptive control
ApplicationsModels Content-addressable memories
• Retrieve a memory cell based on its content rather than on its address or position
• Neural networks may work as associative memories since they can learn how to retrieve data independently of their position
• DNA computers may work as associative memories if DNA strands are used to represent data. In this case, a complementary strand can be used to retrieve (by means of hybridization) the elements with the desired content.
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Cellular tissues
Memory systems
Associativememories
Associative memories
Artificial intelligence
Genetic algorithms
Neural networks
Parallel systems
Cellular automata
DNAcomputation
Virtualreality.
Motiontracking
Robotics.
Antropom.robots
Artificial vision
Adaptive control
ApplicationsModels • Both genetic algorithms and neural networks may be used to implement automatic control/driving systems that learn from experience or self-adapt to the environment
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Cellular tissues
Memory systems
Associativememories
Associative memories
Artificial intelligence
Genetic algorithms
Neural networks
Parallel systems
Cellular automata
DNAcomputation
Virtualreality.
Motiontracking
Robotics.
Antropom.robots
Artificial vision
Adaptive control
ApplicationsModels
• Neural networks are often used in artificial vision applications for their inherent clustering capabilities
• They can be trained to recognize– The same object from different points of view
– The same handwritten character
– Objects with the same shape
– …
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Biological models
Bio-inspired computer systems
Speciesevolution
Reinforc.learning
DNAhybridization
Humankinetics
Cellular tissues
Memory systems
Associativememories
Associative memories
Artificial intelligence
Genetic algorithms
Neural networks
Parallel systems
Cellular automata
DNAcomputation
Virtualreality.
Motiontracking
Robotics.
Antropom.robots
Artificial vision
Adaptive control
ApplicationsModels
• Most robots mimic (part of) the human body and its movements
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
©2003/04 Alessandro Bogliolo
Outline1. Computer systems vs Biological systems
2. Computational biology: using computers in biology1. Tools
2. Applications
3. Example: comparing genetic sequences
3. Bio-computing: using biology to inspire computers1. Models
2. Applications
3. Examples: cellular automata, DNA computing
4. The big picture: closing the loop
5. Research interests and open issues
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Cellular automata (Edit distance)• Edit distance computation on a sequential computer requires NxM
steps (one for each matrix entry)
• At each step the partial edit distance is incrementally computed as:
2 3 4 5
1 2 2 3 4
2 3 2 3 4
3 2 3 3 3
4 3 3 4 3
5 4 4 4 4
A C G T C CA
AG
CAC
Query: ACGTCCEntry: AGACAC0 1
1
2
2
3
3
)()1,(
)()()()1,1(
)(),1(
min),(
insertCjiED
replaceCjEntryiQueryjiED
deleteCjiED
jiED
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Cellular automata (Edit distance)• This functionality can be implemented by an elementary
computational unit (i.e., a cell) locally connected to similar cells
2 3 4 5
1 2 2 3 4
2 3 2 3 4
3 2 3 3 3
4 3 3 4 3
5 4 4 4 4
A C G T C CA
AG
CAC
Query: ACGTCCEntry: AGACAC0 1
1
2
2
3
3
i,j
i-1,ji-1,j-1
i,j-1
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Cellular automata (Edit distance)
• Each cell can compute as soon as its inputs are ready
• A matrix of NxM elementary cells can compute edit distance in N+M steps (rather than in NxM steps)
2
2
2
4
2
2
2
4
4
3
3
3
4
3
3
4
3
4
A C G T C CA
AG
CAC
Query: ACGTCCEntry: AGACAC0
1
1 3
3
3
1
5
3
3
3
5
2 4
4
4
2
3
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
DNA computing (Traveling salesman)a
b
c
de
f
g
b-c-f-g-a-e-d
ab
c
de
f
ga
b
c
de
f
ga
b
c
de
f
g
b-c-f-e-db-c-dGraph
• A Hamiltonian path is a path that visits every vertex in a graph exactly once
• Finding a Hamiltonian path is NP-hard (the search space grows exponentially with the number of vertexes)
• A computer algorithm traverses the decision tree sequentially
bc
d
fe d
g a e d
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
DNA computing (Traveling salesman)a
b
c
de
f
g
• Representation: • each vertex is represented by a sequence of (say) 20 bases
• each edge is represented by a sequence of 20 bases:
the last 10 of source + the first 10 of destination
• Generate all possible solutions by mixing together:• many copies of all edges
• many copies of node complements
• Use lab techniques to isolate and read solutions representing Hamiltonian paths
accgttacgtcggtaactgc
b
cacagttggtagtcagtcgg
c
b-c
cggtaactgccacagttggttggcaatgcagccattgacggtgtcaaccatcagtcagcc
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
©2003/04 Alessandro Bogliolo
Outline1. Computer systems vs Biological systems
2. Computational biology: using computers in biology1. Tools
2. Applications
3. Example: comparing genetic sequences
3. Bio-computing: using biology to inspire computers1. Models
2. Applications
3. Examples: cellular automata, DNA computing
4. The big picture: closing the loop
5. Research interests and open issues
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
The big picture: closing the loopComputation tools
Biological applications
Biologicalmodels
Bio-inspired computer systems
Com
puta
tiona
l Bio
logy
Bio
-ins
pire
d co
mpu
ting
Computing systemsBiological systems
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Biological applications
Biologicalmodels
Bio-inspired computer systems
ComputationalBiology
Bio-inspiredcomputing
Genomics Cellular tissues
Cellular automataString comparison
Bio-inspired DNA comparison
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Edit distance computation performed by the Bio-wall:
a giant reconfigurable computational tissue
Bio-inspired DNA comparison
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Computation tools
Biological applications
Biologicalmodels
Bio-inspired computer systems
ComputationalBiology
Bio-inspiredcomputing
Genomics DNA hybridization
DNA-based assoc.mem.DB-search
DNA-based Gene-Bank
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
DNA-based Gene-Bank• Create a bank of gene-specific DNA fragments with associated labels representing gene name and properties• Use DNA encoding to represent labels• Search the data base using as a query a marked DNA strand complementary to the DNA template under analysis• Select gene fragments hybridizated to the marked query• Decode labels associated with the selected strands• If gene-specific strands are immobilized at known positions on a matrix, positions could be used in place of labels to represent properties.
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
©2003/04 Alessandro Bogliolo
Outline1. Computer systems vs Biological systems
2. Computational biology: using computers in biology1. Tools
2. Applications
3. Example: comparing genetic sequences
3. Bio-computing: using biology to inspire computers1. Models
2. Applications
3. Examples: cellular automata, DNA computing
4. The big picture: closing the loop
5. Research interests and open issues
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Research interests• Genomics
– DNA chip: microfabricated sensors of DNA hybridization– Base calling: improving accuracy of current techniques– String comparison: exploit string compression and parallelism
• Proteomics– 2-D analysis: align and compare 2-D images– SM-SM decoding: de-novo decoding of spectra
• Neurophysiology– Neuronal simulation: validate biological models– NeuroChip: create a stable interface between cultured neurons and
electronics
• Movement analysis– Motion tracking
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
People involved in BioInfo topics• Group:
– Andrea Acquaviva (motion tracking)– Valerio Freschi (base calling – string comparison)– Emanuele Lattanzi (neuronal simulation)– Matteo Canella, Filippo Miglioli (cellular automata)
• Internal cooperations– Prof. Cuppini (Neurophysiology)– Prof. Magnani (Genomics)– Prof. Cappiello (Proteomics)
• Main external cooperations– UniBO – STM (DNA BioSensors)– UniGE – IRST (NeuroChip)– UniFE – EPFL (cellular automata)