Introduction to Bioinformatics
Chia sẻ bởi Nguyễn Xuân Vũ |
Ngày 18/03/2024 |
12
Chia sẻ tài liệu: Introduction to Bioinformatics thuộc Sinh học
Nội dung tài liệu:
Introduction to Bioinformatics
2
Questions & Help
Amir Mitchell – lecturer.
Itay Mayros, Einat Hazkani-Covo, and Shira Mintz – Teaching assistants
Emails:
[email protected], [email protected], [email protected], [email protected]
Course site
3
Course Layout
Eleven lessons – eleven weeks.
Lecture, exercise, discussion.
Presentations and exercises.
Books and additional material.
Missing lessons or exercises.
Consultation hour.
Personal gene/protein.
4
Final grade
Final exam (80%):
Multiple choice questions
Open questions
No online part
Home assignment (20%)
5
Bioinformatics
Buzzword …
Nanotechnology, Biotechnology …
Bioinformatics: Bioinformatics is the branch of computer science that focuses on sub-domains of biology: research on genes and proteins. Researchers in this field must use powerful computers and special calculation methods to process the large body of complex data generated by genetics. Using these tools, it was possible to sequence the human genome .
Lexicon-encyclobio
6
Two separate approaches
Computer science - inventing tools, developing algorithms.
Biology - Utilizing tools for biological research.
Purely bioinformatics (comparing exon/intron structure in human and mouse).
“Fairly” bioinformatics (Locating the active site of an enzyme by identifying conserved residues in the protein sequence).
7
Research outline
Databases (public, local)
Retrieve data
Analysis
Results
Literature
Lab (wet biology)
8
Databases & Tools
Free shared databases (on-line, bioinfo unit)
Internet based tools (PC)
GCG package tools (unix)
9
GCG
Commercial DNA and protein sequence analysis package.
Written by Wisconsin Genetics Computing Group.
Includes more than 130 separate tools.
10
GCG
GCG works in unix environment (OS)
Same principles apply to all GCG programs
On-line help
11
Divided work
1Access (unix and web)
2Advanced analysis, user databases, web site
12
Lesson 1 – Introduction,
Unix environment
Administration
Introduction to Bioinformatics.
NCBI
Working in Unix environment
13
Lesson 2 – databases and text based searching:
Databases: organization and entries.
Database problems.
Principles of database searching.
Unix and GCG.
14
Lesson 3 – pairwise alignment
Comparing two sequences.
Scoring: good and bad alignments.
Comparison methods.
Comparison programs.
Unix.
15
Lesson 4 – Sequence based searching
DNA or protein sequences as search queries.
Problems with sequence search.
Methods for searching (fasta, blast).
16
Lesson 5 – Multiple sequence alignment
Comparing multiple sequences.
Uses of multiple alignment.
Methods for multiple alignment, efficiency and limitations.
Profiles and consensus sequences.
17
Lesson 6 – Phylogenies
Introduction to phylogeny.
Methods for constructing evolutionary trees.
Statistical analysis of constructed trees.
18
Lesson 7 – Protein families, secondary databases
Dividing proteins into families.
Patterns.
Different approaches: motifs, fingerprints.
Different databases.
Consurf.
19
Lesson 8 – DNA sequence analysis
Gene structure.
Gene finding.
Predicting gene features.
Consurf.
20
Lesson 9 - genomes
Genome features.
Prokaryotic and Eukaryotic genomes.
Genome viewers
Model organisms
21
Lesson 10 - Various tools
Making things easy, useful tools for lab work.
Lesson 11 - Summary
Overview, Q&A before the exam.
22
Last comments
Introduction only.
Finding sites: Links and google.
Biology background.
Unix accounts.
Terminology
23
Milestones in bioinformatics
* 1953 Watson and Crick
24
Milestones in bioinformatics
25
Today …
Over 1500 fully sequenced genomes from all domains of life.
Numerous databases.
Numerous tools.
26
Today …
Archea (16)
Eukarya (20)
Bacteria (139)
Viruses (1500)
27
Examples
Human , mouse, rat, zebra fish, drosophila, yeast, anopheles, tomato, rice, wheat.
E. coli (4 strains), M. tuberculosis, M. leprae.
Mitochondria, chloroplast, plasmids.
28
Public interest:
Human Genome Project
2000 - Working draft of the Genome, work of 20 groups world wide. (http://www.ncbi.nlm.nih.gov).
2003 - Obtain a complete, high-quality genomic sequence.
Determine the sequences of the 3 billion bases.
Identify all the estimated 30,000 genes in human DNA
29
Human Genome Project
Initial analysis
15 Feb, 2001
30
NCBI – at a glance
The biggest and most comprehensive site!
Includes numerous tools and databases!
31
NCBI - overview
NCBI
PubMed
Books
OMIM
Nucleotides
Proteins
Genomes
Taxonomy
Structure
Domains
Exp’ profiles
* Cross references between the databases
32
NCBI
Citations, abstracts, full articles.
Online books, full text from books (Cell, introduction to genetic analysis)
PubMed
Books
33
NCBI
Online Mendelian Inheritance in Man. A comprehensive database of human genes and genetic disorders.
Entries include textual information and ,most importantly, references to literature and sequences.
OMIM
34
NCBI
Gene Expression Omnibus
Results from a high throughput experiments. mRNA, DNA, and protein arrays.
GEO
35
NCBI
Sequence databases. Divided into sections and sub-sections.
Protein domains, both conserved sequence domains and 3D domains.
Genomes
Nucleotides
Proteins
Domains
36
NCBI
3D structure of proteins (~20,000 entries).
Taxonomy of all organisms found in NCBI
Structure
Taxonomy
37
NCBI - Interconnectivity
NCBI
PubMed
Books
OMIM
Nucleotides
Proteins
Genomes
Taxonomy
Structure
Domains
Exp’ profiles
* Cross references between the databases
* Một số tài liệu cũ có thể bị lỗi font khi hiển thị do dùng bộ mã không phải Unikey ...
Người chia sẻ: Nguyễn Xuân Vũ
Dung lượng: |
Lượt tài: 1
Loại file:
Nguồn : Chưa rõ
(Tài liệu chưa được thẩm định)