Zhu, Lu: Context-specific subcellular localization prediction: Leveraging protein interaction networks and scientific texts. 2018
Inhalt
- Table of contents
- List of figures
- List of tables
- 1 Introduction
- 1.1 Understanding protein SCLs
- 1.2 The importance of the context-specific subcellular distribution of proteins
- 1.3 Computational prediction of protein SCL
- 1.4 The aim of this work
- 1.5 Structure of this work
- 2 Background
- 2.1 SCL
- 2.1.1 Cell and cellular compartmentalization
- 2.1.2 Protein subcellular localization
- 2.1.3 Protein translocation
- 2.1.4 MLP
- 2.1.5 Protein mislocalization
- 2.2 PPI
- 2.3 Basic concepts in graph theory
- 2.4 Gene co-expression network analysis
- 2.5 Bayesian inference and Gibbs sampling
- 2.6 Markov random field
- 2.7 Multi-label dataset and classification
- 2.8 Text mining data curation
- 3 Overview of protein subcellular localization prediction
- 3.1 Access to the protein SCL data
- 3.2 Computational prediction method
- 3.2.1 Sequence feature based methods
- 3.2.2 PPIN-based approaches
- 3.2.3 Limitation of existing methods
- 3.3 Spatial adjacency of SCCs
- 3.4 Direct neighbors and indirect neighbors
- 3.5 MRF for protein function prediction
- 3.6 From mono-SCL prediction to multi-SCL prediction
- 3.7 From generic SCL prediction to context-specific SCL prediction
- 3.8 Significance of tissue specificity in human biology
- 3.8.1 Tissue-specific SCL of proteins
- 3.8.2 Bring computational approaches to the study of tissue-specific SCL of proteins
- 3.9 Summary
- 4 Generic SCL prediction
- 4.1 The Bayesian Collective MRF Model
- 4.1.1 The weighted markov random field model
- 4.1.2 Gibbs sampler and likelihood estimation
- 4.1.3 Parameter learning
- 4.1.4 Collective MRFs
- 4.1.5 Computational complexity
- 4.1.6 Implementation
- 4.2 Experimental setup
- 4.3 Results
- 4.3.1 Likelihood and prediction performance
- 4.3.2 Effects of different potentials
- 4.3.3 A collective process improves the performance
- 4.3.4 Transductive learning from imbalanced MLDs
- 4.3.5 Comparison with existing methods
- 4.4 Summary
- 5 Tissue-specific SCL prediction
- 5.1 Methods
- 5.1.1 BCMRFs for predicting tissue-specific SCLs
- 5.1.2 Implementation
- 5.1.3 Data resources
- 5.1.4 Performance measures
- 5.2 Results
- 5.2.1 Statistics of the tissue-specific physical PPINs
- 5.2.2 Statistics of the tissue-specific SCLs
- 5.2.3 The impact of the noisy tissue-specific functional associations on tissue-specific SCL prediction
- 5.2.4 Genome-wide tissue-specific SCLs prediction
- 5.2.5 Predictions for novel tissue-specific protein candidate validated by text mining
- 5.3 Summary
- 6 Tissue-specific SCL Data Curation using Text mining
- 6.1 Methods
- 6.1.1 A. Retrieving relevant abstracts
- 6.1.2 B. Text preprocessing
- 6.1.3 C. NER
- 6.1.4 D. Term normalization
- 6.1.5 E. Extraction and scoring of tissue-protein-SCL associations
- 6.1.6 Experimental design and evaluation
- 6.2 Results
- 6.2.1 Dictionary-based tagger
- 6.2.2 Evaluation against manual curated corpus - Tissue
- 6.2.3 Evaluation against experimental dataset - Cell lines
- 6.2.4 Creation of TS-SCL database
- 6.2.5 TS-SCL database web interface
- 6.2.6 Generality of the approach
- 6.2.7 Limitation and future direction
- 6.3 Summary
- 7 Tissue-specific subcellular distribution of the human AGO2 protein
- 7.1 Tissue-specific PPI networks of the human AGO2 protein
- 7.2 Characterization of the tissue-specific networks
- 7.2.1 Roles in RNA silencing event
- 7.2.2 Roles in mRNA splice and translation
- 7.2.3 Roles in tumorigenesis
- 7.3 Analysis of the prediction results
- 7.4 Summary
- 8 Conclusion and discussion
- References
- Notations
