Sundermann, Linda K.: Lineage-Based Subclonal Reconstruction of Cancer Samples. 2019
Inhalt
- List of Abbreviations
- Notation Tables
- 1 Introduction
- 2 Background
- 2.1 Probabilistic Models and Optimization
- 2.2 Biological and Technical Background
- 2.2.1 Cancer and Genetic Mutations
- 2.2.2 Next-Generation Sequencing Techniques
- 2.2.3 Detecting Somatic Mutations
- 2.3 Subclonal Reconstruction of Cancer Samples
- 3 A New Lineage-Based Subclonal Reconstruction Model
- 3.1 The Likelihood Function
- 3.2 Model Components and Rules
- 3.2.1 Inferred Lineage Frequencies
- 3.2.2 Inferred Lineage Relationships
- 3.2.3 Copy Number Aberration Assignment
- 3.2.4 Simple Somatic Mutation Assignment
- 3.3 Optimization with Mixed Integer Linear Programming
- 3.3.1 Objective Function and Basic Mixed Integer Linear Program
- 3.3.2 Variables and Constraints for Lineage Frequencies
- 3.3.3 Variables and Constraints for Lineage Relationships
- 3.3.4 Variables and Constraints for Copy Number Aberrations
- 3.3.5 Variables and Constraints for Simple Somatic Mutations
- 3.3.6 Reducing the Number of Variables and Constraints
- 3.4 Optimization Complexity
- 3.5 Determining the Number of Lineages
- 4 Dealing with Ambiguity
- 4.1 Defining Ambiguity
- 4.2 Handling Ambiguity
- 4.2.1 Finding Present Ancestor-Descendant Relationships Necessary because of Likelihood Influence
- 4.2.2 Updating Lineage Relationships
- 4.2.3 Unphasing Simple Somatic Mutations
- 4.2.4 Identifying Absent Ancestor-Descendant Relationships Necessary because of Crossing Rule and Mutation Assignment
- 4.2.5 Identifying Present Ancestor-Descendant Relationships Necessary because of Sum Rule
- 4.2.6 Identifying Absent Ancestor-Descendant Relationships Necessary because of Sum Rule
- 4.3 Lineage-Based versus Population-Based Subclonal Reconstruction
- 5 Analyzing Onctopus' Performance
- 5.1 Implementation
- 5.2 Data Simulation
- 5.3 Evaluation Metrics
- 5.4 Optimality, Run Time and Memory Usage
- 5.5 Clustering Simple Somatic Mutations
- 5.5.1 Clustering Algorithms and Cluster Numbers
- 5.5.2 Building Subclonal Reconstructions with Clustered Simple Somatic Mutations
- 5.6 Fixing Copy Number Aberrations
- 5.7 Fixing Lineage Frequencies
- 5.7.1 Performance with Correct Lineage Frequencies
- 5.7.2 Inference of Lineage Frequencies Depending on the Number of Simple Somatic Mutations
- 5.7.3 Performance with Inferred Lineage Frequencies
- 5.8 Approximating Variant Allele Frequencies in Mixed Integer Linear Program
- 6 Results and Evaluation
- 7 Conclusion and Outlook
- Bibliography
- A Onctopus Software
- B Data Simulation
- B.1 Data Simulation
- B.2 Simulated Datasets for Analyzing Optimality, Run Time and Memory Usage
- B.3 Simulated Datasets for Simple Somatic Mutation Clustering Analysis
- B.3.1 Clustering Algorithms and Cluster Numbers
- B.3.2 Building Subclonal Reconstructions with Clustered Simple Somatic Mutations
- B.4 Simulated Datasets for Fixing Copy Number Aberration Analysis
- B.5 Simulated Datasets for Fixing Lineage Frequencies Analysis
- B.5.1 Simulated Datasets for Inference of Lineage Frequencies Depending on the Number of Simple Somatic Mutations
- B.5.2 Simulated Datasets for Analysis of Performance with Inferred Lineage Frequencies
- B.6 Simulated Datasets for Analysis of Approximating Variant Allele Frequencies in Mixed Integer Linear Program
- B.7 Simulated Datasets for Comparison between Onctopus, PhyloWGS and Canopy
