Schirmer, Stefanie: Comparing forests. 2011
Inhalt
- Introduction
- Motivation: the RNA structure alignment problem
- Introduction: RNA, a molecular contortionist
- Overview of this work
- Foundations and Definitions
- Basic Definitions
- Structure levels of RNA
- RNA secondary structure representations
- Basepair representation
- Vienna dot-bracket representation
- Tree representation
- Mountain representation
- Dome representation
- Circle representation
- Graph representation
- Other representations
- Discussion
- Distance measures for structure comparison
- Methods for RNA structure comparison
- Basepair distances
- Naive base pair metric
- Hausdorff distance
- Prohorov metric
- Mountain metric
- Discussion of the basepair metrics
- Encoding as a string and alignment
- Konings and Hogeweg - string encoding from mountain representation
- Shapiros encoding - string encoding from coarse grained tree representation
- Tree representation distance - from tree to string by counting children
- Discussion of string encoding methods
- Sequences with structure information, arc annotated sequences
- Evans/LAPCS problem
- Zhang's model
- Jiang's model - A general edit distance between RNA structures
- Alignment hierarchy / Alignments of RNA structures
- Discussion of Arc annotated sequence methods
- SCFGS/HMMs
- Stochastic context-free grammars for tRNA modeling
- Small subunit ribosomal RNA modeling using stochastic context-free grammars
- Tree and forest comparison
- Tree edit distances
- Tree alignment distances
- Tree edit and tree alignment
- Discussion of the tree and forest comparison models
- Distances between graphs
- Multiple layer secondary structure comparison
- A graph theoretical approach to predict common RNA secondary structure motifs including pseudoknots of unaligned sequences
- Discussion of graph based methods
- The forest alignment model
- Computing the forest alignment
- The search space of alignments for general forests
- Scoring
- Global alignment of general forests
- Defining the ranges of recursive computation: Relevant Closed Subforests
- Stepping down in the recursion: Aligning a pair of CSFs
- Sketching a control structure
- Tabulation
- Index mapping
- Calculation order
- Dynamic programming algorithm for global forest alignment
- Global alignment of RNA forests
- Local alignment of general forests
- Relevant CSFs for local alignment
- Dynamic Programming algorithm for local alignment of general forests
- Local alignment of RNA forests
- Small in large alignment of forests
- Discussion: Role of relevant CSF pairs; edit operations for RNA
- Pairwise vs. multiple forest alignment
- Time and space complexity
- Gaps and affine gap cost alignment
- Motivation and model
- The scattered alignment problem
- Gaps
- Affine gap costs on sequences
- Affine gap costs on forests
- The search space of forest alignments with affine gap cost model
- Recursive enumeration of the search space of forest alignments with affine gap costs
- Tabulation
- Scoring
- Relevant CSF Pairs
- Aligning two CSFs
- Prototyping in Haskell
- Developing a dynamic programming algorithm
- Global forest alignment with affine gap costs
- Global forest alignment of RNA with affine gap costs
- Local forest alignment with affine gap costs
- Multiple forest alignment with affine gap costs
- Time and space complexity
- Anchoring by shape abstraction
- Motivation - shape anchoring
- Shape abstraction
- Acquiring anchors for the input forest
- The anchoring
- The anchored forest alignment problem
- Time and space complexity
- Discussion
- Top down and bottom up alignment
- Case correction for pair nodes
- Constructing well-formed RNA forest alignments
- The recursive subproblems of the edit operations
- The recursive subproblems of the general forest edit operations
- The recursive subproblems of the RNA forest alignment
- The recursive subproblems of the pair replacement operation
- The recursive subproblems of the pair deletion operations
- The recursive subproblems of the pair insertion operations
- A new well-formed RNA forest alignment algorithm
- Discussion and Complexity
- Updated recurrences for aligning RNA forests
- Updated abstract recurrences and adjustment of the scoring functions
- Discussion of the new P-node edit operations
- RNAforester 2.0
- Applications
- Algorithms for RNA 2D structure prediction
- Problem: from a family of sequences to a structural consensus
- Plan ACstar: Improving structural alignments in the twilight zone
- RNAforecast as an alternative to the Sankoff algorithm
- Performance of RNAforecast with RNAforester 2.0
- Program evaluation
- Solving the scattered alignment problem with hand tuned scores
- General measurements - on Rfam
- Consistency checks
- Speedup measurements
- Score decline with the anchoring
- Benefits regarding the biological model
- Conclusion
