Stöver, Ben; Stöver, Ben Christoph; Stöver, Ben C.: Software components for increased data reuse and reproducibility in phylogenetics and phylogenomics. 2018
Inhalt
- 1 General Introduction
- 1.1 Fostering data reuse and reproducibility by providing software that simplifies annotating phylogenetic data with necessary metadata
- 1.1.1 Metadata annotation is important to make data accessible and reusable
- 1.1.2 Metadata annotation is important to increase reproducibility
- 1.1.3 Phylogenetic data formats
- 1.1.4 Linking metadata using ontologies
- 1.1.5 Public databases
- 1.1.6 Software components developed in this thesis provide missing functionality to fostering data reuse and increasing reproducibility
- 1.2 Providing tools to compare, combine and present results from alternative analyses
- 1.3 How the developed software components combine
- Part I – Fundamental software libraries to process, display and edit phylogenetic data and metadata
- 2 JPhyloIO: A Java library for event-based reading and writing of different phylogenetic file formats through a common interface
- 2.1 Introduction
- 2.2 Design and implementation
- 2.2.1 Event streams for reading documents
- 2.2.2 Data adapters for writing documents
- 2.2.3 Supported formats
- 2.2.4 Generalization over different metadata concepts
- 2.2.5 Ways to extend JPhyloIO
- 2.3 Discussion
- 2.3.1 Comparison with other libraries
- 2.3.2 Event-based processing versus predefined library data structures
- 2.3.3 Current usage
- 2.3.4 Future development
- 2.4 Conclusion
- 2.5 Data Accessibility
- 2.6 Declarations
- 3 LibrAlign: A flexible Java GUI library for displaying and editing multiple sequence alignments and attached raw- and metadata data
- 3.1 Background
- 3.2 Implementation
- 3.2.1 GUI component architecture
- 3.2.2 TIC and the abstraction over Swing and SWT
- 3.2.3 Data model
- 3.2.4 I/O and interaction with JPhyloIO
- 3.3 Results and discussion
- 3.3.1 Alignment GUI components and editing capabilities
- 3.3.2 Data model
- 3.3.3 I/O and metadata access
- 3.3.4 Comparison to other software
- 3.3.5 Current usage
- 3.3.6 Future perspectives
- 3.4 Conclusion
- 3.5 Availability and requirements
- 3.6 Declarations
- Part II – Applications to model, visualize, edit and compare phylogenetic data and metadata
- 4 Sample data processing in an additive and reproducible taxonomic workflow by using character data persistently linked to preserved individual specimens
- 4.1 Introduction
- 4.2 Conceptual foundations of integrated sample data processing
- 4.2.1 Organismic samples, their associations and data
- 4.2.2 Processing sample metadata
- 4.2.3 Linking specimen-based character data to sample metadatasets
- 4.2.4 Taxon assignment of samples and their data
- 4.2.5 Aggregating specimen-based character data at the taxon level
- 4.3 Workflow implementation using the EDIT Platform
- 4.3.1 Extending the EDIT Platform to handle the variety of sample data
- 4.3.2 Basic functionalities of the EDIT Platform, scalability and use cases
- 4.4 Steps of the integrated sample data workflow
- 4.4.1 Scope of the workflow
- 4.4.2 Establishing a reproducible connection between sampled individuals and all types of samples derived from them
- 4.4.2.1 Searching, retrieving and importing of sample metadata
- 4.4.2.2 Editing metadatasets
- 4.4.2.3 Building and editing specimen derivative hierarchies
- 4.4.2.4 Versioning, synchronizing and exchanging metadatasets
- 4.4.3 Stably linking character datasets to the sample derivative hierarchy
- 4.4.4 Recording and storing specimen-based morphological and molecular character data
- 4.4.5 Taxon assignment of sample metadata and character datasets
- 4.4.5.1 Adding sample data to a classification
- 4.4.5.2 Aggregating specimen-based character data at the taxon level
- 4.4.5.3 Publishing sample metadata and character data with the CDM Data Portal
- 4.4.6 Data exchange via standard exchange formats and enabling persistent, specimen-linked storage in research collections
- 4.5 Perspectives
- 4.6 Acknowledgements
- 4.7 Funding
- 5 The molecular components of the Taxonomic Editor
- 5.1 Introduction
- 5.2 Implementation
- 5.3 Results and discussion
- 5.4 Conclusion
- 5.5 Availability and requirements
- 5.6 Acknowledgements
- 6 A new version of the alignment editor PhyDE based on the recently developed functionality of JPhyloIO and LibrAlign
- 6.1 Introduction
- 6.2 Implementation
- 6.3 Results and discussion
- 6.3.1 User interface
- 6.3.2 Supported formats
- 6.3.3 Comparison to other software
- 6.3.4 Future development
- 6.4 Conclusion
- 6.5 Availability and requirements
- 6.6 Declarations
- 7 AlignmentComparator: Comparing alternative multiple sequence alignments of the same dataset
- 7.1 Introduction
- 7.2 Algorithms
- 7.2.1 Profile alignment approach
- 7.2.2 Average position approach
- 7.2.2.1 Calculating the unaligned positions
- 7.2.2.2 Performing the initial superalignment
- 7.2.2.3 Improving the superalignment
- 7.2.2.4 Space and time complexity
- 7.2.2.5 Example
- 7.2.3 Maximum sequence pair match approach
- 7.3 Implementation
- 7.4 Results and discussion
- 7.4.1 Features and user interface
- 7.4.2 Supported formats and storage of comparison results
- 7.4.3 Differences between the comparison algorithms
- 7.4.4 Comparison to other software
- 7.4.5 Future development
- 7.5 Conclusion
- 7.6 Availability and requirements
- 7.7 Declarations
- 8 TreeGraph 2: Combining and visualizing evidence from different phylogenetic analyses
- 8.1 Background
- 8.2 Implementation
- 8.3 Results and discussion
- 8.3.1 Importing data
- 8.3.1.1 Mapping statistical support onto congruent nodes
- 8.3.1.2 Finding conflicting nodes and mapping contradictory support
- 8.3.2 Editing and formatting capabilities
- 8.3.2.1 Editing of node/branch data
- 8.3.2.2 Editing operations
- 8.3.2.3 Searching, replacing and translating tree leaf names
- 8.3.2.4 Formatting document elements
- 8.3.2.5 Automatically setting line width, text height, and color
- 8.3.3 Different view modes
- 8.3.4 Exporting to graphic formats and printing
- 8.3.5 Help
- 8.3.6 Comparison to previous software
- 8.4 Conclusions
- 8.5 Availability and requirements
- 8.6 Declarations
- 9 New features of the tree editor TreeGraph 2 to handle rich metadata and compare phylogenies
- 9.1 Introduction
- 9.2 Implementation
- 9.3 Results and discussion
- 9.3.1 Interactively comparing trees
- 9.3.2 Handling data for ancestral state reconstruction
- 9.3.3 Extended I/O functionality
- 9.3.4 New ways to calculate metadata
- 9.3.5 Additional new features and improvements
- 9.3.6 Ongoing extension of the metadata model
- 9.3.7 Comparison to other tree editors
- 9.3.7.1 Interactive tree comparison
- 9.3.7.2 Handling ancestral state reconstruction data
- 9.3.7.3 Extended metadata model
- 9.3.8 Future development
- 9.4 Conclusion
- 9.5 Availability and Requirements
- 9.6 Declarations
- Part III – General purpose software libraries and the bioinfweb portal
- 10 The bioinfweb portal
- 11 bioinfweb.commons: Shared bioinfweb components made available in a library
- 11.1 Introduction
- 11.2 Modules and provided functionality
- 11.3 Conclusion
- 11.4 Availability and Requirements
- 11.5 Declarations
- 12 Toolkit Independent Components: Creating GUI components for both Swing and SWT
- Appending chapters
- 13 General discussion and outlook
- 13.1 Increasing data reuse and reproducibility
- 13.1.1 Developed functionality
- 13.1.2 Conserving relevant metadata throughout the whole taxonomic and phylogenetic workflow
- 13.1.3 Externally implemented GUI components for metadata attached by externally defined ontologies
- 13.2 Comparing phylogenetic data
- 14 List of abbreviations
- 15 References
- 16 Acknowledgements
- 17 Appendix
