Large-scale -omics data solutions for local and cloud based database storage, efficient access to public repositories, quality control, commercial and academic licensing information, analysis pipeline structuring, and more!
I have over 10 years of experience in large-scale data access, curation, quality control, and analysis, including single-cell and population NGS, proteomics, phenotypic screening analysis, and database design and implementation.
- Multiple NGS data modality analysis using R (bioconductor) and Python including population and single-cell RNA-seq for variant calling, functional genomics, and basic transcriptomics along with data and pipeline QC and results’ integration
- Cloud computing and database storage with GCP, Azure, and AWS
- Data architecture, indexing, and search design and implementation with Solr, Hadoop, and REST
- PostgreSQL and SQLite relational database design, management, data curation and validation, and ETL using Aquadata, PGAdmin, and PipelinePilot
- GitHub, Trello, JIRA, and Basecamp based project tracking for issues, code publication, etc.
- Natural language processing through Python of PubMed-published data to determine specific candidate datasets for analysis projects
Publications
- Stathias, V., Turner, J., Koleti, A., Vidovic, D., Cooper, D., Fazel-Najafabadi, M., . . . Schürer, S. C. (2019). LINCS Data Portal 2.0: next generation access point for perturbation-response signatures. Nucleic Acids Research. doi:10.1093/nar/gkz1023
- Cooper, D. J., & Schürer, S. (2019). Improving the Utility of the Tox21 Dataset by Deep Metadata Annotations and Constructing Reusable Benchmarked Chemical Reference Signatures. Molecules, 24(8), 1604. Retrieved from https://www.mdpi.com/1420-3049/24/8/1604
- Stathias, V., Koleti, A., Vidović, D., Cooper, D. J., Jagodnik, K. M., Terryn, R., . . . Schürer, S. C. (2018). Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center. Scientific Data, 5, 180117.
- Keenan AB, Jenkins S, Jagodnik K,…. Cooper DJ,….. Pillai A. The Library of Integrated Network-Based Cellular Signatures NIH Program: System-Level Cataloging of Human Cells Response to Perturbations (2018). Cell Systems, 6(1):13-24
- Koleti A, Terryn R,…. Cooper DJ,….. Schürer SC. Data Portal for the Library of Integrated Network-based Cellular Signatures (LINCS) program: integrated access to diverse large-scale cellular perturbation response data (2018). Nucleic Acids Res. 46(D1):D558-D566
- Cooper DJ, Zunino G, Bixby JL, Lemmon VP. Phenotypic screening with primary neurons to identify drug targets for regeneration and degeneration (2017). Mol Cell Neurosci, 80:161-169
Contact for more info