Job Description
The Cancer Science Institute of Singapore (CSI) is seeking a collaborative and highly skilled Genomic Data Scientist/Bioinformatician to join our Genomics and Data Analytics Core (GeDaC). Our facility is powered by a multi-disciplinary team of cloud engineering experts who are establishing a robust, petabyte-scale infrastructure on AWS for the automated processing and analysis of large-scale cancer genomics datasets.
This role is specifically designed for a scientist who excels at building state-of-the-art bioinformatics workflows for RNA-sequencing analyses. You will leverage our production-grade "AI Factory" to move beyond standard pipelines, exploring the intricate complexities of the transcriptome at a massive scale to advance biological discovery and precision medicine.
Key Responsibilities
Working collaboratively with software engineering experts, principal investigators, and research teams, the successful candidate will:
- Large-Scale Data Operations: Collaborate with a strong engineering team to design and optimize cloud-based components capable of processing and analyzing genomic datasets spanning thousands of patients.
- Intricate RNA Analysis: Lead advanced transcriptomic investigations, with a particular focus on alternative splicing, transcript discovery, RNA editing, and RNA modifications.
- Platform & API Development: Help design biologically-sensible APIs for genome analytics to be utilized by research teams.
- Scientific Collaboration: Provide expert bioinformatics consultation to CSI/NUS investigators, communicating complex results clearly to diverse audiences.
- Strategic Initiatives: Participate in state-of-the-art machine learning, AI, and big data projects driven by institutional and national initiatives.
Requirements
Education & Experience
- Master's or Ph.D. in Genetics, Genomics, Bioinformatics, Computational Biology, or Data Science.
- Proven track record of processing and analyzing high-throughput sequencing data (NGS) at scale, with an emphasis on large-scale RNA-seq or single-cell cohorts.
Technical Skills
- Programming Proficiency: Strong proficiency in Python and/or R.
- Computational Infrastructure: Familiarity with High-Performance Computing (HPC) and/or cloud computing environments (AWS preferred).
Reproducibility: Understanding of version control systems (Git) and reproducible research practices.
Professional Attributes
- Strong problem-solving abilities and the analytical thinking required to manage multi-thousand sample datasets.
- Commitment to producing high-quality, reproducible work within collaborative, multidisciplinary teams.
Preferred Qualifications
- Specialized RNA Expertise: Demonstrated experience in analyzing splicing, transcript discovery, RNA editing, or RNA modifications.
- Workflow Mastery: Familiarity with workflow management systems such as Nextflow, Snakemake, or WDL.
- Data Management: Understanding of database systems and data management practices.
- Scientific Impact: Authorship on peer-reviewed scientific publications showcasing large-scale genomic analysis.