Job Description

Job Title:  Research Associate (Data Engineering)
University-Level Unit:  College of Design and Engineering
Faculty/Department-Level Unit:  Civil and Environmental Engineering
Employee Category:  Research Staff
Location_ONB:  Kent Ridge Campus
Posting Start Date:  12/06/2026

Job Description

 

We are seeking a skilled and motivated Research Associate to join our environmental informatics team. In this role, the candidate will build and maintain the data infrastructure that underpins our environmental monitoring and early-warning systems. The work will involve diverse, high-volume data streams — including rainfall records, temperature sensors, radar imagery, and computer vision outputs — to deliver a unified, query table, and secure data platform that drives research, operational decision-making, and stakeholder dashboards. 


Key Responsibilities
1.  Environmental Data Platform
•    Design, build, and maintain a unified database to ingest and store diverse environmental data streams: rain gauge records, gridded temperature data, rainfall radar (e.g. OPERA, NEXRAD), satellite imagery, and computer vision model outputs.
•    Define and enforce common data schemas and ontologies across heterogeneous source formats (NetCDF, HDF5, GeoTIFF, CSV, JSON, REST/API feeds).
•    Implement scalable ingestion pipelines supporting real-time streaming and batch historical loads.
•    Ensure data traceability with robust metadata, provenance tracking, and versioning.


2.  Data Processing & Quality Assurance
•    Develop and maintain automated pipelines for data cleaning, outlier detection, and quality flagging.
•    Implement missing-data imputation methods appropriate to environmental time-series and spatial fields (e.g. interpolation, climatological fill, ML-based gap-filling).
•    Apply noise-removal algorithms (e.g. signal filtering, radar clutter suppression, spike detection) across sensor and remote-sensing data types.
•    Document processing logic and maintain reproducible workflow configurations. 


3.  Visualisation & Dashboards
•    Design and develop interactive dashboards for operational and research users, displaying spatial maps, time-series plots, and aggregated statistics.
•    Integrate visualisation tools (e.g. Grafana, Superset, Plotly Dash, or custom web front-ends) with the data backend.
•    Collaborate with domain scientists to translate monitoring requirements into effective visual analytics.
•    Ensure dashboards remain performant and responsive under live data load. 


4.  Data Security & Governance
•    Implement and maintain role-based access control (RBAC) for all data assets.
•    Enforce data encryption at rest and in transit; manage secrets and credentials securely.
•    Support compliance with relevant data governance policies and institutional data-sharing agreements.
•    Maintain audit trails and access logs; respond to security reviews and risk assessments.


5.  Infrastructure & Operations
•    Manage cloud or on-premise database services (e.g. PostgreSQL/PostGIS, TimescaleDB, InfluxDB, or equivalent); tune for time-series and geospatial query performance.
•    Maintain CI/CD pipelines for data pipeline code; apply version control best practices.
•    Monitor pipeline health, set up alerting for failures, and respond to incidents.
•    Contribute to infrastructure-as-code practices (Docker, Kubernetes, Terraform or equivalent)

Job Requirements

 

•    Possess at least a Master’s degree in computer science, Data Engineering, Environmental Informatics, or a closely related field.
•    3–6 years of professional experience in data engineering, with demonstrable work on time-series or geospatial data.
•    Proficiency in Python (pandas, NumPy, xarray, or similar) and SQL; experience with at least one workflow orchestration tool (Airflow, Prefect, Luigi, etc.).
•    Hands-on experience with geospatial or scientific data formats: NetCDF, HDF5, GeoTIFF, GeoJSON, or similar.
•    Working knowledge of relational and time-series databases, with practical experience in data modelling.
•    Familiarity with cloud platforms (AWS, GCP, or Azure) and containerisation (Docker).
•    Solid understanding of data security principles: encryption, RBAC, secrets management. 
•    Open to fixed-term contract. 


Experience
•    Experience with environmental, meteorological, or hydrological datasets (radar QPE, NWP outputs, IoT sensor networks).
•    Familiarity with PostGIS or other spatially enabled database extensions.
•    Exposure to machine learning pipelines or MLOps practices for model output ingestion.
•    Experience building dashboards with Grafana, Apache Superset, Plotly Dash, or equivalent.
•    Contributions to open-source scientific data tooling (e.g. xarray, GDAL ecosystem).


Core Competencies
•    Preferably knowledge with Python, SQL, Bash
•    Time-series & geospatial DBs
•    Data pipeline orchestration
•    Cloud & containerisation
•    Dashboard & BI tools
•    Data security & RBAC

Req ID:  33302