Job Description
Job Title:  Research Assistant (Language Modeling), School of Computing
Posting Start Date:  26/05/2025
Job Description: 

Job Description

The School of Computing at the National University of Singapore (NUS) is seeking a motivated and capable Research Assistant to contribute to research projects in the area of efficient language modeling. In this role, you will be responsible for researching, designing, and implementing techniques to improve the efficiency of large-scale language models for real-world applications. You will work at the intersection of natural language processing (NLP), deep learning, and systems optimization, driving advances in inference speed, memory usage, and energy efficiency—without compromising model performance


Responsibilitie

  • Research and implement advanced methods for efficient language modeling, including but not limited to model compression, pruning, quantization, distillation, and sparsity techniques
  • Design, train, and evaluate transformer-based models (e.g., BERT, GPT, LLaMA, etc.) optimized for efficiency on various hardware platforms (CPUs, GPUs, edge devices)
  • Analyze trade-offs between accuracy, latency, and resource consumption to meet deployment requirements for production or user-facing applications
  • Collaborate with cross-functional teams to integrate efficient language models into company products and pipelines
  • Stay up to date with the latest literature and advances in efficient deep learning, presenting findings internally and externally as needed
  • Publish research results in top-tier conferences and journals (optional, depending on company).
  • Contribute to open-source projects and/or the development of internal tooling for model efficiency.

Job Requirements

•    MS/PhD in Computer Science, Electrical Engineering, Mathematics, or a related field; or equivalent industry experience.
•    Strong background in natural language processing, machine learning, and deep learning.
•    Proven experience with model efficiency techniques such as quantization, pruning, distillation, weight sharing, etc.
•    Hands-on experience training and optimizing large language models using PyTorch, TensorFlow, JAX, or similar frameworks.
•    Familiarity with hardware-aware model design and deployment for edge, cloud, or mobile environments.
•    Strong programming skills in Python; proficiency with C++ or CUDA is a plus.
•    Excellent analytical, problem-solving, and communication skills.
•    (Optional) Track record of publications in relevant conferences (NeurIPS, ICML, ACL, EMNLP, etc.).
•    Experience with distributed training and large-scale model deployment.
•    Understanding of system-level optimizations (e.g., ONNX, TensorRT, TVM).
•    Experience with low-resource or multilingual NLP.
•    Contribution to relevant open-source projects.

More Information

Location: Kent Ridge Campus

Organization: School of Computing

Department : Department of Computer Science

Employee Referral Eligible: No

Job requisition ID : 29029