Introduction

Hi there👋! My name is Joy Naomi Olusanya. I am a linguist and NLP research engineer. I hold a Bachelor’s degree in Linguistics from Obafemi Awolowo University, Nigeria. My research focuses on developing more efficient and robust NLP models for low-resource African languages. I engage in dataset curation, evaluation pipelines, and multilingual NLP systems with a focus on underrepresented and African languages.

My work spans automatic speech recognition (ASR), text-to-speech (TTS), machine translation (MT), code-switching, and code-mixing for low-resource African languages. I am particularly interested in computational linguistics, multilingual NLP, LLM evaluation, and culturally aware AI systems. More broadly, I am passionate about leveraging AI to preserve languages and cultures, expand digital inclusion, and advance responsible AI governance through impactful, community-grounded research.

I am also the Founder and CEO of the Center for Low-Resource Languages and Cultures (CLRLC), an initiative dedicated to advancing research, building language resources, and strengthening communities around underrepresented languages worldwide.

Beyond research and engineering, I actively contribute to AI and NLP communities across Africa and beyond through teaching, mentorship, and community-driven initiatives. In 2025, I served as Workshop Chair for the CLRLC-LLMs Workshop at NeurIPS 2025, organised by the CLRLC community. I currently serve as Training Manager and NLP Researcher at Tonative, where I design curricula, oversee tutors and teach. I also mentor beginners in AI, particularly students transitioning from linguistics into NLP and language technology research.

🎓
Seeking Graduate Positions

Actively looking for Master's or PhD opportunities in NLP, computational linguistics, or AI for low-resource languages.

🤝
Open to Research Collaboration

Always happy to collaborate on projects related to African language technology, multilingual NLP, AI safety, or dataset creation.

Education B.A. Linguistics — Obafemi Awolowo University, Nigeria (2021–2025)
Languages Yorùbá (Native) · English (Fluent) · Nigerian Pidgin (Fluent) · German (Basic)

News

2026
Jan
Jan
Appointed Training Manager at Tonative Data Academy
2025
Aug
Nov
Received the NeurIPS 2025 Scholar Award, which covered accommodation funding to attend NeurIPS 2025 in Mexico City.
Nov
Received the WiML @ NeurIPS 2025 Travel Funding Award — Women in Machine Learning, Mexico City
Sep
Chaired the CLRLC-LLMs Workshop at NeurIPS 2025 — Lead organiser
Aug
Received the Deep Learning Indaba Travel Grant and attended DLI Indaba in Kigali, Rwanda 🌍
Aug
Reviewed submissions for the Deep Learning Indaba Ideathon 2025
Aug
Joined Tonative as NLP Researcher & Yorùbá Language Lead · Joined YorLect as NLP/ML Engineer Lead

Research Interests

My research focuses on computational linguistics and natural language processing, with an emphasis on multilingual and low-resource language technologies. I am interested in the development and evaluation of large language models (LLMs) for underrepresented languages, with a focus on machine translation, language understanding, and cross-lingual transfer in low-resource settings. I am also interested in evaluation methodologies for NLP and LLM systems, particularly in multilingual contexts, where standard benchmarks often fail to capture linguistic diversity and real-world performance. This work extends to the preservation of language and culture in AI systems, ensuring that computational models reflect and sustain cultural knowledge embedded in language.

  • Machine Translation & Multilingual NLP
  • Automatic Speech Recognition (ASR) & Tone Processing for African Languages
  • Large Language Models (LLMs) for Low-resource African Languages
  • Cultural & Figurative Language Understanding
  • AI Safety, Evaluation & Responsible AI for African Languages
  • Dataset Creation, Language Documentation & Linguistic Annotation

Publications

Projects

LLM Evaluation · Nigerian Pidgin · Sentiment Analysis

Benchmarks GPT-4o-mini on Nigerian Pidgin English (Naija) sentiment classification using zero-shot prompting across 2,000 stratified samples. Achieves 63.2% accuracy and 0.55 Macro F1

View Project →
NMT · Fine-tuning · MarianMT · BLEU

Fine-tuned Helsinki-NLP's MarianMT (opus-mt-tc-big-en-pt) on the Pirá dataset using Hugging Face Transformers. Implemented full training pipeline including preprocessing, tokenisation, Seq2Seq training, and BLEU evaluation with qualitative sample comparisons between baseline and fine-tuned outputs.

View Project →

Experience

  • Leadership

  • Founder & CEO 2026 – Present
    Center for Low-Resource Languages and Cultures (CLRLC) · Nigeria
    • Leading initiatives focused on low-resource language research, preservation, and AI development.
    • Coordinating research, community, and educational programs for multilingual and low-resource languages.
    • Building collaborations that promote linguistic diversity and inclusive NLP innovation across Africa and beyond.
  • Executive Secretary & Founder Support Assistant (Internship) 2026 – Present
    Female Founders Tribe · Remote
    • Support founders with meeting coordination, reminders, and logistics for internal and external meetings.
    • Draft and manage emails, follow-ups, and internal/external communication.
    • Track tasks, deadlines, and administrative requests while supporting file management and record-keeping.
  • Research Experience

  • NLP Researcher & Yorùbá Language Lead Aug 2025 – Present
    Tonative · Nigeria (Remote)
    • Developed multilingual NLP datasets for low-resource African languages, ensuring linguistic diversity.
    • Applied NLLB-200M for machine translation from English to Kikuyu, Kinyarwanda, and Yorùbá using the XNLI dataset for evaluation and research publication.
    • Benchmarked models using OpenAI healthcare datasets to evaluate performance and data quality.
    • Directed data annotation and validation workflows to produce high-quality Yoruba datasets.
  • NLP / ML Engineer Lead Aug 2025 – Present
    YorLect · Nigeria (Remote)
    • Contributing to research initiatives focused on improving NLP model performance for Yorùbá and its dialects.
    • Supporting model training, evaluation, and deployment workflows.
  • AI Developer Jan 2024 – Mar 2025
    Brilla AI Project, Kwame AI · Ghana (Remote)
    • Validated and curated diverse speech datasets to improve transcription accuracy and model reliability.
    • Built reproducible preprocessing and evaluation pipelines for scalable experimentation.
  • Industry Experience

  • AI / LLM Developer Apr 2025
    Atom Group (Contract) · Nigeria (Remote)
    • Developed a commercial AI agent with Instagram scraping pipelines for social media intelligence.
    • Built modular automation systems for large-scale data collection and API integration.
  • ML / NLP Engineer Jun 2024 – Dec 2024
    Rendo AI Nigeria · Nigeria (Remote)
    • Designed Python-based web scraping pipelines using Requests, Selenium, and BeautifulSoup.
    • Automated large-scale data acquisition workflows, improving efficiency and dataset completeness.

Community Service

  • Founder & CEO, Center for Low-Resource Languages and Cultures (CLRLC)
  • Workshop Chair, CLRLC-LLMs Workshop @ NeurIPS 2025 — Mexico City
  • Reviewer, Deep Learning Indaba Ideathon 2025
  • Training Manager, Tonative Data Academy (Jan 2026 – Present)
  • AI / Machine Learning Lead, Women in Data Science Nigeria (WiDS) (Oct 2024 – Nov 2025)
  • Deep Learning / AI Co-Lead, Data Science Nigeria (Jan 2025 – Nov 2025)
  • AI Mentor, LevelUp Techies Bootcamp (Sep–Nov 2024)
  • Open Contributor, Cohere & Cohere Labs — contributed to 500+ translation validations for multilingual AI datasets, supporting low-resource and African language coverage. View Certificate ↗

Skills & Tools

Python PyTorch TensorFlow JAX Scikit-learn Transformers spaCy NLTK LangChain BLEU ROUGE BERTScore COMET Docker MLflow AWS GCP FastAPI Streamlit OpenAI API Praat ELAN FLEx Audacity SQL Julia

Teaching

I actively teach and develop educational materials on AI, NLP, Deep Learning and Linguistics. Below are my lecture slides delivered to learners across various skill levels.

Blog

Writing on African NLP, language technology, AI research, and the human side of building in tech. Published on Medium ↗