Career Highlights
Current Role: Member of Technical Staff @ Cohere
Building and improving Compass, an enterprise-grade retrieval platform that powers RAG pipelines. Key contributions include:
- Designed the end-to-end evaluation framework with retry mechanisms, checkpointing, and LLM-based evaluation for scalable IR experiments
- Shipped multiple integrations: web2compass for scraping website into Compass, MCP tooling for North, and code2compass for code bases ingestion into Compass for semantic code search
- Improved Compass Parser efficiency for vision-language models, achieving 50% memory reduction in image-to-markdown tasks
- Led development of Compass SDK V2 with async support and built the Compass Asset Service for scalable ingestion pipelines
- Implemented Postgres-based job queue replacing Celery for improved reliability in long-running jobs
- Contributed to open-source by adding Cohere support to Pydantic-AI
Key Amazon/AWS Achievements
- Founding engineer on Amazon Managed Apache Airflow (MWAA), leading design and launch at re:Invent 2020; scaled to 15+ global regions
- Led modernization of the Distributed Job Scheduler backend using Elasticsearch, powering large-scale scheduling across Amazon
- Built platform services for Amazon Mobile Shopping App serving millions of retail customers
- Championed open source with amazon-mwaa-docker-images repository serving thousands of MWAA environments
Systems & Low-Level Programming Background
- C++ expertise: Lead developer on Alusus Programming Language compiler using LLVM, imaging modules with AutoCAD DWG/DXF reverse engineering
- High-performance systems: Financial workflow orchestration platforms, real-time alerting systems with Elasticsearch
- Infrastructure at scale: Google App Engine migrations, NetworkedBlogs optimization for hundreds of thousands of users
Selected Projects
- Compass @ Cohere - Enterprise search and retrieval platform
- microtorch - Lightweight deep learning framework
- neuroscout - LLM-powered NeurIPS paper analysis tool
- image-search - Natural language image search using CLIP
- pat-cli - ML-powered log clustering tool
- amazon-mwaa-docker-images - Docker infrastructure for Apache Airflow
- Alusus Programming Language - LLVM-based programming language compiler
- Alkitab - Modern Quranic text processing toolkit
Skills & Technologies
Machine Learning: PyTorch, CLIP, BERT, LLM evaluation, LLM-as-a-judge, synthetic data generation, information retrieval, multi-modal models, autograd systems
Cloud & Infrastructure: AWS (ECS, Lambda, Step Functions, DynamoDB, RDS, CloudWatch), Elasticsearch, Docker, PostgreSQL
Programming Languages: Python, C++, JavaScript/TypeScript
Publications
Academic Publications
- Alexa Visual Item Selection (AVIS) Dataset (Amazon Computer Vision Conference 2023)
- Visual Item Selection with Voice Assistants (ACM Web Conference 2023)
- Advanced Composition in Virtual Camera Control (International Symposium on Smart Graphics 2011)
- Advanced Composition in Virtual Camera Control (MPhil Thesis, Newcastle University, 2011)
Technical Articles
- Installing Jupyter on a Cloud Machine (Dec 2019)
- Useful Shell Tools/Tips for Increasing Productivity (Dec 2019)
- Full Text Search – Part 3: Building Inverted Index (June 2016)
- Full Text Search – Part 2: Token Processing (May 2016)
- Full Text Search – Part 1: Tokenization (May 2016)
- Easy Caching with C# and PostSharp (Dec 2013)
- Creating Strongly Typed Custom Collections in C# (Sep 2002)
- Why is Psychology not Taught in Schools? (Nov 2019)
- Focus: Reduction of Distractions (Jan 2018)
Reading List
2025
- Good Energy: The Surprising Connection Between Metabolism and Limitless Health by Casey Means and Calley Means
- The Whole-Brain Child: Revolutionary Strategies to Nurture Your Child’s Developing Mind by Daniel J. Siegel and Tina Payne Bryson
- The Expectant Father: The Ultimate Guide for Dads-to-Be by Armin A. Brott
- The Anxious Generation: How the Great Rewiring of Childhood Caused an Epidemic of Mental Illness by Jonathan Haidt
- The Coddling of the American Mind: How Good Intentions and Bad Ideas Are Setting up a Generation for Failure by Jonathan Haidt and Greg Lukianoff
All-Time Keepers
Programming Books
- A Tour of C++ by Bjarne Stroustrup
- Python Essential Reference by David Beazley
Computer Systems
- Designing Data-Intensive Applications by Martin Kleppmann
- Operating Systems: Three Easy Pieces by Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau
Biographies
- The Prince of Mathematics: Carl Friedrich Gauss by M.B.W. Tent
- The Strangest Man: The Hidden Life of Paul Dirac, Quantum Genius by Graham Farmelo
Psychology
- Without Conscience by Robert Hare
Miscellaneous
- David and Goliath by Malcolm Gladwell
- Flow: The Psychology of Optimal Experience by Mihaly Csikszentmihalyi
- Why We Sleep by Matthew Walker