Projects
Most of the projects listed here are open source, giving me the opportunity to explore and share knowledge with a community of people interested in similar topics. I have explored topics and ideas around data publishing and orchestration, data science on the web, networking, interfaces, browser and IDE extensions, system programming, and more.
JavaScript · Data Science · ML on the Web
Open-source JavaScript library providing high-performance, Pandas-like data structures for manipulating and processing structured data in the browser and Node.js.
Co-created the project and contributed to the core dataframe data structure and it operations, plus the general architecture and design of the library
Co-created the project and contributed to the core dataframe data structure and it operations, plus the general architecture and design of the library
An interactive JavaScript notebook environment - think Jupyter but for the browser.
I impemented the core notebook data model, cell execution logic etc.
I impemented the core notebook data model, cell execution logic etc.
A TypeScript port of scikit-learn for JS environments (browser, Node.js), powered by TensorFlow.js.
Contributed to project planning, the initial kickstart, and model saving implementation.
Contributed to project planning, the initial kickstart, and model saving implementation.
Machine learning and data science library for JavaScript and TypeScript, powered by TensorFlow.js.
Contributed feature engineering utilities and preprocessing pipelines — a building block for training and deploying ML models entirely in-browser or on Node.
Contributed feature engineering utilities and preprocessing pipelines — a building block for training and deploying ML models entirely in-browser or on Node.
Open Data · Publishing Systems
The world's leading open-source data portal platform — used by governments and organisations globally to publish and share datasets.
Contributed to adding new features, python 2 to 3 migration, bug fixes, documentation and creating varieties of extensions to improve data publishing workflows.
Contributed to adding new features, python 2 to 3 migration, bug fixes, documentation and creating varieties of extensions to improve data publishing workflows.
A modern JavaScript/React framework built on Next.js for rapidly building rich, feature-complete open data portals and publishing systems. Natively supports CKAN, GitHub, Frictionless Data Packages, and more.
Helped with the initial project kickstart and idea consolidation, and contributed to integrating the framework into various data portals.
Helped with the initial project kickstart and idea consolidation, and contributed to integrating the framework into various data portals.
Systems · Infrastructure · Machine Learning
A Postgres extension written in Rust that hooks into the executor and replication pipeline to track updates on a chosen table column and mirror those values into Redis in real time. A deep dive into Postgres internals — shared memory, hooks, and the extension API.
A minimal VM-based job runner that uses Firecracker micro-VMs to securely isolate and execute arbitrary user scripts. Each job gets its own ephemeral Linux VM with a mounted filesystem. Covers networking, Linux kernel images, root filesystems, and the Firecracker API end-to-end.
Lightweight Go runtime for deploying and serving scikit-learn models exported as ONNX. Implements the ONNX graph execution engine from scratch in Go. Built for production-grade ML inference without a Python runtime dependency.
Interfaces · Developer Tools