Thomas Kalnik's Technical Blog

HomeAbout

Latest Posts

GraphRAG Assisted Ideation with a YouTube Knowledge Graph

March 23, 2025

I've been building a system that helps creators generate better video ideas using a combination of Retrieval-Augmented Generation (RAG) with PostgreSQL's pg_vector and a Neo4j knowledge graph.

Read more →

Scaling the Summit: Distributed Inference with Meta-Llama-3.1-405B using vLLM

October 13, 2024

This post details the technical approach, configuration, and key insights from deploying one of the largest language models currently available using distributed inference techniques.

Read more →

Fine-Tuning Llama 3.1 8B with Direct Preference Optimization: A Distributed Training Approach

September 27, 2024

As part of our deep learning research initiatives, I recently conducted a distributed Direct Preference Optimization (DPO) fine-tuning of the Meta Llama 3.1 8B model.

Read more →

Building a Multi-Cloud AI Image Generation Service with Flux

August 23, 2024

In this post, I'll share my experience designing and implementing a production image generation system across multiple cloud platforms, with a focus on the technical concepts that could be valuable for similar projects.

Read more →

Fine-tuning SDXL for Specialized Thumbnail Generation: A Technical Deep Dive

June 4, 2024

I recently undertook a project to fine-tune Stability AI's SDXL model for creating custom thumbnails in a specific visual style.

Read more →
© 2025 Thomas Kalnik. All rights reserved.