Karthik Kumar

Building Reliable Systems at Scale

Product-minded Software Engineer specializing in Observability and Developer Productivity. I'm passionate about the performance and reliability of Distributed Systems. I have over 14 years of experience leading complex projects and reliably delivering results.

Intro

I'm passionate about improving developer productivity through better tooling and Observability. Currently, I'm exploring how principles from debugging Distributed Systems can be applied to AI agents and systems, bringing production-level reliability practices to emerging AI infrastructure.

I'm currently at Netflix building Observability infrastructure, helping engineers understand and debug complex systems. I've had stints at Lightstep, Rally Health, Opower, Qualcomm, and IBM in the past. I hold Master's & Bachelor's degrees in Computer Science from Georgia Tech & Virginia Tech.


Core Expertise

  • Distributed Tracing and Observability Infrastructure
  • System reliability and performance optimization
  • Design and implementation of Distributed Systems

Values

  • Communicate with clarity and transparency - Structure ideas simply, share context openly, and ensure understanding across teams and people
  • Embrace challenges to grow - Seek complex problems that stretch my capabilities and accelerate learning
  • Build trust through action - Demonstrate accountability, show empathy, and bring passion to my work