June 08, 2026
AI

On-Policy Distillation is Reinforcement Learning in Disguise

A mathematical deep dive into why knowledge distillation, done right, is secretly policy gradient training

Read More
- views
May 21, 2026
AI

From REINFORCE to GRPO

[WIP] This blog is an aggregate of various papers, blogs, and videos I have been going ...

Read More
- views
June 23, 2025
Baduk

Go Proverbs

A while ago, I was watching Dwyrin's proverb series on YouTube, and to be honest, it was quite fun to watch. Since then, I have always wanted to d...

Read More
- views
March 22, 2025
GATE

My Gate Journey

After completing my Bachelor's, I suddenly decided, without any clear reason, to prepare for GATE instead of going abroad. I had always dreamed of ...

Read More
- views
March 22, 2025
Dev

A New Recipe for Jekyll Comments

I've always wanted to implement a custom comment box in my Jekyll blog without relying on third-party services like Disqus or the GitHub API. Since...

Read More
- views
February 29, 2024
AI

Replicating AlphaGo

Go, also known as Baduk in Korea, has a long history spanning over 2500 years and is loved by many in East Asia. In this game, players take turns p...

Read More
- views
September 30, 2023
AI

Google Summer of Code'23 - ML4SCI

I'm thrilled to share that I've been selected for Google Summer of Code (GSoC) at Ml4SCI. I'll be working on developing equivariant neural networks...

Read More
- views