Rimika Writes

Latest

6 Step Optimization of GeMMs in CUDA

I aim to take a naive implementation of single-precision (FP32) General Matrix Multiplication (GeMM) and optimize it so its computations can be parallelized effectively on GPUs with CUDA C/C++.

Low-Precision Arithmetic in ML Systems

CUDA 4: Profiling CUDA Kernels

CUDA 3: Your Checklist for Optimizing CUDA Kernels

CUDA 1: GPU v/s CPU

CUDA 0: From OS to GPUs

7 Step Optimization of Parallel Reduction with CUDA

Processing Life...

Maximizing Grace Hopper Conference 2025

What to expect and how to prepare for the Biggest Tech Conference for Women!

3 Habits of Successful People — According to Neuroscience

The Key to Discipline is Loving Yourself

5 Mindset Shifts that will Guarantee Personal Growth

How I Self-Introspect

Decoding GHC 2023: My Journey, Career Impact, and Insider Tips

Transforming your Writing – A Self-Taught Approach

Deep Dives & Debugs

6 Step Optimization of GeMMs in CUDA

I aim to take a naive implementation of single-precision (FP32) General Matrix Multiplication (GeMM) and optimize it so its computations can be parallelized effectively on GPUs with CUDA C/C++.

Low-Precision Arithmetic in ML Systems

CUDA 4: Profiling CUDA Kernels

CUDA 3: Your Checklist for Optimizing CUDA Kernels

CUDA 1: GPU v/s CPU

CUDA 0: From OS to GPUs

7 Step Optimization of Parallel Reduction with CUDA

Industry Reflections

A Checklist for Your Next SWE Interview

How to best prepare for your next software engineering interview–a month before, a week before, a day before!

Transform Your Networking Skills: 5 Steps to Building Powerful Connections for Recruitment Season

Ultimate Timeline for Landing a Summer SWE Internship

Why Backpropagation Falls Short of Its True Purpose

AI-Powered Neurotechnology–The Future of Big Tech

Staying Ahead of the Curve as a Computer Science Student

Decoding GHC 2023: My Journey, Career Impact, and Insider Tips

Rimika Writes

Latest

6 Step Optimization of GeMMs in CUDA

Low-Precision Arithmetic in ML Systems

CUDA 4: Profiling CUDA Kernels

CUDA 3: Your Checklist for Optimizing CUDA Kernels

CUDA 1: GPU v/s CPU

CUDA 0: From OS to GPUs

7 Step Optimization of Parallel Reduction with CUDA

Processing Life...

Maximizing Grace Hopper Conference 2025

3 Habits of Successful People — According to Neuroscience

The Key to Discipline is Loving Yourself

5 Mindset Shifts that will Guarantee Personal Growth

How I Self-Introspect

Decoding GHC 2023: My Journey, Career Impact, and Insider Tips

Transforming your Writing – A Self-Taught Approach

Deep Dives & Debugs

6 Step Optimization of GeMMs in CUDA

Low-Precision Arithmetic in ML Systems

CUDA 4: Profiling CUDA Kernels

CUDA 3: Your Checklist for Optimizing CUDA Kernels

CUDA 1: GPU v/s CPU

CUDA 0: From OS to GPUs

7 Step Optimization of Parallel Reduction with CUDA

Industry Reflections

A Checklist for Your Next SWE Interview

Transform Your Networking Skills: 5 Steps to Building Powerful Connections for Recruitment Season

Ultimate Timeline for Landing a Summer SWE Internship

Why Backpropagation Falls Short of Its True Purpose

AI-Powered Neurotechnology–The Future of Big Tech

Staying Ahead of the Curve as a Computer Science Student

Decoding GHC 2023: My Journey, Career Impact, and Insider Tips

Book Reviews

Show Your Work - Austin Kleon