Pranav Bhandari PhD Candidate at Emory University Storage, Cache, Workload Characterization.

About

I am a PhD candidate in the Computer Science and Informatics program at Emory University graduating this spring (May, 2024). My thesis, "Optimizing Block Storage using Multi-tier Caches", focuses on measuring and analyzing the performance of storage systems, modeling performance using machine learning techniques, and developing novel techniques to improve the efficiency of workload analysis. I am advised by Dr. Avani Wildani. This is my Resume. I am actively applying to Software Engineer/Research Scientist roles. Feel free to reach out.

Research

Multi-Tier Caching

The availability of cache devices with diverse cost-performance profiles has improved the prospects of multi-tier caches. Workload sizes continue to grow making DRAM-only caches not large enough to yield acceptable hit rates. Furthermore, multi-tier caches have been shown to improve cost-efficiency. We want to configure multi-tier caches based on device and workload properties while efficiently evaluating the large configuration space. We are also interested in optimizing multi-tier caches by using better cache admission policies.

Sampling for Block Storage Traces

Random spatial sampling is used to reduce overhead for trace collection and analysis. We extend random spatial sampling to work with multiblock storage requests where it generates samples with lower error in features like mean read/write request size, mean read/write interarrival time, and write ratio compared to random spatial sampling.

Projects

CacheLib

Developed a block storage system stressor using the C++ cache engine CacheLib to cache the data and libaio to transfer data to and from the backing store. Replayed diverse production workloads across different types of storage server that add up to more than 10 years of compute time to derive insights about sizing multiple cache tiers, selecting storage devices, and improving cost efficiency.

GitHub PDF

Cydonia

A python library to analyze and sample block storage traces. Implemented random spatial sampling and augmented it to improve performance for block storage traces.

GitHub

PyMimircache

PyMimircache is an open source cache simulation framework developed by Junchen Yang as part of the Emory SimBioSys Lab. I implemented miniature-simulations of a workload based on paper in FAST'15: Efficient MRC Construction with SHARDS by Carl A. Waldspurger, Nohhyun Park, Alexander Garthwaite, and Irfan Ahmad, CloudPhysics, Inc.

GitHub

I/O Workload Classification using CNN

Used a Convolutional Neural Network (CNN) to classify block traces. Each trace was converted into access plot images with block addresses on the y-axis and time on the x-axis. These images are later classified by the CNN.

PDF

MT Cache Prediction

Using performance data from block trace replay, we used random forest regression to predict the optimal split of budget between DRAM and SSD for multi-tier caching.

PDF

KubeCacheFS

Developed an in-memory, FUSE filesystem to cache data from Kubernetes volumes. Allows users to customize replacement policy and selectively cache directories.

GitHub PDF

Using Smart Agents to Improve Connectivity in a Segmented Multi-Radio Wireless Network using SDNs

Being an inexpensive approach compared to other network model, wireless mesh networks(WMN) are a perfect tool to connect underprivileged areas to the global network. We analyze how we can improve connectivity of WMNs using limited multi-radio nodes which are expensive.

PDF

Trace Analysis Website

Created a website to extract features from file system traces and display them. Trace files were stored in S3, processed in chunks using Lambda functions, and the metadata was stored in DynamoDB. The trace features were visualized using d3.js.

Publications

Guiding Simulations of Multi-Tier Storage Caches Using Knee Detection

Tyler Estro, Mário Antunes, Pranav Bhandari, Anshul Gandhi, Geoff Kuenning, Yifei Liu, Carl Waldspurger, Avani Wildani and Erez Zadok

31st International Symposium on the Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2023)

Desperately Seeking ... Optimal Multi-Tier Cache Configurations

Tyler Estro, Pranav Bhandari, Avani Wildani, Erez Zadok

12th {USENIX} Workshop on Hot Topics in Storage and File Systems (HotStorage 20)

PDF

Shuffled Frog Leaping Algorithm for 0/1 Knapsack Problem on the GPU

Pranav Bhandari, Rahul Chandrashekhar, Peter Yoon

CSC'15 - The 2015 International Conference on Scientific Computing, Las Vegas, Nevada

PDF

Bio

Aug 2017 - Research Assistant
Department of Computer Science, Emory University
May 2020 - Aug 2020 Research Intern
IBM
Jan 2018 - May 2019 Teaching Assistant
CS170 - Introduction to Computer Science (Spring 2018)
CS323 - Algorithms (Fall 2018, Spring 2019)
Department of Computer Science, Emory University
Jan 2016 - Jan 2017 Software Engineering Intern
CivicLift
Oct 2015 - Feb 2016 Student Apprentice
Independent Software
May 2013 - May 2014 Research Assistant
Department of Computer Science, Trinity College

Education

Doctor of Philosophy in Computer Science and Informatics
2017 - 2020 Emory University
Master of Computer Science and Informatics
2013 - 2017 Trinity College
Bachelor of Science in Computer Science and Mathematics

Contact

400 Dowman Drive
Department of Computer Science
Emory University
Atlanta, GA 30322