publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2026
- To Adapt or Not to Adapt, That is the Ski QuestionYuvaraj Chesetti, and Prashant PandeyProc. ACM Manag. Data, May 2026
Adaptive filters, unlike traditional filters, update their representation upon detecting a false positive to avoid repeating the same error in the future. Adaptive filters require an auxiliary structure, typically much larger than the main filter and often residing on slow storage, to facilitate adaptation. Each adaptation requires a disk I/O, creating overhead that must be amortized across repeated false positives. On highly skewed or adversarial workloads, this overhead is well amortized over repeated false positives. However, on uniform random work- loads, adaptive filters experience noticeable performance degradation as false positives do not repeat often, making adaptivity an unnecessary overhead. In this paper, we address this fundamental limitation in adaptive filters by establishing a connection to the online ski rental problem, a classic online decision problem where one must repeatedly choose between renting or buying skis without knowing future usage. We demonstrate that adapting a false positive is equivalent to ”buying”, while accepting the false positive without adaptation corresponds to ”renting”. Using the online model, we theoretically analyze the cost of adaptivity and develop an adaptivity strategy to determine when to adapt a false positive to avoid unnecessary overheads. We introduce the SkiQF, an online adaptive filter that provides robust false positive rate guarantees through adaptivity while maintaining consistent, high performance across diverse query distributions. The SkiQF makes no assumptions about the query distribution and is 2-competitive, i.e., its total I/O never exceeds twice the optimal for any workload. In a database system using a filter to avoid unnecessary disk accesses, the SkiQF achieves up to 1.78\texttimeshigher throughput compared to the AdaptiveQF. Compared to traditional filters, the SkiQF achieves up to 1.45\texttimeshigher throughput on skewed workloads and up to 10\texttimeshigher throughput on adversarial workloads, demonstrating robustness and resilience. We also introduce the HybridSkiQF, a variant that detects the underlying query distribution and chooses an adaptivity strategy to minimize I/O. Additionally, it dynamically adjusts its strategy in response to distribution drift. When it detects repeating false positives, the HybridSkiQF switches to immediate adaptation, like a standard adaptive filter, avoiding the overhead of the SkiQF’s delayed adaptivity when immediate adaptation is optimal. The HybridSkiQF achieves up to 1.28\texttimes and 1.79\texttimeshigher throughput than the SkiQF and traditional filter respectively.
- Aeris Filter: A Strongly and Monotonically Adaptive Range FilterYuvaraj Chesetti, Navid Eslami, Huanchen Zhang, Niv Dayan, and Prashant PandeyProc. ACM Manag. Data, Apr 2026
Range filters are probabilistic data structures used to efficiently perform range emptiness queries, with applications in databases, big data analytics, and key-value stores. Modern range filters are compact and can guarantee a bounded false positive rate irrespective of the spatial skew in queries. However, existing range filters are still susceptible to temporal skew: in skewed workloads where a few queries are repeated disproportionately more often, the false positive rate of a range filter may be unbounded. We introduce the Aeris filter, an adaptive expandable range filter that guarantees a robust false positive rate irrespective of spatial or temporal skew. The Aeris filter achieves this by dynamically resolving and adapting to false positives. More specifically, the Aeris filter is monotonic adaptive, i.e., it never forgets a previously encountered false positive. The Aeris filter introduces a novel encoding scheme to implement adaptivity in a range filter with no additional space or operational overhead. Furthermore, the Aeris filter deamortizes the I/O cost to expand monotonic adaptive filters by utilizing on-disk adaptivity structures, resulting in fewer system disruptions. Experimental results demonstrate that the Aeris filter achieves up to a 10\texttimes reduction in false positive rates on skewed query distributions compared to other non-adaptive range filters. When integrated into a database, the Aeris filter delivers 1.5-8\texttimes higher throughput for adversarial workloads, and is able to deliver this high throughput using a cache of smaller size. The Aeris filter also reduces expansion overhead by up to 3\texttimes compared to the Memento filter, a spatially-robust expandable range filter. These improvements ensure scalable, efficient, and adaptive range query handling in dynamic environments.
2025
- Zombie Hashing: Reanimating Tombstones in GraveyardYuvaraj Chesetti, Benwei Shi, Jeff M. Phillips, and Prashant PandeyProc. ACM Manag. Data, Apr 2025
- Evaluating Learned Indexes for External Memory JoinsYuvaraj Chesetti, and Prashant PandeyIn 2025 Proceedings of the Conference on Applied and Computational Discrete Algorithms (ACDA), Apr 2025