Research Projects

Our research work across inference optimization, computer vision, and quantitative finance.

Completed Research

PyTorch native INT8 quantization API

Published

INT8 tensor subclass for PyTorch enabling up to 4× memory reduction with optimized CUDA/Triton kernels. Merged into TorchAO.

Quantization PyTorch TorchAO

LLM Serving Frameworks Comparison

Coming Soon

Benchmarking vLLM, SGLang, and HuggingFace TGI across agentic workflows, long-context, and high-throughput scenarios.

Benchmarking vLLM SGLang
Report Coming Soon

Current Research

nano-vLLM

In Progress

Minimalist implementation of vLLM for educational purposes and rapid prototyping of inference optimizations.

Inference vLLM LLM Serving

ViT Robustness via Occlusion

In Progress

Improving Visual Transformer robustness using occlusion generators for better generalization.

Computer Vision ViT Robustness

Sparse Frame Selector

In Progress

Building sparse frame selection methods for faster video reasoning and efficient temporal understanding.

Computer Vision Video Efficiency

Prediction Market Analysis

In Progress

Using LLMs for prediction market analysis and probabilistic forecasting.

Quant LLM Markets

Event-Based Trading Agents

In Progress

Developing LLM-powered trading agents that react to market events and news.

Quant Agents Trading

Interested in collaborating on research? Contact us at daniel@aerlabs.tech or shubham@aerlabs.tech