Research Projects
Our v0 batch research initiatives focused on advancing AI inference optimization, quantization techniques, and efficient serving infrastructure.
Our First Research Cohort
The v0 batch represents our inaugural research cohort, bringing together talented researchers to tackle fundamental challenges in AI systems. Our focus areas include quantization techniques for efficient inference, comparative analysis of serving frameworks, and optimization strategies for production deployment. Each project combines rigorous experimentation with practical implementation, contributing both to academic understanding and open-source tooling.
Research Areas
- Quantization & Compression: INT8 quantization techniques, memory optimization, and hardware-aware model compression
- Serving Infrastructure: Comparative analysis of vLLM, SGLang, HuggingFace TGI across diverse workload patterns
- Production Optimization: Kernel development, profiling methodologies, and real-world deployment strategies
Active Projects
PyTorch native INT8 quantization API for TorchAO
ActiveA quantized tensor subclass enabling INT8 inference for neural networks through seamless PyTorch integration. Supports dynamic activation quantization (INT8×INT8) and weight-only quantization (FP16/BF16×INT8) with optimized kernels for CPU and CUDA. Reduces memory footprint by up to 4× while maintaining model accuracy. Custom CUDA/Triton kernel development and comprehensive benchmarking against Hugging Face and vLLM baselines in progress.
Comparative Analysis of LLM Serving Frameworks
ActiveA comprehensive benchmarking study comparing vLLM, SGLang, and HuggingFace TGI across diverse workload patterns including agentic workflows, long-context processing, and high-throughput scenarios. The research investigates how architectural differences—such as SGLang's RadixAttention versus vLLM's PagedAttention—impact performance metrics (TTFT, TPOT, throughput) under varying conditions.
Interested in collaborating on research? Contact us at daniel@aerlabs.tech or shubham@aerlabs.tech