Research
Our work across inference optimization, computer vision, and quantitative finance.
-
PyTorch native INT8 quantization API Published
INT8 tensor subclass for PyTorch enabling up to 4x memory reduction with optimized CUDA/Triton kernels. Merged into TorchAO.
-
nano-vLLM Published
Educational LLM inference engine built from scratch. Covers PagedAttention, continuous batching, chunked prefill, and scheduling with detailed C++ implementations.
-
ViT Robustness via Occlusion In progress
Improving Visual Transformer robustness using occlusion generators for better generalization.
-
Sparse Frame Selector In progress
Building sparse frame selection methods for faster video reasoning and efficient temporal understanding.
-
Prediction Market Analysis In progress
Using LLMs for prediction market analysis and probabilistic forecasting.
-
Event-Based Trading Agents In progress
Developing LLM-powered trading agents that react to market events and news.
Interested in collaborating on research? Contact us at daniel@aerlabs.tech or shubham@aerlabs.tech