Featured
Compare webshops (2)
Pages: 168, Paperback, Independently published
Independently Published
AI Inference Optimization Engineering: Quantization, Speculative Decoding, and Hardware-Specific LLM Deployment
NEURAL PROCESSING UNITS: THE COMPLETE GUIDE TO AI ACCELERATION HARDWARE: TOPS Performance,...
LLM Inference Engineering: Quantization, KV-Cache Optimization, and High-Throughput Serving: A Production Engineer's...
LOCAL LLM DEPLOYMENT: Training, Fine-Tuning, & Offline Inference: The Complete Developer’s Guide...
Back to top