AI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch 1st Edition

-19%

In Stock

AI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch 1st Edition

Original price was: 640.00 Dhs.Current price is: 520.00 Dhs.

Elevate your AI system performance capabilities with this definitive guide to maximizing efficiency across every layer of your AI infrastructure. In today’s era of ever-growing generative models, AI Systems Performance Engineering provides engineers, researchers, and developers with a hands-on set of actionable optimization strategies. Learn to co-optimize hardware, software, and algorithms to build resilient, scalable, and cost-effective AI systems that excel in both training and inference. Authored by Chris Fregly, a performance-focused engineering and product leader, this resource transforms complex AI systems into streamlined, high-impact AI solutions.

Inside, you’ll discover step-by-step methodologies for fine-tu….

Categories: AI / Machine Learning, Technology & Programming Tag: Books

Description
Additional information

Description

Inside, you’ll discover step-by-step methodologies for fine-tuning GPU CUDA kernels, PyTorch-based algorithms, and multinode training and inference systems. You’ll also master the art of scaling GPU clusters for high performance, distributed model training jobs, and inference servers. The book ends with a 175+-item checklist of proven, ready-to-use optimizations.

Codesign and optimize hardware, software, and algorithms to achieve maximum throughput and cost savings
Implement cutting-edge inference strategies that reduce latency and boost throughput in real-world settings
Utilize industry-leading scalability tools and frameworks
Profile, diagnose, and eliminate performance bottlenecks across complex AI pipelines
Integrate full stack optimization techniques for robust, reliable AI system performance

From the Preface

In the vibrant streets of San Francisco, where innovation is as common as autonomous vehicle traffic on US Route 101, we find ourselves surrounded by an amazing world of artificial intelligence. Rapid advancements in AI are redefining our daily lives in every aspect. Over the last 20 years, we’ve experienced recommendation engines (2000s), AI assistants (2010s), and fully autonomous vehicles (2020s). The 2030s are going to be even more exciting, as AI is progressing extremely quickly and with massive societal influence.

My personal journey into the fast-moving AI systems performance engineering field was driven by a curiosity to understand the delicate balance and codesign between cutting-edge hardware, highly optimized software, and clever algorithms that power such complex systems and impactful use cases. This realization inspired me to dive deep into the realm of “full-stack” AI performance engineering. I wanted to understand how multiple components like processors, memory architectures, network interconnects, operating systems, and software frameworks all work together in harmony. The complexity of these interactions presented the challenges—and opportunities—that fueled my desire to dive deep and explore this unique combination of technologies.

This book is a realization of my explorations throughout the years as a hands-on ML and AI performance engineer. I created this book for engineers, researchers, practitioners, and enthusiasts who are eager to understand the underpinnings of AI systems performance at all levels. Readers might be building AI applications, optimizing neural network training strategies, or designing and managing scalable inference servers, or they may simply be fascinated by the mechanics of modern AI systems. Overall, this book provides the insights that bridge theory and practice across multiple disciplines.

The reader of this book likely has a foundational understanding of neural networks and a basic familiarity with Python and ML. However, even without these fundamentals, a curious reader can follow the multidimensional codesign performance narrative rooted in the first principles across hardware, software, and algorithms. I promise there is something in this book for every type of reader—and every reader will learn a few new things in these pages.

Throughout the chapters, we examine the evolution of hardware architectures, dive into the nuances of software optimization, and explore real-world case studies that highlight the patterns and best practices of building both high-performance and cost-efficient AI systems. Each section is designed to build upon the last, covering everything from foundational concepts to advanced applications.

Review

“AI systems are layered and fast-moving. Chris breaks the complexity down into a reference that will set the standard for years.”
–Chris Lattner, CEO at Modular

“CUDA kernels, distributed training, compilers, disaggregated inference—finally in one place. An encyclopedia of ML systems.”
–Mark Saroufim, PyTorch at Meta (and Founder of GPU MODE Community)

“Squeezing the most performance out of your AI system is what separates the good from the great. This is the missing manual.”
—Sebastian Raschka, ML/AI Researcher

“An essential guide to modern ML systems—grounded in vLLM and distributed systems—with deep insight into inference optimization and open source.”
—Michael Goin, vLLM Maintainer and Principal Engineer at Red Hat

“A definitive field guide that connects silicon to application, giving AI engineers the full‑stack wisdom to turn raw compute into high‑performance models.”
—Harsh Banwait, Director of Product at Coreweave

Book details

Author : Chris Fregly
Publisher ‏ : O’Reilly Media
Publication date ‏ : ‎ December 16, 2025
Edition ‏ : ‎ 1st
Print length : 1058 pages
Language : English
Format : Paperback

Additional information

book-author	Chris Fregly
Select Format	Paperback

Original Books | Free delivery over 300 MAD | All Morocco 💫

(+212) 682-08-02-05

Find a Book Store

AI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch 1st Edition

Description

Description

Book details

Additional information

Additional information

Who might be interested

Guide infirmier des urgences

Orthopédie Traumatologie – Conforme à la réforme R2C de l’EDN

Neuro-imagerie diagnostique (Imagerie médicale : Précis)

Neurophysiologie: De la physiologie à l’exploration fonctionnelle

Médecine cardio-vasculaire: Réussir les ECNi (French Edition) 1st Edition

Free Delivery

100% Secure

Expert Customer

payment

All Books

Support

Wishlist

Contact Us

Email: contact@Rababooks.com

WhatsApp: +212 068 208 0205

Information

Categories

Categories

Raba Books Morocco's

AI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch 1st Edition

Description

Description

Book details

Additional information

Additional information

Related products

Who might be interested

Free Delivery

100% Secure

Expert Customer

payment