GPU y referencia


HF linkedin post

Here's the mix of AWS instances we currently run our serverless Inference API on.

For context, the Inference API is the infra service that powers the widgets on Hugging Face Hub model pages + PRO users and Enterprise orgs can use it programmatically.

64 g4dn.2xlarge
48 g5.12xlarge
48 g5.2xlarge
10 p4de.24xlarge
42 r6id.2xlarge
9 r7i.2xlarge
6 m6a.2xlarge (control plane and monitoring)
Total = 229 instances

