At the moment, we’re saying the overall availability of Amazon Elastic Compute Cloud (Amazon EC2) G7e cases that ship cost-effective efficiency for generative AI inference workloads and the very best efficiency for graphics workloads.
G7e cases are accelerated by the NVIDIA RTX PRO 6000 Blackwell Server Version GPUs and are effectively suited to a broad vary of GPU-enabled workloads together with spatial computing and scientific computing workloads. G7e cases ship as much as 2.3 instances inference efficiency in comparison with G6e cases.
Enhancements made in comparison with predecessors:
- NVIDIA RTX PRO 6000 Blackwell GPUs — NVIDIA RTX PRO 6000 Blackwell Server Version GPUs provide two instances the GPU reminiscence and 1.85 instances the GPU reminiscence bandwidth in comparison with G6e cases. By utilizing the upper GPU reminiscence provided by G7e cases, you’ll be able to run medium-sized fashions of as much as 70B parameters with FP8 precision on a single GPU.
- NVIDIA GPUDirect P2P — For fashions which might be too giant to suit into the reminiscence of a single GPU, you’ll be able to cut up the mannequin or computations throughout a number of GPUs. G7e cases scale back the latency of your multi-GPU workloads with help for NVIDIA GPUDirect P2P, which allows direct communication between GPUs over PCIe interconnect. These cases provide the bottom peer to see latency for GPUs on the identical PCIe swap. Moreover, G7e cases provide as much as 4 instances the inter-GPU bandwidth in comparison with L40s GPUs featured in G6e cases, boosting the efficiency of multi-GPU workloads. These enhancements imply you’ll be able to run inference for bigger fashions throughout a number of GPUs providing as much as 768 GB of GPU reminiscence in a single node.
- Networking — G7e cases provide 4 instances the networking bandwidth in comparison with G6e cases, which implies you need to use the occasion for small-scale multi-node workloads. Moreover, multi-GPU G7e cases help NVIDIA GPUDirect Distant Direct Reminiscence Entry (RDMA) with Elastic Cloth Adapter (EFA), which reduces the latency of distant GPU-to-GPU communication for multi-node workloads. These occasion sizes additionally help NVIDIA GPUDirectStorage with Amazon FSx for Lustre, which will increase throughput by as much as 1.2 Tbps to the cases in comparison with G6e cases, which implies you’ll be able to rapidly load your fashions.
EC2 G7e specs
G7e cases function as much as 8 NVIDIA RTX PRO 6000 Blackwell Server Version GPUs with as much as 768 GB of whole GPU reminiscence (96 GB of reminiscence per GPU) and Intel Emerald Rapids processors. Additionally they help as much as 192 vCPUs, as much as 1,600 Gbps of community bandwidth, as much as 2,048 GiB of system reminiscence, and as much as 15.2 TB of native NVMe SSD storage.
Listed below are the specs:
| Occasion title |
GPUs | GPU reminiscence (GB) | vCPUs | Reminiscence (GiB) | Storage (TB) | EBS bandwidth (Gbps) | Community bandwidth (Gbps) |
| g7e.2xlarge | 1 | 96 | 8 | 64 | 1.9 x 1 | As much as 5 | 50 |
| g7e.4xlarge | 1 | 96 | 16 | 128 | 1.9 x 1 | 8 | 50 |
| g7e.8xlarge | 1 | 96 | 32 | 256 | 1.9 x 1 | 16 | 100 |
| g7e.12xlarge | 2 | 192 | 48 | 512 | 3.8 x 1 | 25 | 400 |
| g7e.24xlarge | 4 | 384 | 96 | 1024 | 3.8 x 2 | 50 | 800 |
| g7e.48xlarge | 8 | 768 | 192 | 2048 | 3.8 x 4 | 100 | 1600 |
To get began with G7e cases, you need to use the AWS Deep Studying AMIs (DLAMI) in your machine studying (ML) workloads. To run cases, you need to use AWS Administration Console, AWS Command Line Interface (AWS CLI) or AWS SDKs. For a managed expertise, you need to use G7e cases with Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS). Assist for Amazon SageMaker AI can be coming quickly.
Now obtainable
Amazon EC2 G7e cases can be found as we speak within the US East (N. Virginia) and US East (Ohio) AWS Areas. For Regional availability and a future roadmap, search the occasion kind within the CloudFormation assets tab of AWS Capabilities by Area.
The cases might be bought as On-Demand Situations, Financial savings Plan, and Spot Situations. G7e cases are additionally obtainable in Devoted Situations and Devoted Hosts. To study extra, go to the Amazon EC2 Pricing web page.
Give G7e cases a attempt within the Amazon EC2 console. To study extra, go to the Amazon EC2 G7e cases web page and ship suggestions to AWS re:Submit for EC2 or via your standard AWS Assist contacts.
— Channy

