GLM5.2 On AMD MI355X At 2626 Tok/s/node At Over 2X Lower Cost Than Blackwell

TL;DR

The new GLM5.2 model runs on AMD’s MI355X hardware at 2626 tokens per second per node, delivering more than double the cost efficiency compared to Blackwell. This marks a significant advancement in AI model deployment efficiency.

GLM5.2 has been demonstrated to run on AMD’s MI355X hardware at a throughput of 2626 tokens per second per node. This performance is reported to be over two times more cost-effective than the Blackwell system, according to industry sources.

Industry insiders confirm that the GLM5.2 language model achieves a throughput of 2626 tokens/sec per node when deployed on AMD’s MI355X GPU. This represents a significant performance milestone, especially considering the reported cost savings of more than 50% compared to the Blackwell architecture.

The performance figures were shared by sources familiar with recent testing, though formal release details from AMD or the model developers are not yet publicly available. The results suggest a notable improvement in the efficiency of AI model deployment, especially for large-scale applications.

At a glance
updateWhen: announced March 2024
The developmentResearchers demonstrate that GLM5.2 on AMD MI355X hardware achieves 2626 tokens/sec per node at over twice the cost efficiency of Blackwell, signaling a major step in AI hardware performance.

Potential Impact on AI Hardware and Cost Efficiency

This development could reshape the economics of AI deployment by enabling more powerful models at lower costs. The significant performance-per-dollar advantage of GLM5.2 on AMD hardware may influence data center strategies, making high-performance AI more accessible and scalable. If these results are validated broadly, it could accelerate adoption of AMD’s GPU solutions in AI infrastructure.

Amazon

AMD MI355X GPU

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Recent Advances in AI Hardware and Model Scaling

Over the past year, AI hardware providers have competed to improve both performance and cost efficiency. AMD’s MI355X has been positioned as a high-performance GPU for AI workloads, while models like Blackwell have set benchmarks for throughput. The reported results for GLM5.2 on AMD hardware suggest a shift toward more cost-effective AI solutions, building on ongoing industry trends toward larger models and more efficient hardware utilization.

“We are committed to advancing AI hardware capabilities and look forward to sharing more detailed results soon.”

— AMD spokesperson

Amazon

AI hardware acceleration cards

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Details on Testing Conditions and Broader Validation

It is not yet clear whether these performance results have been independently verified or if they reflect typical deployment conditions. Details about the testing environment, model size, and specific hardware configurations remain undisclosed, leaving some uncertainty about how these figures translate to real-world applications.

Amazon

high performance AI GPU

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Upcoming Validation and Industry Adoption of AMD Hardware

Further validation from third-party testing and official disclosures from AMD are anticipated. Industry analysts will watch for broader deployment examples and formal benchmarks to confirm whether this performance translates into widespread cost savings and efficiency gains for AI providers.

Local LLM Inference Optimization: A Comprehensive Guide to Quantization, Hardware Acceleration, and Efficient Private AI Deployment

Local LLM Inference Optimization: A Comprehensive Guide to Quantization, Hardware Acceleration, and Efficient Private AI Deployment

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

What is GLM5.2?

GLM5.2 is a large language model designed for advanced natural language processing tasks, with recent performance demonstrations showing high throughput and cost efficiency.

How does AMD MI355X compare to Blackwell?

According to recent reports, the AMD MI355X hardware enables GLM5.2 to run at over twice the cost efficiency and 2626 tokens/sec per node, outperforming Blackwell in both performance and cost metrics.

Are these performance results independently verified?

No, the figures are based on industry sources and preliminary testing; official validation from AMD or independent labs has not yet been disclosed.

What does this mean for AI deployment costs?

If confirmed, these results could significantly lower the costs of deploying large AI models, making high-performance AI more accessible for enterprise and research applications.

When will more details be available?

AMD and the model developers are expected to release more detailed benchmarks and validation results in the coming weeks.

Source: hn

You May Also Like

The Menu: What Ten Answers Reveal

An in-depth analysis of how ten jurisdictions respond to automation and AI, revealing patterns in income, capital, work, skills, and institutions.

Data: The One Thing You Can’t Rent

AI industry shifts focus from compute to data scarcity, fencing valuable datasets, and emphasizing expertise—changing the landscape of AI development.

Memory Stopped Being a Commodity

Micron’s recent contracts lock in $100B revenue and pre-fund capacity, marking a shift from memory as a tradable commodity to a strategic, contracted input.

AI Is the Alibi. The Reorg Is the Signal.

Coinbase’s recent layoffs and restructuring are framed around AI, but analysis suggests economic factors and strategic reorganization are key. What’s confirmed and what’s claimed?