New top story on Hacker News: Who uses Google TPUs for inference in production? - Hindi Top Breaking News - Hindi News, Latest News in Hindi, Breaking News

Hindi Top Breaking News - Hindi News, Latest News in Hindi, Breaking News

India Hindi News app brings you the latest news and videos from the Hindi Top Breaking News studios in India. Stay tuned to the latest news stories from India and the world. Access videos and photos on your device with the Hindi Top Breaking News India News app.

Breaking

Home Top Ad

Post Top Ad

Responsive Ads Here

Monday, March 11, 2024

New top story on Hacker News: Who uses Google TPUs for inference in production?

Who uses Google TPUs for inference in production?
19 by arthurdelerue | 2 comments on Hacker News.
I am really puzzled by TPUs. I've been reading everywhere that TPUs are powerful and a great alternative to NVIDIA. I have been playing with TPUs for a couple of months now, and to be honest I don't understand how can people use them in production for inference: - almost no resources online showing how to run modern generative models like Mistral, Yi 34B, etc. on TPUs - poor compatibility between JAX and Pytorch - very hard to understand the memory consumption of the TPU chips (no nvidia-smi equivalent) - rotating IP addresses on TPU VMs - almost impossible to get my hands on a TPU v5 Is it only me? Or did I miss something? I totally understand that TPUs can be useful for training though.

No comments:

Post a Comment

Post Bottom Ad

Responsive Ads Here

Pages