Distributing your deep learning model training has become a question of when you do it, not if. State-of-the-art ML models like BERT have 100s of millions of parameters, and training these large networks will take you days if not weeks on one machine.
Examples for customizing training
Examples for distributed hyperparameter tuning
Follow us on Twitter!
Join our Slack!