A year ago, a few of us started working on Cortex, an open source platform for building machine learning APIs. At the outset, we assumed all of our users—and all of the companies actually applying ML in production, for that matter—would be large companies with mature data science teams.
We were wrong.
Over the last year, we’ve seen students, solo engineers, and small teams ship models to production. Just as surprisingly, these users frequently deploy large, state of the art deep learning models for use in real world applications.
A team of two, for example, recently spun up a 500 GPU inference cluster to support their application’s 10,000 concurrent users.
Not long ago, this kind of thing only happened at companies with large budgets and lots of data. Now, any team can do it. This transition is the result of many different factors, but one component, transfer learning, stands out as important.