Inference costs too much. We’re fixing that! We've partnered with @sfcompute to launch the Large Scale Inference Batch API: up to 80% cost savings, 15+ leading models, and real-time GPU spot pricing. Learn how in our launch video: youtu.be/PZIqVuvaJgE
2
19
125
213K
56
Learn more about how we're cutting token costs and reducing total inference spend in the blog post: modular.com/blog/sf-comput…