Run, tune, and scale generative AI in the cloud with OctoAI | OctoAI

https://octo.ai/ • Jan 16, 2024 15:27

Extracto

OctoAI empowers app builders with a collection of end-to-end GenAI solutions to launch and scale AI applications. Get started today and receive $10 of free credit in your account.

Contenido

OCTOAI

RunTuneScale generative AI in the cloud

OctoAI delivers production-grade GenAI solutions running on the most efficient compute, empowering builders to launch the next generation of AI applications.

Make GenAI work for you

RUN

“Our top priority was getting to market quickly. OctoAI simplified achieving our goals while providing the highest level of speed and reliability.”

Deck Author - Brian Carlson

Brian CarlsonFounder & CEO @ Storytime AI

Tune

“OctoAI’s integration has been instrumental for our customers to fine-tune their image generation while accelerating CALA's time to market.”

Deck Author - Dylan Pyle

Dylan PyleCo-Founder & CTO @ CALA

Scale

“We've increased our image generation speeds by 5x with OctoAI’s low latency inferences, resulting in more usage and growth for our platform!”

Deck Author - Angus Russell

Angus RussellFounder @ NightCafe

Tap into AI expertise builders need to succeed

OctoAI emerged from deep expertise in AI systems: hardware enablement, model acceleration, and machine learning compilation and infrastructure. Leave the complexities of scaling ML to us and focus your resources on developing an app that meets the moment.

Reliability

Our strong cloud partnerships ensure ample compute capacity, with autoscaling and aggressive SLAs ensuring your app is supported as your usage grows.

Scalability

Effortlessly scales with your app and user base, allowing you to provide the best possible user experience.

Expert Support

Ensure technical and business success by working hand-in-hand with an experienced team of customer engineers and account managers at every step.

Generate breathtaking imagery in your app

OctoAI’s Image Generation is the most performant and customizable solution for Stable Diffusion and Stable Diffusion XL. Create, store, and orchestrate model assets at scale to deliver highly differentiated end-user experiences.

Generate, classify, and summarize text with the utmost control

OctoAI is the fastest and most flexible place to leverage the best open source large language models: Code Llama, and Llama 2 Chat, Mistral, and Mixtral. Build with the Llama 2 that best delivers for your users and business, controlling the development from end-to-end.

OctoAI's Text Gen solution can help build a chatbot that references documents and helps with daily business needs

Run your choice of OSS, fine tuned, or custom models performantly at scale

Save significant engineering resources spent rolling deployment pipelines and tap into OctoAI’s sophisticated ML infrastructure and efficient, scalable compute. Effortlessly bring custom models or models from popular hubs like HuggingFace.

Bring your custom models to OctoAI, showing input boxes of AI models: Bark, InstrucXL, Yolov7, Instruct BLIP, and RoBERTa

Our ML experts deliver the fastest, cheapest foundational models

The OctoAI team includes recognized leaders in ML systems, ML compilation, and hardware intrinsics who have founded widely adopted open source ML projects including Apache TVM and XGBoost. Our accelerated models are in production at hyperscalers like Microsoft where they process billions of images a month in services like Xbox.

curl -X POST https://your-sd-endpoint.octoai.cloud/predict' \ -H 'content-type: application/json' \ -H 'Authorization: BEARER {apiKey}' \ --data '{"prompt:"an oil painting of an octopus playing chess", "width":512, "height":512, "guidance_scale":7.5, "num_images_per_prompt":1, "num_inference_steps":3-, "seed":0, "negative_prompt":"frog", "solver":"DPMSolverMultistep"}'  > test_curl )json.out

SDXL (accelerated)

<3

second image generation

over 2x

faster than base model

curl "https://my-llama-2-70b-chat-demo.octoai.run/chat/completions" \ -H "accept: text/event-stream" \ -H "authorization: Bearer $YOUR_TOKEN" \ -H "content-type: application/json" \ -d '{ "model": "llama-2-70b-chat", "messages": [ { "role": "assistant","content": "Below is an instruction that describes a task. Write a response that appropriately completes the request." }, { "role": "user", "content": "write a poem about an octopus who lives in the sea"}], "stream": true, "max_tokens": 850}'

Llama 2 70B (accelerated)

2x

performance gains on multi-GPU

Read about our work

OctoAI is now GA: The only SOC2 certified, production grade GenAI platform that can run, tune, and scale generative AI for your needs

We launched the public beta of OctoAI earlier this year, and have been actively working with early customers to refine and enhance the product experience. Since then, we have added a number of new models (including SDXL and the Llama 2 “herd”), introduced new features (like SDXL style presets, Langchain integration), and completed the SOC 2 Type II certification. In the last few months, OctoAI has served millions of inferences in industries ranging from art, to entertainment, to fashion design.

OctoAI is now GA: The only SOC2 certified, production grade GenAI platform that can run, tune, and scale generative AI for your needs

We launched the public beta of OctoAI earlier this year, and have been actively working with early customers to refine and enhance the product experience. Since then, we have added a number of new models (including SDXL and the Llama 2 “herd”), introduced new features (like SDXL style presets, Langchain integration), and completed the SOC 2 Type II certification. In the last few months, OctoAI has served millions of inferences in industries ranging from art, to entertainment, to fashion design.

OctoAI is now GA: The only SOC2 certified, production grade GenAI platform that can run, tune, and scale generative AI for your needs

We launched the public beta of OctoAI earlier this year, and have been actively working with early customers to refine and enhance the product experience. Since then, we have added a number of new models (including SDXL and the Llama 2 “herd”), introduced new features (like SDXL style presets, Langchain integration), and completed the SOC 2 Type II certification. In the last few months, OctoAI has served millions of inferences in industries ranging from art, to entertainment, to fashion design.