Absortio

Email → Summary → Bookmark → Email

Together AI – The AI Acceleration Cloud - Fast Inference, Fine-Tuning & Training

https://www.together.ai/ Jan 30, 2025 10:23

Extracto

Run and fine-tune generative AI models with easy-to-use APIs and highly scalable infrastructure. Train & deploy models at scale on our AI Acceleration Cloud and scalable GPU clusters. Optimize performance and cost.

Contenido

The AI Acceleration AccelerationCloud

Train, fine-tune-and run inference on AI models blazing fast, at low cost, and at production scale.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Try now

End-to-end platform for the full generative AI lifecycle

Leverage pre-trained models, fine-tune them for your needs, or build custom models from scratch. Whatever your generative AI needs, Together AI offers a seamless continuum of AI compute solutions to support your entire journey.

  • Inference

    The fastest way to launch AI models:

    • ✔ Serverless or dedicated endpoints

    • ✔ Deploy in enterprise VPC

    • ✔ SOC 2 and HIPAA compliant

  • Fine-Tuning

    Tailored customization for your tasks

    • ✔ Complete model ownership

    • ✔ Fully tune or adapt models

    • ✔ Easy-to-use APIs

    • Full Fine-Tuning

    • LoRA Fine-Tuning

  • GPU Clusters

    Full control for massive AI workloads

    • ✔ Accelerate large model training

    • ✔ GB200, H200, and H100 GPUs

    • ✔ Pricing from $1.75 / hour

SPEED RELATIVE TO VLLM

LLAMA-3 8B AT FULL PRECISION

COST RELATIVE TO GPT-4o

Control your IP.
‍Own your AI.

Fine-tune open-source models like Llama on your data and run them on Together Cloud or in a hyperscaler VPC. With no vendor lock-in, your AI remains fully under your control.

together files upload acme_corp_customer_support.jsonl
  
{
  "filename" : "acme_corp_customer_support.json",
  "id": "file-aab9997e-bca8-4b7e-a720-e820e682a10a",
  "object": "file"
}
  
  
together finetune create --training-file file-aab9997-bca8-4b7e-a720-e820e682a10a
--model together compute/RedPajama-INCITE-7B-Chat

together finetune create --training-file $FILE_ID 
--model $MODEL_NAME 
--wandb-api-key $WANDB_API_KEY 
--n-epochs 10 
--n-checkpoints 5 
--batch-size 8 
--learning-rate 0.0003
{
    "training_file": "file-aab9997-bca8-4b7e-a720-e820e682a10a",
    "model_output_name": "username/togethercomputer/llama-2-13b-chat",
    "model_output_path": "s3://together/finetune/63e2b89da6382c4d75d5ef22/username/togethercomputer/llama-2-13b-chat",
    "Suffix": "Llama-2-13b 1",
    "model": "togethercomputer/llama-2-13b-chat",
    "n_epochs": 4,
    "batch_size": 128,
    "learning_rate": 1e-06,
    "checkpoint_steps": 2,
    "created_at": 1687982945,
    "updated_at": 1687982945,
    "status": "pending",
    "id": "ft-5bf8990b-841d-4d63-a8a3-5248d73e045f",
    "epochs_completed": 3,
    "events": [
        {
            "object": "fine-tune-event",
            "created_at": 1687982945,
            "message": "Fine tune request created",
            "type": "JOB_PENDING",
        }
    ],
    "queue_depth": 0,
    "wandb_project_name": "Llama-2-13b Fine-tuned 1"
}

Pika creates the next gen text-to-video models on Together GPU Clusters

Nexusflow uses Together GPU Clusters to build cybersecurity models

Arcee builds domain adaptive language models with Together Custom Models