Inference Models

Note: This section covers the Inference API (OpenAI compatible), which provides serverless endpoints for running generative AI models without managing infrastructure. Find guides, supported models, and usage examples below.

Inference API (Inference Models)

Welcome to Nebula Block's Inference API, the easiest way to integrate powerful generative AI models like Meta LLaMa and StableDiffusion into your applications. The Inference API is OpenAI compatible and available at https://inference.nebulablock.com/v1.

With Nebula Block, endpoints are pre-configured for you—simply sign up, log in, and start using them right away! You can explore the capabilities of our models live on our platform UI or integrate them into your projects using provided code samples in curl, Python, or JavaScript.

Key Features

  • OpenAI Compatible: Use OpenAI SDKs and tools with Nebula Block's Inference API.

  • Inference Models: No setup required—get started instantly after signing in.

  • Live Use: Test models interactively on our platform's UI.

  • Code Samples: Quick-start guides for curl, Python, and JavaScript.

  • Support for Popular Models: Choose from a range of new models like Meta's LLaMA3.1, StableDiffusion XL, and more.

  • Scalable Performance: Built on Nebula Block's GPU-accelerated infrastructure.

  • Customizable Configurations: Tailor endpoints to meet your workload requirements.

Prerequisites

  • Nebula Block Account: Ensure you have an account on our platform.

  • Credit Balance: Ensure you have at least $0.01 credit balance in your account.

Pricing and Billing

  • Pay-As-You-Go: Charges are based on token usage.

  • Check the pricing page for details.

Types of Inference

To get started with Text Generation, see Text Generation.

To get started with Image Generation, see Image Generation.

To get started with Embedding Generation, see Embedding Generation.

See Also

Last updated