Inference API (Serverless Endpoints)
Last updated
Last updated
[!NOTE] This section covers the Inference API (OpenAI compatible), which provides serverless endpoints for running generative AI models without managing infrastructure. Find guides, supported models, and usage examples below.
Welcome to Nebula Block's Inference API, the easiest way to integrate powerful generative AI
models like Meta LLaMa and StableDiffusion into your applications. The Inference API is OpenAI compatible and available at https://inference.nebulablock.com/v1
.
With Nebula Block, endpoints are pre-configured for you—simply sign up, log in, and start using them right away! You can explore the capabilities of our models live on our platform UI or integrate them into your projects using provided code samples in curl, Python, or JavaScript.
OpenAI Compatible: Use OpenAI SDKs and tools with Nebula Block's Inference API.
Serverless Endpoints: No setup required—get started instantly after signing in.
Live Use: Test models interactively on our platform's UI.
Code Samples: Quick-start guides for curl, Python, and JavaScript.
Support for Popular Models: Choose from a range of new models like Meta's LLaMA3.1, StableDiffusion XL, and more.
Scalable Performance: Built on Nebula Block's GPU-accelerated infrastructure.
Customizable Configurations: Tailor endpoints to meet your workload requirements.
Nebula Block Account: Ensure you have an account on our .
Credit Balance: Ensure you have at least $0.01 credit balance in your account.
Pay-As-You-Go: Charges are based on token usage.
Check the for details.
To get started with Text Generation, see .
To get started with Image Generation, see .
To get started with Embedding Generation, see .