Generate Text (Vision)
Generate Serverless Text (Vision).
Return the generated text based on the given textual and image inputs.
HTTP Request
POST
{API_URL}/chat/completions
where the API_URL = https://inference.nebulablock.com/v1
. The body has the following parameters:
messages
array
: An array of message objects. Each object should have:role
string
: The role of the message sender (e.g., "user").content
list
: A list containing the different inputs (recall an input can be either text or image). Each input is represented as a dict with the following key-value pairs:type
string
: The type of input (e.g., "text" or "image_url").image_url
dict
: If type is "image_url", this contains a dict representing the URL of the image with the following key-value pair:url
string
: The URL of the image.
text
string
: If type is "text", the text input.
Note that only one of url
or text
can be provided in the input dict, and depends on the type
.
model
string
: The model to use for generating the response.max_tokens
integer or null
: The maximum number of tokens to generate. If null, the model's default will be used.temperature
float
: Sampling temperature. Higher values make the output more random, while lower values make it more focused and deterministic.top_p
float
: Nucleus sampling probability. The model will consider the results of the tokens with top_p probability mass. In other words, a higher value will result in more diverse outputs, while a lower value will result in more repetitive outputs.stream
boolean
: Whether to stream the response in chunks or not.
Response Attributes
id string
string
A unique identifier for the completion request.
created integer
integer
A Unix timestamp representing when the response was generated.
model string
string
The specific AI model used to generate the response.
object string
string
The type of response object (e.g., "chat.completion.chunk"
for a streamed chunk or chat.completion
for a non-chunked completion).
system_fingerprint string
string
A unique identifier for the system that generated the response, if available.
choices array
array
An array containing completion objects. Each object has the following fields:
finish_reason
string
: The reason the completion finished.index
integer
: An index demarking this completion object.message
dict
: Contains data on the generated output.content
string
: The generated text for this completion object.role
string
: Specifies the role of the AI (e.g., "assistant").tool_calls
array
: Contains information about the tools used in generating the completion, if available.function_calls
array
: Contains information about the functions used in generating the completion, if available.
usage dict
dict
A dictionary containing information about the inference request, in key-value pairs:
completion_tokens
integer
: The number of tokens generated in the completion for a completion action.prompt_tokens
integer
: The number of tokens in the prompt.total_tokens
integer
: The total number of tokens (prompt and completion combined).completion_tokens_details
null
: Additional details about the completion tokens, if available.prompt_tokens_details
null
: Additional details about the prompt tokens, if available.
service_tier string
string
The service tier used for the completion request.
prompt_logprobs array
array
An array containing the log probabilities of the tokens in the prompt, if available.
Example
Request
Response
A successful generation response (non-streaming) will contain a chat.completion
object, and should look like this:
As is the case with text generation, if you set stream
to True you can get the entire generated completion in 1 chat.completion
object:
Last updated