Generative AI API(OpenAI API compatible) (1.0)

Download OpenAPI specification:Download

Create Chat Completion

Completion API similar to OpenAI's API.

See https://platform.openai.com/docs/api-reference/chat/create for the API specification. This API mimics the OpenAI ChatCompletion API.

NOTE: Currently we do not support the following features: - function_call (Users should implement this by themselves) - logit_bias (to be supported by vLLM engine)

Authorizations:

bearerAuth

Request Body schema: application/json
required

required	Messages (string) or Array of Messages (objects) (Messages) A list of messages comprising the conversation so far.
model required	string (Model) ID of the model to use.
frequency_penalty	number (Frequency Penalty) Default: 0 Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
	object (Logit Bias) Default: null Modify the likelihood of specified tokens appearing in the completion.
max_tokens	integer (Max Tokens) Default: null The maximum number of tokens that can be generated in the chat completion.
n	integer (N) Default: 1 How many chat completion choices to generate for each input message.
presence_penalty	number (Presence Penalty) Default: 0 Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
	Array of Stop (strings) or Stop (string) (Stop) Default: null Up to 4 sequences where the API will stop generating further tokens.
stream	boolean (Stream) Default: false If set, partial message deltas will be sent, like in ChatGPT.
temperature	number (Temperature) Default: 1 What sampling temperature to use, between 0 and 2.
top_p	number (Top P) Default: 1 An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.
user	string (User) Default: null A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.

Responses

Request samples

Payload

Content type

application/json

{"messages": "[{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},{\"role\": \"user\", \"content\": \"Hello!\"}]",
"model": "cotomi-fast-v1.0",
"frequency_penalty": 0,
"logit_bias": null,
"max_tokens": null,
"n": 1,
"presence_penalty": 0,
"stop": null,
"stream": false,
"temperature": 1,
"top_p": 1,
"user": null
}

Response samples