Providers
Here is a list of supported language providers:
OpenAI
OpenAI (opens in a new tab) is a super popular LLM provider that basically created the GenAI movement.
Below is an example OpenAI provider config:
routers:
language:
- id: default
models:
- id: openai
openai:
base_url: https://api.openai.com/v1
chat_endpoint: /chat/completions
model: gpt-3.5-turbo
api_key: <YOUR API KEY>
default_params:
temperature: 0.8
top_p: 1
max_tokens: 100
n: 1
frequency_penalty: 0
presence_penalty: 0
seed: 42
Here is a list of all supported provider model params:
Azure OpenAI
Azure OpenAI (opens in a new tab) is an Azure-hosted version of OpenAI models.
Below is an example Azure OpenAI provider config:
routers:
language:
- id: default
models:
- id: azureopenai
azureopenai:
base_url: <YOUR AZURE ENDPOINT>
chat_endpoint: /chat/completions
api_version: "2023-05-15"
model: gpt-3.5-turbo
api_key: <YOUR API KEY>
default_params:
temperature: 0.8
top_p: 1
max_tokens: 100
n: 1
frequency_penalty: 0
presence_penalty: 0
seed: 42
Here is a list of all supported provider model params:
Anthropic
Anthropic has not yet integrated with the streaming chat API.
Anthropic (opens in a new tab)
Below is an example Anthropic provider config:
routers:
language:
- id: default
models:
- id: anthropic
anthropic:
base_url: https://api.anthropic.com/v1
api_version: "2023-06-01"
chat_endpoint: /messages
model: claude-instant-1.2
api_key: <YOUR API KEY>
default_params:
system: You are a helpful assistant.
temperature: 1
max_tokens: 250
Here is a list of all supported provider model params:
Cohere
Cohere (opens in a new tab) is another popular LLM provider that has a great low latency models.
Here is an example Cohere configuration:
routers:
language:
- id: default
models:
- id: cohere
cohere:
base_url: https://api.cohere.ai/v1
chat_endpoint: /chat
model: command-light
api_key: <YOUR API KEY>
default_params:
temperature: 0.3
p: 0.75
Here is a list of all supported provider model params:
AWS Bedrock
Bedrock has not yet integrated with the streaming chat API.
routers:
language:
- id: default
models:
- id: cohere
bedrock:
base_url: <YOUR AWS ENDPOINT>
chat_endpoint: /model
model: amazon.titan-text-express-v1
api_key: <YOUR API KEY>
access_key: <YOUR ACCESS KEY>
secret_key: <YOUR SECRET KEY>
aws_region: <YOUR REGION>
default_params:
temperature: 0
top_p: 1
max_tokens: 512
stop_sequences: []
Here is a list of all supported provider model params:
OctoML
OctoML has not yet integrated with the streaming chat API.
OctoML (opens in a new tab) the default_params and model name for OctoML. Specify override values in the config.yaml
file.
routers:
language:
- id: default
models:
- id: octoml
octoml:
base_url: https://text.octoai.run/v1
chat_endpoint: /chat/completions
model: mistral-7b-instruct-fp16
api_key: <YOUR API KEY>
default_params:
temperature: 1
top_p: 1
max_tokens: 100
Here is a list of all supported provider model params:
Ollama
Ollama has not yet integrated with the streaming chat API.
Ollama (opens in a new tab) is a great way to serve Open Source LLMs locally and beyond.
Here is an example Ollama configuration:
routers:
language:
- id: default
models:
- id: ollama
ollama:
base_url: http://localhost:11434
chat_endpoint: /api/chat
model: llama3
default_params:
temperature: 0.8
top_p: 0.9
num_ctx: 2048
top_k: 40
Here is a list of all supported provider model params: