Quick Start

Drop-in OpenAI-API compatible endpoint

My API Keys

Chat Completions

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 import requests import json url = "https://api.arliai.com/v1/chat/completions" payload = json.dumps({ "model": "Meta-Llama-3.1-8B-Instruct", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"}, {"role": "assistant", "content": "Hi!, how can I help you today?"} ], "repetition_penalty": 1.1, "temperature": 0.7, "top_p": 0.9, "top_k": 40, "max_tokens": 1024, "stream": True }) headers = { 'Content-Type': 'application/json', 'Authorization': f"Bearer {ARLIAI_API_KEY}" } response = requests.request("POST", url, headers=headers, data=payload)
NOTE: Some models might not accept system prompts.

Completions

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 import requests import json url = "https://api.arliai.com/v1/completions" payload = json.dumps({ "model": "Meta-Llama-3.1-8B-Instruct", "prompt": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are an assistant AI.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHello there!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n", "repetition_penalty": 1.1, "temperature": 0.7, "top_p": 0.9, "top_k": 40, "max_tokens": 1024, "stream": True }) headers = { 'Content-Type': 'application/json', 'Authorization': f"Bearer {ARLIAI_API_KEY}" } response = requests.request("POST", url, headers=headers, data=payload)
NOTE: Make sure to use the suggested prompt format for each model when using completions. Example shown is Llama 3 Instruct format.

AnythingLLM

AnythingLLM Image

SillyTavern

- Text Completion

SillyTavern Image

- Chat Completion

SillyTavern Image

RisuAI

Risu Image