Debian/OpenWeb-UI

Fork 0

Go to file

innotex 455e314596 update repository

2026-02-10 20:55:32 +01:00

docker-compose.yml

update repository

2026-02-10 20:55:32 +01:00

LICENSE

Ajout OpenWeb-UI fonctionnel

2025-08-07 20:11:08 +02:00

Notice d'installation.odt

update readme

2025-08-07 20:53:56 +02:00

README.md

update readme

2025-08-07 20:53:56 +02:00

README.md

Open WebUI 👋

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG, making it a powerful AI deployment solution.

Passionate about open-source AI? Join our team →

Tip

Looking for an Enterprise Plan? – Speak with Our Sales Team Today!

Get enhanced capabilities, including custom theming and branding, Service Level Agreement (SLA) support, Long-Term Support (LTS) versions, and more!

For more information, be sure to check out our Open WebUI Documentation.

Key Features of Open WebUI ⭐

Feature	Description
🚀 Effortless Setup	Install seamlessly using Docker or Kubernetes (kubectl, kustomize or helm) for a hassle-free experience with support for both `:ollama` and `:cuda` tagged images.
🤝 Ollama/OpenAI API Integration	Effortlessly integrate OpenAI-compatible APIs for versatile conversations alongside Ollama models. Customize the OpenAI API URL to link with LMStudio, GroqCloud, Mistral, OpenRouter, and more.
🛡️ Granular Permissions and User Groups	By allowing administrators to create detailed user roles and permissions, we ensure a secure user environment. This granularity not only enhances security but also allows for customized user experiences, fostering a sense of ownership and responsibility amongst users.
📱 Responsive Design	Enjoy a seamless experience across Desktop PC, Laptop, and Mobile devices.
📱 Progressive Web App (PWA) for Mobile	Enjoy a native app-like experience on your mobile device with our PWA, providing offline access on localhost and a seamless user interface.
✒️🔢 Full Markdown and LaTeX Support	Elevate your LLM experience with comprehensive Markdown and LaTeX capabilities for enriched interaction.
🎤📹 Hands-Free Voice/Video Call	Experience seamless communication with integrated hands-free voice and video call features, allowing for a more dynamic and interactive chat environment.
🛠️ Model Builder	Easily create Ollama models via the Web UI. Create and add custom characters/agents, customize chat elements, and import models effortlessly through Open WebUI Community integration.
🐍 Native Python Function Calling Tool	Enhance your LLMs with built-in code editor support in the tools workspace. Bring Your Own Function (BYOF) by simply adding your pure Python functions, enabling seamless integration with LLMs.
📚 Local RAG Integration	Dive into the future of chat interactions with groundbreaking Retrieval Augmented Generation (RAG) support. This feature seamlessly integrates document interactions into your chat experience. You can load documents directly into the chat or add files to your document library, effortlessly accessing them using the `#` command before a query.
🔍 Web Search for RAG	Perform web searches using providers like `SearXNG`, `Google PSE`, `Brave Search`, `serpstack`, `serper`, `Serply`, `DuckDuckGo`, `TavilySearch`, `SearchApi` and `Bing` and inject the results directly into your chat experience.
🌐 Web Browsing Capability	Seamlessly integrate websites into your chat experience using the `#` command followed by a URL. This feature allows you to incorporate web content directly into your conversations, enhancing the richness and depth of your interactions.
🎨 Image Generation Integration	Seamlessly incorporate image generation capabilities using options such as AUTOMATIC1111 API or ComfyUI (local), and OpenAI's DALL-E (external), enriching your chat experience with dynamic visual content.
⚙️ Many Models Conversations	Effortlessly engage with various models simultaneously, harnessing their unique strengths for optimal responses. Enhance your experience by leveraging a diverse set of models in parallel.
🔐 Role-Based Access Control (RBAC)	Ensure secure access with restricted permissions; only authorized individuals can access your Ollama, and exclusive model creation/pulling rights are reserved for administrators.
🌐🌍 Multilingual Support	Experience Open WebUI in your preferred language with our internationalization (i18n) support. Join us in expanding our supported languages! We're actively seeking contributors!
🧩 Pipelines, Open WebUI Plugin Support	Seamlessly integrate custom logic and Python libraries into Open WebUI using Pipelines Plugin Framework. Launch your Pipelines instance, set the OpenAI URL to the Pipelines URL, and explore endless possibilities. Examples include Function Calling, User Rate Limiting to control access, Usage Monitoring with tools like Langfuse, Live Translation with LibreTranslate for multilingual support, Toxic Message Filtering and much more.
🌟 Continuous Updates	We are committed to improving Open WebUI with regular updates, fixes, and new features.

Want to learn more about Open WebUI's features? Check out our Open WebUI documentation for a comprehensive overview!

Sponsors 🙌

Emerald

Logo	Description
	n8n • Does your interface have a backend yet? Try n8n
	Tailscale • Connect self-hosted AI to any device with Tailscale

We are incredibly grateful for the generous support of our sponsors. Their contributions help us to maintain and improve our project, ensuring we can continue to deliver quality work to our community. Thank you!

How to Install 🚀

Installation via Python pip 🐍

Open WebUI can be installed using pip, the Python package installer. Before proceeding, ensure you're using Python 3.11 to avoid compatibility issues.

Install Open WebUI: Open your terminal and run the following command to install Open WebUI:

pip install open-webui

Running Open WebUI: After installation, you can start Open WebUI by executing:

open-webui serve

This will start the Open WebUI server, which you can access at http://localhost:8080

Quick Start with Docker 🐳

Note

Please note that for certain Docker environments, additional configurations might be needed. If you encounter any connection issues, our detailed guide on Open WebUI Documentation is ready to assist you.

Warning

When using Docker to install Open WebUI, make sure to include the -v open-webui:/app/backend/data in your Docker command. This step is crucial as it ensures your database is properly mounted and prevents any loss of data.

Tip

If you wish to utilize Open WebUI with Ollama included or CUDA acceleration, we recommend utilizing our official images tagged with either :cuda or :ollama. To enable CUDA, you must install the Nvidia CUDA container toolkit on your Linux/WSL system.

Installation with Default Configuration

Command	Description
`docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main`	If Ollama is on your computer
`docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=https://example.com -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main`	If Ollama is on a Different Server
`docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda`	To run Open WebUI with Nvidia GPU support
`docker run -d -p 3000:8080 -e OPENAI_API_KEY=your_secret_key -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main`	If you're only using OpenAI API

Installation for OpenAI API Usage Only

Command	Description
`docker run -d -p 3000:8080 -e OPENAI_API_KEY=your_secret_key -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main`	If you're only using OpenAI API

Installing Open WebUI with Bundled Ollama Support

This installation method uses a single container image that bundles Open WebUI with Ollama, allowing for a streamlined setup via a single command. Choose the appropriate command based on your hardware setup:

Command	Description
`docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama`	With GPU Support
`docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama`	For CPU Only

Both commands facilitate a built-in, hassle-free installation of both Open WebUI and Ollama, ensuring that you can get everything up and running swiftly.

After installation, you can access Open WebUI at http://localhost:3000. Enjoy! 😄

Other Installation Methods

We offer various installation alternatives, including non-Docker native installation methods, Docker Compose, Kustomize, and Helm. Visit our Open WebUI Documentation or join our Discord community for comprehensive guidance.

Look at the Local Development Guide for instructions on setting up a local development environment.

Troubleshooting

Encountering connection issues? Our Open WebUI Documentation has got you covered. For further assistance and to join our vibrant community, visit the Open WebUI Discord.

Open WebUI: Server Connection Error

If you're experiencing connection issues, it’s often due to the WebUI docker container not being able to reach the Ollama server at 127.0.0.1:11434 (host.docker.internal:11434) inside the container . Use the --network=host flag in your docker command to resolve this. Note that the port changes from 3000 to 8080, resulting in the link: http://localhost:8080.

Example Docker Command:

docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Keeping Your Docker Installation Up-to-Date

In case you want to update your local Docker installation to the latest version, you can do it with Watchtower:

docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui

In the last part of the command, replace open-webui with your container name if it is different.

Check our Updating Guide available in our Open WebUI Documentation.

Using the Dev Branch 🌙

Warning

The :dev branch contains the latest unstable features and changes. Use it at your own risk as it may have bugs or incomplete features.

If you want to try out the latest bleeding-edge features and are okay with occasional instability, you can use the :dev tag like this:

docker run -d -p 3000:8080 -v open-webui:/app/backend/data --name open-webui --add-host=host.docker.internal:host-gateway --restart always ghcr.io/open-webui/open-webui:dev

Offline Mode

If you are running Open WebUI in an offline environment, you can set the HF_HUB_OFFLINE environment variable to 1 to prevent attempts to download models from the internet.

export HF_HUB_OFFLINE=1

What's Next? 🌟

Discover upcoming features on our roadmap in the Open WebUI Documentation.

License 📜

This project is licensed under the Open WebUI License, a revised BSD-3-Clause license. You receive all the same rights as the classic BSD-3 license: you can use, modify, and distribute the software, including in proprietary and commercial products, with minimal restrictions. The only additional requirement is to preserve the "Open WebUI" branding, as detailed in the LICENSE file. For full terms, see the LICENSE document. 📄

Support 💬

If you have any questions, suggestions, or need assistance, please open an issue or join our Open WebUI Discord community to connect with us! 🤝

Star History

Created by Timothy Jaeryang Baek - Let's make Open WebUI even more amazing together! 💪

Ollama

Get up and running with large language models.

macOS

Download

Windows

Download

Linux

curl -fsSL https://ollama.com/install.sh | sh

Manual install instructions

Docker

The official Ollama Docker image ollama/ollama is available on Docker Hub.

Libraries

Library	Description
ollama-python	Python library for Ollama
ollama-js	JavaScript library for Ollama

Community

Platform	Link
Discord	Discord community
Reddit	Reddit community

Quickstart

To run and chat with Gemma 3:

ollama run gemma3

Model library

Ollama supports a list of models available on ollama.com/library

Here are some example models that can be downloaded:

Model	Parameters	Size	Download
Gemma 3	1B	815MB	`ollama run gemma3:1b`
Gemma 3	4B	3.3GB	`ollama run gemma3`
Gemma 3	12B	8.1GB	`ollama run gemma3:12b`
Gemma 3	27B	17GB	`ollama run gemma3:27b`
QwQ	32B	20GB	`ollama run qwq`
DeepSeek-R1	7B	4.7GB	`ollama run deepseek-r1`
DeepSeek-R1	671B	404GB	`ollama run deepseek-r1:671b`
Llama 4	109B	67GB	`ollama run llama4:scout`
Llama 4	400B	245GB	`ollama run llama4:maverick`
Llama 3.3	70B	43GB	`ollama run llama3.3`
Llama 3.2	3B	2.0GB	`ollama run llama3.2`
Llama 3.2	1B	1.3GB	`ollama run llama3.2:1b`
Llama 3.2 Vision	11B	7.9GB	`ollama run llama3.2-vision`
Llama 3.2 Vision	90B	55GB	`ollama run llama3.2-vision:90b`
Llama 3.1	8B	4.7GB	`ollama run llama3.1`
Llama 3.1	405B	231GB	`ollama run llama3.1:405b`
Phi 4	14B	9.1GB	`ollama run phi4`
Phi 4 Mini	3.8B	2.5GB	`ollama run phi4-mini`
Mistral	7B	4.1GB	`ollama run mistral`
Moondream 2	1.4B	829MB	`ollama run moondream`
Neural Chat	7B	4.1GB	`ollama run neural-chat`
Starling	7B	4.1GB	`ollama run starling-lm`
Code Llama	7B	3.8GB	`ollama run codellama`
Llama 2 Uncensored	7B	3.8GB	`ollama run llama2-uncensored`
LLaVA	7B	4.5GB	`ollama run llava`
Granite-3.3	8B	4.9GB	`ollama run granite3.3`

Note

You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.

Customize a model

Import from GGUF

Ollama supports importing GGUF models in the Modelfile:

Create a file named Modelfile, with a FROM instruction with the local filepath to the model you want to import.

FROM ./vicuna-33b.Q4_0.gguf

Create the model in Ollama

ollama create example -f Modelfile

Run the model

ollama run example

Import from Safetensors

See the guide on importing models for more information.

Customize a prompt

Models from the Ollama library can be customized with a prompt. For example, to customize the llama3.2 model:

ollama pull llama3.2

Create a Modelfile:

FROM llama3.2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system message
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.

For more information on working with a Modelfile, see the Modelfile documentation.

CLI Reference

Create a model

ollama create is used to create a model from a Modelfile.

ollama create mymodel -f ./Modelfile

Pull a model

ollama pull llama3.2

This command can also be used to update a local model. Only the diff will be pulled.

Remove a model

ollama rm llama3.2

Copy a model

ollama cp llama3.2 my-model

Multiline input

For multiline input, you can wrap text with """:

>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.

Multimodal models

ollama run llava "What's in this image? /Users/jmorgan/Desktop/smile.png"

Output: The image features a yellow smiley face, which is likely the central focus of the picture.

Pass the prompt as an argument

ollama run llama3.2 "Summarize this file: $(cat README.md)"

Output: Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.

Show model information

ollama show llama3.2

List models on your computer

ollama list

List which models are currently loaded

ollama ps

Stop a model which is currently running

ollama stop llama3.2

Start Ollama

ollama serve is used when you want to start ollama without running the desktop application.

REST API

Ollama has a REST API for running and managing models.

Generate a response

curl http://localhost:11434/api/generate -d '{
"model": "llama3.2",
"prompt":"Why is the sky blue?"
}'

Chat with a model

curl http://localhost:11434/api/chat -d '{
"model": "llama3.2",
"messages": [
{ "role": "user", "content": "why is the sky blue?" }
]
}'

See the API documentation for all endpoints.

Community Integrations

Web & Desktop

Integration	Description
Open WebUI	Open WebUI integration
SwiftChat (macOS with ReactNative)	SwiftChat integration
Enchanted (macOS native)	Enchanted integration
Hollama	Hollama integration
Lollms-Webui	Lollms-Webui integration
LibreChat	LibreChat integration
Bionic GPT	Bionic GPT integration
HTML UI	HTML UI integration
Saddle	Saddle integration
TagSpaces	TagSpaces integration
Chatbot UI	Chatbot UI integration
Chatbot UI v2	Chatbot UI v2 integration
Typescript UI	Typescript UI integration
Minimalistic React UI for Ollama Models	Minimalistic React UI integration
Ollamac	Ollamac integration
big-AGI	big-AGI integration
Cheshire Cat assistant framework	Cheshire Cat integration
Amica	Amica integration
chatd	chatd integration
Ollama-SwiftUI	Ollama-SwiftUI integration
Dify.AI	Dify.AI integration
MindMac	MindMac integration
NextJS Web Interface for Ollama	NextJS Web Interface integration
Msty	Msty integration
Chatbox	Chatbox integration
WinForm Ollama Copilot	WinForm Ollama Copilot integration
NextChat	NextChat integration
Alpaca WebUI	Alpaca WebUI integration
OllamaGUI	OllamaGUI integration
OpenAOE	OpenAOE integration
Odin Runes	Odin Runes integration
LLM-X	LLM-X integration
AnythingLLM (Docker + MacOs/Windows/Linux native app)	AnythingLLM integration
Ollama Basic Chat: Uses HyperDiv Reactive UI	Ollama Basic Chat integration
Ollama-chats RPG	Ollama-chats RPG integration
IntelliBar	IntelliBar integration
Jirapt	Jirapt integration
ojira	ojira integration
QA-Pilot	QA-Pilot integration
ChatOllama	ChatOllama integration
CRAG Ollama Chat	CRAG Ollama Chat integration
RAGFlow	RAGFlow integration
StreamDeploy	StreamDeploy integration
chat	chat integration
Lobe Chat	Lobe Chat integration
Ollama RAG Chatbot	Ollama RAG Chatbot integration
BrainSoup	BrainSoup integration
macai	macai integration
RWKV-Runner	RWKV-Runner integration
Ollama Grid Search	Ollama Grid Search integration
Olpaka	Olpaka integration
Casibase	Casibase integration
OllamaSpring	OllamaSpring integration
LLocal.in	LLocal.in integration
Shinkai Desktop	Shinkai Desktop integration
AiLama	AiLama integration
Ollama with Google Mesop	Ollama with Google Mesop integration
R2R	R2R integration
Ollama-Kis	Ollama-Kis integration
OpenGPA	OpenGPA integration
Painting Droid	Painting Droid integration
Kerlig AI	Kerlig AI integration
AI Studio	AI Studio integration
Sidellama	Sidellama integration
LLMStack	LLMStack integration
BoltAI for Mac	BoltAI for Mac integration
Harbor	Harbor integration
PyGPT	PyGPT integration
Alpaca	Alpaca integration
AutoGPT	AutoGPT integration
Go-CREW	Go-CREW integration
PartCAD	PartCAD integration
Ollama4j Web UI	Ollama4j Web UI integration
PyOllaMx	PyOllaMx integration
Cline	Cline integration
Cherry Studio	Cherry Studio integration
ConfiChat	ConfiChat integration
Archyve	Archyve integration
crewAI with Mesop	crewAI with Mesop integration
Tkinter-based client	Tkinter-based client integration
LLMChat	LLMChat integration
Local Multimodal AI Chat	Local Multimodal AI Chat integration
ARGO	ARGO integration
OrionChat	OrionChat integration
G1	G1 integration
Web management	Web management integration
Promptery	Promptery integration
Ollama App	Ollama App integration
chat-ollama	chat-ollama integration
SpaceLlama	SpaceLlama integration
YouLama	YouLama integration
DualMind	DualMind integration
ollamarama-matrix	ollamarama-matrix integration
ollama-chat-app	ollama-chat-app integration
Perfect Memory AI	Perfect Memory AI integration
Hexabot	Hexabot integration
Reddit Rate	Reddit Rate integration
OpenTalkGpt	OpenTalkGpt integration
VT	VT integration
Nosia	Nosia integration
Witsy	Witsy integration
Abbey	Abbey integration
Minima	Minima integration
aidful-ollama-model-delete	aidful-ollama-model-delete integration
Perplexica	Perplexica integration
Ollama Chat WebUI for Docker	Ollama Chat WebUI for Docker integration
AI Toolkit for Visual Studio Code	AI Toolkit for Visual Studio Code integration
MinimalNextOllamaChat	MinimalNextOllamaChat integration
Chipper	Chipper integration
ChibiChat	ChibiChat integration
LocalLLM	LocalLLM integration
Ollamazing	Ollamazing integration
OpenDeepResearcher-via-searxng	OpenDeepResearcher-via-searxng integration
AntSK	AntSK integration
MaxKB	MaxKB integration
yla	yla integration
LangBot	LangBot integration
1Panel	1Panel integration
AstrBot	AstrBot integration
Reins	Reins integration
Flufy	Flufy integration
Ellama	Ellama integration
screenpipe	screenpipe integration
Ollamb	Ollamb integration
Writeopia	Writeopia integration
AppFlowy	AppFlowy integration
Lumina	Lumina integration
Tiny Notepad	Tiny Notepad integration
macLlama (macOS native)	macLlama integration
GPTranslate	GPTranslate integration
ollama launcher	ollama launcher integration
ai-hub	ai-hub integration
Mayan EDMS	Mayan EDMS integration

Cloud

Cloud	Link
Google Cloud	Google Cloud integration
Fly.io	Fly.io integration
Koyeb	Koyeb integration

Terminal

Terminal	Link
oterm	oterm integration
Ellama Emacs client	Ellama Emacs client integration
Emacs client	Emacs client integration
neollama	neollama integration
gen.nvim	gen.nvim integration
ollama.nvim	ollama.nvim integration
ollero.nvim	ollero.nvim integration
ollama-chat.nvim	ollama-chat.nvim integration
ogpt.nvim	ogpt.nvim integration
gptel Emacs client	gptel Emacs client integration
Oatmeal	Oatmeal integration
cmdh	cmdh integration
ooo	ooo integration
shell-pilot	shell-pilot integration
tenere	tenere integration
llm-ollama	llm-ollama integration
typechat-cli	typechat-cli integration
ShellOracle	ShellOracle integration
tlm	tlm integration
podman-ollama	podman-ollama integration
gollama	gollama integration
ParLlama	ParLlama integration
Ollama eBook Summary	Ollama eBook Summary integration
Ollama Mixture of Experts (MOE) in 50 lines of code	Ollama Mixture of Experts integration
vim-intelligence-bridge	vim-intelligence-bridge integration
x-cmd ollama	x-cmd ollama integration
bb7	bb7 integration
SwollamaCLI	SwollamaCLI integration
aichat	aichat integration
PowershAI	PowershAI integration
DeepShell	DeepShell integration
orbiton	orbiton integration
orca-cli	orca-cli integration
GGUF-to-Ollama	GGUF-to-Ollama integration
AWS-Strands-With-Ollama	AWS-Strands-With-Ollama integration
ollama-multirun	ollama-multirun integration
ollama-bash-toolshed	ollama-bash-toolshed integration

Apple Vision Pro

Integration	Link
SwiftChat	SwiftChat integration
Enchanted	Enchanted integration

Database

Integration	Link
pgai	pgai integration
MindsDB	MindsDB integration
chromem-go	chromem-go integration
Kangaroo	Kangaroo integration

Package managers

Package Manager	Link
Pacman	Pacman integration
Gentoo	Gentoo integration
Homebrew	Homebrew integration
Helm Chart	Helm Chart integration
Guix channel	Guix channel integration
Nix package	Nix package integration
Flox	Flox integration

Libraries

Library	Link
LangChain	LangChain integration
LangChain.js	LangChain.js integration
Firebase Genkit	Firebase Genkit integration
crewAI	crewAI integration
Yacana	Yacana integration
Spring AI	Spring AI integration
LangChainGo	LangChainGo integration
LangChain4j	LangChain4j integration
LangChainRust	LangChainRust integration
LangChain for .NET	LangChain for .NET integration
LLPhant	LLPhant integration
LlamaIndex	LlamaIndex integration
LlamaIndexTS	LlamaIndexTS integration
LiteLLM	LiteLLM integration
OllamaFarm for Go	OllamaFarm for Go integration
OllamaSharp for .NET	OllamaSharp for .NET integration
Ollama for Ruby	Ollama for Ruby integration
Ollama-rs for Rust	Ollama-rs for Rust integration
Ollama-hpp for C++	Ollama-hpp for C++ integration
Ollama4j for Java	Ollama4j for Java integration
ModelFusion Typescript Library	ModelFusion Typescript Library integration
OllamaKit for Swift	OllamaKit for Swift integration
Ollama for Dart	Ollama for Dart integration
Ollama for Laravel	Ollama for Laravel integration
LangChainDart	LangChainDart integration
Semantic Kernel - Python	Semantic Kernel - Python integration
Haystack	Haystack integration
Elixir LangChain	Elixir LangChain integration
Ollama for R - rollama	Ollama for R - rollama integration
Ollama for R - ollama-r	Ollama for R - ollama-r integration
Ollama-ex for Elixir	Ollama-ex for Elixir integration
Ollama Connector for SAP ABAP	Ollama Connector for SAP ABAP integration
Testcontainers	Testcontainers integration
Portkey	Portkey integration
PromptingTools.jl	PromptingTools.jl integration
LlamaScript	LlamaScript integration
llm-axe	llm-axe integration
Gollm	Gollm integration
Gollama for Golang	Gollama for Golang integration
Ollamaclient for Golang	Ollamaclient for Golang integration
High-level function abstraction in Go	High-level function abstraction in Go integration
Ollama PHP	Ollama PHP integration
Agents-Flex for Java	Agents-Flex for Java integration
Parakeet	Parakeet integration
Haverscript	Haverscript integration
Ollama for Swift	Ollama for Swift integration
Swollama for Swift	Swollama for Swift integration
GoLamify	GoLamify integration
Ollama for Haskell	Ollama for Haskell integration
multi-llm-ts	multi-llm-ts integration
LlmTornado	LlmTornado integration
Ollama for Zig	Ollama for Zig integration
Abso	Abso integration
Nichey	Nichey integration
Ollama for D	Ollama for D integration
OllamaPlusPlus	OllamaPlusPlus integration

Mobile

Integration	Link
SwiftChat	SwiftChat integration
Enchanted	Enchanted integration
Maid	Maid integration
Ollama App	Ollama App integration
ConfiChat	ConfiChat integration
Ollama Android Chat	Ollama Android Chat integration
Reins	Reins integration

Extensions & Plugins

Integration	Link
Raycast extension	Raycast extension integration
Discollama	Discollama integration
Continue	Continue integration
Vibe	Vibe integration
Obsidian Ollama plugin	Obsidian Ollama plugin integration
Logseq Ollama plugin	Logseq Ollama plugin integration
NotesOllama	NotesOllama integration
Dagger Chatbot	Dagger Chatbot integration
Discord AI Bot	Discord AI Bot integration
Ollama Telegram Bot	Ollama Telegram Bot integration
Hass Ollama Conversation	Hass Ollama Conversation integration
Rivet plugin	Rivet plugin integration
Obsidian BMO Chatbot plugin	Obsidian BMO Chatbot plugin integration
Cliobot	Cliobot integration
Copilot for Obsidian plugin	Copilot for Obsidian plugin integration
Obsidian Local GPT plugin	Obsidian Local GPT plugin integration
Open Interpreter	Open Interpreter integration
Llama Coder	Llama Coder integration
Ollama Copilot	Ollama Copilot integration
twinny	twinny integration
Wingman-AI	Wingman-AI integration
Page Assist	Page Assist integration
Plasmoid Ollama Control	Plasmoid Ollama Control integration
AI Telegram Bot	AI Telegram Bot integration
AI ST Completion	AI ST Completion integration
Discord-Ollama Chat Bot	Discord-Ollama Chat Bot integration
ChatGPTBox: All in one browser extension	ChatGPTBox integration
Discord AI chat/moderation bot	Discord AI chat/moderation bot integration
Headless Ollama	Headless Ollama integration
Terraform AWS Ollama & Open WebUI	Terraform AWS Ollama & Open WebUI integration
node-red-contrib-ollama	node-red-contrib-ollama integration
Local AI Helper	Local AI Helper integration
vnc-lm	vnc-lm integration
LSP-AI	LSP-AI integration
QodeAssist	QodeAssist integration
Obsidian Quiz Generator plugin	Obsidian Quiz Generator plugin integration
AI Summmary Helper plugin	AI Summmary Helper plugin integration
TextCraft	TextCraft integration
Alfred Ollama	Alfred Ollama integration
TextLLaMA	TextLLaMA integration
Simple-Discord-AI	Simple-Discord-AI integration
LLM Telegram Bot	LLM Telegram Bot integration
mcp-llm	mcp-llm integration
SimpleOllamaUnity	SimpleOllamaUnity integration
UnityCodeLama	UnityCodeLama integration
NativeMind	NativeMind integration
GMAI - Gradle Managed AI	GMAI integration

Supported backends

Backend	Link
llama.cpp	llama.cpp integration

Observability

Tool	Link
Opik	Opik integration
Lunary	Lunary integration
OpenLIT	OpenLIT integration
HoneyHive	HoneyHive integration
Langfuse	Langfuse integration
MLflow Tracing	MLflow Tracing integration

OpenedAI Speech

Notice: This software is mostly obsolete and will no longer be updated.

Some Alternatives:

An OpenAI API compatible text to speech server.

Compatible with the OpenAI audio/speech API
Serves the /v1/audio/speech endpoint
Not affiliated with OpenAI in any way, does not require an OpenAI API Key
A free, private, text-to-speech server with custom voice cloning

Full Compatibility:

Feature	Description
`tts-1`	`alloy`, `echo`, `fable`, `onyx`, `nova`, and `shimmer` (configurable)
`tts-1-hd`	`alloy`, `echo`, `fable`, `onyx`, `nova`, and `shimmer` (configurable, uses OpenAI samples by default)
`response_format`	`mp3`, `opus`, `aac`, `flac`, `wav` and `pcm`
`speed`	0.25-4.0 (and more)

Details:

Detail	Description
Model `tts-1` via piper tts	Very fast, runs on CPU
Model `tts-1-hd` via coqui-ai/TTS xtts_v2 voice cloning	Fast, but requires around 4GB GPU VRAM
Custom cloned voices	Can be used for tts-1-hd
🌐 Multilingual support	With XTTS voices, the language is automatically detected if not set
Custom fine-tuned XTTS model support	See: Custom fine-tuned XTTS model support
Configurable generation parameters	See: Generation parameters
Streamed output	While generating
Occasionally, certain words or symbols may sound incorrect	Can be fixed with regex via `pre_process_map.yaml`
Tested with python	3.9-3.11, piper does not install on python 3.12 yet

If you find a better voice match for tts-1 or tts-1-hd, please let me know so I can update the defaults.

Recent Changes

Version	Date	Changes
0.18.2	2024-08-16	Fix docker building for amd64, refactor github actions again, free up more disk space
0.18.1	2024-08-15	Refactor github actions
0.18.0	2024-08-15	Allow folders of wav samples in xtts. Samples will be combined, allowing for mixed voices and collections of small samples. Still limited to 30 seconds total. Fix missing yaml requirement in -min image. Fix fr_FR-tom-medium and other 44khz piper voices (detect non-default sample rates). Minor updates
0.17.2	2024-07-01	Fix -min image (re: langdetect)
0.17.1	2024-07-01	Fix ROCm (add langdetect to requirements-rocm.txt). Fix zh-cn for xtts
0.17.0	2024-07-01	Automatic language detection
0.16.0	2024-06-29	Multi-client safe version. Audio generation is synchronized in a single process. The estimated 'realtime' factor of XTTS on a GPU is roughly 1/3, this means that multiple streams simultaneously, or `speed` over 2, may experience audio underrun (delays or pauses in playback). This makes multiple clients possible and safe, but in practice 2 or 3 simultaneous streams is the maximum without audio underrun
0.15.1	2024-06-27	Remove deepspeed from requirements.txt, it's too complex for typical users. A more detailed deepspeed install document will be required
0.15.0	2024-06-26	Switch to coqui-tts (updated fork), updated simpler dependencies, torch 2.3, etc. Resolve cuda threading issues
0.14.1	2024-06-26	Make deepspeed possible (`--use-deepspeed`), but not enabled in pre-built docker images (too large). Requires the cuda-toolkit installed, see the Dockerfile comment for details
0.14.0	2024-06-26	Added `response_format`: `wav` and `pcm` support. Output streaming (while generating) for `tts-1` and `tts-1-hd`. Enhanced generation parameters for xtts models (temperature, top_p, etc.). Idle unload timer (optional) - doesn't work perfectly yet. Improved error handling
0.13.0	2024-06-25	Added Custom fine-tuned XTTS model support. Initial prebuilt arm64 image support (Apple M-series, Raspberry Pi - MPS is not supported in XTTS/torch). Initial attempt at AMD GPU (ROCm 5.7) support. Parler-tts support removed. Move the *.default.yaml to the root folder. Run the docker as a service by default (`restart: unless-stopped`). Added `audio_reader.py` for streaming text input and reading long texts
0.12.3	2024-06-17	Additional logging details for BadRequests (400)
0.12.2	2024-06-16	Fix :min image requirements (numpy<2?)
0.12.0	2024-06-16	Improved error handling and logging. Restore the original alloy tts-1-hd voice by default, use alloy-alt for the old voice
0.11.0	2024-05-29	🌐 Multilingual support (16 languages) with XTTS. Remove high Unicode filtering from the default `config/pre_process_map.yaml`. Update Docker build & app startup. Fix: "Plan failed with a cudnnException". Remove piper cuda support
0.10.1	2024-05-05	Remove `runtime: nvidia` from docker-compose.yml, this assumes nvidia/cuda compatible runtime is available by default
0.10.0	2024-04-27	Pre-built & tested docker images, smaller docker images (8GB or 860MB). Better upgrades: reorganize config files under `config/`, voice models under `voices/`. Default listen host to 0.0.0.0
0.9.0	2024-04-23	Fix bug with yaml and loading UTF-8. New sample text-to-speech application `say.py`. Smaller docker base image. Add beta parler-tts support (you can describe very basic features of the speaker voice)
0.7.3	2024-03-20	Allow different xtts versions per voice in `voice_to_speaker.yaml`, ex. xtts_v2.0.2. Quality: Fix xtts sample rate (24000 vs. 22050 for piper) and pops

Installation instructions

Create a `speech.env` environment file

Copy the sample.env to speech.env (customize if needed)

cp sample.env speech.env

Defaults

TTS_HOME=voices
HF_HOME=voices
#PRELOAD_MODEL=xtts
#PRELOAD_MODEL=xtts_v2.0.2
#EXTRA_ARGS=--log-level DEBUG --unload-timer 300
#USE_ROCM=1

Option A: Manual installation

# install curl and ffmpeg
sudo apt install curl ffmpeg
# Create & activate a new virtual environment (optional but recommended)
python -m venv .venv
source .venv/bin/activate
# Install the Python requirements
# - use requirements-rocm.txt for AMD GPU (ROCm support)
# - use requirements-min.txt for piper only (CPU only)
pip install -U -r requirements.txt
# run the server
bash startup.sh

On first run, the voice models will be downloaded automatically. This might take a while depending on your network connection.

Option B: Docker Image (recommended)

Nvidia GPU (cuda)

docker compose up

AMD GPU (ROCm support)

docker compose -f docker-compose.rocm.yml up

ARM64 (Apple M-series, Raspberry Pi)

XTTS only has CPU support here and will be very slow, you can use the Nvidia image for XTTS with CPU (slow), or use the piper only image (recommended)

CPU only, No GPU (piper only)

For a minimal docker image with only piper support (<1GB vs. 8GB).

docker compose -f docker-compose.min.yml up

Server Options

usage: speech.py [-h] [--xtts_device XTTS_DEVICE] [--preload PRELOAD] [--unload-timer UNLOAD_TIMER] [--use-deepspeed] [--no-cache-speaker] [-P PORT] [-H HOST]
                 [-L {DEBUG,INFO,WARNING,ERROR,CRITICAL}]

OpenedAI Speech API Server

options:
  -h, --help            show this help message and exit
  --xtts_device XTTS_DEVICE
                        Set the device for the xtts model. The special value of 'none' will use piper for all models. (default: cuda)
  --preload PRELOAD     Preload a model (Ex. 'xtts' or 'xtts_v2.0.2'). By default it's loaded on first use. (default: None)
  --unload-timer UNLOAD_TIMER
                        Idle unload timer for the XTTS model in seconds, Ex. 900 for 15 minutes (default: None)
  --use-deepspeed       Use deepspeed with xtts (this option is unsupported) (default: False)
  --no-cache-speaker    Don't use the speaker wav embeddings cache (default: False)
  -P PORT, --port PORT  Server tcp port (default: 8000)
  -H HOST, --host HOST  Host to listen on, Ex. 0.0.0.0 (default: 0.0.0.0)
  -L {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        Set the log level (default: INFO)

Sample Usage

You can use it like this:

curl http://localhost:8000/v1/audio/speech -H "Content-Type: application/json" -d '{
    "model": "tts-1",
    "input": "The quick brown fox jumped over the lazy dog.",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1.0
  }' > speech.mp3

Or just like this:

curl -s http://localhost:8000/v1/audio/speech -H "Content-Type: application/json" -d '{
    "input": "The quick brown fox jumped over the lazy dog."}' > speech.mp3

Or like this example from the OpenAI Text to speech guide:

import openai

client = openai.OpenAI(
  # This part is not needed if you set these environment variables before import openai
  # export OPENAI_API_KEY=sk-11111111111
  # export OPENAI_BASE_URL=http://localhost:8000/v1
  api_key = "sk-111111111",
  base_url = "http://localhost:8000/v1",
)

with client.audio.speech.with_streaming_response.create(
  model="tts-1",
  voice="alloy",
  input="Today is a wonderful day to build something people love!"
) as response:
  response.stream_to_file("speech.mp3")

Also see the say.py sample application for an example of how to use the openai-python API.

# play the audio, requires 'pip install playsound'
python say.py -t "The quick brown fox jumped over the lazy dog." -p
# save to a file in flac format
python say.py -t "The quick brown fox jumped over the lazy dog." -m tts-1-hd -v onyx -f flac -o fox.flac

You can also try the included audio_reader.py for listening to longer text and streamed input.

Example usage:

python audio_reader.py -s 2 < LICENSE # read the software license - fast

OpenAI API Documentation and Guide

Documentation	Link
OpenAI Text to speech guide	OpenAI Text to speech guide
OpenAI API Reference	OpenAI API Reference

Custom Voices Howto

Piper

Select the piper voice and model from the piper samples
Update the config/voice_to_speaker.yaml with a new section for the voice, for example:

...
tts-1:
  ryan:
    model: voices/en_US-ryan-high.onnx
    speaker: # default speaker

New models will be downloaded as needed, of you can download them in advance with download_voices_tts-1.sh. For example:

bash download_voices_tts-1.sh en_US-ryan-high

Coqui XTTS v2

Coqui XTTS v2 voice cloning can work with as little as 6 seconds of clear audio. To create a custom voice clone, you must prepare a WAV file sample of the voice.

Guidelines for preparing good sample files for Coqui XTTS v2

Guideline	Description
Mono (single channel) 22050 Hz WAV file
6-30 seconds long	Longer isn't always better (I've had some good results with as little as 4 seconds)
Low noise	No hiss or hum
No partial words, breathing, laughing, music or backgrounds sounds
An even speaking pace with a variety of words is best	Like in interviews or audiobooks
Audio longer than 30 seconds will be silently truncated

You can use FFmpeg to prepare your audio files, here are some examples:

# convert a multi-channel audio file to mono, set sample rate to 22050 hz, trim to 6 seconds, and output as WAV file.
ffmpeg -i input.mp3 -ac 1 -ar 22050 -t 6 -y me.wav
# use a simple noise filter to clean up audio, and select a start time start for sampling.
ffmpeg -i input.wav -af "highpass=f=200, lowpass=f=3000" -ac 1 -ar 22050 -ss 00:13:26.2 -t 6 -y me.wav
# A more complex noise reduction setup, including volume adjustment
ffmpeg -i input.mkv -af "highpass=f=200, lowpass=f=3000, volume=5, afftdn=nf=25" -ac 1 -ar 22050 -ss 00:13:26.2 -t 6 -y me.wav

Once your WAV file is prepared, save it in the /voices/ directory and update the config/voice_to_speaker.yaml file with the new file name.

For example:

...
tts-1-hd:
  me:
    model: xtts
    speaker: voices/me.wav # this could be you

You can also use a sub folder for multiple audio samples to combine small samples or to mix different samples together.

For example:

...
tts-1-hd:
  mixed:
    model: xtts
    speaker: voices/mixed

Where the voices/mixed/ folder contains multiple wav files. The total audio length is still limited to 30 seconds.

Multilingual

Multilingual cloning support was added in version 0.11.0 and is available only with the XTTS v2 model. To use multilingual voices with piper simply download a language specific voice.

Coqui XTTSv2 has support for multiple languages: English (en), Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Polish (pl), Turkish (tr), Russian (ru), Dutch (nl), Czech (cs), Arabic (ar), Chinese (zh-cn), Hungarian (hu), Korean (ko), Japanese (ja), and Hindi (hi). When not set, an attempt will be made to automatically detect the language, falling back to English (en).

Unfortunately the OpenAI API does not support language, but you can create your own custom speaker voice and set the language for that.

Create the WAV file for your speaker, as in Custom Voices Howto
Add the voice to config/voice_to_speaker.yaml and include the correct Coqui language code for the speaker. For example:

  xunjiang:
    model: xtts
    speaker: voices/xunjiang.wav
    language: zh-cn

Don't remove high unicode characters in your config/pre_process_map.yaml! If you have these lines, you will need to remove them. For example:

Remove:

- - '[\U0001F600-\U0001F64F\U0001F300-\U0001F5FF\U0001F680-\U0001F6FF\U0001F700-\U0001F77F\U0001F780-\U0001F7FF\U0001F800-\U0001F8FF\U0001F900-\U0001F9FF\U0001FA00-\U0001FA6F\U0001FA70-\U0001FAFF\U00002702-\U000027B0\U000024C2-\U0001F251]+'
  - ''

These lines were added to the config/pre_process_map.yaml config file by default before version 0.11.0:

Your new multi-lingual speaker voice is ready to use!

Custom Fine-Tuned Model Support

Adding a custom xtts model is simple. Here is an example of how to add a custom fine-tuned 'halo' XTTS model.

Save the model folder under voices/ (all 4 files are required, including the vocab.json from the model)

openedai-speech$ ls voices/halo/
config.json vocab.json model.pth sample.wav

Add the custom voice entry under the tts-1-hd section of config/voice_to_speaker.yaml:

tts-1-hd:
...
  halo:
    model: halo # This name is required to be unique
    speaker: voices/halo/sample.wav # voice sample is required
    model_path: voices/halo

The model will be loaded when you access the voice for the first time (--preload doesn't work with custom models yet)

Generation Parameters

The generation of XTTSv2 voices can be fine tuned with the following options (defaults included below):

tts-1-hd:
  alloy:
    model: xtts
    speaker: voices/alloy.wav
    enable_text_splitting: True
    length_penalty: 1.0
    repetition_penalty: 10
    speed: 1.0
    temperature: 0.75
    top_k: 50
    top_p: 0.85

README.md Unescape Escape

Open WebUI 👋

Key Features of Open WebUI ⭐

Sponsors 🙌

Emerald

How to Install 🚀

Installation via Python pip 🐍

Quick Start with Docker 🐳

Installation with Default Configuration

Installation for OpenAI API Usage Only

Installing Open WebUI with Bundled Ollama Support

Other Installation Methods

Troubleshooting

Open WebUI: Server Connection Error

Keeping Your Docker Installation Up-to-Date

Using the Dev Branch 🌙

Offline Mode

What's Next? 🌟

License 📜

Support 💬

Star History

Ollama

macOS

Windows

Linux

Docker

Libraries

Community

Quickstart

Model library

Customize a model

Import from GGUF

Import from Safetensors

Customize a prompt

CLI Reference

Create a model

Pull a model

Remove a model

Copy a model

Multiline input

Multimodal models

Pass the prompt as an argument

Show model information

List models on your computer

List which models are currently loaded

Stop a model which is currently running

Start Ollama

REST API

Generate a response

Chat with a model

Community Integrations

Web & Desktop

Cloud

Terminal

Apple Vision Pro

Database

Package managers

Libraries

Mobile

Extensions & Plugins

Supported backends

Observability

OpenedAI Speech

Recent Changes

Installation instructions

Create a speech.env environment file

Defaults

Option A: Manual installation

Option B: Docker Image (recommended)

Nvidia GPU (cuda)

AMD GPU (ROCm support)

ARM64 (Apple M-series, Raspberry Pi)

CPU only, No GPU (piper only)

Server Options

Sample Usage

OpenAI API Documentation and Guide

Custom Voices Howto

Piper

Coqui XTTS v2

Guidelines for preparing good sample files for Coqui XTTS v2

README.md

Create a `speech.env` environment file