mirror of
https://github.com/MindWorkAI/AI-Studio.git
synced 2025-08-18 21:02:56 +00:00
Added vLLM support (#524)
Co-authored-by: Thorsten Sommer <SommerEngineering@users.noreply.github.com>
This commit is contained in:
parent
fe2baa8c00
commit
b75d90b13f
@ -47,6 +47,7 @@ Other News:
|
|||||||
|
|
||||||
Features we have recently released:
|
Features we have recently released:
|
||||||
|
|
||||||
|
- v0.9.50: Added support for self-hosted LLMs using [vLLM](https://blog.vllm.ai/2023/06/20/vllm.html).
|
||||||
- v0.9.46: Released our plugin system, a German language plugin, early support for enterprise environments, and configuration plugins. Additionally, we added the Pandoc integration for future data processing and file generation.
|
- v0.9.46: Released our plugin system, a German language plugin, early support for enterprise environments, and configuration plugins. Additionally, we added the Pandoc integration for future data processing and file generation.
|
||||||
- v0.9.45: Added chat templates to AI Studio, allowing you to create and use a library of system prompts for your chats.
|
- v0.9.45: Added chat templates to AI Studio, allowing you to create and use a library of system prompts for your chats.
|
||||||
- v0.9.44: Added PDF import to the text summarizer, translation, and legal check assistants, allowing you to import PDF files and use them as input for the assistants.
|
- v0.9.44: Added PDF import to the text summarizer, translation, and legal check assistants, allowing you to import PDF files and use them as input for the assistants.
|
||||||
@ -57,7 +58,6 @@ Features we have recently released:
|
|||||||
- v0.9.26+: Added RAG for external data sources using our [ERI interface](https://mindworkai.org/#eri---external-retrieval-interface) as a preview feature.
|
- v0.9.26+: Added RAG for external data sources using our [ERI interface](https://mindworkai.org/#eri---external-retrieval-interface) as a preview feature.
|
||||||
- v0.9.25: Added [xAI](https://x.ai/) as a new provider. xAI provides their Grok models for generating content.
|
- v0.9.25: Added [xAI](https://x.ai/) as a new provider. xAI provides their Grok models for generating content.
|
||||||
- v0.9.23: Added support for OpenAI `o` models (`o1`, `o1-mini`, `o3`, etc.); added also an [ERI](https://github.com/MindWorkAI/ERI) server coding assistant as a preview feature behind the RAG feature flag. Your own ERI server can be used to gain access to, e.g., your enterprise data from within AI Studio.
|
- v0.9.23: Added support for OpenAI `o` models (`o1`, `o1-mini`, `o3`, etc.); added also an [ERI](https://github.com/MindWorkAI/ERI) server coding assistant as a preview feature behind the RAG feature flag. Your own ERI server can be used to gain access to, e.g., your enterprise data from within AI Studio.
|
||||||
- v0.9.22: Added options for preview features; added embedding provider configuration for RAG (preview) and writer mode (experimental preview).
|
|
||||||
|
|
||||||
## What is AI Studio?
|
## What is AI Studio?
|
||||||
|
|
||||||
@ -71,7 +71,7 @@ MindWork AI Studio is a free desktop app for macOS, Windows, and Linux. It provi
|
|||||||
**Key advantages:**
|
**Key advantages:**
|
||||||
- **Free of charge**: The app is free to use, both for personal and commercial purposes.
|
- **Free of charge**: The app is free to use, both for personal and commercial purposes.
|
||||||
- **Independence**: You are not tied to any single provider. Instead, you can choose the providers that best suit your needs. Right now, we support:
|
- **Independence**: You are not tied to any single provider. Instead, you can choose the providers that best suit your needs. Right now, we support:
|
||||||
- [OpenAI](https://openai.com/) (GPT4o, GPT4.1, o1, o3, o4, etc.)
|
- [OpenAI](https://openai.com/) (GPT5, GPT4.1, o1, o3, o4, etc.)
|
||||||
- [Mistral](https://mistral.ai/)
|
- [Mistral](https://mistral.ai/)
|
||||||
- [Anthropic](https://www.anthropic.com/) (Claude)
|
- [Anthropic](https://www.anthropic.com/) (Claude)
|
||||||
- [Google Gemini](https://gemini.google.com)
|
- [Google Gemini](https://gemini.google.com)
|
||||||
@ -79,7 +79,7 @@ MindWork AI Studio is a free desktop app for macOS, Windows, and Linux. It provi
|
|||||||
- [DeepSeek](https://www.deepseek.com/en)
|
- [DeepSeek](https://www.deepseek.com/en)
|
||||||
- [Alibaba Cloud](https://www.alibabacloud.com) (Qwen)
|
- [Alibaba Cloud](https://www.alibabacloud.com) (Qwen)
|
||||||
- [Hugging Face](https://huggingface.co/) using their [inference providers](https://huggingface.co/docs/inference-providers/index) such as Cerebras, Nebius, Sambanova, Novita, Hyperbolic, Together AI, Fireworks, Hugging Face
|
- [Hugging Face](https://huggingface.co/) using their [inference providers](https://huggingface.co/docs/inference-providers/index) such as Cerebras, Nebius, Sambanova, Novita, Hyperbolic, Together AI, Fireworks, Hugging Face
|
||||||
- Self-hosted models using [llama.cpp](https://github.com/ggerganov/llama.cpp), [ollama](https://github.com/ollama/ollama), [LM Studio](https://lmstudio.ai/)
|
- Self-hosted models using [llama.cpp](https://github.com/ggerganov/llama.cpp), [ollama](https://github.com/ollama/ollama), [LM Studio](https://lmstudio.ai/), and [vLLM](https://github.com/vllm-project/vllm)
|
||||||
- [Groq](https://groq.com/)
|
- [Groq](https://groq.com/)
|
||||||
- [Fireworks](https://fireworks.ai/)
|
- [Fireworks](https://fireworks.ai/)
|
||||||
- For scientists and employees of research institutions, we also support [Helmholtz](https://helmholtz.cloud/services/?serviceID=d7d5c597-a2f6-4bd1-b71e-4d6499d98570) and [GWDG](https://gwdg.de/services/application-services/ai-services/) AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities.
|
- For scientists and employees of research institutions, we also support [Helmholtz](https://helmholtz.cloud/services/?serviceID=d7d5c597-a2f6-4bd1-b71e-4d6499d98570) and [GWDG](https://gwdg.de/services/application-services/ai-services/) AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities.
|
||||||
|
@ -4573,9 +4573,6 @@ UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T1702902297"] = "Introduction"
|
|||||||
-- Vision
|
-- Vision
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T1892426825"] = "Vision"
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T1892426825"] = "Vision"
|
||||||
|
|
||||||
-- You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT4o, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities.
|
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2217921237"] = "You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT4o, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities."
|
|
||||||
|
|
||||||
-- Let's get started
|
-- Let's get started
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2331588413"] = "Let's get started"
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2331588413"] = "Let's get started"
|
||||||
|
|
||||||
@ -4585,6 +4582,9 @@ UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2348849647"] = "Last Changelog"
|
|||||||
-- Choose the provider and model best suited for your current task.
|
-- Choose the provider and model best suited for your current task.
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2588488920"] = "Choose the provider and model best suited for your current task."
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2588488920"] = "Choose the provider and model best suited for your current task."
|
||||||
|
|
||||||
|
-- You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT4o, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using vLLM, llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities.
|
||||||
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2900280782"] = "You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT4o, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using vLLM, llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities."
|
||||||
|
|
||||||
-- Quick Start Guide
|
-- Quick Start Guide
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T3002014720"] = "Quick Start Guide"
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T3002014720"] = "Quick Start Guide"
|
||||||
|
|
||||||
|
@ -164,7 +164,7 @@ public partial class ProviderDialog : MSGComponentBase, ISecretId
|
|||||||
//
|
//
|
||||||
// We cannot load the API key for self-hosted providers:
|
// We cannot load the API key for self-hosted providers:
|
||||||
//
|
//
|
||||||
if (this.DataLLMProvider is LLMProviders.SELF_HOSTED && this.DataHost is not Host.OLLAMA)
|
if (this.DataLLMProvider is LLMProviders.SELF_HOSTED && this.DataHost is not Host.OLLAMA && this.DataHost is not Host.VLLM)
|
||||||
{
|
{
|
||||||
await this.ReloadModels();
|
await this.ReloadModels();
|
||||||
await base.OnInitializedAsync();
|
await base.OnInitializedAsync();
|
||||||
|
@ -31,7 +31,7 @@ public partial class Home : MSGComponentBase
|
|||||||
{
|
{
|
||||||
this.itemsAdvantages = [
|
this.itemsAdvantages = [
|
||||||
new(this.T("Free of charge"), this.T("The app is free to use, both for personal and commercial purposes.")),
|
new(this.T("Free of charge"), this.T("The app is free to use, both for personal and commercial purposes.")),
|
||||||
new(this.T("Independence"), this.T("You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT4o, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities.")),
|
new(this.T("Independence"), this.T("You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT4o, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using vLLM, llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities.")),
|
||||||
new(this.T("Assistants"), this.T("You just want to quickly translate a text? AI Studio has so-called assistants for such and other tasks. No prompting is necessary when working with these assistants.")),
|
new(this.T("Assistants"), this.T("You just want to quickly translate a text? AI Studio has so-called assistants for such and other tasks. No prompting is necessary when working with these assistants.")),
|
||||||
new(this.T("Unrestricted usage"), this.T("Unlike services like ChatGPT, which impose limits after intensive use, MindWork AI Studio offers unlimited usage through the providers API.")),
|
new(this.T("Unrestricted usage"), this.T("Unlike services like ChatGPT, which impose limits after intensive use, MindWork AI Studio offers unlimited usage through the providers API.")),
|
||||||
new(this.T("Cost-effective"), this.T("You only pay for what you use, which can be cheaper than monthly subscription services like ChatGPT Plus, especially if used infrequently. But beware, here be dragons: For extremely intensive usage, the API costs can be significantly higher. Unfortunately, providers currently do not offer a way to display current costs in the app. Therefore, check your account with the respective provider to see how your costs are developing. When available, use prepaid and set a cost limit.")),
|
new(this.T("Cost-effective"), this.T("You only pay for what you use, which can be cheaper than monthly subscription services like ChatGPT Plus, especially if used infrequently. But beware, here be dragons: For extremely intensive usage, the API costs can be significantly higher. Unfortunately, providers currently do not offer a way to display current costs in the app. Therefore, check your account with the respective provider to see how your costs are developing. When available, use prepaid and set a cost limit.")),
|
||||||
|
@ -4575,9 +4575,6 @@ UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T1702902297"] = "Einführung"
|
|||||||
-- Vision
|
-- Vision
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T1892426825"] = "Vision"
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T1892426825"] = "Vision"
|
||||||
|
|
||||||
-- You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT4o, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities.
|
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2217921237"] = "Sie sind nicht an einen einzelnen Anbieter gebunden. Stattdessen können Sie den Anbieter wählen, der am besten zu ihren Bedürfnissen passt. Aktuell unterstützen wir OpenAI (GPT4o, o1 usw.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face sowie selbst gehostete Modelle mit llama.cpp, ollama, LM Studio, Groq oder Fireworks. Für Wissenschaftler und Beschäftigte von Forschungseinrichtungen unterstützen wir außerdem die KI-Dienste von Helmholtz und GWDG. Diese sind über föderierte Logins wie eduGAIN für alle 18 Helmholtz-Zentren, die Max-Planck-Gesellschaft, die meisten deutschen sowie viele internationale Universitäten verfügbar."
|
|
||||||
|
|
||||||
-- Let's get started
|
-- Let's get started
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2331588413"] = "Los geht's"
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2331588413"] = "Los geht's"
|
||||||
|
|
||||||
@ -4587,6 +4584,9 @@ UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2348849647"] = "Letztes Änderungsproto
|
|||||||
-- Choose the provider and model best suited for your current task.
|
-- Choose the provider and model best suited for your current task.
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2588488920"] = "Wählen Sie den Anbieter und das Modell aus, die am besten zu ihrer aktuellen Aufgabe passen."
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2588488920"] = "Wählen Sie den Anbieter und das Modell aus, die am besten zu ihrer aktuellen Aufgabe passen."
|
||||||
|
|
||||||
|
-- You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT4o, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using vLLM, llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities.
|
||||||
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2900280782"] = "Sie sind an keinen einzelnen Anbieter gebunden. Stattdessen können Sie den Anbieter wählen, der am besten zu ihren Bedürfnissen passt. Derzeit unterstützen wir OpenAI (GPT5, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face und selbst gehostete Modelle mit vLLM, llama.cpp, ollama, LM Studio, Groq oder Fireworks. Für Wissenschaftler und Mitarbeiter von Forschungseinrichtungen unterstützen wir auch die KI-Dienste von Helmholtz und GWDG. Diese sind über föderierte Anmeldungen wie eduGAIN für alle 18 Helmholtz-Zentren, die Max-Planck-Gesellschaft, die meisten deutschen und viele internationale Universitäten verfügbar."
|
||||||
|
|
||||||
-- Quick Start Guide
|
-- Quick Start Guide
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T3002014720"] = "Schnellstart-Anleitung"
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T3002014720"] = "Schnellstart-Anleitung"
|
||||||
|
|
||||||
|
@ -4575,9 +4575,6 @@ UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T1702902297"] = "Introduction"
|
|||||||
-- Vision
|
-- Vision
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T1892426825"] = "Vision"
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T1892426825"] = "Vision"
|
||||||
|
|
||||||
-- You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT4o, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities.
|
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2217921237"] = "You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT4o, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities."
|
|
||||||
|
|
||||||
-- Let's get started
|
-- Let's get started
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2331588413"] = "Let's get started"
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2331588413"] = "Let's get started"
|
||||||
|
|
||||||
@ -4587,6 +4584,9 @@ UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2348849647"] = "Last Changelog"
|
|||||||
-- Choose the provider and model best suited for your current task.
|
-- Choose the provider and model best suited for your current task.
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2588488920"] = "Choose the provider and model best suited for your current task."
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2588488920"] = "Choose the provider and model best suited for your current task."
|
||||||
|
|
||||||
|
-- You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT4o, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using vLLM, llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities.
|
||||||
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T2900280782"] = "You are not tied to any single provider. Instead, you might choose the provider that best suits your needs. Right now, we support OpenAI (GPT5, o1, etc.), Mistral, Anthropic (Claude), Google Gemini, xAI (Grok), DeepSeek, Alibaba Cloud (Qwen), Hugging Face, and self-hosted models using vLLM, llama.cpp, ollama, LM Studio, Groq, or Fireworks. For scientists and employees of research institutions, we also support Helmholtz and GWDG AI services. These are available through federated logins like eduGAIN to all 18 Helmholtz Centers, the Max Planck Society, most German, and many international universities."
|
||||||
|
|
||||||
-- Quick Start Guide
|
-- Quick Start Guide
|
||||||
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T3002014720"] = "Quick Start Guide"
|
UI_TEXT_CONTENT["AISTUDIO::PAGES::HOME::T3002014720"] = "Quick Start Guide"
|
||||||
|
|
||||||
|
@ -285,7 +285,7 @@ public static class LLMProvidersExtensions
|
|||||||
LLMProviders.GWDG => true,
|
LLMProviders.GWDG => true,
|
||||||
LLMProviders.HUGGINGFACE => true,
|
LLMProviders.HUGGINGFACE => true,
|
||||||
|
|
||||||
LLMProviders.SELF_HOSTED => host is Host.OLLAMA,
|
LLMProviders.SELF_HOSTED => host is (Host.OLLAMA or Host.VLLM),
|
||||||
|
|
||||||
_ => false,
|
_ => false,
|
||||||
};
|
};
|
||||||
@ -322,6 +322,7 @@ public static class LLMProvidersExtensions
|
|||||||
|
|
||||||
case Host.OLLAMA:
|
case Host.OLLAMA:
|
||||||
case Host.LM_STUDIO:
|
case Host.LM_STUDIO:
|
||||||
|
case Host.VLLM:
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -6,11 +6,8 @@ namespace AIStudio.Provider.SelfHosted;
|
|||||||
/// <param name="Model">Which model to use for chat completion.</param>
|
/// <param name="Model">Which model to use for chat completion.</param>
|
||||||
/// <param name="Messages">The chat messages.</param>
|
/// <param name="Messages">The chat messages.</param>
|
||||||
/// <param name="Stream">Whether to stream the chat completion.</param>
|
/// <param name="Stream">Whether to stream the chat completion.</param>
|
||||||
/// <param name="MaxTokens">The maximum number of tokens to generate.</param>
|
|
||||||
public readonly record struct ChatRequest(
|
public readonly record struct ChatRequest(
|
||||||
string Model,
|
string Model,
|
||||||
IList<Message> Messages,
|
IList<Message> Messages,
|
||||||
bool Stream,
|
bool Stream
|
||||||
|
|
||||||
int MaxTokens
|
|
||||||
);
|
);
|
@ -7,4 +7,5 @@ public enum Host
|
|||||||
LM_STUDIO,
|
LM_STUDIO,
|
||||||
LLAMACPP,
|
LLAMACPP,
|
||||||
OLLAMA,
|
OLLAMA,
|
||||||
|
VLLM,
|
||||||
}
|
}
|
@ -9,6 +9,7 @@ public static class HostExtensions
|
|||||||
Host.LM_STUDIO => "LM Studio",
|
Host.LM_STUDIO => "LM Studio",
|
||||||
Host.LLAMACPP => "llama.cpp",
|
Host.LLAMACPP => "llama.cpp",
|
||||||
Host.OLLAMA => "ollama",
|
Host.OLLAMA => "ollama",
|
||||||
|
Host.VLLM => "vLLM",
|
||||||
|
|
||||||
_ => "Unknown",
|
_ => "Unknown",
|
||||||
};
|
};
|
||||||
@ -29,6 +30,7 @@ public static class HostExtensions
|
|||||||
{
|
{
|
||||||
case Host.LM_STUDIO:
|
case Host.LM_STUDIO:
|
||||||
case Host.OLLAMA:
|
case Host.OLLAMA:
|
||||||
|
case Host.VLLM:
|
||||||
return true;
|
return true;
|
||||||
|
|
||||||
default:
|
default:
|
||||||
|
@ -58,8 +58,7 @@ public sealed class ProviderSelfHosted(ILogger logger, Host host, string hostnam
|
|||||||
}).ToList()],
|
}).ToList()],
|
||||||
|
|
||||||
// Right now, we only support streaming completions:
|
// Right now, we only support streaming completions:
|
||||||
Stream = true,
|
Stream = true
|
||||||
MaxTokens = -1,
|
|
||||||
}, JSON_SERIALIZER_OPTIONS);
|
}, JSON_SERIALIZER_OPTIONS);
|
||||||
|
|
||||||
async Task<HttpRequestMessage> RequestBuilder()
|
async Task<HttpRequestMessage> RequestBuilder()
|
||||||
@ -101,6 +100,7 @@ public sealed class ProviderSelfHosted(ILogger logger, Host host, string hostnam
|
|||||||
|
|
||||||
case Host.LM_STUDIO:
|
case Host.LM_STUDIO:
|
||||||
case Host.OLLAMA:
|
case Host.OLLAMA:
|
||||||
|
case Host.VLLM:
|
||||||
return await this.LoadModels(["embed"], [], token, apiKeyProvisional);
|
return await this.LoadModels(["embed"], [], token, apiKeyProvisional);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -127,6 +127,7 @@ public sealed class ProviderSelfHosted(ILogger logger, Host host, string hostnam
|
|||||||
{
|
{
|
||||||
case Host.LM_STUDIO:
|
case Host.LM_STUDIO:
|
||||||
case Host.OLLAMA:
|
case Host.OLLAMA:
|
||||||
|
case Host.VLLM:
|
||||||
return await this.LoadModels([], ["embed"], token, apiKeyProvisional);
|
return await this.LoadModels([], ["embed"], token, apiKeyProvisional);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -2,5 +2,6 @@
|
|||||||
- Added an option for chat templates to predefine a user input.
|
- Added an option for chat templates to predefine a user input.
|
||||||
- Added the ability to create chat templates from existing chats.
|
- Added the ability to create chat templates from existing chats.
|
||||||
- Added an enterprise IT configuration option to prevent manual addition of LLM providers in managed environments.
|
- Added an enterprise IT configuration option to prevent manual addition of LLM providers in managed environments.
|
||||||
|
- Added support for self-hosted LLMs using [vLLM](https://blog.vllm.ai/2023/06/20/vllm.html).
|
||||||
- Improved the display of enterprise configurations on the about page; configuration details are only shown when needed.
|
- Improved the display of enterprise configurations on the about page; configuration details are only shown when needed.
|
||||||
- Improved hot reloading on Unix-like systems when entire plugins were added or removed.
|
- Improved hot reloading on Unix-like systems when entire plugins were added or removed.
|
Loading…
Reference in New Issue
Block a user