Model Clients#

AutoGen bietet eine Reihe von integrierten Model Clients für die Nutzung der ChatCompletion API. Alle Model Clients implementieren die ChatCompletionClient Protokollklasse.

Derzeit unterstützen wir die folgenden integrierten Model Clients

OpenAIChatCompletionClient: für OpenAI-Modelle und Modelle mit OpenAI API-Kompatibilität (z. B. Gemini).
AzureOpenAIChatCompletionClient: für Azure OpenAI-Modelle.
AzureAIChatCompletionClient: für GitHub-Modelle und Modelle, die auf Azure gehostet werden.
OllamaChatCompletionClient (Experimentell): für lokale Modelle, die auf Ollama gehostet werden.
AnthropicChatCompletionClient (Experimentell): für Modelle, die auf Anthropic gehostet werden.
SKChatCompletionAdapter: Adapter für Semantic Kernel AI-Konnektoren.

Weitere Informationen zur Verwendung dieser Model Clients finden Sie in der Dokumentation jedes Clients.

Modellaufrufe protokollieren#

AutoGen verwendet das Standard-Python-Logging-Modul, um Ereignisse wie Modellaufrufe und -antworten zu protokollieren. Der Loggername ist autogen_core.EVENT_LOGGER_NAME, und der Ereignistyp ist LLMCall.

import logging

from autogen_core import EVENT_LOGGER_NAME

logging.basicConfig(level=logging.WARNING)
logger = logging.getLogger(EVENT_LOGGER_NAME)
logger.addHandler(logging.StreamHandler())
logger.setLevel(logging.INFO)

Model Client aufrufen#

Um einen Model Client aufzurufen, können Sie die Methode create() verwenden. Dieses Beispiel verwendet den OpenAIChatCompletionClient, um ein OpenAI-Modell aufzurufen.

from autogen_core.models import UserMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(
    model="gpt-4", temperature=0.3
)  # assuming OPENAI_API_KEY is set in the environment.

result = await model_client.create([UserMessage(content="What is the capital of France?", source="user")])
print(result)

finish_reason='stop' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=15, completion_tokens=8) cached=False logprobs=None thought=None

Streaming-Tokens#

Sie können die Methode create_stream() verwenden, um eine Chat-Vervollständigungsanfrage mit gestreamten Token-Chunks zu erstellen.

from autogen_core.models import CreateResult, UserMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(model="gpt-4o")  # assuming OPENAI_API_KEY is set in the environment.

messages = [
    UserMessage(content="Write a very short story about a dragon.", source="user"),
]

# Create a stream.
stream = model_client.create_stream(messages=messages)

# Iterate over the stream and print the responses.
print("Streamed responses:")
async for chunk in stream:  # type: ignore
    if isinstance(chunk, str):
        # The chunk is a string.
        print(chunk, flush=True, end="")
    else:
        # The final chunk is a CreateResult object.
        assert isinstance(chunk, CreateResult) and isinstance(chunk.content, str)
        # The last response is a CreateResult object with the complete message.
        print("\n\n------------\n")
        print("The complete response:", flush=True)
        print(chunk.content, flush=True)

Streamed responses:
In the heart of an ancient forest, beneath the shadow of snow-capped peaks, a dragon named Elara lived secretly for centuries. Elara was unlike any dragon from the old tales; her scales shimmered with a deep emerald hue, each scale engraved with symbols of lost wisdom. The villagers in the nearby valley spoke of mysterious lights dancing across the night sky, but none dared venture close enough to solve the enigma.

One cold winter's eve, a young girl named Lira, brimming with curiosity and armed with the innocence of youth, wandered into Elara’s domain. Instead of fire and fury, she found warmth and a gentle gaze. The dragon shared stories of a world long forgotten and in return, Lira gifted her simple stories of human life, rich in laughter and scent of earth.

From that night on, the villagers noticed subtle changes—the crops grew taller, and the air seemed sweeter. Elara had infused the valley with ancient magic, a guardian of balance, watching quietly as her new friend thrived under the stars. And so, Lira and Elara’s bond marked the beginning of a timeless friendship that spun tales of hope whispered through the leaves of the ever-verdant forest.

------------

The complete response:
In the heart of an ancient forest, beneath the shadow of snow-capped peaks, a dragon named Elara lived secretly for centuries. Elara was unlike any dragon from the old tales; her scales shimmered with a deep emerald hue, each scale engraved with symbols of lost wisdom. The villagers in the nearby valley spoke of mysterious lights dancing across the night sky, but none dared venture close enough to solve the enigma.

One cold winter's eve, a young girl named Lira, brimming with curiosity and armed with the innocence of youth, wandered into Elara’s domain. Instead of fire and fury, she found warmth and a gentle gaze. The dragon shared stories of a world long forgotten and in return, Lira gifted her simple stories of human life, rich in laughter and scent of earth.

From that night on, the villagers noticed subtle changes—the crops grew taller, and the air seemed sweeter. Elara had infused the valley with ancient magic, a guardian of balance, watching quietly as her new friend thrived under the stars. And so, Lira and Elara’s bond marked the beginning of a timeless friendship that spun tales of hope whispered through the leaves of the ever-verdant forest.

------------

The token usage was:
RequestUsage(prompt_tokens=0, completion_tokens=0)

Hinweis

Die letzte Antwort in der Streaming-Antwort ist immer die endgültige Antwort vom Typ CreateResult.

Hinweis

Die Standardantwort ist die Rückgabe von Nullwerten. Um die Nutzung zu aktivieren, siehe create_stream() für weitere Details.

Strukturierte Ausgabe#

Strukturierte Ausgabe kann aktiviert werden, indem das Feld response_format in OpenAIChatCompletionClient und AzureOpenAIChatCompletionClient als Pydantic BaseModel-Klasse gesetzt wird.

Hinweis

Strukturierte Ausgabe ist nur für Modelle verfügbar, die sie unterstützen. Sie erfordert auch, dass der Model Client strukturierte Ausgabe unterstützt. Derzeit unterstützen OpenAIChatCompletionClient und AzureOpenAIChatCompletionClient strukturierte Ausgabe.

from typing import Literal

from pydantic import BaseModel


# The response format for the agent as a Pydantic base model.
class AgentResponse(BaseModel):
    thoughts: str
    response: Literal["happy", "sad", "neutral"]


# Create an agent that uses the OpenAI GPT-4o model with the custom response format.
model_client = OpenAIChatCompletionClient(
    model="gpt-4o",
    response_format=AgentResponse,  # type: ignore
)

# Send a message list to the model and await the response.
messages = [
    UserMessage(content="I am happy.", source="user"),
]
response = await model_client.create(messages=messages)
assert isinstance(response.content, str)
parsed_response = AgentResponse.model_validate_json(response.content)
print(parsed_response.thoughts)
print(parsed_response.response)

# Close the connection to the model client.
await model_client.close()

I'm glad to hear that you're feeling happy! It's such a great emotion that can brighten your whole day. Is there anything in particular that's bringing you joy today? 😊
happy

Sie können auch den Parameter extra_create_args in der Methode create() verwenden, um das Feld response_format zu setzen, damit die strukturierte Ausgabe für jede Anfrage konfiguriert werden kann.

Zwischenspeichern von Modellantworten#

autogen_ext implementiert ChatCompletionCache, die jeden ChatCompletionClient wrappen kann. Die Verwendung dieses Wrappers vermeidet Token-Nutzung, wenn der zugrunde liegende Client mehrmals mit demselben Prompt abgefragt wird.

ChatCompletionCache verwendet ein CacheStore-Protokoll. Wir haben einige nützliche Varianten von CacheStore implementiert, darunter DiskCacheStore und RedisStore.

Hier ist ein Beispiel für die Verwendung von diskcache für lokales Caching

# pip install -U "autogen-ext[openai, diskcache]"

import asyncio
import tempfile

from autogen_core.models import UserMessage
from autogen_ext.cache_store.diskcache import DiskCacheStore
from autogen_ext.models.cache import CHAT_CACHE_VALUE_TYPE, ChatCompletionCache
from autogen_ext.models.openai import OpenAIChatCompletionClient
from diskcache import Cache


async def main() -> None:
    with tempfile.TemporaryDirectory() as tmpdirname:
        # Initialize the original client
        openai_model_client = OpenAIChatCompletionClient(model="gpt-4o")

        # Then initialize the CacheStore, in this case with diskcache.Cache.
        # You can also use redis like:
        # from autogen_ext.cache_store.redis import RedisStore
        # import redis
        # redis_instance = redis.Redis()
        # cache_store = RedisCacheStore[CHAT_CACHE_VALUE_TYPE](redis_instance)
        cache_store = DiskCacheStore[CHAT_CACHE_VALUE_TYPE](Cache(tmpdirname))
        cache_client = ChatCompletionCache(openai_model_client, cache_store)

        response = await cache_client.create([UserMessage(content="Hello, how are you?", source="user")])
        print(response)  # Should print response from OpenAI
        response = await cache_client.create([UserMessage(content="Hello, how are you?", source="user")])
        print(response)  # Should print cached response

        await openai_model_client.close()
        await cache_client.close()


asyncio.run(main())

True

Die Überprüfung von cached_client.total_usage() (oder model_client.total_usage()) vor und nach einer gecachten Antwort sollte identische Zählungen ergeben.

Beachten Sie, dass das Caching empfindlich auf die exakten Argumente reagiert, die an cached_client.create oder cached_client.create_stream übergeben werden. Daher kann eine Änderung der Argumente tools oder json_output zu einem Cache-Fehler führen.

Erstellen eines Agenten mit einem Model Client#

Erstellen wir einen einfachen KI-Agenten, der auf Nachrichten über die ChatCompletion API reagieren kann.

from dataclasses import dataclass

from autogen_core import MessageContext, RoutedAgent, SingleThreadedAgentRuntime, message_handler
from autogen_core.models import ChatCompletionClient, SystemMessage, UserMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient


@dataclass
class Message:
    content: str


class SimpleAgent(RoutedAgent):
    def __init__(self, model_client: ChatCompletionClient) -> None:
        super().__init__("A simple agent")
        self._system_messages = [SystemMessage(content="You are a helpful AI assistant.")]
        self._model_client = model_client

    @message_handler
    async def handle_user_message(self, message: Message, ctx: MessageContext) -> Message:
        # Prepare input to the chat completion model.
        user_message = UserMessage(content=message.content, source="user")
        response = await self._model_client.create(
            self._system_messages + [user_message], cancellation_token=ctx.cancellation_token
        )
        # Return with the model's response.
        assert isinstance(response.content, str)
        return Message(content=response.content)

Die Klasse SimpleAgent ist eine Unterklasse der Klasse autogen_core.RoutedAgent, um Nachrichten automatisch an die entsprechenden Handler weiterzuleiten. Sie verfügt über einen einzigen Handler, handle_user_message, der Nachrichten vom Benutzer verarbeitet. Er verwendet den ChatCompletionClient, um eine Antwort auf die Nachricht zu generieren. Dann gibt er die Antwort gemäß dem direkten Kommunikationsmodell an den Benutzer zurück.

Hinweis

Das cancellation_token vom Typ autogen_core.CancellationToken wird verwendet, um asynchrone Operationen abzubrechen. Es ist mit asynchronen Aufrufen innerhalb der Nachrichtenhandler verbunden und kann vom Aufrufer verwendet werden, um die Handler abzubrechen.

# Create the runtime and register the agent.
from autogen_core import AgentId

model_client = OpenAIChatCompletionClient(
    model="gpt-4o-mini",
    # api_key="sk-...", # Optional if you have an OPENAI_API_KEY set in the environment.
)

runtime = SingleThreadedAgentRuntime()
await SimpleAgent.register(
    runtime,
    "simple_agent",
    lambda: SimpleAgent(model_client=model_client),
)
# Start the runtime processing messages.
runtime.start()
# Send a message to the agent and get the response.
message = Message("Hello, what are some fun things to do in Seattle?")
response = await runtime.send_message(message, AgentId("simple_agent", "default"))
print(response.content)
# Stop the runtime processing messages.
await runtime.stop()
await model_client.close()

Seattle is a vibrant city with a wide range of activities and attractions. Here are some fun things to do in Seattle:

**Space Needle**: Visit this iconic observation tower for stunning views of the city and surrounding mountains.

**Pike Place Market**: Explore this historic market where you can see the famous fish toss, buy local produce, and find unique crafts and eateries.

**Museum of Pop Culture (MoPOP)**: Dive into the world of contemporary culture, music, and science fiction at this interactive museum.

**Chihuly Garden and Glass**: Marvel at the beautiful glass art installations by artist Dale Chihuly, located right next to the Space Needle.

**Seattle Aquarium**: Discover the diverse marine life of the Pacific Northwest at this engaging aquarium.

**Seattle Art Museum**: Explore a vast collection of art from around the world, including contemporary and indigenous art.

**Kerry Park**: For one of the best views of the Seattle skyline, head to this small park on Queen Anne Hill.

**Ballard Locks**: Watch boats pass through the locks and observe the salmon ladder to see salmon migrating.

**Ferry to Bainbridge Island**: Take a scenic ferry ride across Puget Sound to enjoy charming shops, restaurants, and beautiful natural scenery.

**Olympic Sculpture Park**: Stroll through this outdoor park with large-scale sculptures and stunning views of the waterfront and mountains.

**Underground Tour**: Discover Seattle's history on this quirky tour of the city's underground passageways in Pioneer Square.

**Seattle Waterfront**: Enjoy the shops, restaurants, and attractions along the waterfront, including the Seattle Great Wheel and the aquarium.

**Discovery Park**: Explore the largest green space in Seattle, featuring trails, beaches, and views of Puget Sound.

**Food Tours**: Try out Seattle’s diverse culinary scene, including fresh seafood, international cuisines, and coffee culture (don’t miss the original Starbucks!).

**Attend a Sports Game**: Catch a Seahawks (NFL), Mariners (MLB), or Sounders (MLS) game for a lively local experience.

Whether you're interested in culture, nature, food, or history, Seattle has something for everyone to enjoy!

Der obige SimpleAgent antwortet immer mit einem frischen Kontext, der nur die Systemnachricht und die neueste Benutzernachricht enthält. Wir können Model-Kontextklassen aus autogen_core.model_context verwenden, damit der Agent frühere Gespräche "erinnern" kann. Weitere Einzelheiten finden Sie auf der Seite Model Context.

API-Schlüssel aus Umgebungsvariablen#

In den obigen Beispielen zeigen wir, dass Sie den API-Schlüssel über das Argument api_key übergeben können. Wichtig ist, dass die OpenAI- und Azure OpenAI-Clients das openai-Paket verwenden, das automatisch einen API-Schlüssel aus der Umgebungsvariable liest, wenn keiner angegeben ist.

Für OpenAI können Sie die Umgebungsvariable OPENAI_API_KEY setzen.
Für Azure OpenAI können Sie die Umgebungsvariable AZURE_OPENAI_API_KEY setzen.

Darüber hinaus können Sie für Gemini (Beta) die Umgebungsvariable GEMINI_API_KEY setzen.

Dies ist eine gute Praxis, die es zu erkunden gilt, da sie die Aufnahme sensibler API-Schlüssel in Ihren Code vermeidet.