autogen_ext.models.azure#

class AzureAIChatCompletionClient(**kwargs: Unpack)[Quelle]#

Bases: ChatCompletionClient

Chat-Vervollständigungsclient für Modelle, die auf Azure AI Foundry oder GitHub Models gehostet werden. Weitere Informationen finden Sie hier.

Parameter:
  • endpoint (str) – Der zu verwendende Endpunkt. Erforderlich.

  • credential (union, AzureKeyCredential, AsyncTokenCredential) – Die zu verwendenden Anmeldeinformationen. Erforderlich

  • model_info (ModelInfo) – Die Modellfamilie und die Fähigkeiten des Modells. Erforderlich.

  • model (str) – Der Name des Modells. Erforderlich, wenn das Modell auf GitHub Models gehostet wird.

  • frequency_penalty – (optional,float)

  • presence_penalty – (optional,float)

  • temperature – (optional,float)

  • top_p – (optional,float)

  • max_tokens – (optional,int)

  • response_format – (optional, literal[“text”, “json_object”])

  • stop – (optional,List[str])

  • tools – (optional,List[ChatCompletionsToolDefinition])

  • tool_choice – (optional,Union[str, ChatCompletionsToolChoicePreset, ChatCompletionsNamedToolChoice]])

  • seed – (optional,int)

  • model_extras – (optional,Dict[str, Any])

Um diesen Client zu verwenden, müssen Sie das azure-Extra installieren

pip install "autogen-ext[azure]"

Der folgende Code-Schnipsel zeigt, wie der Client mit GitHub Models verwendet wird

import asyncio
import os
from azure.core.credentials import AzureKeyCredential
from autogen_ext.models.azure import AzureAIChatCompletionClient
from autogen_core.models import UserMessage


async def main():
    client = AzureAIChatCompletionClient(
        model="Phi-4",
        endpoint="https://models.github.ai/inference",
        # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings.
        # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens
        credential=AzureKeyCredential(os.environ["GITHUB_TOKEN"]),
        model_info={
            "json_output": False,
            "function_calling": False,
            "vision": False,
            "family": "unknown",
            "structured_output": False,
        },
    )

    result = await client.create([UserMessage(content="What is the capital of France?", source="user")])
    print(result)

    # Close the client.
    await client.close()


if __name__ == "__main__":
    asyncio.run(main())

Für Streaming können Sie die Methode create_stream verwenden

import asyncio
import os

from autogen_core.models import UserMessage
from autogen_ext.models.azure import AzureAIChatCompletionClient
from azure.core.credentials import AzureKeyCredential


async def main():
    client = AzureAIChatCompletionClient(
        model="Phi-4",
        endpoint="https://models.github.ai/inference",
        # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings.
        # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens
        credential=AzureKeyCredential(os.environ["GITHUB_TOKEN"]),
        model_info={
            "json_output": False,
            "function_calling": False,
            "vision": False,
            "family": "unknown",
            "structured_output": False,
        },
    )

    # Create a stream.
    stream = client.create_stream([UserMessage(content="Write a poem about the ocean", source="user")])
    async for chunk in stream:
        print(chunk, end="", flush=True)
    print()

    # Close the client.
    await client.close()


if __name__ == "__main__":
    asyncio.run(main())
add_usage(usage: RequestUsage) None[Quelle]#
async create(messages: Sequence[Annotated[SystemMessage | UserMessage | AssistantMessage | FunctionExecutionResultMessage, FieldInfo(annotation=NoneType, required=True, discriminator='type')]], *, tools: Sequence[Tool | ToolSchema] = [], tool_choice: Tool | Literal['auto', 'required', 'none'] = 'auto', json_output: bool | type[BaseModel] | None = None, extra_create_args: Mapping[str, Any] = {}, cancellation_token: CancellationToken | None = None) CreateResult[Quelle]#

Creates a single response from the model.

Parameter:
  • messages (Sequence[LLMMessage]) – The messages to send to the model.

  • tools (Sequence[Tool | ToolSchema], optional) – The tools to use with the model. Defaults to [].

  • tool_choice (Tool | Literal["auto", "required", "none"], optional) – A single Tool object to force the model to use, “auto” to let the model choose any available tool, “required” to force tool usage, or “none” to disable tool usage. Defaults to “auto”.

  • json_output (Optional[bool | type[BaseModel]], optional) – Whether to use JSON mode, structured output, or neither. Defaults to None. If set to a Pydantic BaseModel type, it will be used as the output type for structured output. If set to a boolean, it will be used to determine whether to use JSON mode or not. If set to True, make sure to instruct the model to produce JSON output in the instruction or prompt.

  • extra_create_args (Mapping[str, Any], optional) – Extra arguments to pass to the underlying client. Defaults to {}.

  • cancellation_token (Optional[CancellationToken], optional) – A token for cancellation. Defaults to None.

Gibt zurück:

CreateResult – The result of the model call.

async create_stream(messages: Sequence[Annotated[SystemMessage | UserMessage | AssistantMessage | FunctionExecutionResultMessage, FieldInfo(annotation=NoneType, required=True, discriminator='type')]], *, tools: Sequence[Tool | ToolSchema] = [], tool_choice: Tool | Literal['auto', 'required', 'none'] = 'auto', json_output: bool | type[BaseModel] | None = None, extra_create_args: Mapping[str, Any] = {}, cancellation_token: CancellationToken | None = None) AsyncGenerator[str | CreateResult, None][Quelle]#

Creates a stream of string chunks from the model ending with a CreateResult.

Parameter:
  • messages (Sequence[LLMMessage]) – The messages to send to the model.

  • tools (Sequence[Tool | ToolSchema], optional) – The tools to use with the model. Defaults to [].

  • tool_choice (Tool | Literal["auto", "required", "none"], optional) – A single Tool object to force the model to use, “auto” to let the model choose any available tool, “required” to force tool usage, or “none” to disable tool usage. Defaults to “auto”.

  • json_output (Optional[bool | type[BaseModel]], optional) –

    Whether to use JSON mode, structured output, or neither. Defaults to None. If set to a Pydantic BaseModel type, it will be used as the output type for structured output. If set to a boolean, it will be used to determine whether to use JSON mode or not. If set to True, make sure to instruct the model to produce JSON output in the instruction or prompt.

  • extra_create_args (Mapping[str, Any], optional) – Extra arguments to pass to the underlying client. Defaults to {}.

  • cancellation_token (Optional[CancellationToken], optional) – A token for cancellation. Defaults to None.

Gibt zurück:

AsyncGenerator[Union[str, CreateResult], None] – A generator that yields string chunks and ends with a CreateResult.

async close() None[Quelle]#
actual_usage() RequestUsage[Quelle]#
total_usage() RequestUsage[Quelle]#
count_tokens(messages: Sequence[Annotated[SystemMessage | UserMessage | AssistantMessage | FunctionExecutionResultMessage, FieldInfo(annotation=NoneType, required=True, discriminator='type')]], *, tools: Sequence[Tool | ToolSchema] = []) int[Quelle]#
remaining_tokens(messages: Sequence[Annotated[SystemMessage | UserMessage | AssistantMessage | FunctionExecutionResultMessage, FieldInfo(annotation=NoneType, required=True, discriminator='type')]], *, tools: Sequence[Tool | ToolSchema] = []) int[Quelle]#
property model_info: ModelInfo#
property capabilities: ModelInfo#
class AzureAIChatCompletionClientConfig[Quelle]#

Bases: dict

endpoint: str#
credential: AzureKeyCredential | AsyncTokenCredential#
model_info: ModelInfo#
frequency_penalty: float | None#
presence_penalty: float | None#
temperature: float | None#
top_p: float | None#
max_tokens: int | None#
response_format: Literal['text', 'json_object'] | None#
stop: List[str] | None#
tools: List[ChatCompletionsToolDefinition] | None#
tool_choice: str | ChatCompletionsToolChoicePreset | ChatCompletionsNamedToolChoice | None#
seed: int | None#
model: str | None#
model_extras: Dict[str, Any] | None#