autogen_ext.tools.graphrag#

class GlobalSearchTool(token_encoder: Encoding, model: ChatModel, data_config: GlobalDataConfig, context_config: GlobalContextConfig = _default_context_config, mapreduce_config: MapReduceConfig = _default_mapreduce_config)[source]#

Bases: BaseTool[GlobalSearchToolArgs, GlobalSearchToolReturn]

Ermöglicht die Ausführung von GraphRAG-Globalsuchanfragen als AutoGen-Tool.

Dieses Tool ermöglicht es Ihnen, semantische Suchen über einen Dokumentenkorpus unter Verwendung des GraphRAG-Frameworks durchzuführen. Die Suche kombiniert graphenbasierte Dokumentenbeziehungen mit semantischen Einbettungen, um relevante Informationen zu finden.

Hinweis

Dieses Tool erfordert die graphrag-Erweiterung für das Paket autogen-ext.

Zur Installation

pip install -U "autogen-agentchat" "autogen-ext[graphrag]"

Vor der Verwendung dieses Tools müssen Sie den GraphRAG-Einrichtungs- und Indexierungsprozess abschließen

  1. Folgen Sie der GraphRAG-Dokumentation, um Ihr Projekt und Ihre Einstellungen zu initialisieren

  2. Konfigurieren und optimieren Sie Ihre Prompts für den spezifischen Anwendungsfall

  3. Führen Sie den Indexierungsprozess aus, um die erforderlichen Datendateien zu generieren

  4. Stellen Sie sicher, dass Sie die Datei settings.yaml aus dem Einrichtungsprozess haben

Weitere detaillierte Anweisungen zu diesen Vorbereitungsschritten finden Sie in der [GraphRAG-Dokumentation](https://msdocs.de/graphrag/).

Beispielverwendung mit AssistantAgent

import asyncio
from pathlib import Path
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.ui import Console
from autogen_ext.tools.graphrag import GlobalSearchTool
from autogen_agentchat.agents import AssistantAgent


async def main():
    # Initialize the OpenAI client
    openai_client = OpenAIChatCompletionClient(
        model="gpt-4o-mini",
        api_key="<api-key>",
    )

    # Set up global search tool
    global_tool = GlobalSearchTool.from_settings(root_dir=Path("./"), config_filepath=Path("./settings.yaml"))

    # Create assistant agent with the global search tool
    assistant_agent = AssistantAgent(
        name="search_assistant",
        tools=[global_tool],
        model_client=openai_client,
        system_message=(
            "You are a tool selector AI assistant using the GraphRAG framework. "
            "Your primary task is to determine the appropriate search tool to call based on the user's query. "
            "For broader, abstract questions requiring a comprehensive understanding of the dataset, call the 'global_search' function."
        ),
    )

    # Run a sample query
    query = "What is the overall sentiment of the community reports?"
    await Console(assistant_agent.run_stream(task=query))


if __name__ == "__main__":
    asyncio.run(main())
async run(args: GlobalSearchToolArgs, cancellation_token: CancellationToken) GlobalSearchToolReturn[source]#
classmethod from_settings(root_dir: str | Path, config_filepath: str | Path | None = None) GlobalSearchTool[source]#

Erstellen Sie eine GlobalSearchTool-Instanz aus der GraphRAG-Einstellungsdatei.

Parameter:
  • root_dir – Pfad zum GraphRAG-Stammverzeichnis

  • config_filepath – Pfad zur GraphRAG-Einstellungsdatei (optional)

Gibt zurück:

Eine initialisierte GlobalSearchTool-Instanz

class LocalSearchTool(token_encoder: Encoding, model: ChatModel, embedder: EmbeddingModel, data_config: LocalDataConfig, context_config: LocalContextConfig = _default_context_config, search_config: SearchConfig = _default_search_config)[source]#

Bases: BaseTool[LocalSearchToolArgs, LocalSearchToolReturn]

Ermöglicht die Ausführung von GraphRAG-Lokalsuchanfragen als AutoGen-Tool.

Dieses Tool ermöglicht es Ihnen, semantische Suchen über einen Dokumentenkorpus unter Verwendung des GraphRAG-Frameworks durchzuführen. Die Suche kombiniert lokalen Dokumentenkontext mit semantischen Einbettungen, um relevante Informationen zu finden.

Hinweis

Dieses Tool erfordert die graphrag-Erweiterung für das Paket autogen-ext. Zur Installation

pip install -U "autogen-agentchat" "autogen-ext[graphrag]"

Vor der Verwendung dieses Tools müssen Sie den GraphRAG-Einrichtungs- und Indexierungsprozess abschließen

  1. Folgen Sie der GraphRAG-Dokumentation, um Ihr Projekt und Ihre Einstellungen zu initialisieren

  2. Konfigurieren und optimieren Sie Ihre Prompts für den spezifischen Anwendungsfall

  3. Führen Sie den Indexierungsprozess aus, um die erforderlichen Datendateien zu generieren

  4. Stellen Sie sicher, dass Sie die Datei settings.yaml aus dem Einrichtungsprozess haben

Weitere detaillierte Anweisungen zu diesen Vorbereitungsschritten finden Sie in der [GraphRAG-Dokumentation](https://msdocs.de/graphrag/).

Beispielverwendung mit AssistantAgent

import asyncio
from pathlib import Path
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.ui import Console
from autogen_ext.tools.graphrag import LocalSearchTool
from autogen_agentchat.agents import AssistantAgent


async def main():
    # Initialize the OpenAI client
    openai_client = OpenAIChatCompletionClient(
        model="gpt-4o-mini",
        api_key="<api-key>",
    )

    # Set up local search tool
    local_tool = LocalSearchTool.from_settings(root_dir=Path("./"), config_filepath=Path("./settings.yaml"))

    # Create assistant agent with the local search tool
    assistant_agent = AssistantAgent(
        name="search_assistant",
        tools=[local_tool],
        model_client=openai_client,
        system_message=(
            "You are a tool selector AI assistant using the GraphRAG framework. "
            "Your primary task is to determine the appropriate search tool to call based on the user's query. "
            "For specific, detailed information about particular entities or relationships, call the 'local_search' function."
        ),
    )

    # Run a sample query
    query = "What does the station-master say about Dr. Becher?"
    await Console(assistant_agent.run_stream(task=query))


if __name__ == "__main__":
    asyncio.run(main())
Parameter:
  • token_encoder (tiktoken.Encoding) – Der für die Textkodierung verwendete Tokenizer

  • model – Das für die Suche zu verwendende Chat-Modell (GraphRAG ChatModel)

  • embedder – Das zu verwendende Text-Embedding-Modell (GraphRAG EmbeddingModel)

  • data_config (DataConfig) – Konfiguration für Datenquellenstandorte und Einstellungen

  • context_config (LocalContextConfig, optional) – Konfiguration für den Kontextaufbau. Standardmäßig Standardkonfiguration.

  • search_config (SearchConfig, optional) – Konfiguration für Suchoperationen. Standardmäßig Standardkonfiguration.

async run(args: LocalSearchToolArgs, cancellation_token: CancellationToken) LocalSearchToolReturn[source]#
classmethod from_settings(root_dir: Path, config_filepath: Path | None = None) LocalSearchTool[source]#

Erstellen Sie eine LocalSearchTool-Instanz aus der GraphRAG-Einstellungsdatei.

Parameter:
  • root_dir – Pfad zum GraphRAG-Stammverzeichnis

  • config_filepath – Pfad zur GraphRAG-Einstellungsdatei (optional)

Gibt zurück:

Eine initialisierte LocalSearchTool-Instanz

pydantic model GlobalDataConfig[source]#

Bases: DataConfig

JSON-Schema anzeigen
{
   "title": "GlobalDataConfig",
   "type": "object",
   "properties": {
      "input_dir": {
         "title": "Input Dir",
         "type": "string"
      },
      "entity_table": {
         "default": "entities",
         "title": "Entity Table",
         "type": "string"
      },
      "entity_embedding_table": {
         "default": "entities",
         "title": "Entity Embedding Table",
         "type": "string"
      },
      "community_table": {
         "default": "communities",
         "title": "Community Table",
         "type": "string"
      },
      "community_level": {
         "default": 2,
         "title": "Community Level",
         "type": "integer"
      },
      "community_report_table": {
         "default": "community_reports",
         "title": "Community Report Table",
         "type": "string"
      }
   },
   "required": [
      "input_dir"
   ]
}

Felder:
  • community_report_table (str)

field community_report_table: str = 'community_reports'#
pydantic model LocalDataConfig[source]#

Bases: DataConfig

JSON-Schema anzeigen
{
   "title": "LocalDataConfig",
   "type": "object",
   "properties": {
      "input_dir": {
         "title": "Input Dir",
         "type": "string"
      },
      "entity_table": {
         "default": "entities",
         "title": "Entity Table",
         "type": "string"
      },
      "entity_embedding_table": {
         "default": "entities",
         "title": "Entity Embedding Table",
         "type": "string"
      },
      "community_table": {
         "default": "communities",
         "title": "Community Table",
         "type": "string"
      },
      "community_level": {
         "default": 2,
         "title": "Community Level",
         "type": "integer"
      },
      "relationship_table": {
         "default": "relationships",
         "title": "Relationship Table",
         "type": "string"
      },
      "text_unit_table": {
         "default": "text_units",
         "title": "Text Unit Table",
         "type": "string"
      }
   },
   "required": [
      "input_dir"
   ]
}

Felder:
  • relationship_table (str)

  • text_unit_table (str)

field relationship_table: str = 'relationships'#
field text_unit_table: str = 'text_units'#
pydantic model GlobalContextConfig[source]#

Bases: ContextConfig

JSON-Schema anzeigen
{
   "title": "GlobalContextConfig",
   "type": "object",
   "properties": {
      "max_data_tokens": {
         "default": 12000,
         "title": "Max Data Tokens",
         "type": "integer"
      },
      "use_community_summary": {
         "default": false,
         "title": "Use Community Summary",
         "type": "boolean"
      },
      "shuffle_data": {
         "default": true,
         "title": "Shuffle Data",
         "type": "boolean"
      },
      "include_community_rank": {
         "default": true,
         "title": "Include Community Rank",
         "type": "boolean"
      },
      "min_community_rank": {
         "default": 0,
         "title": "Min Community Rank",
         "type": "integer"
      },
      "community_rank_name": {
         "default": "rank",
         "title": "Community Rank Name",
         "type": "string"
      },
      "include_community_weight": {
         "default": true,
         "title": "Include Community Weight",
         "type": "boolean"
      },
      "community_weight_name": {
         "default": "occurrence weight",
         "title": "Community Weight Name",
         "type": "string"
      },
      "normalize_community_weight": {
         "default": true,
         "title": "Normalize Community Weight",
         "type": "boolean"
      }
   }
}

Felder:
  • community_rank_name (str)

  • community_weight_name (str)

  • include_community_rank (bool)

  • include_community_weight (bool)

  • max_data_tokens (int)

  • min_community_rank (int)

  • normalize_community_weight (bool)

  • shuffle_data (bool)

  • use_community_summary (bool)

field use_community_summary: bool = False#
field shuffle_data: bool = True#
field include_community_rank: bool = True#
field min_community_rank: int = 0#
field community_rank_name: str = 'rank'#
field include_community_weight: bool = True#
field community_weight_name: str = 'occurrence weight'#
field normalize_community_weight: bool = True#
field max_data_tokens: int = 12000#
pydantic model GlobalSearchToolArgs[source]#

Bases: BaseModel

JSON-Schema anzeigen
{
   "title": "GlobalSearchToolArgs",
   "type": "object",
   "properties": {
      "query": {
         "description": "The user query to perform global search on.",
         "title": "Query",
         "type": "string"
      }
   },
   "required": [
      "query"
   ]
}

Felder:
  • query (str)

field query: str [Required]#

Die Benutzeranfrage für die globale Suche.

pydantic model GlobalSearchToolReturn[source]#

Bases: BaseModel

JSON-Schema anzeigen
{
   "title": "GlobalSearchToolReturn",
   "type": "object",
   "properties": {
      "answer": {
         "title": "Answer",
         "type": "string"
      }
   },
   "required": [
      "answer"
   ]
}

Felder:
  • answer (str)

field answer: str [Required]#
pydantic model LocalContextConfig[source]#

Bases: ContextConfig

JSON-Schema anzeigen
{
   "title": "LocalContextConfig",
   "type": "object",
   "properties": {
      "max_data_tokens": {
         "default": 8000,
         "title": "Max Data Tokens",
         "type": "integer"
      },
      "text_unit_prop": {
         "default": 0.5,
         "title": "Text Unit Prop",
         "type": "number"
      },
      "community_prop": {
         "default": 0.25,
         "title": "Community Prop",
         "type": "number"
      },
      "include_entity_rank": {
         "default": true,
         "title": "Include Entity Rank",
         "type": "boolean"
      },
      "rank_description": {
         "default": "number of relationships",
         "title": "Rank Description",
         "type": "string"
      },
      "include_relationship_weight": {
         "default": true,
         "title": "Include Relationship Weight",
         "type": "boolean"
      },
      "relationship_ranking_attribute": {
         "default": "rank",
         "title": "Relationship Ranking Attribute",
         "type": "string"
      }
   }
}

Felder:
  • community_prop (float)

  • include_entity_rank (bool)

  • include_relationship_weight (bool)

  • rank_description (str)

  • relationship_ranking_attribute (str)

  • text_unit_prop (float)

field text_unit_prop: float = 0.5#
field community_prop: float = 0.25#
field include_entity_rank: bool = True#
field rank_description: str = 'number of relationships'#
field include_relationship_weight: bool = True#
field relationship_ranking_attribute: str = 'rank'#
pydantic model LocalSearchToolArgs[source]#

Bases: BaseModel

JSON-Schema anzeigen
{
   "title": "LocalSearchToolArgs",
   "type": "object",
   "properties": {
      "query": {
         "description": "The user query to perform local search on.",
         "title": "Query",
         "type": "string"
      }
   },
   "required": [
      "query"
   ]
}

Felder:
  • query (str)

field query: str [Required]#

Die Benutzeranfrage für die lokale Suche.

pydantic model LocalSearchToolReturn[source]#

Bases: BaseModel

JSON-Schema anzeigen
{
   "title": "LocalSearchToolReturn",
   "type": "object",
   "properties": {
      "answer": {
         "description": "The answer to the user query.",
         "title": "Answer",
         "type": "string"
      }
   },
   "required": [
      "answer"
   ]
}

Felder:
  • answer (str)

field answer: str [Required]#

Die Antwort auf die Benutzeranfrage.

pydantic model MapReduceConfig[source]#

Bases: BaseModel

JSON-Schema anzeigen
{
   "title": "MapReduceConfig",
   "type": "object",
   "properties": {
      "map_max_tokens": {
         "default": 1000,
         "title": "Map Max Tokens",
         "type": "integer"
      },
      "map_temperature": {
         "default": 0.0,
         "title": "Map Temperature",
         "type": "number"
      },
      "reduce_max_tokens": {
         "default": 2000,
         "title": "Reduce Max Tokens",
         "type": "integer"
      },
      "reduce_temperature": {
         "default": 0.0,
         "title": "Reduce Temperature",
         "type": "number"
      },
      "allow_general_knowledge": {
         "default": false,
         "title": "Allow General Knowledge",
         "type": "boolean"
      },
      "json_mode": {
         "default": false,
         "title": "Json Mode",
         "type": "boolean"
      },
      "response_type": {
         "default": "multiple paragraphs",
         "title": "Response Type",
         "type": "string"
      }
   }
}

Felder:
  • allow_general_knowledge (bool)

  • json_mode (bool)

  • map_max_tokens (int)

  • map_temperature (float)

  • reduce_max_tokens (int)

  • reduce_temperature (float)

  • response_type (str)

field map_max_tokens: int = 1000#
field map_temperature: float = 0.0#
field reduce_max_tokens: int = 2000#
field reduce_temperature: float = 0.0#
field allow_general_knowledge: bool = False#
field json_mode: bool = False#
field response_type: str = 'multiple paragraphs'#
pydantic model SearchConfig[source]#

Bases: BaseModel

JSON-Schema anzeigen
{
   "title": "SearchConfig",
   "type": "object",
   "properties": {
      "max_tokens": {
         "default": 1500,
         "title": "Max Tokens",
         "type": "integer"
      },
      "temperature": {
         "default": 0.0,
         "title": "Temperature",
         "type": "number"
      },
      "response_type": {
         "default": "multiple paragraphs",
         "title": "Response Type",
         "type": "string"
      }
   }
}

Felder:
  • max_tokens (int)

  • response_type (str)

  • temperature (float)

field max_tokens: int = 1500#
field temperature: float = 0.0#
field response_type: str = 'multiple paragraphs'#