Skip to main content

StatGPT

StatGPT is AI-driven open-source Talk-To-Your-Data platform specializing in official statistical data serving the international community of statisticians and data users. It is designed to help users and statistical organizations to query, transform, analyze, visualize, and interpret statistical data and publications using natural language interface. StatGPT is built on top of the DIAL platform, which enables enterprise-grade governance and security.

By leveraging agentic architecture, StatGPT can respond to complex queries which require multiple steps and involve multiple tools:

  • Data Query tool: Builds and executes SDMX queries based on natural language input.
  • Publications RAG tool: Retrieves information from analytical publications (usually in PDF format) using the RAG approach.
  • Glossary tool: Retrieves information from the glossary of terms and definitions of the specific statistical organization.
  • Web Search tool: Retrieves information from the white-listed websites using the RAG approach.

The system is designed to be user-friendly, enabling interaction with the system using natural language. StatGPT can handle a wide range of queries, from simple requests for data to more complex inquiries that require analysis and visualization of data.

Admin part of the platform allows statistical organizations to onboard their data sources and publications, configure the application, and manage the knowledge base. The platform is designed to be flexible and scalable, allowing organizations to customize the application to meet their specific needs.

StatGPT has several interfaces for integration with other systems:

  • Standalone chat interface: A web-based chat interface for users to interact with the system.
  • API: A RESTful API for programmatic integration with other systems.
  • Chat Overlay: A chat-bot interface that can be embedded into other applications, including statistical data dissemination portals.
  • MS Excel Add-in: MS Excel add-in allows users to interact with the system directly from a familiar spreadsheet interface.