Skip to content

Configurations

Configurations to set

All the configurations that can be used listed below with their description, default values and possible values. set them using environment variables.

Application run configurations

Host

Host where your application will be running.

DATU_HOST=0.0.0.0
Port

Port where your application listens to.

DATU_PORT=8000
Openai key

OpenAI llm key.

DATU_OPENAI_API_KEY=

Schema configurations

Schema refresh threshold days

Refresh schema in x days.

DATU_SCHEMA_REFRESH_THRESHOLD_DAYS=2
Schema cache file

Store schema cache to.

DATU_SCHEMA_CACHEC_FILE=schema_cache.json
Schema categorical detection

Detect categorical columns.

DATU_SCHEMA_CATEGORICAL_DETECTION=true
Schema sample limit

Schema sample limit to determine categorical values.

DATU_SAMPLE_LIMIT=1000
Schema categorical threshold

Threshold for column values to be considered as categorical.

DATU_SCHEMA_CATEGORICAL_THRESHOLD=10
SchemaRAG Engine

Extract relevant part of schema based on user query. Set to True to activate.

DATU_ENABLE_SCHEMA_RAG=False
DBT profiles

Datasource profiles.yml.

DATU_DBT_PROFILES=
Enable MCP Connectivity

Set the following environment variable to enable MCP connectivity in Datu:

DATU_ENABLE_MCP=true
MCP Servers Configuration

Define your MCP servers in a JSON config file (e.g. mcp_config.json):

{
  "mcpServers": {
    "sql_generator": {
      "command": "python",
      "args": ["-m", "datu.mcp.tools.sql_generator"],
      "env": { "PYTHONPATH": "." }
    }
  }
}