Configuration¶

Aspected is configured via a JSON or YAML configuration file and/or environment variables. On startup the server loads configuration from the following sources (in order, later sources override earlier ones):

config.yaml / config.yml / config.json (in the working directory)
{APP_RUN_MODE}.yaml / {APP_RUN_MODE}.yml / {APP_RUN_MODE}.json (defaults to development)
local.yaml / local.yml / local.json
Environment variables prefixed with ASPECTED_

You can also pass a specific config file with the --config / -c flag:

docker run xillio/aspected:latest ./aspected --config /path/to/config.json

Lastly, HNSW parameters can be configured on a per-index basis during creation. See HNSW Parameters.

Configuration Reference¶

Below is the full configuration structure with default values:

config.yml

# Logging settings
log:
  level: info             # Minimum log level: trace, debug, info, warn, err, critical, off
  file:
    enabled: false        # Enable logging to a rotating file
    path: ./logs/app.log  # Path to the log file
    level: null           # Override log level for the file sink (defaults to log.level)
    sizeThresholdMb: 5    # Max size (MB) of a single log file before rotation
    maxFiles: 3           # Number of rotated log files to keep

# HTTP server settings
server:
  host: 0.0.0.0           # Address to bind the HTTP server to
  port: 8080              # Port to listen on
  corsEnabled: false      # Enable CORS headers on responses
  staticApiToken: ""      # Static API token for authentication (empty = no auth)
  https:
    enabled: false        # Enable HTTPS
    certificatePath: ""   # Path to the TLS certificate file (PEM)
    privateKeyPath: ""    # Path to the TLS private key file (PEM)

# Index storage settings
index:
  storagePath: ./data     # Directory where index data is persisted to disk
  migrate:
    enabled: true         # Enable automatic index migration
    onlyInternal: false   # Only migrate indexes created through the server

# Llama / text resolver settings (GGUF embedding models)
llama:
  modelsPath: ./models    # Directory containing GGUF model files
  useGpu: true            # Offload model layers to the GPU (if available)
  gpuLayers: -1           # Number of layers to offload to GPU (-1 = all layers)
  threads: -1             # CPU threads for inference (-1 = default, 4 threads)
  concurrentContexts: 4   # Number of concurrent inference contexts per model
  maxBatchSize: 2048      # Max tokens per batch (capped to model's context window)

Logging (`log`)¶

Key	Version	Type	Default	Description
`log.level`	`≥0.1.0`	string	`"info"`	Minimum log level: `trace`, `debug`, `info`, `warn`, `err`, `critical`, `off`.
`log.file.enabled`	`≥0.1.0`	boolean	`false`	Enable logging to a rotating file.
`log.file.path`	`≥0.1.0`	string	`"./logs/app.log"`	Path to the log file.
`log.file.level`	`≥0.1.0`	string	(same as `log.level`)	Override log level for the file sink.
`log.file.sizeThresholdMb`	`≥0.1.0`	integer	`5`	Maximum size (MB) of a single log file before rotation.
`log.file.maxFiles`	`≥0.1.0`	integer	`3`	Number of rotated log files to keep.

Server (`server`)¶

Key	Version	Type	Default	Description
`server.host`	`≥0.1.0`	string	`"0.0.0.0"`	Address to bind the HTTP server to.
`server.port`	`≥0.1.0`	integer	`8080`	Port to listen on.
`server.corsEnabled`	`≥0.2.0`	boolean	`false`	Enable CORS headers on responses.
`server.staticApiToken`	`≥0.1.0`	string	—	Static API token for request authentication. Empty disables.
`server.https.enabled`	`≥0.1.0`	boolean	`false`	Enable HTTPS / TLS on the server.
`server.https.certificatePath`	`≥0.1.0`	string	—	Path to the TLS certificate file (PEM).
`server.https.privateKeyPath`	`≥0.1.0`	string	—	Path to the TLS private key file (PEM).

Index Storage (`index`)¶

Key	Version	Type	Default	Description
`index.storagePath`	`≥0.1.0`	string	`"./data"`	Directory where index data is persisted to disk.
`index.migrate.enabled`	`≥0.2.0`	boolean	`true`	Enable automatic index migration.
`index.migrate.onlyInternal`	`≥0.2.0`	boolean	`false`	Only migrate indexes created through the server.

Llama / Text Resolver (`llama`)¶

These settings control how the text resolver loads and runs GGUF embedding models.

Key	Version	Type	Default	Description
`llama.modelsPath`	`≥0.1.0`	string	`"./models"`	Directory containing GGUF model files.
`llama.useGpu`	`≥0.1.0`	boolean	`true`	Offload model layers to the GPU (if available).
`llama.gpuLayers`	`≥0.1.0`	integer	`-1`	Number of layers to offload to the GPU. `-1` means all layers.
`llama.threads`	`≥0.1.0`	integer	`-1`	Number of CPU threads for inference. `-1` uses the default (4 threads).
`llama.concurrentContexts`	`≥0.1.0`	integer	`4`	Number of concurrent inference contexts per loaded model.
`llama.maxBatchSize`	`≥0.1.0`	integer	`2048`	Maximum number of tokens processed in a single batch. Capped to the model's context window size.

Environment Variables¶

Every configuration key can be set via an environment variable. Use the prefix ASPECTED followed by underscores (_) to separate nested keys, for example:

Config key	Environment variable
`server.port`	`ASPECTED_SERVER_PORT`
`server.staticApiToken`	`ASPECTED_SERVER_STATIC_API_TOKEN`
`server.https.enabled`	`ASPECTED_SERVER_HTTPS_ENABLED`
`log.level`	`ASPECTED_LOG_LEVEL`
`log.file.enabled`	`ASPECTED_LOG_FILE_ENABLED`
`llama.modelsPath`	`ASPECTED_LLAMA_MODELS_PATH`
`llama.concurrentContexts`	`ASPECTED_LLAMA_CONCURRENT_CONTEXTS`
`index.storagePath`	`ASPECTED_INDEX_STORAGE_PATH`

Docker Example¶

docker run -p 9090:9090 \
  -e ASPECTED_SERVER_PORT=9090 \
  -e ASPECTED_LOG_LEVEL=debug \
  -e ASPECTED_LLAMA_MODELS_PATH=/models \
  -v $(pwd)/models:/models \
  -v $(pwd)/data:/app/data \
  xillio/aspected:latest

Version Information¶

To print the server version and exit:

docker run xillio/aspected:latest ./aspected --version

Configuration¶

Configuration Reference¶

Logging (log)¶

Server (server)¶

Index Storage (index)¶

Llama / Text Resolver (llama)¶