Skip to content

Getting Started

This guide will have you running Aspected in under five minutes using Docker.

Prerequisites

  • Docker installed on your machine.

2. Run the Docker Image

Start the container, exposing the default port 8080:

docker run -p 8080:8080 aspected:latest

You should see the Aspected ASCII banner followed by startup logs:

    _                        _           _
   / \   ___ _ __   ___  ___| |_ ___  __| |
  / _ \ / __| '_ \ / _ \/ __| __/ _ \/ _` |
 / ___ \\__ \ |_) |  __/ (__| ||  __/ (_| |
/_/   \_\___/ .__/ \___|\___|\__\___|\__,_|
            |_|

Version: ...

3. Verify the Server is Running

In another terminal, list the available indexes (there will be none yet):

curl http://localhost:8080/indexes

Expected response:

{
  "total": 0,
  "page": 1,
  "limit": 100,
  "data": [],
  "status": "ok"
}

The server is up and ready to accept requests.

Persistent Storage

By default, Aspected stores index data in a ./data directory inside the container. To persist indexes across container restarts, mount a host directory:

docker run -p 8080:8080 -v $(pwd)/data:/app/data aspected:latest

Using the Text Resolver (Embedding Models)

If you plan to use the text resolver for semantic text embeddings, you need to make GGUF model files available to the server. See the Text Resolver page for details.

A quick way to get started is to download the bundled models and mount them:

# Download models (run from the repository root)
cd utility && bash download-huggingface-models.sh && cd ..

# Run with models mounted
docker run -p 8080:8080 \
  -v $(pwd)/models:/app/models \
  -v $(pwd)/data:/app/data \
  aspected:latest

GPU-Accelerated Embeddings (Vulkan)

If you are running on Linux, you can enable Vulkan-based GPU acceleration for significantly faster text embeddings. Pass the GPU devices into the container with --device:

docker run -p 8080:8080 \
  -v $(pwd)/models:/app/models \
  -v $(pwd)/data:/app/data \
  --device /dev/kfd \
  --device /dev/dri \
  aspected:latest

Note

The Docker image ships with Vulkan drivers for Intel, AMD, and NVIDIA GPUs and should be compatible with most devices. However, you may encounter issues when running inside a virtual machine, as GPU passthrough and driver support can vary.

Verifying GPU Detection

To confirm that the GPU has been detected, start the container with trace logging enabled:

docker run -p 8080:8080 \
  -v $(pwd)/models:/app/models \
  -v $(pwd)/data:/app/data \
  --device /dev/kfd \
  --device /dev/dri \
  -e ASPECTED_LOG_LEVEL=trace \
  aspected:latest

If llama successfully detects the GPU you should see output similar to:

T 2026-03-19T14:27:35.475Z llama: ggml_vulkan: Found 1 Vulkan devices:
T 2026-03-19T14:27:35.475Z llama: ggml_vulkan: 0 = AMD Radeon Graphics (RADV RENOIR) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | ...

Note

Vulkan acceleration requires that the host has a compatible GPU and the appropriate kernel drivers installed. The /dev/kfd and /dev/dri device nodes must be present on the host system.

Custom Configuration

You can supply a configuration file to override defaults (logging level, server port, model paths, etc.):

docker run -p 8080:8080 \
  -v $(pwd)/config.json:/app/config.json \
  -v $(pwd)/data:/app/data \
  aspected:latest

See Configuration for all available options.

You can also configure the server using environment variables. Environment variables are prefixed with ASPECTED and use underscores (_) to separate nested keys. For example:

docker run -p 9090:9090 \
  -e ASPECTED_SERVER_PORT=9090 \
  -e ASPECTED_LOG_LEVEL=debug \
  aspected:latest

Next Steps

  • Follow the hands-on Tutorial to create, train, and search an index.
  • Learn about Resolvers to understand how data is converted into vectors.