2.3.0
2026-04-16
xAI prompt cache + Batch API — 5–20× cost reductions
-
New
Phase 1 — Byte-stable system prefix: the dynamic context (GROK_NOTES.md, todos, memory, edit history) is now a separate `system` message *after* the stable instructions, so the prefix is identical across turns and users. Typical sessions go from ~0% prompt-cache hits to 70–90% from turn 2 onward.
-
New
Phase 2 — Sticky conversation routing: new `X-Xai-Conv-Id` header on chat completions + `conv_id` body field on the Responses API pin cache routing to the same shard for a conversation's lifetime. Conv IDs are minted on session create, persisted with sessions, and inherited by sub-agents.
-
New
Phase 3 — Cache-hit observability: every response emits `[cache] prompt=X cached=Y ratio=Z%` in logs; `cached_tokens`, `reasoning_tokens`, `billable_prompt_tokens`, and `cache_hit_rate` are surfaced on SSE `token_usage` frames, headless `--format json` summaries, sub-agent `usage` events, and the `/usage` dashboard.
-
New
Phase 4a — `previous_response_id` chaining for the Responses API: subsequent turns send only new items + the previous response ID; xAI rebuilds history server-side (30-day retention). Transparent full-resend fallback on stale-ID errors. Persisted in `SessionData::lastResponseId` / `lastSubmittedCount`.
-
New
Phase 4b — Reasoning content preservation: chat-completions reasoning models stream `reasoning_content` through a dedicated callback; it's stored per-message and echoed byte-identically on replay so reasoning stays inside the cached prefix.
-
New
Phase 5 — Batch API: new `BatchClient`, SQLite migration v7 (`batches` table), background `BatchPoller` reconciles state every 30s. Full REST surface under `/api/batches/*` (JWT-guarded, owner-scoped) plus five agent tools: `batch_submit`, `get_batch`, `list_batches`, `get_batch_results`, `cancel_batch`.
-
Update
Net effect on bills: prompt caching alone cuts interactive costs by 5–10×; Batch API adds a flat 50% discount that stacks with caching for ~20× effective discount on offline/bulk workloads. `grok-4.20-*` cached-input pricing is ~10× cheaper than fresh input.
-
Update
Zero breaking changes. Existing sessions inherit cached prompts on the first reply after upgrade. `use_history:false` resets chain state cleanly; hitting `MAX_TOOL_TURNS` drops the chain so the next run is a fresh request.
2.2.0
2026-04-16
Platform capabilities — long-running services, per-app SQLite, scoped sub-agents
-
New
Phase 1 — Long-running services: new `type:"process"` service kind for Python / Node / native daemons (Discord bots, workers, etc). Cross-platform `ProcessSupervisor` (POSIX `fork+setpgid`, Windows `CreateProcessW+Job Object`); per-service workdir under `~/.avacli/services/<id>/`; vault-backed env at spawn time.
-
New
Phase 2 — Auto dep install + restart policies + SSE logs: first start auto-bootstraps per-service `venv` / `npm ci`, caches by `sha256` of `requirements.txt` / `package.json`. Restart policies (`always` / `on_failure` / `never`) with exponential backoff and per-hour ceiling. New SSE endpoint `/api/services/:id/logs/stream` and `tail_service_logs` tool.
-
New
Phase 3 — Internal SQLite for agent-generated apps: each app gets its own `~/.avacli/app_data/<slug>.db` with WAL + FK and a `sqlite3_set_authorizer` sandbox (deny ATTACH/DETACH). Auto-injected `_sdk.js` exposes `window.avacli.db` / `window.avacli.main` / `window.avacli.ai` to every app. 32-byte agent tokens; whitelisted main-DB views (articles_public, apps_directory, my_app).
-
New
Phase 4 — Sub-agents with scoped writes: `spawn_subagent` delegates bounded work to child xAI agents on their own threads. `ScopedToolExecutor` enforces `allowed_paths` (glob) and `allowed_tools`. `LeaseManager` implements write leases via `INSERT OR FAIL INTO agent_leases` so siblings can't collide. Default-deny: `subagents.max_depth=0` disables spawn until raised in Settings.
-
New
New tools: `app_token`, `app_db_execute`, `app_db_set_cap`, `tail_service_logs`, `spawn_subagent`, `wait_subagent`, `cancel_subagent`, `list_subagents`. New APIs: `/api/services/:id/status`, `/api/services/:id/logs/stream`, `/api/apps/:slug/db/{query,execute,schema,export}`, `/api/apps/:slug/main/query`, `/api/tasks/*`.
-
New
Settings → Platform capabilities: new section with `services_workdir_root` (text + reset), `apps_db_size_cap_mb` (16–16384, default 256), `subagents_max_depth` slider (0 = Disabled, max 16). All values read fresh per call — no restart needed.
-
Update
Schema migrations v4/v5/v6: `services.pid/started_at/restart_count/last_exit_code`, `apps.agent_token/db_enabled/ai_enabled/db_size_cap_mb`, `app_usage`, `agent_tasks`, `agent_leases`. Existing installs upgrade transparently on next start.
-
Update
Debian package now declares `Recommends: python3 (>= 3.8), python3-venv, nodejs (>= 18), npm`. Core `avacli` remains dep-light (libcurl + libssl + sqlite3); runtimes are only needed when hosting process services.
-
Update
xAI only, by design: no LLMClient indirection. Sub-agents share `XAIClient` with the root chat — only the model string varies. Per-user filesystem under `~/.avacli/`.
2.1.0
2026-04-09
Detached UI — disk-based frontend with theming
-
New
The embedded web frontend has been extracted into a standalone `ui/` directory served from disk. Any edit to HTML / CSS / JS is reflected on the next browser refresh — zero recompile, zero restart.
-
New
New `avacli serve` flags: `--ui-dir <path>` (serve from any directory), `--ui-embedded` (force compiled-in UI), `--ui-theme <name>` (load a theme overlay), `--ui-init` (extract the built-in UI to `~/.avacli/ui/` as a starting point).
-
New
Theme system: `ui/css/variables.css` defines all design tokens. Drop a CSS file in `ui/themes/` to override them. Ships with `default`, `light`, and `cyberpunk` themes. Active theme persists in `settings.json`.
-
Update
New `UIFileServer` class handles MIME detection, path-traversal protection, and SPA fallback. Embedded fallback preserved — single-binary deployment still works if no disk `ui/` is present.
2.0.0
2026-04-07
avacli open source — packages.avalynn.ai
-
New
Linux packages and tarballs are now the open-source **avacli** build (MIT): APT package `avacli`, binary `avacli`, config under `~/.avacli/`.
-
Update
Single binary with embedded WebIDE, tool forge, Vultr fleet UI, vault, and Grok tools; connect xAI from Settings in the web UI or `avacli --set-api-key`.
-
Update
Legacy `avalynnai` platform-relay client packages remain in the repo for older installs; new installs should use `avacli`.
1.3.2
2026-03-27
Reverse proxy tunnel & full rebrand
-
New
Reverse proxy tunnel — access any node's full web UI from the platform without opening ports. A new "Direct" tab in node-chat loads the node's interface in an iframe, proxied securely through the platform.
-
New
TunnelClient (C++) — 4 concurrent long-polling threads handle proxied HTTP requests, forwarding them to the local HTTP server and returning responses (including binary content via base64).
-
New
Bridge tunnel infrastructure — in-memory request queues with promise-based resolution for /tunnel/proxy, /tunnel/poll, /tunnel/response, and /tunnel/status endpoints.
-
New
Platform PHP proxy — node-proxy.php with <base> tag injection for HTML, plus tunnel/poll.php, tunnel/respond.php, and tunnel/status.php passthrough endpoints.
-
Update
Full rebrand: renamed project from textgrok-agent to avalynnai. All internal variables, functions, log categories, provider strings, and JSON keys updated across bridge, app.js, and API layer.
-
New
Tunnel status indicator — green dot in the Direct tab shows when the node's tunnel connection is active.
1.3.1
2026-03-26
Real-time streaming relay
-
New
Real-time SSE streaming relay — content, thinking, tool calls, and media events stream live from nodes to the platform web UI via new stream-event and stream endpoints.
-
New
Rich response rendering in node-chat — collapsible reasoning blocks, tool execution display with success/fail indicators, inline media grid, and token usage footer.
-
New
Node capability badges displayed in the node-chat sidebar (chat, file_ops, nodejs, python, web_search, etc).
-
Fix
messages.php now accepts role from POST body (user/assistant) instead of hardcoding assistant — fixes message sync from client nodes.
-
Update
Session list distinguishes relay-synced sessions from web-initiated ones with visual indicators.
-
Update
Upload button in node-chat now wired to context_media in the relay payload.
1.3.0
2026-03-26
Node-chat sync, 28 tools, media generation
-
New
Node-chat session and message sync — client nodes sync relay conversations back to the platform via sessions.php and messages.php endpoints.
-
New
28 agent tools including web_search, x_search, generate_image, edit_image, generate_video, and persistent memory (add_memory, search_memory, forget_memory).
-
New
Multi-node relay with relay_to_node tool — send tasks to other AvalynnAI instances linked to your platform account.
-
New
Reasoning model support — grok-4.20-*-reasoning models stream internal chain-of-thought as separate thinking_delta events.
-
Update
Headless / CI mode with --non-interactive and --format stream-json outputs NDJSON events for piping into other tools.
-
Update
Session memory, edit history, and pinned notes persist across runs under ~/.avalynnai/sessions/.
1.2.3
2026-03-25
Image/video generation settings, model sync, file uploads
-
New
Image generation settings: aspect ratio (1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3), resolution, and image count options for grok-imagine-image models.
-
New
Video generation settings: duration (5/10/15s), aspect ratio, and resolution for grok-imagine-video models.
-
New
Model selector now shows Chat, Image Generation, and Video Generation categories with proper model names.
-
New
Upload button: attach images or videos to chat messages on both client and platform node-chat.
-
Fix
Model changes in the toolbar now auto-save immediately (no longer requires clicking Save Settings).
-
Fix
Model sync from platform: heartbeat response now includes node-chat settings; model changes on the platform propagate to the client automatically.
-
Update
Disconnect button now shows a confirmation dialog before disconnecting and clearing the API key.
1.2.2
2026-03-25
WebIDE, logging, xAI-only models, settings sync
-
New
Logs page: terminal-style system log with color-coded levels/categories, filters, auto-scroll, and 3-second polling. Captures chat I/O, API calls, node events, relay, settings changes, and errors.
-
New
WebIDE: Files page reimagined as a full code editor with lazy directory loading, file tabs, line numbers, Tab key support, Ctrl+S save, and revert.
-
Fix
Files: replaced recursive 5-level tree scan with lazy single-level listing — directory browsing is now instant.
-
Fix
Fixed HTTP 401 "getBalance failed" error — balance endpoint now supports agent key auth (X-Agent-Key header) when no access token is available.
-
Update
Settings: removed duplicate Disconnect button from bottom of Settings page.
-
New
Platform node-chat: model selector now shows only xAI models (chat, image gen, video gen) grouped by type. Added latest grok-4.20 0309 models.
-
New
Settings sync: model selection, context toggle, notes, and Ctrl+Enter preference sync between platform node-chat and client via new settings API.
1.2.1
2026-03-24
API key auth only, login removed
-
Fix
Fix "verify response parse error" when connecting via web UI — Apache .htaccess now resolves .php extensions for extensionless API paths.
-
Update
Removed login/logout subcommands — authentication is now exclusively via platform API key (--set-platform-key or the web UI Settings page).
-
Update
Removed email/password login and register routes from the embedded HTTP server.
-
Update
Docs updated: web UI Settings page noted as the primary way to configure keys, billing, model, and workspace.
1.2.0
2026-03-23
Platform key sharing and cluster sync
-
New
Platform Settings page: store your xAI API key with AES-256-GCM encryption. Toggle sharing so all connected cluster clients auto-sync it.
-
New
Client: --share-key flag pushes a locally-set xAI key to the platform for cluster-wide sharing.
-
New
Client: background loop fetches platform user settings every 60s; when a shared xAI key is set, chat routes directly to api.x.ai (zero platform billing).
-
Update
Agent API (chat, images, videos) resolves user xAI key first, skips token billing when present.
-
Update
Platform: access gate removed — all logged-in users can view plans, servers, and tokens immediately.
-
New
Platform: checkout modal preview on Plans page (Stripe integration coming soon).
1.1.4
2026-03-23
Platform Clients page CSP fix; chat billing toggle
-
Fix
Website: /platform/clients.php loads its script from /platform/assets/clients.js so Generate key works under strict Content-Security-Policy (script-src self).
-
New
Client: Settings → Chat billing — use Avalynn platform tokens or your own xAI API key (direct to api.x.ai); store key in ~/.avalynnai/settings.json; clear button.
-
New
Client: Toolbar and sidebar show Platform vs xAI for the active chat route; Usage page splits session totals and labels history rows by billing source.
1.1.3
2026-03-23
Toolbar popovers, SSE flush, permission-safe files
-
Fix
History/Notes buttons: inline handlers no longer call the browser native HTMLElement.togglePopover (rename to avaOpenToolbarPopover + classList).
-
Fix
Chat History: final SSE chunk is decoded before the stream closes so the done event (and session id) is not dropped.
-
Fix
Files tree skips entries that throw permission denied or other stat errors instead of failing the whole workspace listing.
-
Fix
Session list API tolerates permission issues and uses non-throwing size/mtime where supported.
1.1.2
2026-03-23
Embedded UI: history, notes, files, model picker
-
Fix
Model dropdown no longer throws ReferenceError (inline handlers use setModelFromSelect instead of a non-global S).
-
Fix
Chat sessions persist: empty client session now gets a server-assigned id, saved under ~/.avalynnai/sessions, returned on stream done so History lists real threads.
-
Fix
Files tree skips broken symlinks (no more whole-tree failure on unreadable file_size).
-
Fix
Notes popover stays open after add/toggle/delete; corrupt ava_notes localStorage no longer breaks the app.
-
Update
History popover shows clearer empty and error states.
1.1.1
2026-03-23
Settings: correct API key help link
-
Fix
Embedded settings UI now links to https://avalynn.ai/platform/clients.php (generate platform API keys) instead of a non-existent /settings URL.
1.1.0
2026-03-23
Platform-linked nodes and relay
-
New
Stable device identity: clients should persist and resend the same node_id on reconnect so the platform updates one row instead of creating duplicates.
-
New
Agent APIs: verify-key, register, heartbeat, deregister, and relay send/poll/respond work with platform user API keys (avl_u_*) as well as server keys.
-
Update
Dashboard Clients page for generating keys and viewing linked devices; Download page shows a living changelog.
-
Fix
Platform: correct heartbeat parameter binding; key revoke removes dependent node rows reliably; relay resolves users from user_api_keys when no node row exists yet.
-
New
Native client: /api/auth/connect calls verify-key then register with a stable node_id in ~/.avalynnai/auth.json; heartbeat + relay poll loop; inbound relay runs the agent and POSTs respond.
-
New
Native client: relay_to_node tool (queue on platform, poll status until completed). Platform user keys use X-Agent-Key on /platform/api/agent/chat; chat picks up new credentials after linking without restart.
1.0.0
2026-03-01
Initial Linux distribution
-
New
APT repository (packages.avalynn.ai) and Linux x86_64 / ARM64 tarballs.
-
New
avalynnai login, serve, chat, and --local-mode with XAI_API_KEY.