All Versions
22
Latest Version
Avg Release Cycle
16 days
Latest Release
5 days ago

Changelog History
Page 1

  • v0.25.0 Changes

    June 13, 2026

    What's Changed

    • 📜 chore(deps): bump github.com/apple/swift-argument-parser from 1.8.1 to 1.8.2 by @dependabot[bot] in #109
    • ➕ Add structured native app-control actions to BasicAction by @NakaokaRei in #111

    Full Changelog : 0.24.0...0.25.0

  • v0.24.0 Changes

    June 07, 2026

    What's Changed

    Full Changelog : 0.23.0...0.24.0

  • v0.23.0 Changes

    June 06, 2026

    Highlights

    💥 Breaking: deprecated sync APIs removed. The synchronous variants previously marked @available(*, deprecated, ...) have been deleted in favor of the async versions. If you were still on the sync paths, migrate by awaiting the async equivalents:

    • SwiftAutoGUI.sendKeyShortcut(_:)await SwiftAutoGUI.sendKeyShortcut(_:)
    • SwiftAutoGUI.keyDown(_:) / keyUp(_:)await versions
    • 🚚 SwiftAutoGUI.moveMouse(dx:dy:)await SwiftAutoGUI.moveMouse(dx:dy:)
    • 🚚 SwiftAutoGUI.move(to:)await SwiftAutoGUI.move(to:duration: 0)
    • SwiftAutoGUI.doubleClick(button:) / doubleClick(at:button:)await versions
    • SwiftAutoGUI.tripleClick(button:) / tripleClick(at:button:)await versions

    Metal-based template matching. Image recognition now runs on Metal instead of OpenCV — faster on Apple Silicon, fewer false positives, and the opencv-spm dependency is gone.

    🔄 Changes

    • 💥 breaking: remove deprecated sync APIs (#101)
    • feat: replace OpenCV template matching with Metal (#100)
    • 🔌 chore(plugin): bump to 0.23.0 (#102)
    • 📄 docs: add Homebrew install instructions for sagui CLI
    • 📄 docs: add star history chart
    • 📜 chore(deps): bump swift-argument-parser to 1.8.1 (#98)
    • 📄 docs: add AGENTS.md reference to CLAUDE.md (#99)
  • v0.22.0 Changes

    May 16, 2026

    Highlights

    🔌 Claude Code Marketplace plugin — SwiftAutoGUI can now be installed as a Claude Code plugin without cloning the repo:

    /plugin marketplace add NakaokaRei/SwiftAutoGUI
    /plugin install swift-auto-gui@swift-auto-gui
    

    🍎 This installs the bundled macos-control skill, which lets Claude drive macOS GUI apps via the sagui CLI — mouse, keyboard, screenshots, image recognition. The skill walks you through installing the sagui binary on first use (Swift 6.2+ toolchain required).

    🔄 Changes

    • 📦 feat: package skill as Claude Code Marketplace plugin (#95)
    • 🔌 chore(plugin): bump to 0.22.0 (#96)

    Closes #94.

  • v0.21.0 Changes

    May 08, 2026

    What's Changed

    • feat: add AXUIElement-based actions for semantic GUI automation (#91)
    • feat: wire AX actions into AI backends and sagui CLI (#92)
    • 🔌 chore(deps): bump github.com/apple/swift-docc-plugin from 1.4.6 to 1.5.0 (#89)

    Highlights

    This release adds an accessibility-based automation layer alongside the existing CGEvent path (closes #86):

    • Library API — new SwiftAutoGUI.pressButton(label:app:), setTextField(label:value:app:), selectMenuItem(path:app:), raiseWindow(title:app:), isEnabled(...), getValue(...). Plus low-level primitives in AXAction and search in AXSearch. App scoping via AXAppScope (.frontmost, .bundleID, .pid). Label matching defaults to case-insensitive contains; exact: true opts into strict equality. pressButton automatically falls back to a CGEvent click at the element's frame center when AX press is unsupported (e.g. Electron apps); opt out with axOnly: true.
    • AI backendsAction and BasicAction carry the new cases, so Foundation Models and OpenAI (Responses + Vision) can emit semantic AX actions. The vision prompt now nudges the model to prefer AX over coordinates when the accessibility tree provides labels.
    • sagui ax CLI — new subcommands: press, set, menu, tree, find. Useful for terminal-driven AX automation and for debugging which labels an app exposes.

      sagui ax tree --bundle-id com.apple.calculator sagui ax press --label"5"--bundle-id com.apple.calculator sagui ax press --label"Add"--bundle-id com.apple.calculator sagui ax menu File"New"--bundle-id com.apple.TextEdit sagui axset--role AXTextArea --value"hello"--bundle-id com.apple.TextEdit

    A demo script scripts/demo-ax.sh drives Calculator and TextEdit end-to-end.

    Full Changelog : 0.20.0...0.21.0

  • v0.20.0 Changes

    April 12, 2026

    What's Changed

    • feat: migrate OpenAI backends from Chat Completions to Responses API (#85)
    • 🍎 feat: add Claude Code skill for macOS GUI automation (#83)
    • feat: add keyboard input source context to AI agent (#81)

    Full Changelog : 0.19.0...0.20.0

  • v0.19.0 Changes

    April 02, 2026

    What's New

    👍 Screen Context Support

    The Agent now receives screen context — accessibility tree and window information — alongside screenshots, enabling more accurate reasoning about on-screen elements.

    letagent=Agent(backend:backend)letresult=tryawaitagent.run(goal:"Click the Submit button")// Agent now sees both the screenshot AND the accessibility tree
    

    0️⃣ Screen context is enabled by default. Disable it from the CLI with:

    sagui agent"your goal"--api-key sk-... --no-screen-context
    

    Key additions:

    • Screen context with accessibility tree and active window info sent to the vision backend
    • --no-screen-context CLI flag to opt out
    • Screen Context tab in the Sample app with a toggle to enable/disable

    Other Changes

    • ⚡️ Updated project description to reflect full feature set
    • Cleaned up and reorganized README
    • ➕ Added contributor attribution

    Full Changelog : 0.18.0...0.19.0

  • v0.18.0 Changes

    April 01, 2026

    What's New

    AI Agent (Autonomous Loop)

    A new Agent that autonomously observes the screen, reasons about what it sees, and executes actions in a loop until a goal is achieved. This follows the ReAct (Observe → Think → Act) pattern using a vision-capable LLM.

    letbackend=OpenAIVisionBackend(apiKey:"sk-...",model:"gpt-4o")letagent=Agent(backend:backend,maxIterations:15)letresult=tryawaitagent.run(goal:"Open Safari and search for Swift"){stepinprint("Step: \(step.reasoning)")}
    

    Also available from the CLI:

    sagui agent"Open Safari and search for Swift"--api-key sk-...
    

    Key additions:

    • Agent — Orchestrates the observe-think-act loop with configurable max iterations and delay
    • VisionActionGenerating protocol — For backends that accept screenshots and conversation history
    • OpenAIVisionBackend — Vision backend using OpenAI Chat Completions API with structured outputs
    • sagui agent CLI subcommand — Run the agent from the terminal
    • Agent demo tab in the Sample app with step history and stop support

    ✂ Removed MacPaw/OpenAI SDK Dependency

    ✅ Both OpenAIBackend and OpenAIVisionBackend now use direct URLSession HTTP calls to the OpenAI API, removing the MacPaw/OpenAI SDK dependency entirely. This improves compatibility with the latest OpenAI models and reduces the dependency footprint.

    Other Changes

    • BasicAction is now public with null-tolerant Codable decoding
    • 📜 Robust JSON response parsing for API responses with trailing bytes
    • 🚀 Bumped actions/deploy-pages from 4 to 5

    Full Changelog : 0.17.0...0.18.0

  • v0.17.0 Changes

    March 22, 2026

    What's New

    sagui CLI Tool

    A new command-line tool sagui for GUI automation directly from the terminal, built with swift-argument-parser.

    swift run sagui key shortcutcommandc# Keyboard shortcutswift run sagui mouse move --x 100 --y 200# Mouse movementswift run sagui screen screenshot --output capture.png# Screenshot
    

    Three subcommands:

    • key — Send keyboard events and shortcuts
    • mouse — Move, click, drag, and scroll
    • screen — Take screenshots

    CI Improvements

    • 🏗 Parallelized CI jobs (build, test, docs, sample app) for faster feedback

    📚 Documentation

    • ➕ Added DocC comments to sagui CLI commands
    • ⚡️ Updated CLAUDE.md with sagui CLI, AI action generation, and current architecture

    ⚡️ Dependency Updates

    • ⬆️ Bumped opencv-spm from 4.11.0 to 4.13.0
    • ⬆️ Bumped swift-docc-plugin from 1.3.0 to 1.4.6
    • ➕ Added Dependabot for automated dependency updates
    • ⚡️ Updated GitHub Actions (checkout v6, upload-artifact v7, github-script v8, upload-pages-artifact v4)

    Full Changelog : 0.16.0...0.17.0

  • v0.16.0 Changes

    February 07, 2026

    What's New

    👍 JIS Keyboard Layout Support

    SwiftAutoGUI now supports JIS (Japanese Industrial Standard) keyboard layout in addition to US. The physical keyboard type is auto-detected, so symbols like @, [, :, _ are mapped to the correct keys automatically.

    🆕 New Features

    • KeyboardLayout enum.us and .jis with per-layout character-to-key mapping tables
    • Physical keyboard auto-detection — Uses KBGetLayoutType to detect hardware layout regardless of IME setting
    • CharacterKeyMapping — Pairs key and shift requirement together for accurate per-layout symbol input
    • SwiftAutoGUI.currentLayout — Read the detected layout or manually override it
    • JIS-specific keys.jisYen (¥), .jisUnderscore (ろ), .jisEisu (英数), .jisKana (かな)
    • Key.from(character:layout:) — Layout-explicit key lookup usable from any concurrency context

    Usage

    // Auto-detected — just works on both US and JIS keyboardsawaitSwiftAutoGUI.write("Hello @world [test]")// Manually overrideSwiftAutoGUI.currentLayout=.jis
    
    // Reset to auto-detectionSwiftAutoGUI.resetLayoutToAutoDetect()// Explicit layout lookupletkey=Key.from(character:"@",layout:.jis)// .leftBracket
    

    Sample App

    • Added KB Layout demo tab with layout switching, US/JIS mapping comparison table, and symbol typing test

    Backward Compatibility

    • Existing code continues to work without changes
    • US layout environments behave identically to before via auto-detection
    • Key.from(character:) now uses currentLayout internally (@MainActor)

    Full Changelog : 0.15.0...0.16.0