Changelog History
Page 1
-
v0.25.0 Changes
June 13, 2026What's Changed
- 📜 chore(deps): bump github.com/apple/swift-argument-parser from 1.8.1 to 1.8.2 by @dependabot[bot] in #109
- ➕ Add structured native app-control actions to BasicAction by @NakaokaRei in #111
Full Changelog : 0.24.0...0.25.0
-
v0.24.0 Changes
June 07, 2026What's Changed
- 📇 Rename image recognition target by @NakaokaRei in #103
- Organize SwiftAutoGUI sources by responsibility by @NakaokaRei in #104
- ✅ Make input-generating tests opt-in locally by @NakaokaRei in #105
- ➕ Add command plugin for Metal library generation by @NakaokaRei in #106
- ✂ Remove obsolete Metal build script by @NakaokaRei in #107
- 🚀 Automate plugin releases by @NakaokaRei in #108
Full Changelog : 0.23.0...0.24.0
-
v0.23.0 Changes
June 06, 2026Highlights
💥 Breaking: deprecated sync APIs removed. The synchronous variants previously marked
@available(*, deprecated, ...)have been deleted in favor of the async versions. If you were still on the sync paths, migrate by awaiting the async equivalents:SwiftAutoGUI.sendKeyShortcut(_:)→await SwiftAutoGUI.sendKeyShortcut(_:)SwiftAutoGUI.keyDown(_:)/keyUp(_:)→awaitversions- 🚚
SwiftAutoGUI.moveMouse(dx:dy:)→await SwiftAutoGUI.moveMouse(dx:dy:) - 🚚
SwiftAutoGUI.move(to:)→await SwiftAutoGUI.move(to:duration: 0) SwiftAutoGUI.doubleClick(button:)/doubleClick(at:button:)→awaitversionsSwiftAutoGUI.tripleClick(button:)/tripleClick(at:button:)→awaitversions
Metal-based template matching. Image recognition now runs on Metal instead of OpenCV — faster on Apple Silicon, fewer false positives, and the
opencv-spmdependency is gone.🔄 Changes
- 💥 breaking: remove deprecated sync APIs (#101)
- feat: replace OpenCV template matching with Metal (#100)
- 🔌 chore(plugin): bump to 0.23.0 (#102)
- 📄 docs: add Homebrew install instructions for sagui CLI
- 📄 docs: add star history chart
- 📜 chore(deps): bump swift-argument-parser to 1.8.1 (#98)
- 📄 docs: add AGENTS.md reference to CLAUDE.md (#99)
-
v0.22.0 Changes
May 16, 2026Highlights
🔌 Claude Code Marketplace plugin — SwiftAutoGUI can now be installed as a Claude Code plugin without cloning the repo:
/plugin marketplace add NakaokaRei/SwiftAutoGUI /plugin install swift-auto-gui@swift-auto-gui🍎 This installs the bundled
macos-controlskill, which lets Claude drive macOS GUI apps via thesaguiCLI — mouse, keyboard, screenshots, image recognition. The skill walks you through installing thesaguibinary on first use (Swift 6.2+ toolchain required).🔄 Changes
Closes #94.
-
v0.21.0 Changes
May 08, 2026What's Changed
- feat: add AXUIElement-based actions for semantic GUI automation (#91)
- feat: wire AX actions into AI backends and sagui CLI (#92)
- 🔌 chore(deps): bump github.com/apple/swift-docc-plugin from 1.4.6 to 1.5.0 (#89)
Highlights
This release adds an accessibility-based automation layer alongside the existing CGEvent path (closes #86):
- Library API — new
SwiftAutoGUI.pressButton(label:app:),setTextField(label:value:app:),selectMenuItem(path:app:),raiseWindow(title:app:),isEnabled(...),getValue(...). Plus low-level primitives inAXActionand search inAXSearch. App scoping viaAXAppScope(.frontmost,.bundleID,.pid). Label matching defaults to case-insensitive contains;exact: trueopts into strict equality.pressButtonautomatically falls back to a CGEvent click at the element's frame center when AX press is unsupported (e.g. Electron apps); opt out withaxOnly: true. - AI backends —
ActionandBasicActioncarry the new cases, so Foundation Models and OpenAI (Responses + Vision) can emit semantic AX actions. The vision prompt now nudges the model to prefer AX over coordinates when the accessibility tree provides labels. sagui axCLI — new subcommands:press,set,menu,tree,find. Useful for terminal-driven AX automation and for debugging which labels an app exposes.sagui ax tree --bundle-id com.apple.calculator sagui ax press --label"5"--bundle-id com.apple.calculator sagui ax press --label"Add"--bundle-id com.apple.calculator sagui ax menu File"New"--bundle-id com.apple.TextEdit sagui axset--role AXTextArea --value"hello"--bundle-id com.apple.TextEdit
A demo script
scripts/demo-ax.shdrives Calculator and TextEdit end-to-end.Full Changelog : 0.20.0...0.21.0
-
v0.20.0 Changes
April 12, 2026What's Changed
- feat: migrate OpenAI backends from Chat Completions to Responses API (#85)
- 🍎 feat: add Claude Code skill for macOS GUI automation (#83)
- feat: add keyboard input source context to AI agent (#81)
Full Changelog : 0.19.0...0.20.0
-
v0.19.0 Changes
April 02, 2026What's New
👍 Screen Context Support
The Agent now receives screen context — accessibility tree and window information — alongside screenshots, enabling more accurate reasoning about on-screen elements.
letagent=Agent(backend:backend)letresult=tryawaitagent.run(goal:"Click the Submit button")// Agent now sees both the screenshot AND the accessibility tree0️⃣ Screen context is enabled by default. Disable it from the CLI with:
sagui agent"your goal"--api-key sk-... --no-screen-contextKey additions:
- Screen context with accessibility tree and active window info sent to the vision backend
--no-screen-contextCLI flag to opt out- Screen Context tab in the Sample app with a toggle to enable/disable
Other Changes
- ⚡️ Updated project description to reflect full feature set
- Cleaned up and reorganized README
- ➕ Added contributor attribution
Full Changelog : 0.18.0...0.19.0
-
v0.18.0 Changes
April 01, 2026What's New
AI Agent (Autonomous Loop)
A new Agent that autonomously observes the screen, reasons about what it sees, and executes actions in a loop until a goal is achieved. This follows the ReAct (Observe → Think → Act) pattern using a vision-capable LLM.
letbackend=OpenAIVisionBackend(apiKey:"sk-...",model:"gpt-4o")letagent=Agent(backend:backend,maxIterations:15)letresult=tryawaitagent.run(goal:"Open Safari and search for Swift"){stepinprint("Step: \(step.reasoning)")}Also available from the CLI:
sagui agent"Open Safari and search for Swift"--api-key sk-...Key additions:
Agent— Orchestrates the observe-think-act loop with configurable max iterations and delayVisionActionGeneratingprotocol — For backends that accept screenshots and conversation historyOpenAIVisionBackend— Vision backend using OpenAI Chat Completions API with structured outputssagui agentCLI subcommand — Run the agent from the terminal- Agent demo tab in the Sample app with step history and stop support
✂ Removed MacPaw/OpenAI SDK Dependency
✅ Both
OpenAIBackendandOpenAIVisionBackendnow use directURLSessionHTTP calls to the OpenAI API, removing the MacPaw/OpenAI SDK dependency entirely. This improves compatibility with the latest OpenAI models and reduces the dependency footprint.Other Changes
BasicActionis nowpublicwith null-tolerant Codable decoding- 📜 Robust JSON response parsing for API responses with trailing bytes
- 🚀 Bumped actions/deploy-pages from 4 to 5
Full Changelog : 0.17.0...0.18.0
-
v0.17.0 Changes
March 22, 2026What's New
sagui CLI Tool
A new command-line tool sagui for GUI automation directly from the terminal, built with swift-argument-parser.
swift run sagui key shortcutcommandc# Keyboard shortcutswift run sagui mouse move --x 100 --y 200# Mouse movementswift run sagui screen screenshot --output capture.png# ScreenshotThree subcommands:
key— Send keyboard events and shortcutsmouse— Move, click, drag, and scrollscreen— Take screenshots
CI Improvements
- 🏗 Parallelized CI jobs (build, test, docs, sample app) for faster feedback
📚 Documentation
- ➕ Added DocC comments to sagui CLI commands
- ⚡️ Updated CLAUDE.md with sagui CLI, AI action generation, and current architecture
⚡️ Dependency Updates
- ⬆️ Bumped opencv-spm from 4.11.0 to 4.13.0
- ⬆️ Bumped swift-docc-plugin from 1.3.0 to 1.4.6
- ➕ Added Dependabot for automated dependency updates
- ⚡️ Updated GitHub Actions (checkout v6, upload-artifact v7, github-script v8, upload-pages-artifact v4)
Full Changelog : 0.16.0...0.17.0
-
v0.16.0 Changes
February 07, 2026What's New
👍 JIS Keyboard Layout Support
SwiftAutoGUI now supports JIS (Japanese Industrial Standard) keyboard layout in addition to US. The physical keyboard type is auto-detected, so symbols like
@,[,:,_are mapped to the correct keys automatically.🆕 New Features
KeyboardLayoutenum —.usand.jiswith per-layout character-to-key mapping tables- Physical keyboard auto-detection — Uses
KBGetLayoutTypeto detect hardware layout regardless of IME setting CharacterKeyMapping— Pairs key and shift requirement together for accurate per-layout symbol inputSwiftAutoGUI.currentLayout— Read the detected layout or manually override it- JIS-specific keys —
.jisYen(¥),.jisUnderscore(ろ),.jisEisu(英数),.jisKana(かな) Key.from(character:layout:)— Layout-explicit key lookup usable from any concurrency context
Usage
// Auto-detected — just works on both US and JIS keyboardsawaitSwiftAutoGUI.write("Hello @world [test]")// Manually overrideSwiftAutoGUI.currentLayout=.jis // Reset to auto-detectionSwiftAutoGUI.resetLayoutToAutoDetect()// Explicit layout lookupletkey=Key.from(character:"@",layout:.jis)// .leftBracketSample App
- Added KB Layout demo tab with layout switching, US/JIS mapping comparison table, and symbol typing test
Backward Compatibility
- Existing code continues to work without changes
- US layout environments behave identically to before via auto-detection
Key.from(character:)now usescurrentLayoutinternally (@MainActor)
Full Changelog : 0.15.0...0.16.0