- Device labels: lowercase, no whitespace, only a-z 0-9 - _ (enforced at config time) - Session IDs removed: device label is the sole identifier - Routes changed: /sessions/:id → /devices/:label - Removed commands: click, type, find-window, wait-for-window, label, old version, server-version - Renamed: status → version (compares relay/remote.py/client commits) - Unified screenshot: takes 'screen' or a window label as argument - Windows listed with human-readable labels (same format as device labels) - Single instance enforcement via PID lock file - Removed input.rs (click/type functionality) - All docs and code in English - Protocol: Hello.label is now required (String, not Option<String>) - Client auto-migrates invalid labels on startup |
||
|---|---|---|
| .cargo | ||
| .github/workflows | ||
| assets | ||
| crates | ||
| scripts | ||
| .gitignore | ||
| Cargo.toml | ||
| config.env | ||
| config.env.example | ||
| README.md | ||
| remote.py | ||
| SKILL.md | ||
helios-remote
AI-first remote control tool — a relay server + Windows client written in Rust. Lets an AI agent (or any HTTP client) take full control of a remote Windows machine via a lightweight WebSocket relay.
Quick Connect
Run this in PowerShell (as Admin):
irm https://raw.githubusercontent.com/agent-helios/helios-remote/master/scripts/install.ps1 | iex
Windows Defender notice: Defender may flag the executable as unknown. Temporarily disable real-time protection before running:
Set-MpPreference -DisableRealtimeMonitoring $trueRe-enable it after the client has started:
Set-MpPreference -DisableRealtimeMonitoring $false
Architecture
helios-remote/
├── crates/
│ ├── common/ # Shared protocol types, WebSocket message definitions
│ ├── server/ # Relay server (REST API + WebSocket hub)
│ └── client/ # Windows client
├── remote.py # CLI wrapper for the REST API
├── Cargo.toml # Workspace root
└── README.md
How It Works
AI Agent
│ REST API (X-Api-Key)
▼
helios-server ──WebSocket── helios-client (Windows)
│ │
POST /devices/:label/screenshot │ Captures screen → base64 PNG
POST /devices/:label/exec │ Runs command in persistent shell
- The Windows client connects to the relay server via WebSocket and sends a
Hellowith its device label. - The AI agent calls the REST API using the device label to issue commands.
- The relay server forwards commands to the correct client and streams back responses.
Device Labels
Device labels are the sole identifier for connected clients. Labels must be:
- Lowercase only
- No whitespace
- Only
a-z,0-9,-,_as characters
Labels are set during first-time client setup. Examples: moritz_pc, work-desktop, gaming-rig
Single Instance
Only one helios-remote client can run per device. The client uses a PID-based lock file to enforce this.
Server
REST API
All endpoints (except /version and /ws) require the X-Api-Key header.
| Method | Path | Description |
|---|---|---|
GET |
/devices |
List all connected devices |
POST |
/devices/:label/screenshot |
Full screen screenshot (base64 PNG) |
POST |
/devices/:label/exec |
Execute a shell command |
GET |
/devices/:label/windows |
List visible windows (with labels) |
POST |
/devices/:label/windows/minimize-all |
Minimize all windows |
POST |
/devices/:label/windows/:window_id/screenshot |
Screenshot a specific window |
POST |
/devices/:label/windows/:window_id/focus |
Focus a window |
POST |
/devices/:label/windows/:window_id/maximize |
Maximize and focus a window |
POST |
/devices/:label/prompt |
Show a MessageBox (blocks until OK) |
POST |
/devices/:label/run |
Launch a program (fire-and-forget) |
GET |
/devices/:label/clipboard |
Get clipboard contents |
POST |
/devices/:label/clipboard |
Set clipboard contents |
GET |
/devices/:label/version |
Get client version/commit |
POST |
/devices/:label/upload |
Upload a file to the client |
GET |
/devices/:label/download?path=... |
Download a file from the client |
GET |
/devices/:label/logs |
Fetch client log tail |
GET |
/version |
Server version/commit (no auth) |
WebSocket
Clients connect to ws://host:3000/ws. The first message must be a Hello with the device label.
Running the Server
HELIOS_API_KEY=your-secret-key HELIOS_BIND=0.0.0.0:3000 cargo run -p helios-server
| Variable | Default | Description |
|---|---|---|
HELIOS_API_KEY |
dev-secret |
API key for REST endpoints |
HELIOS_BIND |
0.0.0.0:3000 |
Listen address |
RUST_LOG |
helios_server=debug |
Log level |
Example API Usage
# List devices
curl -H "X-Api-Key: your-secret-key" http://localhost:3000/devices
# Take a full-screen screenshot
curl -s -X POST -H "X-Api-Key: your-secret-key" \
http://localhost:3000/devices/moritz_pc/screenshot
# Run a command
curl -s -X POST -H "X-Api-Key: your-secret-key" \
-H "Content-Type: application/json" \
-d '{"command": "whoami"}' \
http://localhost:3000/devices/moritz_pc/exec
remote.py CLI
The remote.py script provides a CLI wrapper around the REST API.
Commands
python remote.py devices # list connected devices
python remote.py screenshot <device> screen # full-screen screenshot → /tmp/helios-remote-screenshot.png
python remote.py screenshot <device> google_chrome # screenshot a specific window by label
python remote.py exec <device> <command...> # run shell command (PowerShell)
python remote.py exec <device> --timeout 600 <command...> # with custom timeout (seconds)
python remote.py windows <device> # list visible windows (with labels)
python remote.py focus <device> <window_label> # focus a window
python remote.py maximize <device> <window_label> # maximize and focus a window
python remote.py minimize-all <device> # minimize all windows
python remote.py prompt <device> "Please click Save" # ask user to do something manually
python remote.py prompt <device> "message" --title "Title" # with custom dialog title
python remote.py run <device> <program> [args...] # launch program (fire-and-forget)
python remote.py clipboard-get <device> # get clipboard text
python remote.py clipboard-set <device> <text> # set clipboard text
python remote.py upload <device> <local> <remote> # upload file
python remote.py download <device> <remote> <local> # download file
python remote.py version <device> # compare relay/remote.py/client commits
python remote.py logs <device> # fetch last 100 lines of client log
python remote.py logs <device> --lines 200 # custom line count
Window Labels
Windows are identified by human-readable labels (same format as device labels: lowercase, no whitespace). Use windows to list them:
$ python remote.py windows moritz_pc
Label Title
----------------------------------------------------------------------
google_chrome Google Chrome
discord Discord
visual_studio_code Visual Studio Code
Then use the label in screenshot, focus, or maximize:
python remote.py screenshot moritz_pc google_chrome
python remote.py focus moritz_pc discord
Development
# Build everything
cargo build
# Run tests
cargo test
# Run server in dev mode
RUST_LOG=debug cargo run -p helios-server
License
MIT