docs: simplify README, remove REST API examples and dev section, polish SKILL.md
This commit is contained in:
parent
3c7f970d4f
commit
ba3b365f4e
2 changed files with 71 additions and 168 deletions
177
README.md
177
README.md
|
|
@ -4,7 +4,7 @@
|
||||||
<img src="assets/logo.png" width="150" alt="helios-remote logo" />
|
<img src="assets/logo.png" width="150" alt="helios-remote logo" />
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
**AI-first remote control tool** — a relay server + Windows client written in Rust. Lets an AI agent (or any HTTP client) take full control of a remote Windows machine via a lightweight WebSocket relay.
|
**AI-first remote control tool** — a relay server + Windows client written in Rust. Lets an AI agent take full control of a remote Windows machine via a lightweight WebSocket relay.
|
||||||
|
|
||||||
## Quick Connect
|
## Quick Connect
|
||||||
|
|
||||||
|
|
@ -26,79 +26,50 @@ irm https://raw.githubusercontent.com/agent-helios/helios-remote/master/scripts/
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Architecture
|
## How It Works
|
||||||
|
|
||||||
```
|
|
||||||
helios-remote/
|
|
||||||
├── crates/
|
|
||||||
│ ├── common/ # Shared protocol types, WebSocket message definitions
|
|
||||||
│ ├── server/ # Relay server (REST API + WebSocket hub)
|
|
||||||
│ └── client/ # Windows client
|
|
||||||
├── remote.py # CLI wrapper for the REST API
|
|
||||||
├── Cargo.toml # Workspace root
|
|
||||||
└── README.md
|
|
||||||
```
|
|
||||||
|
|
||||||
### How It Works
|
|
||||||
|
|
||||||
```
|
```
|
||||||
AI Agent
|
AI Agent
|
||||||
│ REST API (X-Api-Key)
|
│
|
||||||
▼
|
▼ remote.py CLI
|
||||||
helios-server ──WebSocket── helios-client (Windows)
|
helios-server ──WebSocket── helios-client (Windows)
|
||||||
│ │
|
|
||||||
POST /devices/:label/screenshot │ Captures screen → base64 PNG
|
|
||||||
POST /devices/:label/exec │ Runs command in persistent shell
|
|
||||||
```
|
```
|
||||||
|
|
||||||
1. The **Windows client** connects to the relay server via WebSocket and sends a `Hello` with its device label.
|
1. The **Windows client** connects to the relay server via WebSocket and registers with its device label.
|
||||||
2. The **AI agent** calls the REST API using the device label to issue commands.
|
2. The **AI agent** uses `remote.py` to issue commands — screenshots, shell commands, window management, file transfers.
|
||||||
3. The relay server forwards commands to the correct client and streams back responses.
|
3. The relay server forwards everything to the correct client and streams back responses.
|
||||||
|
|
||||||
### Device Labels
|
Device labels are the sole identifier. Only one client instance can run per device.
|
||||||
|
|
||||||
Device labels are the **sole identifier** for connected clients. Labels must be:
|
---
|
||||||
- **Lowercase** only
|
|
||||||
- **No whitespace**
|
|
||||||
- Only `a-z`, `0-9`, `-`, `_` as characters
|
|
||||||
|
|
||||||
Labels are set during first-time client setup. Examples: `moritz_pc`, `work-desktop`, `gaming-rig`
|
## remote.py CLI
|
||||||
|
|
||||||
### Single Instance
|
```bash
|
||||||
|
python remote.py devices # list connected devices
|
||||||
|
python remote.py screenshot <device> screen # full-screen screenshot → /tmp/helios-remote-screenshot.png
|
||||||
|
python remote.py screenshot <device> <window_label> # screenshot a specific window
|
||||||
|
python remote.py exec <device> <command...> # run shell command (PowerShell)
|
||||||
|
python remote.py exec <device> --timeout 600 <command...> # with custom timeout (seconds)
|
||||||
|
python remote.py windows <device> # list visible windows
|
||||||
|
python remote.py focus <device> <window_label> # focus a window
|
||||||
|
python remote.py maximize <device> <window_label> # maximize and focus a window
|
||||||
|
python remote.py minimize-all <device> # minimize all windows
|
||||||
|
python remote.py prompt <device> "Please click Save" # show MessageBox, blocks until user confirms
|
||||||
|
python remote.py prompt <device> "message" --title "Title" # with custom dialog title
|
||||||
|
python remote.py run <device> <program> [args...] # launch program (fire-and-forget)
|
||||||
|
python remote.py clipboard-get <device> # get clipboard text
|
||||||
|
python remote.py clipboard-set <device> <text> # set clipboard text
|
||||||
|
python remote.py upload <device> <local> <remote> # upload file to device
|
||||||
|
python remote.py download <device> <remote> <local> # download file from device
|
||||||
|
python remote.py version <device> # compare relay/remote.py/client commits
|
||||||
|
python remote.py logs <device> # fetch last 100 lines of client log
|
||||||
|
python remote.py logs <device> --lines 200 # custom line count
|
||||||
|
```
|
||||||
|
|
||||||
Only one helios-remote client can run per device. The client uses a PID-based lock file to enforce this.
|
---
|
||||||
|
|
||||||
## Server
|
## Server Setup
|
||||||
|
|
||||||
### REST API
|
|
||||||
|
|
||||||
All endpoints (except `/version` and `/ws`) require the `X-Api-Key` header.
|
|
||||||
|
|
||||||
| Method | Path | Description |
|
|
||||||
|---|---|---|
|
|
||||||
| `GET` | `/devices` | List all connected devices |
|
|
||||||
| `POST` | `/devices/:label/screenshot` | Full screen screenshot (base64 PNG) |
|
|
||||||
| `POST` | `/devices/:label/exec` | Execute a shell command |
|
|
||||||
| `GET` | `/devices/:label/windows` | List visible windows (with labels) |
|
|
||||||
| `POST` | `/devices/:label/windows/minimize-all` | Minimize all windows |
|
|
||||||
| `POST` | `/devices/:label/windows/:window_id/screenshot` | Screenshot a specific window |
|
|
||||||
| `POST` | `/devices/:label/windows/:window_id/focus` | Focus a window |
|
|
||||||
| `POST` | `/devices/:label/windows/:window_id/maximize` | Maximize and focus a window |
|
|
||||||
| `POST` | `/devices/:label/prompt` | Show a MessageBox (blocks until OK) |
|
|
||||||
| `POST` | `/devices/:label/run` | Launch a program (fire-and-forget) |
|
|
||||||
| `GET` | `/devices/:label/clipboard` | Get clipboard contents |
|
|
||||||
| `POST` | `/devices/:label/clipboard` | Set clipboard contents |
|
|
||||||
| `GET` | `/devices/:label/version` | Get client version/commit |
|
|
||||||
| `POST` | `/devices/:label/upload` | Upload a file to the client |
|
|
||||||
| `GET` | `/devices/:label/download?path=...` | Download a file from the client |
|
|
||||||
| `GET` | `/devices/:label/logs` | Fetch client log tail |
|
|
||||||
| `GET` | `/version` | Server version/commit (no auth) |
|
|
||||||
|
|
||||||
### WebSocket
|
|
||||||
|
|
||||||
Clients connect to `ws://host:3000/ws`. The first message must be a `Hello` with the device label.
|
|
||||||
|
|
||||||
### Running the Server
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
HELIOS_API_KEY=your-secret-key HELIOS_BIND=0.0.0.0:3000 cargo run -p helios-server
|
HELIOS_API_KEY=your-secret-key HELIOS_BIND=0.0.0.0:3000 cargo run -p helios-server
|
||||||
|
|
@ -106,87 +77,11 @@ HELIOS_API_KEY=your-secret-key HELIOS_BIND=0.0.0.0:3000 cargo run -p helios-serv
|
||||||
|
|
||||||
| Variable | Default | Description |
|
| Variable | Default | Description |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| `HELIOS_API_KEY` | `dev-secret` | API key for REST endpoints |
|
| `HELIOS_API_KEY` | `dev-secret` | API key |
|
||||||
| `HELIOS_BIND` | `0.0.0.0:3000` | Listen address |
|
| `HELIOS_BIND` | `0.0.0.0:3000` | Listen address |
|
||||||
| `RUST_LOG` | `helios_server=debug` | Log level |
|
| `RUST_LOG` | `helios_server=debug` | Log level |
|
||||||
|
|
||||||
### Example API Usage
|
---
|
||||||
|
|
||||||
```bash
|
|
||||||
# List devices
|
|
||||||
curl -H "X-Api-Key: your-secret-key" http://localhost:3000/devices
|
|
||||||
|
|
||||||
# Take a full-screen screenshot
|
|
||||||
curl -s -X POST -H "X-Api-Key: your-secret-key" \
|
|
||||||
http://localhost:3000/devices/moritz_pc/screenshot
|
|
||||||
|
|
||||||
# Run a command
|
|
||||||
curl -s -X POST -H "X-Api-Key: your-secret-key" \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-d '{"command": "whoami"}' \
|
|
||||||
http://localhost:3000/devices/moritz_pc/exec
|
|
||||||
```
|
|
||||||
|
|
||||||
## remote.py CLI
|
|
||||||
|
|
||||||
The `remote.py` script provides a CLI wrapper around the REST API.
|
|
||||||
|
|
||||||
### Commands
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python remote.py devices # list connected devices
|
|
||||||
python remote.py screenshot <device> screen # full-screen screenshot → /tmp/helios-remote-screenshot.png
|
|
||||||
python remote.py screenshot <device> google_chrome # screenshot a specific window by label
|
|
||||||
python remote.py exec <device> <command...> # run shell command (PowerShell)
|
|
||||||
python remote.py exec <device> --timeout 600 <command...> # with custom timeout (seconds)
|
|
||||||
python remote.py windows <device> # list visible windows (with labels)
|
|
||||||
python remote.py focus <device> <window_label> # focus a window
|
|
||||||
python remote.py maximize <device> <window_label> # maximize and focus a window
|
|
||||||
python remote.py minimize-all <device> # minimize all windows
|
|
||||||
python remote.py prompt <device> "Please click Save" # ask user to do something manually
|
|
||||||
python remote.py prompt <device> "message" --title "Title" # with custom dialog title
|
|
||||||
python remote.py run <device> <program> [args...] # launch program (fire-and-forget)
|
|
||||||
python remote.py clipboard-get <device> # get clipboard text
|
|
||||||
python remote.py clipboard-set <device> <text> # set clipboard text
|
|
||||||
python remote.py upload <device> <local> <remote> # upload file
|
|
||||||
python remote.py download <device> <remote> <local> # download file
|
|
||||||
python remote.py version <device> # compare relay/remote.py/client commits
|
|
||||||
python remote.py logs <device> # fetch last 100 lines of client log
|
|
||||||
python remote.py logs <device> --lines 200 # custom line count
|
|
||||||
```
|
|
||||||
|
|
||||||
### Window Labels
|
|
||||||
|
|
||||||
Windows are identified by human-readable labels (same format as device labels: lowercase, no whitespace). Use `windows` to list them:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
$ python remote.py windows moritz_pc
|
|
||||||
Label Title
|
|
||||||
----------------------------------------------------------------------
|
|
||||||
google_chrome Google Chrome
|
|
||||||
discord Discord
|
|
||||||
visual_studio_code Visual Studio Code
|
|
||||||
```
|
|
||||||
|
|
||||||
Then use the label in `screenshot`, `focus`, or `maximize`:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python remote.py screenshot moritz_pc google_chrome
|
|
||||||
python remote.py focus moritz_pc discord
|
|
||||||
```
|
|
||||||
|
|
||||||
## Development
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Build everything
|
|
||||||
cargo build
|
|
||||||
|
|
||||||
# Run tests
|
|
||||||
cargo test
|
|
||||||
|
|
||||||
# Run server in dev mode
|
|
||||||
RUST_LOG=debug cargo run -p helios-server
|
|
||||||
```
|
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
|
|
|
||||||
62
SKILL.md
62
SKILL.md
|
|
@ -30,63 +30,71 @@ When Moritz asks to do something on a connected PC:
|
||||||
```bash
|
```bash
|
||||||
SKILL_DIR=/home/moritz/.openclaw/workspace/skills/helios-remote
|
SKILL_DIR=/home/moritz/.openclaw/workspace/skills/helios-remote
|
||||||
|
|
||||||
# Devices
|
# List connected devices
|
||||||
python $SKILL_DIR/remote.py devices
|
python $SKILL_DIR/remote.py devices
|
||||||
|
|
||||||
# Screenshot → /tmp/helios-remote-screenshot.png
|
# Screenshot → /tmp/helios-remote-screenshot.png
|
||||||
# ALWAYS prefer window screenshots (saves bandwidth)!
|
# ALWAYS prefer window screenshots (saves bandwidth)!
|
||||||
python $SKILL_DIR/remote.py screenshot moritz_pc google_chrome # window by label
|
python $SKILL_DIR/remote.py screenshot moritz-pc chrome # window by label
|
||||||
python $SKILL_DIR/remote.py screenshot moritz_pc screen # full screen only when no window known
|
python $SKILL_DIR/remote.py screenshot moritz-pc screen # full screen only when no window known
|
||||||
|
|
||||||
|
# List visible windows (use labels for screenshot/focus/maximize)
|
||||||
|
python $SKILL_DIR/remote.py windows moritz-pc
|
||||||
|
|
||||||
|
# Window labels come from the process name (e.g. chrome, discord, pycharm64)
|
||||||
|
# Duplicates get a number suffix: chrome, chrome2, chrome3
|
||||||
|
# Use `windows` to discover labels before targeting a specific window
|
||||||
|
|
||||||
|
# Focus / maximize a window
|
||||||
|
python $SKILL_DIR/remote.py focus moritz-pc discord
|
||||||
|
python $SKILL_DIR/remote.py maximize moritz-pc chrome
|
||||||
|
|
||||||
|
# Minimize all windows
|
||||||
|
python $SKILL_DIR/remote.py minimize-all moritz-pc
|
||||||
|
|
||||||
# Shell command (PowerShell, no wrapper needed)
|
# Shell command (PowerShell, no wrapper needed)
|
||||||
python $SKILL_DIR/remote.py exec moritz_pc "Get-Process"
|
python $SKILL_DIR/remote.py exec moritz-pc "Get-Process"
|
||||||
python $SKILL_DIR/remote.py exec moritz_pc "hostname"
|
python $SKILL_DIR/remote.py exec moritz-pc "hostname"
|
||||||
# With longer timeout for downloads etc. (default: 30s)
|
# With longer timeout for downloads etc. (default: 30s)
|
||||||
python $SKILL_DIR/remote.py exec moritz_pc --timeout 600 "Invoke-WebRequest -Uri https://... -OutFile C:\file.zip"
|
python $SKILL_DIR/remote.py exec moritz-pc --timeout 600 "Invoke-WebRequest -Uri https://... -OutFile C:\file.zip"
|
||||||
|
|
||||||
# Windows (visible only, shown with human-readable labels)
|
|
||||||
python $SKILL_DIR/remote.py windows moritz_pc
|
|
||||||
python $SKILL_DIR/remote.py focus moritz_pc discord
|
|
||||||
python $SKILL_DIR/remote.py maximize moritz_pc google_chrome
|
|
||||||
python $SKILL_DIR/remote.py minimize-all moritz_pc
|
|
||||||
|
|
||||||
# Launch program (fire-and-forget)
|
# Launch program (fire-and-forget)
|
||||||
python $SKILL_DIR/remote.py run moritz_pc notepad.exe
|
python $SKILL_DIR/remote.py run moritz-pc notepad.exe
|
||||||
|
|
||||||
# Ask user to do something (shows MessageBox, blocks until OK)
|
# Ask user to do something (shows MessageBox, blocks until OK)
|
||||||
python $SKILL_DIR/remote.py prompt moritz_pc "Please click Save, then OK"
|
python $SKILL_DIR/remote.py prompt moritz-pc "Please click Save, then OK"
|
||||||
python $SKILL_DIR/remote.py prompt moritz_pc "UAC dialog coming - please confirm" --title "Action required"
|
python $SKILL_DIR/remote.py prompt moritz-pc "UAC dialog coming - please confirm" --title "Action required"
|
||||||
|
|
||||||
# Clipboard
|
# Clipboard
|
||||||
python $SKILL_DIR/remote.py clipboard-get moritz_pc
|
python $SKILL_DIR/remote.py clipboard-get moritz-pc
|
||||||
python $SKILL_DIR/remote.py clipboard-set moritz_pc "Text for clipboard"
|
python $SKILL_DIR/remote.py clipboard-set moritz-pc "Text for clipboard"
|
||||||
|
|
||||||
# File transfer
|
# File transfer
|
||||||
python $SKILL_DIR/remote.py upload moritz_pc /tmp/local.txt "C:\Users\Moritz\Desktop\remote.txt"
|
python $SKILL_DIR/remote.py upload moritz-pc /tmp/local.txt "C:\Users\Moritz\Desktop\remote.txt"
|
||||||
python $SKILL_DIR/remote.py download moritz_pc "C:\Users\Moritz\file.txt" /tmp/downloaded.txt
|
python $SKILL_DIR/remote.py download moritz-pc "C:\Users\Moritz\file.txt" /tmp/downloaded.txt
|
||||||
|
|
||||||
# Version: compare relay + remote.py + client commits (are they in sync?)
|
# Version: compare relay + remote.py + client commits (are they in sync?)
|
||||||
python $SKILL_DIR/remote.py version moritz_pc
|
python $SKILL_DIR/remote.py version moritz-pc
|
||||||
|
|
||||||
# Client log (last 100 lines, --lines for more)
|
# Client log (last 100 lines, --lines for more)
|
||||||
python $SKILL_DIR/remote.py logs moritz_pc
|
python $SKILL_DIR/remote.py logs moritz-pc
|
||||||
python $SKILL_DIR/remote.py logs moritz_pc --lines 200
|
python $SKILL_DIR/remote.py logs moritz-pc --lines 200
|
||||||
```
|
```
|
||||||
|
|
||||||
## Typical Workflow: UI Task
|
## Typical Workflow: UI Task
|
||||||
|
|
||||||
1. `screenshot <device> screen` → look at the screen
|
1. `windows <device>` → find the window label
|
||||||
2. `windows <device>` → find the window label
|
2. `screenshot <device> <window_label>` → look at it
|
||||||
3. `focus <device> <window_label>` → bring it to front
|
3. `focus <device> <window_label>` → bring it to front if needed
|
||||||
4. `exec` → perform the action
|
4. `exec` → perform the action
|
||||||
5. `screenshot <device> <window_label>` → verify result
|
5. `screenshot <device> <window_label>` → verify result
|
||||||
|
|
||||||
## ⚠️ Prompt Rule (important!)
|
## ⚠️ Prompt Rule
|
||||||
|
|
||||||
**Never interact with UI blindly.** When you need the user to click something:
|
**Never interact with UI blindly.** When you need the user to click something:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python $SKILL_DIR/remote.py prompt moritz_pc "Please click [Save], then press OK"
|
python $SKILL_DIR/remote.py prompt moritz-pc "Please click [Save], then press OK"
|
||||||
```
|
```
|
||||||
|
|
||||||
This blocks until the user confirms. Use it whenever manual interaction is needed.
|
This blocks until the user confirms. Use it whenever manual interaction is needed.
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue