helios-remote/README.md

# helios-remote

<p align="center">
  <img src="assets/logo.png" width="150" alt="helios-remote logo" />
</p>

**AI-first remote control tool** — a relay server + Windows client written in Rust. Lets an AI agent (or any HTTP client) take full control of a remote Windows machine via a lightweight WebSocket relay.

## Quick Connect

Run this in PowerShell **(as Admin)**:

```powershell
irm https://raw.githubusercontent.com/agent-helios/helios-remote/master/scripts/install.ps1 | iex
```

> **Windows Defender notice:** Defender may flag the executable as unknown.
> Temporarily disable real-time protection before running:
> ```powershell
> Set-MpPreference -DisableRealtimeMonitoring $true
> ```
> Re-enable it after the client has started:
> ```powershell
> Set-MpPreference -DisableRealtimeMonitoring $false
> ```

---

## Architecture

```
helios-remote/
├── crates/
│   ├── common/     # Shared protocol types, WebSocket message definitions
│   ├── server/     # Relay server (REST API + WebSocket hub)
│   └── client/     # Windows client
├── remote.py       # CLI wrapper for the REST API
├── Cargo.toml      # Workspace root
└── README.md
```

### How It Works

```
AI Agent
   │  REST API (X-Api-Key)
   ▼
helios-server  ──WebSocket──  helios-client (Windows)
   │                               │
POST /devices/:label/screenshot    │  Captures screen → base64 PNG
POST /devices/:label/exec         │  Runs command in persistent shell
```

1. The **Windows client** connects to the relay server via WebSocket and sends a `Hello` with its device label.
2. The **AI agent** calls the REST API using the device label to issue commands.
3. The relay server forwards commands to the correct client and streams back responses.

### Device Labels

Device labels are the **sole identifier** for connected clients. Labels must be:
- **Lowercase** only
- **No whitespace**
- Only `a-z`, `0-9`, `-`, `_` as characters

Labels are set during first-time client setup. Examples: `moritz_pc`, `work-desktop`, `gaming-rig`

### Single Instance

Only one helios-remote client can run per device. The client uses a PID-based lock file to enforce this.

## Server

### REST API

All endpoints (except `/version` and `/ws`) require the `X-Api-Key` header.

| Method | Path | Description |
|---|---|---|
| `GET` | `/devices` | List all connected devices |
| `POST` | `/devices/:label/screenshot` | Full screen screenshot (base64 PNG) |
| `POST` | `/devices/:label/exec` | Execute a shell command |
| `GET` | `/devices/:label/windows` | List visible windows (with labels) |
| `POST` | `/devices/:label/windows/minimize-all` | Minimize all windows |
| `POST` | `/devices/:label/windows/:window_id/screenshot` | Screenshot a specific window |
| `POST` | `/devices/:label/windows/:window_id/focus` | Focus a window |
| `POST` | `/devices/:label/windows/:window_id/maximize` | Maximize and focus a window |
| `POST` | `/devices/:label/prompt` | Show a MessageBox (blocks until OK) |
| `POST` | `/devices/:label/run` | Launch a program (fire-and-forget) |
| `GET` | `/devices/:label/clipboard` | Get clipboard contents |
| `POST` | `/devices/:label/clipboard` | Set clipboard contents |
| `GET` | `/devices/:label/version` | Get client version/commit |
| `POST` | `/devices/:label/upload` | Upload a file to the client |
| `GET` | `/devices/:label/download?path=...` | Download a file from the client |
| `GET` | `/devices/:label/logs` | Fetch client log tail |
| `GET` | `/version` | Server version/commit (no auth) |

### WebSocket

Clients connect to `ws://host:3000/ws`. The first message must be a `Hello` with the device label.

### Running the Server

```bash
HELIOS_API_KEY=your-secret-key HELIOS_BIND=0.0.0.0:3000 cargo run -p helios-server
```

| Variable | Default | Description |
|---|---|---|
| `HELIOS_API_KEY` | `dev-secret` | API key for REST endpoints |
| `HELIOS_BIND` | `0.0.0.0:3000` | Listen address |
| `RUST_LOG` | `helios_server=debug` | Log level |

### Example API Usage

```bash
# List devices
curl -H "X-Api-Key: your-secret-key" http://localhost:3000/devices

# Take a full-screen screenshot
curl -s -X POST -H "X-Api-Key: your-secret-key" \
  http://localhost:3000/devices/moritz_pc/screenshot

# Run a command
curl -s -X POST -H "X-Api-Key: your-secret-key" \
  -H "Content-Type: application/json" \
  -d '{"command": "whoami"}' \
  http://localhost:3000/devices/moritz_pc/exec
```

## remote.py CLI

The `remote.py` script provides a CLI wrapper around the REST API.

### Commands

```bash
python remote.py devices                                    # list connected devices
python remote.py screenshot <device> screen                 # full-screen screenshot → /tmp/helios-remote-screenshot.png
python remote.py screenshot <device> google_chrome          # screenshot a specific window by label
python remote.py exec <device> <command...>                 # run shell command (PowerShell)
python remote.py exec <device> --timeout 600 <command...>   # with custom timeout (seconds)
python remote.py windows <device>                           # list visible windows (with labels)
python remote.py focus <device> <window_label>              # focus a window
python remote.py maximize <device> <window_label>           # maximize and focus a window
python remote.py minimize-all <device>                      # minimize all windows
python remote.py prompt <device> "Please click Save"        # ask user to do something manually
python remote.py prompt <device> "message" --title "Title"  # with custom dialog title
python remote.py run <device> <program> [args...]           # launch program (fire-and-forget)
python remote.py clipboard-get <device>                     # get clipboard text
python remote.py clipboard-set <device> <text>              # set clipboard text
python remote.py upload <device> <local> <remote>           # upload file
python remote.py download <device> <remote> <local>         # download file
python remote.py version <device>                           # compare relay/remote.py/client commits
python remote.py logs <device>                              # fetch last 100 lines of client log
python remote.py logs <device> --lines 200                  # custom line count
```

### Window Labels

Windows are identified by human-readable labels (same format as device labels: lowercase, no whitespace). Use `windows` to list them:

```bash
$ python remote.py windows moritz_pc
Label                           Title
----------------------------------------------------------------------
google_chrome                   Google Chrome
discord                         Discord
visual_studio_code              Visual Studio Code
```

Then use the label in `screenshot`, `focus`, or `maximize`:

```bash
python remote.py screenshot moritz_pc google_chrome
python remote.py focus moritz_pc discord
```

## Development

```bash
# Build everything
cargo build

# Run tests
cargo test

# Run server in dev mode
RUST_LOG=debug cargo run -p helios-server
```

## License

MIT