Setup

Setup & Installation

Prerequisites

Requirement	Version	Notes
Python	3.10+	Tested on 3.11
Node.js	18+	Required only for the Discord bot
OBS Studio	28+	obs-websocket 5.x built-in
Virtual Audio Cable	any	e.g. VB-Audio CABLE — optional but recommended

1. Clone & Install

TERMINAL

git clone https://github.com/emqnuele/projectBEA.git
cd projectbea

Create a virtual environment (recommended):

TERMINAL

python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activate

Install Python dependencies:

TERMINAL

pip install -r requirements.txt

2. Environment Variables

Create a .env file in the project root:

TERMINAL

# LLM providers — add the ones you plan to use
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=AIzaSy...
GROQ_API_KEY=gsk_...
GLM_API_KEY=...

# TTS — only if using Orpheus
ORPHEUS_API_KEY=...
ORPHEUS_ENDPOINT=https://model-xxxxxxxx.api.baseten.co/environments/production/predict

# Discord — only if using the Discord skill
DISCORD_TOKEN=...

Security note: Environment variables always take priority over config.json for secret fields (*_key, orpheus_endpoint). If an env var is set and non-empty, the config.json value is silently skipped — even if it is also non-empty. A non-empty config.json value is only used as a fallback when the env var is not set.

3. OBS Studio Setup

Open OBS Studio.
Go to Tools → WebSocket Server Settings.
Enable the WebSocket server (default port: 4455).
Set a password and note it down — you'll need it in config.json.

Recommended OBS Sources

Source Name	Type	Purpose
`BeaPNG`	Image Source	Avatar PNG (talking/idle) — or `BeaVid` if using `obs_source_type: "media"`
`AIText`	Text (GDI+)	Animated speech bubble

Set obs_avatar_source, obs_text_source in config.json to match your source names.

4. Avatar Images / Videos

Populate data/pngs/ with avatar assets organized by mood. Each mood folder contains two files: an idle and a talking state.

TERMINAL

data/pngs/
├── normal/
│   ├── idle.mp4       (or .png, .gif)
│   └── talking.mp4
├── angry/
│   ├── idle.mp4
│   └── talking.mp4
├── bored/  cry/  ew/  love/  shock/   (same structure)

The obs_source_type config key controls whether OBS uses an image source (image) or a media source (media).

Then map the files in config.json under the avatar_map key:

TERMINAL

"avatar_map": {
  "normal": { "idle": "data/pngs/normal/idle.mp4", "talking": "data/pngs/normal/talking.mp4" },
  "angry":  { "idle": "data/pngs/angry/idle.mp4",  "talking": "data/pngs/angry/talking.mp4"  }
}

5. Audio Device Setup

ProjectBEA outputs audio to a specific device ID. To list available devices:

TERMINAL

python -c "import sounddevice; print(sounddevice.query_devices())"

Find the ID of your virtual cable (e.g. CABLE Input on Windows) and set audio_device_id in config.json.

6. Discord Bot Setup (optional)

Install Node.js dependencies for the bot:

TERMINAL

cd src/modules/skills/discord/bot
npm install

Set your Discord token in .env or in config.json under skills.discord.token.

In config.json, also set:

skills.discord.enabled: true
skills.discord.target_channel: the voice channel name where Bea should listen/speak

Discord Skill Details →

7. Kokoro TTS Setup (optional)

Kokoro runs entirely locally — no API key required.

The engine automatically downloads the model files on first launch if they are missing:

kokoro-v0_19.onnx (~95 MB)
voices.bin (~30 MB)

No manual steps needed. Just set tts_provider to kokoro in config.json and start the engine. The download happens once and is cached in the project root.

To use a different path, update kokoro_model and kokoro_voices_file in config.json.

8. Orpheus TTS Setup (optional)

Orpheus is a high-quality expressive voice API hosted on Baseten. It requires a manual deployment step before use:

Create an account at baseten.co.
From the Baseten model library, find and deploy the Orpheus TTS model to your workspace.
Wait for the deployment to become active (a few minutes).
Copy the Endpoint URL shown in your deployment dashboard (format: https://model-xxxxxxxx.api.baseten.co/environments/production/predict).
Copy your API key from the Baseten account settings.
Add both to your .env:

TERMINAL

ORPHEUS_API_KEY=your-baseten-api-key
ORPHEUS_ENDPOINT=https://model-xxxxxxxx.api.baseten.co/environments/production/predict

Security note: ORPHEUS_ENDPOINT is treated as a secret — it is read from the environment variable and is never saved to config.json, even if set via the web dashboard.

Then in config.json set tts_provider to orpheus and orpheus_voice to one of: zoe, tara, leo, leah.

Note: Baseten bills per inference. Orpheus is the most expensive TTS option — use EdgeTTS or Kokoro for testing.

Running the Frontend in Development Mode

TERMINAL

cd src/web/frontend
npm install
npm run dev

The Vite dev server starts at http://localhost:5173. The frontend makes direct API calls to http://localhost:8000 — no proxy is configured in vite.config.js. If you change the backend port, update the API base URL in the frontend source accordingly.

Troubleshooting

Problem	Solution
`OBS not connected` warning on start	OBS is not running or WebSocket creds are wrong — the engine continues without it
`No audio device` error	Run the sounddevice query above and update `audio_device_id`
Discord bot fails with `node_modules not found`	Run `npm install` in `src/modules/skills/discord/bot/`
Memory skill disabled on start	`OPENAI_API_KEY` not set — ChromaDB embedding requires it
`GEMINI_API_KEY is missing`	Set the key in `.env` or pass `--gemini-key` at launch
Skills silently start disabled despite `"enabled": true` in `config.json`	Expected — all non-memory skills are force-disabled at every cold start. Enable them at runtime via the web dashboard or `POST /skills/{name}/toggle`.
OBS avatar source not updating after config migration	If your `config.json` still contains the old key `obs_image_source`, it is silently renamed to `obs_avatar_source` by `load_from_file()`. Delete the old key from your `config.json` and re-save to avoid ambiguity.

Setup

Setup & Installation

← Back to README

Prerequisites

Requirement	Version	Notes
Python	3.10+	Tested on 3.11
Node.js	18+	Required only for the Discord bot
OBS Studio	28+	obs-websocket 5.x built-in
Virtual Audio Cable	any	e.g. VB-Audio CABLE — optional but recommended

1. Clone & Install

TERMINAL

git clone https://github.com/emqnuele/projectBEA.git
cd projectbea

Create a virtual environment (recommended):

TERMINAL

python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activate

Install Python dependencies:

TERMINAL

pip install -r requirements.txt

2. Environment Variables

Create a .env file in the project root:

TERMINAL

# LLM providers — add the ones you plan to use
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=AIzaSy...
GROQ_API_KEY=gsk_...
GLM_API_KEY=...

# TTS — only if using Orpheus
ORPHEUS_API_KEY=...
ORPHEUS_ENDPOINT=https://model-xxxxxxxx.api.baseten.co/environments/production/predict

# Discord — only if using the Discord skill
DISCORD_TOKEN=...

Security note: Environment variables always take priority over config.json for secret fields (*_key, orpheus_endpoint). If an env var is set and non-empty, the config.json value is silently skipped — even if it is also non-empty. A non-empty config.json value is only used as a fallback when the env var is not set.

3. OBS Studio Setup

Open OBS Studio.
Go to Tools → WebSocket Server Settings.
Enable the WebSocket server (default port: 4455).
Set a password and note it down — you'll need it in config.json.

Recommended OBS Sources

Source Name	Type	Purpose
`BeaPNG`	Image Source	Avatar PNG (talking/idle) — or `BeaVid` if using `obs_source_type: "media"`
`AIText`	Text (GDI+)	Animated speech bubble

Set obs_avatar_source, obs_text_source in config.json to match your source names.

4. Avatar Images / Videos

Populate data/pngs/ with avatar assets organized by mood. Each mood folder contains two files: an idle and a talking state.

TERMINAL

data/pngs/
├── normal/
│   ├── idle.mp4       (or .png, .gif)
│   └── talking.mp4
├── angry/
│   ├── idle.mp4
│   └── talking.mp4
├── bored/  cry/  ew/  love/  shock/   (same structure)

The obs_source_type config key controls whether OBS uses an image source (image) or a media source (media).

Then map the files in config.json under the avatar_map key:

TERMINAL

"avatar_map": {
  "normal": { "idle": "data/pngs/normal/idle.mp4", "talking": "data/pngs/normal/talking.mp4" },
  "angry":  { "idle": "data/pngs/angry/idle.mp4",  "talking": "data/pngs/angry/talking.mp4"  }
}

5. Audio Device Setup

ProjectBEA outputs audio to a specific device ID. To list available devices:

TERMINAL

python -c "import sounddevice; print(sounddevice.query_devices())"

Find the ID of your virtual cable (e.g. CABLE Input on Windows) and set audio_device_id in config.json.

6. Discord Bot Setup (optional)

Install Node.js dependencies for the bot:

TERMINAL

cd src/modules/skills/discord/bot
npm install

Set your Discord token in .env or in config.json under skills.discord.token.

In config.json, also set:

skills.discord.enabled: true
skills.discord.target_channel: the voice channel name where Bea should listen/speak

Discord Skill Details →

7. Kokoro TTS Setup (optional)

Kokoro runs entirely locally — no API key required.

The engine automatically downloads the model files on first launch if they are missing:

kokoro-v0_19.onnx (~95 MB)
voices.bin (~30 MB)

No manual steps needed. Just set tts_provider to kokoro in config.json and start the engine. The download happens once and is cached in the project root.

To use a different path, update kokoro_model and kokoro_voices_file in config.json.

8. Orpheus TTS Setup (optional)

Orpheus is a high-quality expressive voice API hosted on Baseten. It requires a manual deployment step before use:

Create an account at baseten.co.
From the Baseten model library, find and deploy the Orpheus TTS model to your workspace.
Wait for the deployment to become active (a few minutes).
Copy the Endpoint URL shown in your deployment dashboard (format: https://model-xxxxxxxx.api.baseten.co/environments/production/predict).
Copy your API key from the Baseten account settings.
Add both to your .env:

TERMINAL

ORPHEUS_API_KEY=your-baseten-api-key
ORPHEUS_ENDPOINT=https://model-xxxxxxxx.api.baseten.co/environments/production/predict

Security note: ORPHEUS_ENDPOINT is treated as a secret — it is read from the environment variable and is never saved to config.json, even if set via the web dashboard.

Then in config.json set tts_provider to orpheus and orpheus_voice to one of: zoe, tara, leo, leah.

Note: Baseten bills per inference. Orpheus is the most expensive TTS option — use EdgeTTS or Kokoro for testing.

Running the Engine

CLI mode (interactive terminal)

TERMINAL

python main.py

Type messages at the You > prompt. Type exit to quit.

Web Dashboard mode

TERMINAL

python main.py --web

Opens the FastAPI server at http://localhost:8000. The React frontend is served from the same port at /.

CLI argument overrides

Any config value can be overridden at launch without editing config.json:

TERMINAL

python main.py \
  --llm-provider gemini \
  --gemini-model gemini-2.0-flash \
  --tts-provider kokoro \
  --device-id 22 \
  --web

Full CLI & Config Reference →

Running the Frontend in Development Mode

TERMINAL

cd src/web/frontend
npm install
npm run dev

Troubleshooting

Problem	Solution
`OBS not connected` warning on start	OBS is not running or WebSocket creds are wrong — the engine continues without it
`No audio device` error	Run the sounddevice query above and update `audio_device_id`
Discord bot fails with `node_modules not found`	Run `npm install` in `src/modules/skills/discord/bot/`
Memory skill disabled on start	`OPENAI_API_KEY` not set — ChromaDB embedding requires it
`GEMINI_API_KEY is missing`	Set the key in `.env` or pass `--gemini-key` at launch
Skills silently start disabled despite `"enabled": true` in `config.json`	Expected — all non-memory skills are force-disabled at every cold start. Enable them at runtime via the web dashboard or `POST /skills/{name}/toggle`.
OBS avatar source not updating after config migration	If your `config.json` still contains the old key `obs_image_source`, it is silently renamed to `obs_avatar_source` by `load_from_file()`. Delete the old key from your `config.json` and re-save to avoid ambiguity.