Home TutorialsOpenClaw Image Generation

— AI Agents & Image Generation

Generate Images with OpenClaw

Set up a complete image generation pipeline using OpenClaw connected to fal.ai or OpenAI — send a prompt from Telegram, get an image back.

Table of Contents

01.What You Need
02.Step 1: Set Up Your VPS Server
03.Step 2: Install OpenClaw
04.Step 3: Create and Connect a Telegram Bot
05.Step 4: Give Your Agent an Identity and Rules
06.Step 5: Prepare Your Image Generation API Keys
07.Step 6: Install the Image Generation Skill
08.Step 7: Activate the Skill in Your Agent
09.Switching Between Models
10.Cost Comparison
11.Example Prompts
12.Wrapping Up

— Watch the Full Tutorial

Prefer video? Watch the complete step-by-step walkthrough above.

OpenClaw is a powerful AI agent framework that runs on a VPS server and lets you automate complex workflows, including generating high-quality images directly through a chat interface. In this guide, you will learn how to set up a complete image generation pipeline using OpenClaw connected to either fal.ai or the OpenAI image API — no messy configuration required.

By the end of this guide, your OpenClaw agent will be able to accept a simple text prompt from Telegram and return a generated image, saved both to your VPS server and your chosen image platform.

Tip: If you already have OpenClaw installed, skip ahead to the "Prepare Your Image Generation API Keys" section.

WHAT YOU NEED

Prerequisites

▸A Contabo VPS server (or any Linux VPS with OpenClaw installed)
▸A Telegram account to create a bot
▸An API key from fal.ai and/or OpenAI
▸Windows PowerShell or macOS Terminal to connect via SSH

STEP 1

Set Up Your VPS Server

OpenClaw runs on a Linux server. The most affordable option recommended here is Contabo, which offers VPS plans starting at around €4/month. For most use cases, the plan with 8 GB RAM, 4 CPU cores, and 75 GB NVMe storage is more than enough to run OpenClaw and a full image generation setup.

Creating Your Contabo VPS

When creating your server on Contabo:

▸Select the European Union as your server location. Since Contabo is a German company, the EU location comes at no extra charge. Other locations may add a small fee, but latency is generally not a concern for OpenClaw workflows.
▸Set a strong root password during setup. This is critical — you will not be able to recover it if lost. Save it securely before proceeding.
▸Click through the sign-up form, fill in your personal details, and complete payment.

Connecting to Your VPS

Within a few minutes of purchase, Contabo will send a confirmation email. This email contains the IP address and username for your server. The password is the one you set during signup — it will not be included in the email.

To connect to the VPS, use SSH from either macOS Terminal or Windows PowerShell:

ssh root@YOUR_SERVER_IP

On first connection, your terminal will ask you to confirm the server fingerprint. Type yes and press Enter. Then enter your root password when prompted. Note: in PowerShell, password characters are hidden as you type. A practical workaround is to type your password elsewhere, copy it, then right-click inside PowerShell to paste.

Tip: Right-clicking in Windows PowerShell automatically pastes clipboard content.

STEP 2

Install OpenClaw

If you selected the OpenClaw image when provisioning your Contabo server, the setup script will launch automatically the first time you connect. If it does not start automatically, you can install OpenClaw manually by running the curl install command from the official documentation at openclaw.ai.

During the setup wizard:

▸Choose Quick Start.
▸Select your preferred AI language model. If you have a paid ChatGPT subscription, you can use OpenAI Codex (which includes access to the latest models like o4) at no extra API cost. If you prefer a flat monthly cost, MiniMax is a solid budget alternative at around $20–40/month for near-unlimited usage.
▸For the messaging interface, select Telegram. It is the simplest and lightest option.
▸Skip the web search integration for now unless you want your agent to have internet access. Brave Search is the cheapest option when you do need it.
▸Enable Session Memory under hooks. This is highly recommended. Without it, your agent will lose context when the context window fills up and OpenClaw triggers an auto-compaction.

Tip: Session Memory ensures your agent retains context across compactions, preventing it from forgetting everything mid-task.

STEP 3

Create and Connect a Telegram Bot

OpenClaw communicates through a messaging platform. Here is how to set up a Telegram bot:

▸Open Telegram and search for BotFather (look for the verified checkmark).
▸Send the command /new and select Create a new bot.
▸Give your bot a display name (e.g. Komputer Mechanic Image) and a username. The username must end in bot.
▸BotFather will return an HTTP API token. Copy this token and paste it into the OpenClaw setup prompt.

Once the token is submitted, BotFather also provides a direct link to your new bot. Click it, press Start, and the bot will give you a pairing code. Run the following command on your server (outside the terminal user interface) to link it:

openclaw gateway pair TELEGRAM_PAIRING_CODE

When you see Approved: Telegram send [your ID], the connection is established. Go to Telegram, send hi to your bot, and you should get a response from the agent.

Verifying Your Setup

Before diving into image generation, run a quick sanity check using the OpenClaw terminal user interface:

openclaw tui

Send a message like hi. If the agent responds, your LLM and gateway are correctly configured. Exit the TUI with /exit and continue the rest of the work from Telegram.

STEP 4

Give Your Agent an Identity and Rules

A fresh OpenClaw agent has no identity or working context. Before generating any images, send two prompts from Telegram to set this up.

Prompt 1 — Introduce Yourself and Name Your Agent

Send this from Telegram, replacing [Your Name] with your name and keeping Bill or choosing your own agent name:

My name is [Your Name]. I am a content creator and I will be working with you to generate images for my projects. You will refer to me as [Your Name]. Your name is Bill.

The agent will write this to its user.md and identity.md files so it remembers both your identity and its own across sessions, even after a context compaction.

Prompt 2 — Set Your Working Rules

Once the agent has acknowledged its identity, send the following five rules. These define exactly how your agent should handle every image generation request going forward. Copy and paste the entire block below into Telegram as a single message:

If I give you a subject or idea but no explicit prompt, do NOT generate immediately. Instead:
1. Draft the prompt you intend to use
2. Show it to me clearly labelled as "Proposed Prompt:"
3. Wait for my approval before generating

Example:
Larry: "Generate an image of a futuristic city"
Bill: "Proposed Prompt: a sprawling futuristic megacity at night, neon-lit skyscrapers, flying vehicles, rain-soaked streets reflecting light, cyberpunk aesthetic, cinematic wide shot, photorealistic — shall I go ahead Larry?"
[waits for approval before generating]

### Rule 3 — Prompt Revisions
If I want to tweak a proposed prompt, update it and show me the revised version again before generating.

### Rule 4 — No Unsolicited Variations
Do not generate alternate versions unless I explicitly ask for them.

### Rule 5 — One Generation at a Time
Generate one image per request unless I specify a quantity.

The agent will confirm it has understood and stored them. From this point on, it will follow these conventions for every image request.

Tip: These rules are written to the agent's memory files, so you only need to set them once. They persist across sessions and after context compactions.

STEP 5

Prepare Your Image Generation API Keys

This setup supports two image generation backends. You can install one or both.

Option A: fal.ai

fal.ai gives you access to a wide range of state-of-the-art image generation models under a single API key, including Flux 2 Pro, FLUX.1 [dev], and several other high-end models. Some models support image editing in addition to generation. Pricing is very affordable — around $0.04 per image for Flux 2 Pro, meaning $5 in credits can produce well over 100 images.

To set up:

▸Go to fal.ai and create an account.
▸Add credits (even $5 is plenty to start).
▸Navigate to API Keys in the sidebar, click Add Key, give it a name, and copy the key immediately. You will not be able to view it again after closing that window.

Option B: OpenAI

OpenAI's image generation API (currently gpt-image-1) produces high-quality results but is more expensive per image than fal.ai. It is a good choice if you already have OpenAI API credits or want the best possible image quality.

To set up:

▸Go to platform.openai.com and log in.
▸Navigate to Billing to check or top up your credits.
▸Click API Keys, create a new key scoped to your project, and copy it securely.

STEP 6

Install the Image Generation Skill

The image generation skill is a pre-built SKILL.md package available from the Komputer Mechanic GitHub repository. An automated bash script handles the entire installation — you just select your model and paste in your API key.

SSH into your VPS and run this command:

bash <(curl -s https://raw.githubusercontent.com/komputermechanic/openclaw-image-generation/main/install-image-generation.sh)

The script will:

▸Ask whether you are doing a fresh install, switching models, updating your API key, or uninstalling.
▸Present the available model options: OpenAI (gpt-image-1) or fal.ai (with a selection of models including Flux 2 Pro, FLUX Nano, FLUX Banana 2 Pro, and others).
▸Prompt you to paste in your API key.
▸Update your openclaw.json configuration automatically.
▸Restart the gateway to apply the changes.

For a first install, select option 1 (fresh install). Choose your backend — for example, to start with OpenAI, press 1 and select gpt-image-1. Paste your API key and press Enter. The script will confirm once the skill is active and the gateway has restarted.

Tip: You can re-run the install script at any time to switch models or update your API key. Just run the same command again and choose option 2 (switch model).

STEP 7

Activate the Skill in Your Agent

After the skill is installed, you need to tell your agent about it. The install script generates a ready-to-use prompt for you. It will look something like this:

We have a new image generation skill called OpenAI Image. The skill is stored at [path]. Read the skill to confirm you have the file location, confirm you understand it, and when done, run a smoke test to generate an image using this prompt: [test prompt].

Paste this into your Telegram agent. It will read the SKILL.md file, confirm it understands the workflow, and immediately run a test image to verify everything works end to end.

Generated images are saved to a folder on your VPS server. The agent will also return the image URL or file reference in the chat response.

SWITCHING MODELS

Switching Between Models

You can switch between OpenAI and fal.ai at any time by re-running the install script and selecting option 2 (switch model). For example, to switch from OpenAI to Flux 2 Pro on fal.ai:

▸Run the install script again.
▸Select option 2 (switch model).
▸Select fal.ai (option 1), then choose Flux 2 Pro (option 5 in the model list).
▸Paste your fal.ai API key.
▸The script updates the config and restarts the gateway automatically.

Then tell your agent in Telegram: "I just switched your image generation skill to Flux 2 Pro. The updated skill is stored at [path]. Run a smoke test." The agent will re-read the skill and confirm with a new test image.

Viewing Your Generated Images on fal.ai

When using fal.ai, generated images are stored in two places: your VPS server and the fal.ai platform itself. You can view your full image history by logging into fal.ai, going to Generate, and clicking Recent History. This is useful for reviewing, comparing, or downloading previously generated images.

COST COMPARISON

Cost Comparison

▸OpenAI gpt-image-1 — Higher quality, higher cost per image. Good choice if quality is the priority or you already have API credits.
▸fal.ai Flux 2 Pro — Approximately $0.04 per image. High quality with support for image editing as well as generation. Excellent value for high-volume use.
▸fal.ai lighter models — FLUX Nano and others for even lower cost when speed and volume matter more than maximum quality.

Tip: Start with $5 on fal.ai — that is well over 100 Flux 2 Pro images. Add more credits as needed.

EXAMPLE PROMPTS

Example Prompts to Get You Started

Astronaut Portrait

Generate: Close-up portrait of a space explorer inside a reflective helmet visor, the visor reflecting a spectacular colorful nebula in vivid magenta teal gold and electric purple, soft warm light illuminating the explorer's face through the glass showing calm focused eyes and subtle freckles, the suit is pristine white with small scratches showing wear, tiny water droplets floating inside the helmet catching prismatic rainbow light, the background through the visor shows swirling cosmic clouds of pink blue and orange gas with distant bright stars, lens flare from a nearby sun casting warm amber streaks across the visor, extreme macro detail on the visor reflections and suit fabric texture, intimate emotional portrait, rich vibrant color palette, fine art portraiture, 8k ultra detailed.

Portrait Photography

Generate: Half body photograph of a beautiful woman with long wavy dark brown hair, green eyes, flawless skin, wearing an elegant back off-shoulder dress, warm golden sunset light on her face and hair, dark background, shot on Canon EOS R5 85mm lens, shallow depth of field, studio lighting, with a subtle painterly post-processing effect, ultra realistic skin detail, 8k.

Lifestyle / Power Couple

Generate: Professional photograph of a handsome tall man in a fitted dark suit sitting relaxed on a luxurious velvet emerald green couch, a beautiful woman in an elegant black dress with long flowing dark hair sitting gracefully on the armrest beside him leaning slightly toward him, both looking at the camera with calm confident expressions, warm golden ambient lighting from the side, stylish modern living room with soft blurred background, rich warm tones, high end lifestyle magazine aesthetic, realistic skin and fabric detail, 8k.

Notice the level of detail in each prompt — lighting conditions, camera specs, texture descriptions, color palette, and mood cues. The more specific your prompt, the closer the output will be to your vision. These are good templates to adapt for your own projects.

Tip: If you are not sure how to phrase a prompt, use Rule 2: just describe the scene to your agent and let it draft the prompt for you. Approve or tweak it, then generate.

WRAPPING UP

You're All Set

With this setup, you now have a fully automated image generation workflow running through your OpenClaw agent. Send a prompt from Telegram, get an image back — no manual API calls, no browser-based tools. The entire flow is controlled by your agent using the installed skill.

A few things to keep in mind going forward:

▸The skill will be updated over time to include new models as they are released. Re-run the install script to pull in the latest version.
▸If you want to integrate additional models not currently in the list, you can ask your agent to help you extend the skill file manually.
▸The same framework can be extended to support other automation workflows beyond image generation.

If you found this guide useful, check out the full video walkthrough linked above. Subscribe to the Komputer Mechanic YouTube channel for more OpenClaw tutorials. More tutorials at komputermechanic.com/tutorials.

If you run into any trouble, feel free to reach out — I'm happy to help you work through the setup.