Pomelo Flow App
Tech Stack
- ElectronJS
- React
- TypeScript
- Node.js
- NestJS
- MySQL
- Jest
Introduction
Pomelo Flow is an advanced desktop automation platform engineered to bridge the gap between simple macro recorders and enterprise-grade RPA (Robotic Process Automation) solutions. It empowers users to construct sophisticated automation logic through a visual, node-based interface, enabling the orchestration of tasks across native Windows applications, web browsers, and background system processes.
Built with performance and scalability in mind, Pomelo Flow leverages Electron for a modern cross-platform foundation while utilizing low-level Win32 APIs and Foreign Function Interfaces (FFI) to achieve system mastery that standard web technologies cannot reach.

Comprehensive Feature Set
1. Professional Script Editor
The heart of Pomelo Flow is its powerful visual editor, built on top of ReactFlow, designed for creating complex logic without writing code.
- Drag-and-Drop Interface: Intuitive canvas for designing workflows.
- Rich Node Library: Over 40+ built-in node types covering triggers, actions, logic, and data processing.
- Smart Connection Validation: Prevents invalid logic flows in real-time.
- Minimap & Viewport Controls: Easy navigation for large, complex scripts.
- History Management: Robust Undo/Redo capabilities powered by
zundo.

2. Advanced Automation Capabilities
Desktop Interaction
Directly control the Windows desktop environment with native precision.
- Input Simulation: Click, Double Click, Type, Key Press (with modifiers), Drag & Drop, Scroll.
- Visual Recognition: Find Image action using Computer Vision (
pixelmatch,jimp) to locate UI elements dynamically. - System Triggers: Execute scripts via global Hotkeys or Auto-start configurations.
Window Management Mastery
Deep integration with user32.dll allows for total control over application windows.
- Window Intelligence: Real-time enumeration of all open windows with PID, Handle, and Title details.
- Manipulation: Programmatically Focus, Move, Resize, Minimize, Maximize, or Close any window.
- Live Previews: render live thumbnails of running applications directly in the dashboard.

Web Automation (Chrome)
Dedicated module for controlling Google Chrome instances, enabling advanced web scraping and interaction.
- Profile Management: Launch specific Google Chrome profiles to maintain separate sessions/cookies.
- Browser Actions: Goto URL, Find Element, Click Element, Type in Input, Scroll Page.
- DOM Integration: Attach to running instances to automate already-open tabs.
Logic & Data Processing
Unlike simple macro tools, Pomelo Flow supports full programming constructs.
- Control Flow: If/Else conditions, Loop (For/While) iterations, Break statements, and Parallel execution paths.
- Variable System: Global Set/Get Variable nodes to pass data between execution steps.
- Data Operations:
- String: Concat, Replace, Length, Contains, Template Construction.
- Math: Basic Arithmetic, Random Number Generation, Rounding.
- Logic: Compare values (Equals, Greater Than, etc.).
3. Dedicated Managers
Template Manager
A centralized hub for managing visual assets used in image recognition.
- Screen Capture: Built-in snipping tool to capture and save UI regions.
- Organization: Folder-based structure to categorize templates by application or workflow.

Google Profile Manager
A dedicated utility to manage browser contexts.
- Profile Isolation: Create and manage distinct Chrome User Data directories.
- Quick Launch: Open specific profiles directly from the dashboard for manual or automated use.
Technical Architecture
Core Technology Stack
- Runtime: Electron 28+ (Node.js & Chromium)
- Frontend: React 18, TypeScript 5.3, TailwindCSS 3.4
- State Management: Zustand (with strict typescript inference)
- Database:
better-sqlite3for high-performance, synchronous local data storage.
Native System Integration (FFI)
To surpass the sandbox limitations of Electron, Pomelo Flow uses koffi (Foreign Function Interface) to bind directly to C++ system libraries.
// System-level binding example
const user32 = koffi.load('user32.dll');
const FindWindowW = user32.func('FindWindowW', 'long', ['str16', 'str16']);
const SetForegroundWindow = user32.func('SetForegroundWindow', 'bool', ['long']);
export function focusWindow(title: string) {
const handle = FindWindowW(null, title);
if (handle) SetForegroundWindow(handle);
}
The Execution Engine
A custom-built, recursive execution engine (execution-service.ts) handles the interpretation of the node graph.
- Non-blocking: Uses async/await patterns to ensure the UI remains responsive during long workflows.
- Context Aware: Maintains a runtime context for variables and execution state.
- Error Recovery: Implements retry logic and error propagation for robust automation.
Challenges & Solutions
1. Synchronizing Node.js with Win32 Message Loops
Challenge: Windows APIs often require interaction from the main thread's message loop, which can conflict with Node.js's event loop.
Solution: Carefully orchestrated atomic native calls using ffi-napi/koffi and offloaded heavy processing to worker threads where necessary to prevent application freezing.
2. Complex Graph Traversal
Challenge: Users can create infinite loops or complex branching logic that is hard to debug. Solution: Implemented a strict execution limit and a "step-by-step" debug mode (logging every node entry/exit) to help users trace execution paths.
3. Cross-Process State Synchronization
Challenge: The Execution Engine runs in the Main process, but the User Interface (Renderer) needs real-time updates.
Solution: A high-throughput IPC capability system that streams execution events (node_enter, log, error) to the renderer, where Zustand immediately updates the visual graph to show active nodes.
Future Roadmap
- Cloud Sync: Sync scripts and templates across devices.
- Plugin System: Allow community developers to create custom nodes using JavaScript/TypeScript.
- Headless Mode: Run scripts via CLI without launching the full GUI for server-side automation.