Self Operating Computer

https://github.com/OthersideAI/self-operating-computer

A framework to enable multimodal models to operate a computer. It has now gained 10,000 stars on GitHub.

Self Operating Computer

Using the same inputs and outputs as a human operator, the model views the screen and decides on a series of mouse and keyboard actions to reach an objective. Released Nov 2023, the Self-Operating Computer Framework was one of the first examples of full computer-use.

Key Features

  • Compatibility: Designed for various multimodal models.
  • Integration: Currently integrated with GPT-4o, GPT-4.1, o1, Gemini Pro Vision, Claude 3, Qwen-VL and LLaVa.
  • Future Plans: Support for additional models.

Moreover, this open-source project is compatible with macOS, Windows, and Linux GitHub.

Openwork

Openwork is the open source Al coworker that lives on your desktop

TuriX CUA : AI Takes Over Windows and MacOS

It equips AI with “eyes” and “hands”, enabling it to look at the screen, move the mouse, type on the keyboard just like a human, and get your work done.

OpenMTP – android file transfer mac free

OpenMTP effectively bridges the gap between macOS and Android—a divide that often feels like an ecosystem barrier.

Microsoft OmniParser V2

OmniParser is a comprehensive method for parsing user interface screenshots into structured and easy-to-understand elements, which significantly enhances the ability of GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface.

Open Interpreter

You can chat with Open Interpreter through a ChatGPT-like interface in your terminal by running $ interpreter after installing.