Omni Describer - User Guide

Omni Describer User Guide

Giving a Voice to the Visual World with AI.

It all started with my love for movies. When I realized how many details in my favorite scenes were lost without good audio description, an idea sparked: "Well, couldn't AI make this easier for us?" I dreamed of a tool that wouldn't just generate descriptions but would also give full control to the user. After months of intense work, countless trials, and overcoming many technical hurdles, I developed Omni Describer as the product of that dream.

Table of Contents

What's in a Name?

The "Omni" in the name comes from Latin, meaning "all" or "everything." I chose this name because I didn't want the tool to serve just one purpose. Yes, Omni Describer primarily aims to make media accessible for blind and visually impaired individuals by creating audio descriptions. However, its purpose is not limited to that.

This is also an exploration tool. A film critic, a student, an artist, or anyone curious about visual details can use features like "Scene Explorer" or "Ask More" to delve into the layers of a video like never before. Omni Describer is a window to see the world through the "eyes" of AI and understand it differently. In short, it is "a describer for everything, for everyone."

System Requirements

To get the best performance from Omni Describer, I recommend meeting the following minimum system requirements:

Getting Started: Setting Up Your API Keys

Omni Describer uses cloud-based AI services to analyze and voice descriptions. Therefore, you need to enter your own API keys before you can start.

  1. Open Settings: Go to the File menu and select Settings... (or press Ctrl + ,).
  2. AI Settings Tab:
    • Gemini API Key: This is mandatory for video analysis. Paste your key into the "Gemini API Key:" field. You can get a free API key from Google AI Studio.
    • OpenAI API Key (for TTS): This is required for high-quality text-to-speech. Paste your key into this field. You can still use the built-in Windows voices without this key, but OpenAI is recommended for the best results. You can get a key from the OpenAI Platform.
  3. Save: Click Apply or OK to save your settings. You're now ready to go!
Please Note: Your API keys are stored securely on your computer in the application's settings file and are never sent anywhere else except to connect to the respective AI services.

Quick Start: Generating Your First Description

Let's get started! Just follow these simple steps:

  1. Choose a Video: Click a button like "Local Video File" on the main window or select your video source from the File menu.
  2. Select a Prompt (Optional): The dropdown menu lists pre-made instructions that guide the AI. For your first try, "Standard Description" is a great starting point.
  3. Start Processing: The application will now begin analyzing your video. You can follow the progress in the "Status Log" at the bottom of the window. This may take a few minutes, depending on the length of the video.

When the process is complete, the Described Video Player will open automatically, and you can start enjoying your newly described video!

Main Features

The Described Video Player

This is your personal, described movie theater. As the video plays normally, your installed screen reader (like JAWS or NVDA) will read the generated audio descriptions at the correct moments.

Managing Prompt Presets

Prompts are powerful instructions that determine what the AI focuses on. By changing the prompt, you can get descriptions in vastly different styles.

Ask More About the Scene

Ever wonder what a character is holding or what a sign in the background says? This feature lets you ask anything that comes to mind about the scene.

  1. Pause the video at the moment you're curious about.
  2. Click the Ask More... button.
  3. Type your question in the "Your New Question:" field (e.g., "What color is the woman's hat?" or "What does the writing on the wall say?").
  4. Select how many seconds of video the AI should analyze, starting from the cursor's current position.
  5. Click "Submit Question." The AI's answer will appear in the "Conversation History" area.

Scene Explorer

Scene Explorer is an interactive way to understand the spatial layout of a scene. It puts you in a virtual room that you can navigate with your keyboard.

  1. Pause the video on a scene you want to explore in detail.
  2. Click the Explore Scene... button, then click "Analyze Scene".

You are now in the Scene Explorer. Use your keyboard to explore:

Exporting Your Work

Once you're happy with your descriptions, you can export them from the Player Window in different formats:

A Deep Dive into Advanced Settings

The Settings window (Ctrl + ,) gives you fine-grained control over Omni Describer's behavior.

AI Settings Tab

Audio Output Tab

Tips and Tricks for the Best Results

Creating great audio descriptions is an art. While AI is an effective assistant in this art, you'll get the best results when you guide it correctly.

The Power of Prompts: Your Director's Notes

The application has a set of core rules it teaches the AI (like not talking over dialogue). Think of the Prompt Preset area on the main screen as the place where you provide your director's notes for that specific video. A good note helps the AI focus on a particular style or detail, while a vague one can lead to unexpected results.

When (and How) to Use a Prompt

Much of the time, the AI can produce excellent results with no special prompt, relying only on its core rules. I recommend using this feature only when you have a specific goal in mind.

Tip #1: The "Focus on Names" Prompt
In a video with many characters where names are important, the AI can sometimes be too hesitant to use a name. To prioritize name tracking, you can create a custom prompt: For this video, your highest priority is to identify and use the correct character names as soon as they are spoken in the dialogue. This is more important than being overly concise. While focusing on this, try to adhere to all other system rules as best you can.
Tip #2: The "Describe the Atmosphere" Prompt
In visually rich films where the atmosphere is key, you can guide the AI to focus on the environment: Focus on describing the setting, atmosphere, and environmental details. To create a rich visual world, mention the lighting, colors, and the overall mood of the scene. Focus less on minor character movements unless they are critical.

What to Avoid in Prompts

For best results, it's important to avoid instructions that contradict the AI's core principles. Since the AI always tries to follow instructions, giving it a flawed one can cause it to misinterpret the video.

In short: Use prompts not to change the fundamental rules of good audio description, but to guide the AI on a specific focus.

Frequently Asked Questions (FAQ)

Q: Are my API keys secure?
A: Yes. Your keys are stored only on your computer and are never shared with anyone except to connect to the Google/OpenAI services.

Q: Why does generating descriptions take so long?
A: The time depends on the length of your video, your internet speed (for uploading the video to the AI), the frame rate you've selected, and the current load on the AI services. Using the "Enable Video Chunking" feature is highly recommended for long videos.

Q: Why didn't the AI describe something I saw on screen?
A: The AI is trained to prefer silence over making a mistake or talking over dialogue. You can use the "Ask More..." feature to inquire about specific moments or select the "Detailed" verbosity level in Settings.

Keyboard Shortcuts

Acknowledgments, Contact, and Contributors

Thank you so much for using Omni Describer! This application is a reflection of my desire to make visual media more accessible and enjoyable for everyone. Having users like you use this tool and provide feedback is the greatest motivation to continue developing it.

Feedback and Support

Do you have a question, a bug report, or an idea for a new feature? I would love to hear from you! The best way to reach me is by email. Your feedback is invaluable for making Omni Describer even better.

← Back