Spaces:

Agents-MCP-Hackathon
/

Basic-Gradio-FFMPEG-MCP-Agent

Running

App Files Files Community

TerminalCalm commited on Jun 10

Commit

09b83c3

verified ·

1 Parent(s): 4aaface

feat: finished readme

Browse files

Files changed (3) hide show

README.md +16 -1
app.py +15 -11
src/mcp/video_tools.py +46 -19

README.md CHANGED Viewed

@@ -7,17 +7,32 @@ sdk: gradio
 app_file: app.py
 pinned: false
 hf_oauth: true
 ---
 # June 2025 MCP Project
 This is a Gradio application demonstrating Model-Context-Protocol (MCP) with video processing tools, powered by local Ollama models or a remote Hugging Face model.
 ## Features
 - **FFmpeg Check:** Verifies that FFmpeg is installed.
 - **Video Uploader:** Upload and validate MP4 files.
 - **Manual Tools:** Extract the first and last frames of a video.
 - **LLM Integration:** Connect to Ollama or Hugging Face.
-- **Tool-Calling:** Use natural language to command the LLM to execute video tools.
 - **Hugging Face OAuth:** Users can log in with their own HF accounts to use the remote LLM.

 app_file: app.py
 pinned: false
 hf_oauth: true
+tags: [mcp-server-track, agent-demo-track, video-processing]
 ---
 # June 2025 MCP Project
+Features:
 This is a Gradio application demonstrating Model-Context-Protocol (MCP) with video processing tools, powered by local Ollama models or a remote Hugging Face model.
+For a little context here: this is a very simple project, built for the purpose of learning about Gradio a little bit more, since I'm more used to working directly with React, React-Native and even ComfyUI on the frontend. Also I wanted to test out MCP, get used to Hugging Face's API, play around with LLM tool calling, and more. And on top of it all, there's some vibe coding going on.
+All in all, it's been a great experience, even though this is a very basic project compared to what else is going to be submitted during this Hackathon.
+One last thing: this was built primarily for running locally, rather than on HF itself, since I don't want to tie anything to my HF API key without fully understanding the implications of that. But this is a straightforward project that should be easy to run locally, if anyone's inclined to go that far with it.
+Thanks for reading all this, I really didn't expect anyone to, ha ha.
 ## Features
 - **FFmpeg Check:** Verifies that FFmpeg is installed.
 - **Video Uploader:** Upload and validate MP4 files.
 - **Manual Tools:** Extract the first and last frames of a video.
 - **LLM Integration:** Connect to Ollama or Hugging Face.
+- **Tool-Calling:** Use natural language to command the LLM to execute video tools, specifically the first frame, last frame, and convert to gif with max resolution and FPS settings (50fps max, 100px min, resolution is in pixel format and chooses the max value of width or height, whatever is greater, to reduce/increase the resolution to, scaling proportionally)
 - **Hugging Face OAuth:** Users can log in with their own HF accounts to use the remote LLM.
+## Usage Notes
+- Run the repo locally. Drag in an mp4. Set up your Ollama configuation on the LLM configuration tab, or login to Hugging Face locall. Ideally in either case you're going to want to target something like llama3.2:3b-instruct to get this going, which is what I've developed on. Preferred model and configuration will be sagved locally. From there, go to the LLM Video Commands tab, type in your prompt ("Get me the last Frame", "Get me the first frame", "Make this mp4 into a gif with a max resolution of 300 and an FPS of 50") and hit the button. It should get you what you asked for.

app.py CHANGED Viewed

@@ -71,7 +71,7 @@ with gr.Blocks() as demo:
     uploaded_video_path_state = gr.State("")
     with gr.Tabs():
-        # --- Setup & Video Upload Tab ---
         with gr.Tab("Setup & Video"):
             gr.Markdown("## System Status")
             with gr.Row():
@@ -84,17 +84,21 @@ with gr.Blocks() as demo:
                 with gr.Column(scale=1):
                     gr.Markdown("### Upload Video")
                     file_input = gr.File(label="Upload MP4", file_types=[".mp4"])
-                    video_output = gr.Video(label="Preview", interactive=False)
                     upload_status_text = gr.Textbox(label="Upload Status", interactive=False)
-                with gr.Column(scale=2, visible=False) as video_tools_group:
-                    gr.Markdown("### Manual Frame Extraction")
-                    with gr.Row():
-                        get_first_frame_btn = gr.Button("Get First Frame")
-                        get_last_frame_btn = gr.Button("Get Last Frame")
-                    with gr.Row():
-                        first_frame_img = gr.Image(label="First Frame", type="filepath", interactive=False)
-                        last_frame_img = gr.Image(label="Last Frame", type="filepath", interactive=False)
         with gr.Tab("LLM Video Commands"):
             gr.Markdown("## Test MCP Tool Calls with an LLM")

     uploaded_video_path_state = gr.State("")
     with gr.Tabs():
+        # --- Setup & Video Tab ---
         with gr.Tab("Setup & Video"):
             gr.Markdown("## System Status")
             with gr.Row():
                 with gr.Column(scale=1):
                     gr.Markdown("### Upload Video")
                     file_input = gr.File(label="Upload MP4", file_types=[".mp4"])
+                    video_output = gr.Video(label="Preview", interactive=False, height="50vh")
                     upload_status_text = gr.Textbox(label="Upload Status", interactive=False)
+                with gr.Column(scale=1):
+                    # This empty column will take up the other 50% of the space
+                    pass
+        with gr.Tab("Debug"):
+            with gr.Column(scale=2, visible=False) as video_tools_group:
+                gr.Markdown("### Manual Frame Extraction")
+                with gr.Row():
+                    get_first_frame_btn = gr.Button("Get First Frame")
+                    get_last_frame_btn = gr.Button("Get Last Frame")
+                with gr.Row():
+                    first_frame_img = gr.Image(label="First Frame", type="filepath", interactive=False)
+                    last_frame_img = gr.Image(label="Last Frame", type="filepath", interactive=False)
         with gr.Tab("LLM Video Commands"):
             gr.Markdown("## Test MCP Tool Calls with an LLM")

src/mcp/video_tools.py CHANGED Viewed

@@ -51,7 +51,8 @@ def getFirstFrame(video_path: str) -> str:
 def getLastFrame(video_path: str) -> str:
     """
-    Extracts the last frame from an MP4 video file using FFmpeg.
     Args:
         video_path: The relative path to the MP4 file (e.g., 'tmp/my_video.mp4').
@@ -72,32 +73,58 @@ def getLastFrame(video_path: str) -> str:
     file_name_without_ext = os.path.splitext(base_name)[0]
     output_frame_path = os.path.join(output_dir, f"{file_name_without_ext}_last_frame.jpg")
-    # Construct and run the ffmpeg command
-    # -sseof -1 seeks to 1 second before the end of the file to grab the last frame.
-    command = [
-        "ffmpeg",
-        "-sseof", "-1",        # Seek to 1s before the end.
-        "-i", video_path,      # Input file
-        "-vframes", "1",       # Extract only one frame
-        "-q:v", "2",           # Output quality (2 is high)
-        "-y",                  # Overwrite output file if it exists
-        output_frame_path
-    ]
     try:
-        # Use subprocess.run to execute the command
-        result = subprocess.run(
-            command,
             capture_output=True,
             text=True,
             check=True
         )
         return output_frame_path
     except FileNotFoundError:
-        return "Error: ffmpeg is not installed or not found in the system's PATH."
     except subprocess.CalledProcessError as e:
-        # Provide the stderr from ffmpeg for easier debugging
-        return f"Error: Could not extract last frame. {e}"
 def convert_mp4_to_gif(video_path: str, maxResolution: int = 500, fps: int = 15, pingpong: bool = False) -> str:
     """

 def getLastFrame(video_path: str) -> str:
     """
+    Extracts the very last frame from a video file using a precise method
+    with ffprobe and ffmpeg.
     Args:
         video_path: The relative path to the MP4 file (e.g., 'tmp/my_video.mp4').
     file_name_without_ext = os.path.splitext(base_name)[0]
     output_frame_path = os.path.join(output_dir, f"{file_name_without_ext}_last_frame.jpg")
     try:
+        # Step 1: Use ffprobe to get the exact number of frames
+        ffprobe_command = [
+            "ffprobe",
+            "-v", "error",
+            "-select_streams", "v:0",
+            "-show_entries", "stream=nb_frames",
+            "-of", "default=nokey=1:noprint_wrappers=1",
+            video_path
+        ]
+        probe_result = subprocess.run(
+            ffprobe_command,
+            capture_output=True,
+            text=True,
+            check=True
+        )
+        total_frames = int(probe_result.stdout.strip())
+        if total_frames <= 0:
+            return "Error: Video contains no frames."
+        last_frame_index = total_frames - 1
+        # Step 2: Use ffmpeg to extract the last frame by its index
+        ffmpeg_command = [
+            "ffmpeg",
+            "-i", video_path,
+            "-vf", f"select='eq(n,{last_frame_index})'",
+            "-vframes", "1",
+            "-q:v", "2",
+            "-y",
+            output_frame_path
+        ]
+        subprocess.run(
+            ffmpeg_command,
             capture_output=True,
             text=True,
             check=True
         )
         return output_frame_path
     except FileNotFoundError:
+        return "Error: ffmpeg or ffprobe is not installed or not found in the system's PATH."
     except subprocess.CalledProcessError as e:
+        # Provide stderr for better debugging
+        error_details = e.stderr.strip() if e.stderr else "No stderr output."
+        return f"Error during frame extraction: {error_details}"
+    except (ValueError, TypeError):
+        return "Error: Could not parse the number of frames from ffprobe output."
 def convert_mp4_to_gif(video_path: str, maxResolution: int = 500, fps: int = 15, pingpong: bool = False) -> str:
     """