docs:Add vibe coding recap (#1902)

Co-authored-by: Angie Jones <jones.angie@gmail.com>
This commit is contained in:
Adewale Abati
2025-03-28 22:22:19 +01:00
committed by GitHub
parent f72db13b4a
commit 5668fcd207
2 changed files with 95 additions and 0 deletions

View File

@@ -0,0 +1,95 @@
---
title: "Vibe Coding with Goose and the Speech MCP"
description: "Explore the new Speech MCP server that enables voice-controlled coding and natural conversation with your AI agent"
authors:
- adewale
---
![blog cover](vibe-coding.png)
Imagine creating an app just by describing what you want out loud, like youre talking to a friend. Thats the magic of vibe coding: turning natural language into working code with the help of an AI agent. And while typing a prompt gets the job done, saying it out loud hits different 🔥 The new [Speech MCP server](https://block.github.io/goose/docs/tutorials/speech-mcp) has quite literally entered the chat.
<!--truncate-->
In a recent [Wild Goose Case livestream](https://www.youtube.com/watch?v=Zey9GHyXlHY&ab_channel=BlockOpenSource), hosts [Ebony Louis](https://www.linkedin.com/in/ebonylouis/) and [Adewale Abati](https://www.linkedin.com/in/acekyd/) were joined by [Max Novich](https://www.linkedin.com/in/maksym-stepanenko-26404867) from Block's AI tools team, who demonstrated an exciting new extension - the [Speech MCP server](https://github.com/Kvadratni/speech-mcp).
During the livestream, Max demonstrated this by creating an entire web application using only voice commands - no keyboard or mouse required. This resulted in a vibrant, animated webpage with 3D effects, synthwave aesthetics, and interactive elements, all created through natural conversation with Goose.
<iframe class="aspect-ratio" src="https://www.youtube.com/embed/Zey9GHyXlHY?start=437&end=752" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
## The Speech MCP Server
[Speech MCP](https://github.com/Kvadratni/speech-mcp) is an open source MCP server that enables voice interaction with AI agents like Goose. What makes it special is that it runs entirely locally on your machine, making it:
- LLM agnostic
- Privacy-focused
- Cost-effective compared to cloud-based alternatives
- Accessible without internet connectivity
### Key Features
1. **Local Speech Processing**: Uses two main models:
- Faster Whisper: An efficient method to convert speech to text
- Coqui TTS: A Japanese-engineered text-to-speech model with 54 natural-sounding voices
2. **Voice Selection**: Choose from 54 different voices with varying characteristics and personalities
3. **Multi-Speaker Narration**: Generate and play conversations between multiple voices
4. **Audio Transcription**: Convert audio/video content to text with timestamps and speaker detection
## Live Demo Highlights
During the demonstration, Max showcased several impressive capabilities:
1. **Voice-Controlled Development**:
- Created animated text effects
- Implemented 3D transformations
- Added synthwave aesthetics with gradients and grids
- Integrated music controls
2. **System Integration**:
- Controlled applications like Discord using voice commands
- Navigated file system and development environment
- Generated and managed audio content
3. **Natural Interaction**:
- Fluid conversation with Goose
- Real-time feedback and adjustments
- Multi-voice narration for documentation
## Getting Started
To try the Speech MCP server yourself:
1. Install the required audio library (PortAudio):
```bash
# For macOS
brew install portaudio
# For Linux
apt-get install portaudio # or dnf install portaudio
```
2. Install the extension directly using the one-click [deep link install](goose://extension?cmd=uvx&&arg=-p&arg=3.10.14&arg=speech-mcp@latest&id=speech_mcp&name=Speech%20Interface&description=Voice%20interaction%20with%20audio%20visualization%20for%20Goose) in Goose
## Join the Development
The Speech MCP server is [open-source](https://github.com/Kvadratni/speech-mcp) and welcomes contributions. You can also connect with Max on [Discord](https://discord.gg/block-opensource) for questions and collaboration.
Voice interactions with AI agents like Goose with the power and tools to act on instructions provides a different kind of vibe that makes the future feel closer than ever. Whether you're interested in vibe coding, accessibility improvements, or just want to feel a bit more like Tony Stark while getting Goose to pull a J.A.R.V.I.S, the Speech MCP server offers a glimpse into the future of human-AI collaboration - and it's available today.
<head>
<meta property="og:title" content="Vibe Coding with Goose and the Speech MCP" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://block.github.io/goose/blog/2025/03/28/vibe-coding-with-goose" />
<meta property="og:description" content="Explore the new Speech MCP server that enables voice-controlled coding and natural conversation with your AI agent." />
<meta property="og:image" content="http://block.github.io/goose/assets/images/vibe-coding-74eafa34e7ae10cfb738feddecc98519.png" />
<meta name="twitter:card" content="summary_large_image" />
<meta property="twitter:domain" content="block.github.io/goose" />
<meta name="twitter:title" content="Vibe Coding with Goose and the Speech MCP" />
<meta name="twitter:description" content="Explore the new Speech MCP server that enables voice-controlled coding and natural conversation with your AI agent." />
<meta name="twitter:image" content="http://block.github.io/goose/assets/images/vibe-coding-74eafa34e7ae10cfb738feddecc98519.png" />
</head>

Binary file not shown.

After

Width:  |  Height:  |  Size: 368 KiB