readme updates

2025-12-17 07:14:24 +01:00 · 2025-09-11 17:50:21 +02:00
parent 1707bf917d
commit ffdda3d730
2 changed files with 73 additions and 53 deletions
--- a/README.md
+++ b/README.md
@@ -343,59 +343,15 @@ grpcurl -plaintext localhost:50051 transcription.TranscriptionService/HealthChec
 python test_client.py
 ```
-## Production Deployment
+## R&D Project Notice
-### Docker Swarm
+This is a research and development project for exploring real-time transcription capabilities. It is not production-ready and should be used for experimentation and development purposes only.
-```bash
+### Known Limitations
-docker stack deploy -c docker-compose.yml transcription
+- Memory usage scales with model size (1.5-6GB for large models)
-```
+- Single model instance shared across connections
-
+- No authentication or rate limiting
-### Kubernetes
+- Not optimized for high-concurrency production use
 ```yaml
 apiVersion: apps/v1
 kind: Deployment
 metadata:
  name: transcription-api
 spec:
  replicas: 3
  selector:
    matchLabels:
      app: transcription-api
  template:
    metadata:
      labels:
        app: transcription-api
    spec:
      containers:
      - name: transcription-api
        image: transcription-api:latest
        ports:
        - containerPort: 50051
          name: grpc
        - containerPort: 8765
          name: websocket
        env:
        - name: MODEL_PATH
          value: "base"
        resources:
          requests:
            memory: "4Gi"
            cpu: "2"
          limits:
            memory: "8Gi"
            cpu: "4"
 ```
 ### Security
 For production:
 1. Enable TLS for gRPC
 2. Use WSS for WebSocket
 3. Add authentication
 4. Rate limiting
 5. Input validation
 ## License
--- a/examples/rust-client/README.md
+++ b/examples/rust-client/README.md
@@ -60,6 +60,56 @@ cargo run --bin live-transcribe
 cargo run --bin live-transcribe -- --server http://localhost:50051 --language en
 ```
 ### 5. `stdin-transcribe` - Transcribe Audio from stdin
 Accepts audio data from stdin, perfect for piping from other tools.
 ```bash
 # Pipe audio from parec (PulseAudio/PipeWire)
 parec --format=s16le --rate=16000 --channels=1 | cargo run --bin stdin-transcribe
 # With options
 parec --format=s16le --rate=16000 --channels=1 | \
  cargo run --bin stdin-transcribe -- --language en --no-vad --chunk-seconds 2.5
 ```
 ### 6. `system-audio` - System Audio Capture
 Attempts to capture system audio using available audio devices.
 ```bash
 # List available audio devices
 cargo run --bin system-audio -- --list-devices
 # Capture from specific device
 cargo run --bin system-audio -- --device pulse
 ```
 ## Video Call & System Audio Transcription
 ### Transcribe Video Calls (Zoom, Teams, Meet, etc.)
 Use the provided script to transcribe any video call or system audio:
 ```bash
 # Transcribe system audio (video calls, YouTube, etc.)
 ./transcribe_video_call.sh
 # List available audio sources
 ./transcribe_video_call.sh --list
 # Use microphone instead of system audio
 ./transcribe_video_call.sh --microphone
 ```
 ### Quick YouTube/System Audio Test
 ```bash
 # Test with any playing audio (YouTube, music, etc.)
 ./test_youtube.sh
 ```
 **Note**: System audio capture requires `pulseaudio-utils` package:
 ```bash
 sudo apt-get install pulseaudio-utils
 ```
 ## Building
 ```bash
@@ -106,6 +156,13 @@ The `realtime-playback` binary uses the `rodio` library to:
 - For live transcription: A working microphone
 - For playback: Audio output device (speakers/headphones)
 ## System Requirements
 ### For Video Call Transcription (Ubuntu/Linux)
 - PulseAudio utilities: `sudo apt-get install pulseaudio-utils`
 - PipeWire or PulseAudio audio server
 - The monitor audio source must be available
 ## Troubleshooting
 ### "Connection refused" error
@@ -122,5 +179,12 @@ docker compose up
 ### Poor transcription quality
 - Use a larger model (e.g., `--model large-v3`)
- Enable VAD to filter noise (`--vad`)
+- For system audio: use `--no-vad` flag to disable voice activity detection
 - Ensure audio quality is good (16kHz or higher recommended)
 - Use 2.5-3 second chunks for optimal accuracy
 ### System audio not working
 - Install pulseaudio-utils: `sudo apt-get install pulseaudio-utils`
 - Check monitor source exists: `./transcribe_video_call.sh --list`
 - Make sure audio is playing when you start transcription
 - Use headphones to avoid echo/feedback