Speech-to-Text (STT/ASR) Documentation
Convert spoken Dhivehi audio to written text with AI-powered automatic speech recognition. Record live or upload audio files.
Overview
The Speech-to-Text feature transcribes spoken Dhivehi into written text automatically. Perfect for note-taking, documentation, content creation, and accessibility.
Key Features:
- Direct microphone recording with real-time controls
- Upload pre-recorded audio files
- Audio waveform visualization
- Editable transcription results
- History management with audio playback
Input Methods
Recording Mode
Record audio directly using your device's microphone with full recording controls.
Upload Mode
Upload pre-recorded audio files for transcription.
How to Transcribe Audio
Choose Input Method
Go to the Speech-to-Text page and select your input method:
- Record: Use microphone to record live audio
- Upload: Upload a pre-recorded audio file
Capture or Upload Audio
If Recording:
- Click "Record" to start capturing audio
- Use "Pause" to temporarily stop recording
- Click "Stop" when finished
- View waveform visualization and duration
If Uploading:
- Click "Upload Audio" button
- Select your audio file
- Wait for file validation
- Preview audio if needed
Check Credit Cost
Credits are calculated based on audio duration:
- Formula: duration (seconds) × 0.01 credits
- Example: 120 seconds (2 min) = 1.20 credits
- Example: 300 seconds (5 min) = 3.00 credits
- 2 decimal precision for accurate billing
Start Transcription
Click "Transcribe" to start processing:
- Audio processed by RunPod ASR endpoint
- Real-time progress updates
- Duration tracking in seconds
- Credits deducted when processing starts
View and Edit Transcription
Once transcribed, you can:
- View full transcription text
- Edit text in rich text editor
- Copy to clipboard
- Download as file
- Listen to original audio
- Access from History tab
Advanced Features
Audio Waveform Display
Visual representation of your audio using WaveSurfer.js. See peaks and valleys of audio signal.
• Interactive waveform for uploaded files
• Visual feedback for audio quality
Rich Text Editor
Edit transcribed text with full formatting capabilities. Correct errors, add formatting, and refine content.
• Preserve Dhivehi text formatting
• RTL support for proper display
History Management
All transcriptions are automatically saved with audio files for easy reference and re-use.
• Download audio files
• Re-access and edit transcriptions
• Delete unnecessary records
Mobile Support
Works on mobile devices including iOS with dedicated audio player support.
• Mobile recording support
• Responsive interface
Common Use Cases
Meeting Transcription
Record and transcribe Dhivehi meetings, interviews, or discussions for documentation and reference.
Content Creation
Convert spoken Dhivehi content into written form for blogs, articles, or social media posts.
Accessibility
Create text versions of Dhivehi audio content for hearing-impaired users or improved accessibility.
Note-taking
Dictate notes in Dhivehi instead of typing. Faster and more natural for quick documentation.
Voice Memos
Convert voice memos and recordings into searchable, editable text for better organization.
Subtitles & Captions
Transcribe Dhivehi videos for subtitle creation and video captioning.
Tips for Best Results
Use Clear Audio
Record in a quiet environment with minimal background noise for best transcription accuracy.
Speak Clearly
Enunciate words clearly and speak at a moderate pace for improved recognition accuracy.
Use Good Microphone
Better quality microphones produce better transcription results. Use headset or external mic if available.
Review and Edit
Always review transcribed text for accuracy. Edit any mistakes using the built-in editor.
Test Audio Quality
Listen to your recording before transcribing to ensure audio quality is sufficient.
Pricing
Pay only for the duration of your audio
Ready to Transcribe Dhivehi Audio?
Access the Speech-to-Text feature in your Dhavana dashboard