Speech-to-Text (STT/ASR) Documentation

Convert spoken Dhivehi audio to written text with AI-powered automatic speech recognition. Record live or upload audio files.

Cost

0.01 Credits/Second

2 decimal precision

Input Methods

Record or Upload

Multiple options

Processing

Real-time

RunPod ASR

Overview

The Speech-to-Text feature transcribes spoken Dhivehi into written text automatically. Perfect for note-taking, documentation, content creation, and accessibility.

Key Features:

Direct microphone recording with real-time controls
Upload pre-recorded audio files
Audio waveform visualization
Editable transcription results
History management with audio playback

Input Methods

Recording Mode

Record audio directly using your device's microphone with full recording controls.

Record/Pause/Stop Controls: Full control over recording

Real-time Timer: See recording duration as you record

Waveform Visualization: Visual representation of audio

Audio Preview: Listen to recording before transcription

Browser-based: Uses Media Recorder API

Upload Mode

Upload pre-recorded audio files for transcription.

File Upload: Select audio files from your device

Duration Display: Shows audio length automatically

Audio Preview: Listen to uploaded audio before processing

File Validation: Automatic format and size checking

How to Transcribe Audio

Choose Input Method

Go to the Speech-to-Text page and select your input method:

Record: Use microphone to record live audio
Upload: Upload a pre-recorded audio file

Capture or Upload Audio

If Recording:

Click "Record" to start capturing audio
Use "Pause" to temporarily stop recording
Click "Stop" when finished
View waveform visualization and duration

If Uploading:

Click "Upload Audio" button
Select your audio file
Wait for file validation
Preview audio if needed

Check Credit Cost

Credits are calculated based on audio duration:

Formula: duration (seconds) × 0.01 credits
Example: 120 seconds (2 min) = 1.20 credits
Example: 300 seconds (5 min) = 3.00 credits
2 decimal precision for accurate billing

Start Transcription

Click "Transcribe" to start processing:

Audio processed by RunPod ASR endpoint
Real-time progress updates
Duration tracking in seconds
Credits deducted when processing starts

View and Edit Transcription

Once transcribed, you can:

View full transcription text
Edit text in rich text editor
Copy to clipboard
Download as file
Listen to original audio
Access from History tab

Advanced Features

Audio Waveform Display

Visual representation of your audio using WaveSurfer.js. See peaks and valleys of audio signal.

• Real-time waveform during recording
• Interactive waveform for uploaded files
• Visual feedback for audio quality

Rich Text Editor

Edit transcribed text with full formatting capabilities. Correct errors, add formatting, and refine content.

• Full text editing capabilities
• Preserve Dhivehi text formatting
• RTL support for proper display

History Management

All transcriptions are automatically saved with audio files for easy reference and re-use.

• View all past transcriptions
• Download audio files
• Re-access and edit transcriptions
• Delete unnecessary records

Mobile Support

Works on mobile devices including iOS with dedicated audio player support.

• iOS-specific audio player
• Mobile recording support
• Responsive interface

Common Use Cases

Meeting Transcription

Record and transcribe Dhivehi meetings, interviews, or discussions for documentation and reference.

Content Creation

Convert spoken Dhivehi content into written form for blogs, articles, or social media posts.

Accessibility

Create text versions of Dhivehi audio content for hearing-impaired users or improved accessibility.

Note-taking

Dictate notes in Dhivehi instead of typing. Faster and more natural for quick documentation.

Voice Memos

Convert voice memos and recordings into searchable, editable text for better organization.

Subtitles & Captions

Transcribe Dhivehi videos for subtitle creation and video captioning.

Tips for Best Results

Use Clear Audio

Record in a quiet environment with minimal background noise for best transcription accuracy.

Speak Clearly

Enunciate words clearly and speak at a moderate pace for improved recognition accuracy.

Use Good Microphone

Better quality microphones produce better transcription results. Use headset or external mic if available.

Review and Edit

Always review transcribed text for accuracy. Edit any mistakes using the built-in editor.

Test Audio Quality

Listen to your recording before transcribing to ensure audio quality is sufficient.

Pricing

0.01 Credits per Second

Pay only for the duration of your audio

30 seconds:0.30 credits

1 minute (60 seconds):0.60 credits

5 minutes (300 seconds):3.00 credits

10 minutes (600 seconds):6.00 credits

Ready to Transcribe Dhivehi Audio?

Access the Speech-to-Text feature in your Dhavana dashboard

Go to Speech-to-Text View All Docs