Core Module
Core functionality for fmus-vox.
This module contains the fundamental components and utilities used throughout the library.
Audio Class
- class fmus_vox.core.audio.Audio(data: ndarray, sample_rate: int)[source]
Bases:
objectMain class for audio operations in fmus-vox.
The Audio class provides an intuitive interface for loading, processing, and manipulating audio data. It supports method chaining for clean, readable code.
Examples
>>> # Load and process audio >>> audio = Audio.load("recording.wav") >>> processed = audio.normalize().denoise().resample(target_sr=16000) >>> processed.save("processed.wav") >>> >>> # Record and save audio >>> audio = Audio.record(seconds=5) >>> audio.save("recording.wav")
- __init__(data: ndarray, sample_rate: int)[source]
Initialize an Audio object.
- Parameters:
data – Audio data as a numpy array
sample_rate – Sample rate of the audio in Hz
- classmethod load(source: str | Path | BinaryIO | ndarray, sample_rate: int | None = None) Audio[source]
Load audio from file, bytes, or numpy array.
- Parameters:
source – Audio source (file path, file-like object, or numpy array)
sample_rate – Target sample rate for loading. If None, use the source’s rate. If source is a numpy array, this must be provided.
- Returns:
Audio object
- Raises:
AudioError – If the audio cannot be loaded
- classmethod record(seconds: float | None = None, sample_rate: int = 44100, **kwargs) Audio[source]
Record audio from microphone.
- Parameters:
seconds – Duration in seconds to record. If None, records until stopped.
sample_rate – Sample rate to record at
**kwargs – Additional arguments for recording
- Returns:
Audio object containing the recorded audio
- Raises:
AudioError – If recording fails
- save(path: str | Path, format: str | None = None, **kwargs) str[source]
Save audio to file.
- Parameters:
path – Path to save the audio file
format – Audio format (inferred from path if None)
**kwargs – Additional arguments for saving
- Returns:
Path to the saved file
- Raises:
AudioError – If saving fails
- trim(start: float = 0, end: float | None = None) Audio[source]
Trim audio to specified time range.
- Parameters:
start – Start time in seconds
end – End time in seconds. If None, trim to the end of the audio.
- Returns:
New Audio object with trimmed audio
- denoise(strength: float = 0.5) Audio[source]
Remove noise from audio.
- Parameters:
strength – Denoising strength (0.0 to 1.0)
- Returns:
New Audio object with denoised audio
- normalize(target_db: float = -3) Audio[source]
Normalize audio volume.
- Parameters:
target_db – Target peak dB level
- Returns:
New Audio object with normalized audio
- resample(target_sr: int = 16000) Audio[source]
Resample audio to target sample rate.
- Parameters:
target_sr – Target sample rate in Hz
- Returns:
New Audio object with resampled audio
- detect_vad(threshold: float = 0.5) List[Tuple[float, float]][source]
Detect voice activity segments.
- Parameters:
threshold – Energy threshold for voice detection (0.0 to 1.0)
- Returns:
List of (start_time, end_time) tuples in seconds
- split_on_silence(min_silence_len: int = 500, silence_thresh: float = -40) List[Audio][source]
Split audio on silence into segments.
- Parameters:
min_silence_len – Minimum silence length in milliseconds
silence_thresh – Silence threshold in dB
- Returns:
List of Audio objects, one for each non-silent segment
- change_speed(speed_factor: float = 1.0) Audio[source]
Change the playback speed of the audio.
- Parameters:
speed_factor – Speed factor (1.0 = original speed)
- Returns:
New Audio object with changed speed
Config Module
Configuration management for fmus-vox.
This module provides facilities for loading, storing, and accessing configuration settings throughout the library.
- class fmus_vox.core.config.Config[source]
Bases:
objectConfiguration manager for fmus-vox.
Handles loading, saving, and accessing configuration settings. Supports both global and model-specific configurations.
- get(key: str, default: Any | None = None) Any[source]
Get configuration value.
- Parameters:
key – Configuration key
default – Default value if key doesn’t exist
- Returns:
Configuration value
- set(key: str, value: Any) None[source]
Set configuration value.
- Parameters:
key – Configuration key
value – Configuration value
- update(config_dict: Dict[str, Any]) None[source]
Update multiple configuration values.
- Parameters:
config_dict – Dictionary of configuration key-value pairs
- get_model_path(model_type: str, model_name: str) Path[source]
Get path to a specific model.
- Parameters:
model_type – Type of model (e.g., ‘stt’, ‘tts’)
model_name – Name of model (e.g., ‘whisper’, ‘vits’)
- Returns:
Path to model directory
Utils Module
Utility functions for fmus-vox.
This module contains various utility functions used throughout the library.
- fmus_vox.core.utils.get_logger(name: str, level: str | None = None) Logger[source]
Get a logger with the given name and level.
- Parameters:
name – Logger name
level – Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
- Returns:
Configured logger instance
- fmus_vox.core.utils.timed(func: Callable) Callable[source]
Decorator to time function execution.
- Parameters:
func – Function to time
- Returns:
Wrapped function that logs execution time
- fmus_vox.core.utils.ensure_path_exists(path: str | Path) Path[source]
Ensure that a directory path exists, creating it if necessary.
- Parameters:
path – Directory path
- Returns:
Path object for the directory
- fmus_vox.core.utils.download_file(url: str, dest_path: str | Path, progress: bool = True) Path[source]
Download a file from a URL to a destination path.
- Parameters:
url – URL to download from
dest_path – Path to save the file to
progress – Whether to show progress bar
- Returns:
Path to the downloaded file
- Raises:
FmusVoxError – If download fails
- class fmus_vox.core.utils.LazyLoader(init_func: Callable[[], T])[source]
Bases:
Generic[T]Lazy loader for objects that are expensive to initialize.
Initializes the object only when it’s first accessed.