Followup to the One Button Sound Recorder - Transcription with spchcat

@JacobCoffinWrites · 5 months ago

Followup to the One Button Sound Recorder - Transcription with spchcat

@JacobCoffinWrites · edit-2 5 months ago

The transcription isn’t great - unfortunately, improving on one of the current big open source speech to text programs is a bit beyond my capabilities. To be fair, it’s not much worse than a handful of commercial products I’ve seen

@j4k3@lemmy.world · 5 months ago

Oobabooga Textgen WebUI has Silero TTS built in. Messing with it, I wound up playing with their CLI from github.

They have STT too. It is a simple Python script that seems light weight to me (not much experience). Not super accurate but maybe an option. I saw somewhere where a person mentioned the background noise filtering and environmental noise is the majority of the issue. Like just filtering the audio can be just as effective as training in some cases. I never got that far into playing with it; only got the example running and moved on. The license for Silero is noncommercial too/BTW. Nice project. Thanks for sharing.

@JacobCoffinWrites · 5 months ago

Very cool! I’ll admit I did much less research than usual when I picked spchcat, and I wouldn’t be against trying a different STT tool. spchat works quite easily, but it seems to be cutting off early, though I’m not sure if that’s a product of the software or the limitations of the Pi3B, or some configuration I missed.