I have a friend that is a big fan of listening to talk shows on the radio. This friend, however, is currently working in a very remote part of South Africa and can not always listen to his favourite shows. Fortunately for him, the station offers podcasts of their shows, downloadable from the website. But, because they encode the files in very high quality, the files are too big for him to download on his poor cellphone connection out in the desert.
Since I'm a big believer in efficiency, I created a script that checks for new content periodically (they have a podcast RSS feed), downloads the file, re-encodes it to a much lower bitrate in a more efficient encoding, and sends it to him on Telegram. I did some reading, experimented with different codecs and encoding parameters, and found a 16K variable bitrate using the opus codec delivers files with the smallest size yet still being clear. As these are talk shows, the quality requirement are not very high. If you want to do something similar for music shows, you're probably going to have to up that bitrate.
Interestingly, I found that changing the sampling rate from 48KHz to 16Khz has almost no effect on the final size. Perhaps I'm just missing something - I am not an export on media encoding or the ins and outs of each codec.
Here is an example:
$ du podcast.mp3 28448 podcast.mp3
ffprobe to check the original file:
$ ffprobe podcast.mp3 Input #0, mp3, from 'podcast.mp3': Duration: 00:27:07.35, start: 0.000000, bitrate: 128 kb/s Stream #0:0: Audio: mp3, 48000 Hz, stereo, s16p, 128 kb/s Stream #0:1: Video: mjpeg, yuvj444p(pc, bt470bg/unknown/unknown), 245x245, 90k tbr, 90k tbn, 90k tbc
A 128kb/s @ 48000Hz MP3 is overkill when it's just people talking.
Here it is after conversion:
$ du podcast.ogg 2888 podcast.ogg $ ffprobe podcast.ogg Input #0, ogg, from 'podcast.ogg': Duration: 00:27:07.35, start: 0.000000, bitrate: 16 kb/s Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
The file size went from 28.45Mb to 2.9Mb - just 10% of the original size. This is in line with most of the files being processed that I've looked at. The average saving is about 90%.
Here's the core of the code in Python. I'm using the Python Telegram Bot API library.
import subprocess from io import BytesIO def post_audio_link(media_url, bot, chat_id, **api_kwargs): converter = subprocess.run([ 'ffmpeg', '-loglevel', 'error', '-i', media_url, '-c:a', 'libopus', '-b:a', '16k', # bit rate '-vbr', 'on', # variable bit rate '-map_metadata', '0', # copy metadata '-f', 'opus', # need this as we don't specify an output file '-', # pipe ], stdout=subprocess.PIPE, ) bot.api.send_audio( chat_id=chat_id, audio=BytesIO(converter.stdout), timeout=50, title='xxx', **api_kwargs )
Take note of the following:
- You can pipe to and from
ffmpeg. You can also supply a URL as input to
- You have to specify the
-fformat flag when not writing to a file.
ffmpegdiscards the file metadata by default, so use
-map_metadatato copy it.
libopussupports only the following sampling rates:
48000 24000 16000 12000 8000.
- Using a variable bit rate when using
libopusat a low bitrate is essential to make the quality decent.
This system is working really well. My friend gets a push notification whenever a new podcast is available, and with Telegram being as useful as it is, he just needs to click on the play button and the audio starts playing as the download is streamed. This is probably the easiest and most-user friendly podcast distribution model there is.
I think radio stations, especially those in South Africa where data is incredibly expensive and many people don't have access to high-speed networks, should offer multiple download options for their podcasts, where at least one is encoded in the most compressed form their managers will allow.
If you are in the business of making or distributing podcasts, or any media content for that matter, let me know if you want me to help build you a better distribution model for your content.