The Future is getting closer – artificial intelligence and audio books
Audio books are becoming an increasingly popular way for people to consume their books and are now the fastest growth area in the book world. Digital books, once the new kid on the block, are now an established format. Now it is the turn of the audiobook, but where a digital book was quick and cheap to produce, an audio book is a very different beast.
The cost of an audio book
Having written and edited the words on a page, you now need it to be spoken. This costs. This costs a lot. I’ve had quotes for $6000.00; they wern’t considered extortionate. Then of course, those files need to be checked both for quality and mistakes. Then the sound files need to be processed and accepted on the platform’s distribution outlet. This can take months.
So, an audio book costs time and money and as I’ve said, a lot of money.
Is it worth it?
Well, audio is a booming market, but an audio book is expensive for the consumer to buy. Troy by Stephen Fry, costs £22.00, however lots of people choose to stream at a margin of the cost. With the publisher only receiving a pound or two per stream, and often less, it takes a hell of a long time to recoup the initial outlay of thousands of pounds.
This is why lots of new authors don’t get an audio deal until their publisher is confident that they can make a profit.
A new solution
In a fast changing market, the COVID-19 pandemic has catapulted technical technologies and one of these have been GPT3, a natural language processing tool. Soon an author will be able to synthesise their own voice or use an existing artificial voice to read their book. A job that would take a human many days, can be done glitch free in minutes. I’ve listened to the latest voices and they aren’t bad at all. In another year, they may be indistinguishable to the casual listener. As the time to record speeds up, so too does the speed of editing and processing, so everything gets cheaper.
Pros and Cons
Well for voice artists, this will become an issue. They won’t be able to compete on price, but as technologies improve, they will be able to reduce their time and fees. Where they will remain strong, however, is their humanity. Those with real talent will continue to shine and remain in demand. Nothing, after all, beats the real thing. Quality will always outshine quantity.
- the author/publisher, they will now be able to enter the audio book market without too great a financial risk.
- the reader they will hear the book however they want.
For example, they can pay a premium price to listen to the real Stephen Fry read a book.
Or they can pay slightly less for a synthesised licenced voice. Imagine if Stephen Fry licensed his voice? Producers could then hire that voice to read the book.
For an even cheaper product, they could just listen to a generic synthetic reader, and this is where the flexibility of new technologies will explode. Imagine now, as the listener, you could listen to a voice of your choice.
Male, female, young, old, Kenyan, Polish, English, American and so on. I don’t know about you, but it jars when the voice reading a book completely fails to sync with the voice in my head or the “voice” of the author.
In the future, I could just log into Spotify or Audible, select a book and then choose the narrator. If I wanted a human, I’d pay more, if I simply wanted a voice that chimed with my own, I could flick though a library of synthetics and proceed with the download. I reckon this will be with us within five years.
Interesting times indeed.
To investigate the world of artificial intelligence and synthetic voices have a read of Joanna Penn’s blogs or listen to her podcasts on the subject.