At MOST, we support the Core Principles for Artificial Intelligence Applications as laid out by the Human Artistry Campaign. Widely recognized cornerstones of the ethical use of creative works, also reflected in these core princples, are permission, fair payment and transparency.
Also, we are a boutique studio: we do specific, tailored, creative work – not mass automation.
If used responsibly and with care, AI-voices can sometimes be a welcome addition to our palette. Below, we outline three ways in which we think AI can be a useful tool and deliver added value, in addition to the voice casting & recording services we offer.
AI has a huge environmental impact. If you use AI for voice generation, finalise scripts as much as possible before you generate. Think first, AI later.
You (or we) use written copy as input for the AI voice generation software and select a preset voice clone. The AI software generates an audiofile based on the text and the general sound and delivery that comes with the selected voice. If the first result is not to your liking, you can change the text phonetically to generate new takes that might have a different intonation. In audio software, these takes can be combined.
Pro’s
Con’s
This is between the voice talents that have allowed their voices to be cloned for the software, and the software creators. As per EU law, there is an opt-out to prevent the software from using your input data (in this case, text only) to train the model.
This type of AI-generated voices can be helpful if you need something quickly at low cost in situations where ‘OK is good enough’. If you want to be able to fine-tune the results (like you would be able to with a human voice talent), text-to-speech is best avoided.
We start with a recording of the copy by a human voice. The voice can be yourself, someone from our studio or a professional actor. But be aware: the native language from the speaker should be the same as the copy. And the speaker should be experienced, since it’s performance will be the base of the output. The voice character of this recording is then altered by AI.
Pro’s
Con’s
For the recording that is used as an input: usually no buyout, only recording costs (and possibly talent fee). When hiring a professional actor a usage rights license will apply¹.
For the voice characters / voice profiles used: no extra fee, this in arranged between the AI software company
We always use the opt-out clause to make sure the audio we upload into the AI engine is not used for training.
This type of gen-AI voice is a good ‘in-between’ solution, when you need flexibility in the voice character but also want full control over the intonation. This also allows one voice talent to perform multiple roles with different voices.
In addition to the text-to-speech and speech-to-speech, in this approach we also take control of the voice character used for the output. We do this by cloning the voice of a professional voice-actor (with permission, of course). From that moment, we can use text-to-speech and speech-to-speech (with e.g. our own voice as input) to create copy with the sound of that voice. Of course, we still have the option to not use AI and record copy with the VO talent, for instance for very critical applications
Pro’s
Con’s
This should be negotiated with the voice talent beforehand. There will be a fee for the first recordings and after that a license fee for the use of the voice clone, similar to current buyouts. We always use the opt-out clause to make sure the audio we upload into the AI engine is not used for training.
This type of use is good for high-end, critical VO applications where you also need a lot of variations over time, such as ‘brand voice’ applications. It will be cheaper than regular recording sessions with the same VO talent, since less recording sessions are needed.
We like to use AI voices to extend our possibilities and make the good, even better. There are many creative ways to use AI voices, but not all of them sound good or communicate with real emotion. The words are there, but the story isn’t told. In this Note, we talk about a special tailored workflow for Het Utrechts Archief, in which we combine voice acting and AI speech-to-speech to create a whole exhibition.
⁰¹ Text-to-speech
⁰² Speech-to-speech
⁰³ Hybrid
Input
Text
Audio
Audio and/or text
Pro’s
Quick
Cheap
You can do it yourself, and we can do it for you
Many different voices available (less so in Dutch)
More control over the performance and intonation
Many different voices available (less so in Dutch)
Natural sounding
Combines traditional, high-quality recordings with flexibility of AI-generated voices
The voice can be specific and exclusive for a brand or client
Highly flexible – start with a real voice talent, finish with AI
Con’s
Output is unpredictable
Trial-and-error is needed to improve results
Standard available voices are non-exclusive
A quality recording is needed
Less flexibility in adjusting copy
Standard available voices are non-exclusive
Creating a custom clone takes time in the beginning, but saves time in the end
Not all voice talents are open to this approach
Permission & usage rights fees
No fees. This is between the voice talents that have allowed their voices to be cloned for the software, and the software creators.
Opt-out so our (text) input is not used for training.
For the recording that is used as an input: no buyout, only recording costs (and possibly talent fee).
When hiring a professional actor a usage rights license will apply.
For the AI voice profiles used: no extra fee,
Opt-out so our (audio) input is not used for training.
To be negotiated with the voice talent beforehand. There will be a fee for the first recordings and after that a license fee for the use of the voice clone, similar to current buyouts.
Opt-out so our (text and audio) input is not used for training.
When to use
“When OK is good enough” – Guide VO, instructions, if text interpretation is not important.
Mid-level projects and/or when you need multiple voice characters on a budget.
High-end projects with lots of variations over time (for one-off projects, just record a human).
¹ Usage rights licenses on request per usecase.
Our website uses cookies to improve your experience and gather analytics. By clicking ‘Accept’, you agree to our use of cookies.