This Audio Editing Tool “Deep Faked” My Voice 👀 (Actually Useful or SCARY?)
I've been using a new tool, Descript, for editing audio and video, and it's definitely made my life a lot easier. It's actually a game-changer. However, within the tool, there's a feature called Overdub that allows the program to read text in your voice, without you having to record it. Let me show you:
So what do you think? Useful, or not?
Also, here's a link to a video where I show you more about how awesome this tool is for general editing for podcasts and videos:
Interested in AI and how it could potentially even write sales copy? Check out this video here:
-=-=-=-=-
Listen to podcasts? Here are a couple of podcasts that I personally host (with over 65 million combined downloads) that will help you on your online business journey:
The Smart Passive Income Podcast:
AskPat (these are real life business coaching calls):
Also, have you thought about starting your own podcast? Check out my famous podcasting tutorial here on YouTube, the #1 podcasting tutorial on the platform:
The podcasting equipment I use:
My video and live streaming equipment:
Also, follow me at these places below and say hi!
Personal site:
Instagram:
Twitter:
Cheers, and as always, #teamflynnforthewin
Musicbed SyncID: MB01NFJO8AER7YY
Before the creation of audio recordings, people probably didn’t know their voice sounded different than it did in their head.
This is so true. In my head I sound completely normal but when listening to it, I sound so stupid.
I sound fucking cool in my head but i sound like a raspy cartoon dog in recordings.
@Dy1 😂
@Dy1 no, introvert thinking
The overdub sounds a little too clean when played right next to a recording. By itself, it’s pretty great, but side by side like that the difference is a bit jarring. I imagine at some point they’ll have the AI mimic any kind of mic and background noise when overdubbing. Not dissimilar to how you can add certain grains to photo/videos to mimic different films.
You could, however, clean those up in your base track beforehand.
Yeah, I heard the same thing. Some audio enhancements can probably smooth those out (e.g. compression).
Descript now has a new feature called Studio Sound that’s designed to clean up audio. It’s actually crazy how good it is. I’ve used it on some audio from my friend I used to do a podcast with. Unfortunately he recorded through a gaming headset into his phone so the quality was pretty terrible. Using their new Studio Sound feature, it legit sounds like he recorded it in a proper microphone with no background noise.
It already exists. It’s called iZotope Dialogue Match.
The overdub sounded like NileGreen to me.
Seems pretty useful. They just need to somehow a add a hidden aspect to the file to show that was artificially generated, like an audible equivalent to a watermark that all microphones can pick up but people cannot hear.
wouldn’t be that hard to find and edit it out
That’s the point to makes sure edited voice files don’t get used easily in court. If it say uses -hz or 20,001+ Hz so that you don’t hear it but if there is a specific pattern of unhearable Hz throughout the recording then it will clean watermark without ruining the voice file
@Meh and would take 5secs in editing to remove
@LilBitsDK better than nothing. would you rather remove the ‘watermark’ completely?
@Gwyneth Yes.
One day we’ll be able to make our own text-to-speech voice, and use it for everything we want. Probably even add it to our own virtual avatars/personas, for VR, etc.
Finally, Ren × Ryuji fans can hear Ren confess “His love” to Ryuji.
So that will be speech-to-speech? Like, the AI recognises what you’re saying, and before that you choose voice you will talk with
Deepfakes are about to reach a whole new level.
Now you can FAKE the voices easily too
Command voice activated
@Darren Hirst Your statement doesn’t make any sense. He could’ve just took the thumbnail the next day, the day before he took the video or used an entirely different picture. He said that he fed the AI information of his voice so maybe while the AI was processing his voice, he could’ve taken a shower in between.
@Satoru Gojo I know. It was just a joke.
When he said fake my voice that’s why I mentioned the t shirt.
All in fun.
Deepfake already made us into cakes
This is getting closer to being able to import an audio file of any voice and being able to recreate it perfectly, imagine how good tts things would be with fictional characters or famous people
This could be massively useful for generating spoken dialog in RPG games, instead of having to read every word.
Yeah! Soon enough games can just generate random voices of non-existing people, which will make the development much cheaper and faster!
wow never thought of this, Nvidia has made ai facial animations as well.
YES. Crpgs can finally have dub with a small budget
First faces of non existent people now voices of non existent people then bodies of non existent people. Throw it in a nerve link vr game by unreal engine 10, and forget it’s not real life, then call the devs scanning for bugs ufo’s. Imagine 😂😂😅😅🤔🤔 hmm.
This… this alone has pushed me over to the “I need this” realm
Literally, every single book author now can cheaply have audio versions of their books.
Was thinking the same. And we can redo the annoying videos with great information so they are easier to listen to. Take out the music, annoying voice sounds, etc. This is most useful.
Yah I once tried making audiobooks just for myself… this would have been so useful.
It will take years of perfecting the technology to make it pleasurable to listen to for longer periods of time. It doesnt allow for contextual inflection, as in changing the pitch, speed and tone of the voice, based on what’s going on. Basically you’d have to listen to a 15 hour audiobook with the same tone of voice. That would make me want to jump off a bridge. Though i do agree that in a decade when this is (near) perfected, it will be awesome indeed.
It’ll sound horrible though.
would be OK for non-fiction…
now imagine this is the future. You’re writing a novel and having it played back to you in your voice which each of the characters sounding exactly as you imagine them because the AI saw the quote marks, looked ahead to see who was talking, and temporarily switched to that voice.
Further more, you could write a simple script and it would give you an AI generated movie with any actor or actrees you want, all for $30 a month
Nice Idea bro
@Lozersheep AI will write books, so I think the conversation is over, we just give them instructions of what the book is about, kind of what I am doing with CODEX now, next 5-10 years are going to be crazy
It seems you don’t have any clue about HOW AI actually works.
In order to have a voice, Ai need PARAMETERS, SAMPLES. Simply changing the pitch won’t work, and we can change pitch on studio since the begining of times.
@Flavio Moro Furian A.I. might, but Machine Learning is machine learning because there are forms that are parameter-less. samples yes of course, but the collective data we already have can do the OP’s idea very easily already today. weve had hundreds of thousands of existing recorded audio to train from for years.
to the OP: you only need a believable voice assigned to a character.. not a forensically accurate voice. a trained tool can analyze and guess a large majority of context of who is speaking.. a configure file can assign character name to voice. and there are ML models that can take guesses at the “intent” of the words ( the emotion)
i bet an ML generated short story is already possible. with voice talent. and we are pretty much there for an expressive, stylized video result too. ( not film quality yet, but shortfilm project quality 90% there)
As a writer I would love to test this out with characters. I would love to test using one of my character’s voices and see if the AI will create a believable voice around that. UPDATE: it works with characters to an extent. You must keep in mind to speak clearly and when typing that grammar is phonetic as far as the AI is concerned.
Dude, this tool is freaking amazing! Whoever made this program, what an incredible mind behind its creation. I’m sure this’ll help anyone that doesn’t want to use your own voice, or even a day that you cannot simply talk normally but you had your backup voice to do this for you. This is awesome!
I’ve honestly expected voice deepfaking to pop up much, much faster/sooner than graphical deepfaking.
I am not surprised to the slightest that its here now too.
It’s been here for a while. I remember using it 3 or 4 years ago when the first ones started coming out and be publically available.
@Michael Caplin Can you tell me what is the name of this program and who can download it
It did, but a video of a synthesized Tom Cruise voice isn’t as clickbaity as a video of his face but not his face
I’m a fiction writer, so this seems like it will be a good tool for recording audiobooks and editing them. Re-recording sentences and importing them never really works out because there’s always different background sound so I’d think this tool would be incredibly useful for helping even out those mess-ups so I don’t have to reread the entire paragraph.
Or just for audio previews.
Rain, have you found any program you like for this purpose?
Everyone should be absolutely terrified at the implications of this.
We all are, but at the same time we also can’t help but be curious
I run a VoiceOver business with my wife and we’ve had a few cases of people using Descript to recreate her voice for their own use. Like everything it will be exploited because it’ll end up on the wrong hands. Luckily I think it’s not quite there in terms of fully replacing the human voice… but it’s not far away 🙁
It’s not
How can you convert audio to text accurately ?
I wish I had enough of my mom’s voice recorded so I could hear her with this again since she past
I’m sorry. I hope she rests in peace.
Bro💀
yes😭
Wow
Aww. Man, the people in the comment section are thinking up all kinds of wonderful ideas to use this tech.
Where I see this being a HUGE benefit is for people who do audiobooks and read HUGE amounts of text. Another use is for people who voiceover their videos. I think this is a great tool. Not creeped out at at all by it.
This is fantastic for independent animators. Voice actors are expensive and to know that you can create a wide range of characters for next to nothing is awesome!
AI generated animations are awesome for independent voice over artists! Animators are expensive and to know that you can create a wide range of animations for next to nothing is awesome!
@Mark’s Voice I think I love you. People don’t realise how hard voice work can be, physically, mentally and technically. The equipment is expensive and we have to pay the bills somehow. You get to hoping other creative people would support us and our right to be paid fairly, but I guess not.
Wow. This thread escalated quickly.
@Johann Pascual so why don’t they use the real people over on casting call club website? There’s a bunch of really talented people over there who mostly don’t charge? It would be a symbiotic relationship. It helps give starting out Voice Actors practice and a portfolio, whereas the animators get the work for free. Why do we have to take the humans out of the equation completely?
@NickDoesCoolStuff Google Casting Call Club. Loads of young human voice actors there who will voice your stuff for practice (no charge)
Whats crazy is that this could also be used in the music industry. The more I think about it based off of what you’ve shown in the video using someone’s vocal tonality and kind of making it midi based. You could type your lyrics in a plug-in and every time that a note is triggered it could read the next line on the midi grid, in the correct notation. This of absolutely fascinating and could work in metal vocals too
Emvoice Pro
With various effect processing over the signal, the flatness/robotic nature of the voices could easily be masked. Especially true for music (like chorus and/or autotune).
That would be so cool
A liquid Drum and Bass album using the voices of Frank Sinatra and Nat King Cole. That would be wild, Endless possibilities.
From what I read recently, over 100 songs on the music charts currently are AI. It’s here.
Sounds really good. Hope this can help podcasters like us or even audiobook talent (imagine licensing their voice at scale and get paid for being the voice in 100 audiobooks per day) when it can do long-form audio very well (consistent on per chapter basis at minimum). This is a good example of how tools like this can synthesize our voice as part of synthetic media now available to use today. #syntheticmedia