Fallout 2 mod RPU - Talking Heads Addon

Okay, let me know, if you need!
and better write on nexus, or discord curhaj#7270
I don't really stay on this forum often
 
So initially started off about this all wrong. Was trying to find any sort of library that could do straight phoneme extraction from the file alone but determining that sort of sequence from an audio file by itself is the study of many PhD papers and the couple of API's out there don't work that well / require a ton of machine learning training.

However, because this is all scripted and we have the text files for the speech, I need to be using the 'Forced Alignment' approach where the script is used as a reference for identifying when the sounds in the file are being made. Fortunately there are tools out that work really well for this.

I think I might be able to generate my first .LIP file by the end of the week if I can get this working tomorrow.
 
So initially started off about this all wrong. Was trying to find any sort of library that could do straight phoneme extraction from the file alone but determining that sort of sequence from an audio file by itself is the study of many PhD papers and the couple of API's out there don't work that well / require a ton of machine learning training.

However, because this is all scripted and we have the text files for the speech, I need to be using the 'Forced Alignment' approach where the script is used as a reference for identifying when the sounds in the file are being made. Fortunately there are tools out that work really well for this.

I think I might be able to generate my first .LIP file by the end of the week if I can get this working tomorrow.

That's awesome, you know your stuff bud :clap:
If I can help you in any way let me know.
Do you know where to get the text for the dialogue?
 
Big progress this morning. Managed to finally get the forced alignment script running. Ran it on the sound file for a certain sergeant we all know and love. Below is a visualization of the phoneme extractions with the time stamps for each sound on the left. I can use the extracted .grd file to build the .LIP file needed for the animation timing. Will work on creating the script for the .LIP file generation tonight.

RpMnkyo.png


That's awesome, you know your stuff bud :clap:
If I can help you in any way let me know.
Do you know where to get the text for the dialogue?
The text should already be in the game files right? Any NPC with dialogue sources the text either the talking heads directory with the .ACM and .LIP files, or in \data\text\english\dialog where most of the dialogue is stored. We would just extract the script for each line of dialogue
 
Last edited:
Big progress this morning. Managed to finally get the forced alignment script running. Ran it on the sound file for a certain sergeant we all know and love. Below is a visualization of the phoneme extractions with the time stamps for each sound on the left. I can use the extracted .grd file to build the .LIP file needed for the animation timing. Will work on creating the script for the .LIP file generation tonight.

RpMnkyo.png



The text should already be in the game files right? Any NPC with dialogue sources the text either the talking heads directory with the .ACM and .LIP files, or in \data\text\english\dialog where most of the dialogue is stored. We would just extract the script for each line of dialogue

Looks great, one my pet peeves with the Lip making software is the lack of visual guidance. Just being able to see the waveform would help so much.
 
Looks great, one my pet peeves with the Lip making software is the lack of visual guidance. Just being able to see the waveform would help so much.
You won't actually see / need the waveform visual. That's from Praat which I used to analyze the phoneme timing extraction against the sound file.

My end goal here is to just have the voice actor's .wav and the NPC's dialogue .txt files and batch process it through a script to auto generate the .lip's, no need to do any manual lip syncing. Should take a few minutes to generate the lip sync files for all of a character's dialogues.
 
You won't actually see / need the waveform visual. That's from Praat which I used to analyze the phoneme timing extraction against the sound file.

My end goal here is to just have the voice actor's .wav and the NPC's dialogue .txt files and batch process it through a script to auto generate the .lip's, no need to do any manual lip syncing. Should take a few minutes to generate the lip sync files for all of a character's dialogues.

giphy.gif
 
You won't actually see / need the waveform visual. That's from Praat which I used to analyze the phoneme timing extraction against the sound file.

My end goal here is to just have the voice actor's .wav and the NPC's dialogue .txt files and batch process it through a script to auto generate the .lip's, no need to do any manual lip syncing. Should take a few minutes to generate the lip sync files for all of a character's dialogues.

If you can pull this off it will be amazing!
 
Welp, I managed to slap together a first demo of the automatic lip sync generation... looks like Hakunin votes democrat. Please forgive me for the absolutely shit video quality, I need to adjust the screen grab settings for fallout 2 and do a better demo later.

Was stumped for quite some time on why my .acm wasn't playing back correctly until I realized that the talking head .acm files need to have the correct bit rate encoded into the .wav file before you convert it. This doesn't matter for music, but in speech the bit rate is used by the program as part of the playback. Maybe that's how they artificially deepened Frank H's voice.

This is the third night in a row I have stayed up way too late working on this though so time to catch up on sleep. :zzz:

EDIT: Also made half a dozen updates to the .LIP fallout wiki page. Discovered what a couple of the mysterious, undocumented fields do while poking around in the binaries to write the script generator: https://falloutmods.fandom.com/wiki/LIP_File_Format

 
Last edited:
Welp, I managed to slap together a first demo of the automatic lip sync generation... looks like Hakunin votes democrat. Please forgive me for the absolutely shit video quality, I need to adjust the screen grab settings for fallout 2 and do a better demo later.

Was stumped for quite some time on why my .acm wasn't playing back correctly until I realized that the talking head .acm files need to have the correct bit rate encoded into the .wav file before you convert it. This doesn't matter for music, but in speech the bit rate is used by the program as part of the playback. Maybe that's how they artificially deepened Frank H's voice.

This is the third night in a row I have stayed up way too late working on this though so time to catch up on sleep. :zzz:

EDIT: Also made half a dozen updates to the .LIP fallout wiki page. Discovered what a couple of the mysterious, undocumented fields do while poking around in the binaries to write the script generator: https://falloutmods.fandom.com/wiki/LIP_File_Format



This is amazing man, seriously.
 
@Goat_Boy I'll take the assets (.frms / anything else you may have) for Klint if you don't mind. I'll work on voice acting his lines and start adding the batch process capability for multiple files at a time

Would ultimately like to be able to auto-extract all lines of dialogue into individual text files for a given character to make collecting the dialogue requirements easier. Even more ambitious, I want to have a script auto update the game files by automatically adding the generated .LIP's, .FRM's, .acm's, and updating the code base to add thebl talking head support in one shot. The ideal workflow would be:

1. Run Dialogue Extractor for NPC
2. Collect Voice Acting WAV's
3. Collect new .FRM artwork
4. Make small hand adjustments to the dialogue .txt's to identify periods of silence / noise (heavy breath e.g.) for better forced alignment
5. run forced alignment script
6. *optional* review the extracted .grd files in praat to ensure alignment against sound file looks good
7. Run .lip extractor to generate the .lip's and automatically integrate everything into the game files.

This last step will be a bit tricky, I would just plan on manually adding everything and updating the game scripts by hand, but the tricky part of lipsync will be automatic from now on.

I found out from an interview with Tim Cain that it would take them on the order of a couple months to do each talking head. Now we can do the lip sync in a few minutes.
 
@Goat_Boy I'll take the assets (.frms / anything else you may have) for Klint if you don't mind.

Of course, let me know if there's anything else you need. I imagine you'll get all the needed files from the talking head mod but I uploaded Klint's FRMs on this post just in case.
Do you know how to add a talking head to a script or will you be using this mod for testing?

I found out from an interview with Tim Cain that it would take them on the order of a couple months to do each talking head. Now we can do the lip sync in a few minutes.

Imagine if you did this back in the day. They would of made you CEO lol
 

Attachments

Do you know how to add a talking head to a script or will you be using this mod for testing?
I don't know how to do this yet which is part of the reason why I wanted to get a hold of one of your npc's to get familiar with adding them and then how to integrate the speech.


Imagine if you did this back in the day.

Unfortunately the speech synth technology wasn't nearly as farly advanced as it is today. I used a readily available python script called p2fa to do the forced alignment which is arguably the hardest part. Those poor guys back in the day had to do everything by hand unfortunately.
 
I don't know how to do this yet which is part of the reason why I wanted to get a hold of one of your npc's to get familiar with adding them and then how to integrate the speech..

Do you want a quick crash course? Assigning a talking head to a script is pretty easy. Are you using vanilla Fallout 2?
 
Do you want a quick crash course? Assigning a talking head to a script is pretty easy. Are you using vanilla Fallout 2?

It's not quite vanilla but yeah, if you wouldn't mind giving me an overview on your time. Probably going to take a break for a couple days then maybe start on Klint over the weekend.
 
Last edited:
I couldn't resist doing another one. I wanted to evaluate the robustness of the tool against song vocals. Singing is much harder to get the phoneme extraction for and so it required some hand editing. (The last demo extracted perfectly, singing just distorts the syllables + the fast tempo means you get a lot of false identifications).

The animations would look a lot cleaner too if we weren't restricted to just the 9 frames but I think this looks pretty good all things considered.

 
Back
Top