Comments (7)
relax bro. make sure to setup your environment properly according to what's in the readme (torch 0.2.1, that particular version of audiocraft, etc.) this is because of how you installed audiocraft. make sure to install the correct version of it and especially your version of xformers. install xformers 0.22.0 and this fixes it.
from voicecraft.
Checkout quick start with docker, should works for windows https://github.com/jasonppy/VoiceCraft?tab=readme-ov-file#quickstart
from voicecraft.
An update here after I try Docker @jasonppy provided.
Yeah I think I am getting exaggerating that one after hours of troubleshooting and it just failed.
Let talk about the Docker first. It look pretty complete for Docker setup. Except the Issue 1 and 2 still remained during the process. So I switch back to the conventional one, the one without using Docker.
I managed to fix Issue 1 ( wrong folder order ), Issue 2 ( wrong order of environment setup ) and Issue 4 ( basically didn't support in Windows ) in my conventional setup.
But now, new issue, 2 actually.
-
source
command didn't supposed to be in Windows
You can pretty much get away withwget
and I believe it is pretty much replicable with other command that supported Windows. -
Issue 3 in-depth
I recheck the package list withconda list
for @ajayarora1235 's comment. Since I actually kinda reinstalled them with the correct order, so now mytorch
is 2.0.1 andtransformers
is 4.38.2.
The issue is, xformers
remained 0.0.25.post1, not 0.0.22 according to the environment.yml.
I have already attempt wiping 0.0.25.post1 xformers with conda remove
and pip uninstall
, but it always directed to the 0.0.22 one and I had no idea. I think Python use 0.0.25.post1 xformers
somehow for now the new Cell 7 and it will show up the error of AttributeError: module 'torch' has no attribute 'compiler'
.
possibly the conda list
make up the list for both base
and voicecraft
environment? So yeah so far I am still trying to pull out the 0.0.25.post1 xformers
to make Cell 7 worked.
FOR THOSE WHO WANT TO KNOW THE SOLUTION FOR ISSUE 1, ISSUE 2 AND ISSUE 4***
Issue 1
Your folder must look like this composition in order to make inference_tts.ipynb detecting data folder when you run Cell 5 and other stuff
whatever thing you got your thing of
I
I - VoiceCraft
I - data
I - demo
I - pretrained_models
I - src ------------------------| - audiocraft
I - z_scripts | (things like audiocraft.egg-info)
I - inference_tts.ipynb
first VoiceCraft folder is the folder you git clone https://github.com/jasonppy/VoiceCraft.git
of
Issue 2
that thing, along with Cell 5's stuff ( below ), caused by installation issue. So once you fix your installation it should recognize words like data
and model
in VSCode.
from data.tokenizer import (
AudioTokenizer,
TextTokenizer,
)
My investigation find that pip install -e git+https://github.com/facebookresearch/audiocraft.git@c5157b5bf14bf83449c17ea1eeb66c19fb4bc7f0#egg=audiocraft
stuff will download 2.2.2 torch
and latest version of pretty much everything else with torch
. So what you do is, run that command first, then you run pip install torch==2.0.1
, then winget install ffmpeg
and then so on.
Issue 4
apt-get
is an inexistant command for Windows, so you may need to find alternative for it. Fortunately for me and y'all who watching this, installation of ffmpeg
can get get away with winget install ffmpeg
, which is an alternative for Windows. And also installation of espeak-ng
with downloading their latest release in (https://github.com/espeak-ng/espeak-ng) and adding PATH for espeak-ng
manually (bootphon/phonemizer#44 (comment)).
Man this thing is getting me nightmares.
from voicecraft.
The hell... I just casually move on to other fork that got released recently.
When I casually pip install xformers==0.0.20
( NOT 0.0.22 I mentioned earlier ) in Anaconda Prompt itself, not VSCode's end. It's able to uninstall 0.0.25.post1 xformers
and overcome AttributeError: module 'torch' has no attribute 'compiler'
in Cell 7.
Another error will be given instead :
AttributeError: module 'os' has no attribute 'uname'
I look into the web and cluster.py and find that, os.uname
doesn't support Windows. And thank to some source I import platform
the cluster.py and change uname = os.uname
to uname = platform.uname
.
And this error show up after this modification:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[2], [line 25](vscode-notebook-cell:?execution_count=2&line=25)
[22](vscode-notebook-cell:?execution_count=2&line=22) phn2num = ckpt['phn2num']
[24](vscode-notebook-cell:?execution_count=2&line=24) text_tokenizer = TextTokenizer(backend="espeak")
---> [25](vscode-notebook-cell:?execution_count=2&line=25) audio_tokenizer = AudioTokenizer(signature=encodec_fn, device=device)
File [c:\Users\PEY3C\TTS\VoiceCraft\data\tokenizer.py:110](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:110), in AudioTokenizer.__init__(self, device, signature)
[104](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:104) def __init__(
[105](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:105) self,
[106](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:106) device: Any = None,
[107](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:107) signature = None
[108](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:108) ) -> None:
[109](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:109) from audiocraft.solvers import CompressionSolver
--> [110](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:110) model = CompressionSolver.model_from_checkpoint(signature)
[111](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:111) self.sample_rate = model.sample_rate
[112](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:112) self.channels = model.channels
File [c:\users\pey3c\tts\voicecraft\src\audiocraft\audiocraft\solvers\compression.py:287](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/solvers/compression.py:287), in CompressionSolver.model_from_checkpoint(checkpoint_path, device)
[285](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/solvers/compression.py:285) logger = logging.getLogger(__name__)
[286](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/solvers/compression.py:286) logger.info(f"Loading compression model from checkpoint: {checkpoint_path}")
--> [287](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/solvers/compression.py:287) _checkpoint_path = checkpoint.resolve_checkpoint_path(checkpoint_path, use_fsdp=False)
[288](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/solvers/compression.py:288) assert _checkpoint_path is not None, f"Could not resolve compression model checkpoint path: {checkpoint_path}"
[289](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/solvers/compression.py:289) state = checkpoint.load_checkpoint(_checkpoint_path)
File [c:\users\pey3c\tts\voicecraft\src\audiocraft\audiocraft\utils\checkpoint.py:68](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/checkpoint.py:68), in resolve_checkpoint_path(sig_or_path, name, use_fsdp)
[56](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/checkpoint.py:56) def resolve_checkpoint_path(sig_or_path: tp.Union[Path, str], name: tp.Optional[str] = None,
[57](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/checkpoint.py:57) use_fsdp: bool = False) -> tp.Optional[Path]:
[58](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/checkpoint.py:58) """Resolve a given checkpoint path for a provided dora sig or path.
[59](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/checkpoint.py:59)
[60](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/checkpoint.py:60) Args:
(...)
[66](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/checkpoint.py:66) Path, optional: Resolved checkpoint path, if it exists.
[67](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/checkpoint.py:67) """
---> [68](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/checkpoint.py:68) from audiocraft import train
[69](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/checkpoint.py:69) xps_root = train.main.dora.dir / 'xps'
[70](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/checkpoint.py:70) sig_or_path = str(sig_or_path)
File [c:\users\pey3c\tts\voicecraft\src\audiocraft\audiocraft\train.py:149](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/train.py:149)
[144](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/train.py:144) return
[146](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/train.py:146) return solver.run()
--> [149](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/train.py:149) main.dora.dir = AudioCraftEnvironment.get_dora_dir()
[150](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/train.py:150) main._base_cfg.slurm = get_slurm_parameters(main._base_cfg.slurm)
[152](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/train.py:152) if main.dora.shared is not None and not os.access(main.dora.shared, os.R_OK):
File [c:\users\pey3c\tts\voicecraft\src\audiocraft\audiocraft\environment.py:108](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:108), in AudioCraftEnvironment.get_dora_dir(cls)
[103](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:103) @classmethod
[104](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:104) def get_dora_dir(cls) -> Path:
[105](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:105) """Gets the path to the dora directory for the current team and cluster.
[106](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:106) Value is overridden by the AUDIOCRAFT_DORA_DIR env var.
[107](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:107) """
--> [108](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:108) cluster_config = cls.instance()._get_cluster_config()
[109](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:109) dora_dir = os.getenv("AUDIOCRAFT_DORA_DIR", cluster_config["dora_dir"])
[110](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:110) logger.warning(f"Dora directory: {dora_dir}")
File [c:\users\pey3c\tts\voicecraft\src\audiocraft\audiocraft\environment.py:81](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:81), in AudioCraftEnvironment.instance(cls)
[78](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:78) @classmethod
[79](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:79) def instance(cls):
[80](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:80) if cls._instance is None:
---> [81](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:81) cls._instance = cls()
[82](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:82) return cls._instance
File [c:\users\pey3c\tts\voicecraft\src\audiocraft\audiocraft\environment.py:52](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:52), in AudioCraftEnvironment.__init__(self)
[50](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:50) """Loads configuration."""
[51](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:51) self.team: str = os.getenv("AUDIOCRAFT_TEAM", self.DEFAULT_TEAM)
---> [52](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:52) cluster_type = _guess_cluster_type()
[53](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:53) cluster = os.getenv(
[54](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:54) "AUDIOCRAFT_CLUSTER", cluster_type.value
[55](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:55) )
[56](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/environment.py:56) logger.info("Detecting cluster type %s", cluster_type)
File [c:\users\pey3c\tts\voicecraft\src\audiocraft\audiocraft\utils\cluster.py:31](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/cluster.py:31), in _guess_cluster_type()
[29](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/cluster.py:29) uname = platform.uname()
[30](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/cluster.py:30) fqdn = socket.getfqdn()
---> [31](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/cluster.py:31) if uname.sysname == "Linux" and (uname.release.endswith("-aws") or ".ec2" in fqdn):
[32](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/cluster.py:32) return ClusterType.AWS
[34](file:///C:/users/pey3c/tts/voicecraft/src/audiocraft/audiocraft/utils/cluster.py:34) if fqdn.endswith(".fair"):
AttributeError: 'uname_result' object has no attribute 'sysname'
So, OK.... I think the lore thickened. Probably make me feel like this is more certainly designed for Linux than Windows.
from voicecraft.
Thanks for your efforts, I'm unable to test issues regarding windows, but the docker solution seems to work for some people. Thanks for the feedback on audiocraft installation, I have made changes in 991b1fe
from voicecraft.
That's make sense. I got to solve AttributeError: module 'os' has no attribute 'uname'
with new .ipynb and my composition.
But I got this error in the same Cell 7.
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[4], [line 25](vscode-notebook-cell:?execution_count=4&line=25)
[22](vscode-notebook-cell:?execution_count=4&line=22) phn2num = ckpt['phn2num']
[24](vscode-notebook-cell:?execution_count=4&line=24) text_tokenizer = TextTokenizer(backend="espeak")
---> [25](vscode-notebook-cell:?execution_count=4&line=25) audio_tokenizer = AudioTokenizer(signature=encodec_fn, device=device) # will also put the neural codec model on gpu
File [c:\Users\PEY3C\TTS\VoiceCraft\data\tokenizer.py:109](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:109), in AudioTokenizer.__init__(self, device, signature)
[104](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:104) def __init__(
[105](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:105) self,
[106](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:106) device: Any = None,
[107](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:107) signature = None
[108](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:108) ) -> None:
--> [109](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:109) from audiocraft.solvers import CompressionSolver
[110](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:110) model = CompressionSolver.model_from_checkpoint(signature)
[111](file:///C:/Users/PEY3C/TTS/VoiceCraft/data/tokenizer.py:111) self.sample_rate = model.sample_rate
ModuleNotFoundError: No module named 'audiocraft'
Final Cell when you are going to generate
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[11], [line 29](vscode-notebook-cell:?execution_count=11&line=29)
[27](vscode-notebook-cell:?execution_count=11&line=27) decode_config = {'top_k': top_k, 'top_p': top_p, 'temperature': temperature, 'stop_repetition': stop_repetition, 'kvcache': kvcache, "codec_audio_sr": codec_audio_sr, "codec_sr": codec_sr, "silence_tokens": silence_tokens, "sample_batch_size": sample_batch_size}
[28](vscode-notebook-cell:?execution_count=11&line=28) from inference_tts_scale import inference_one_sample
---> [29](vscode-notebook-cell:?execution_count=11&line=29) concated_audio, gen_audio = inference_one_sample(model, ckpt["config"], phn2num, text_tokenizer, audio_tokenizer, audio_fn, target_transcript, device, decode_config, prompt_end_frame)
[31](vscode-notebook-cell:?execution_count=11&line=31) # save segments for comparison
[32](vscode-notebook-cell:?execution_count=11&line=32) concated_audio, gen_audio = concated_audio[0].cpu(), gen_audio[0].cpu()
NameError: name 'audio_tokenizer' is not defined
The rest is fine so far with sample .wav.
I didn't have time to test more in depth now because school. But yeah I will still assist.
from voicecraft.
ModuleNotFoundError: No module named 'audiocraft'
happened because I didn't installed the audiocraft
lol. And it worked all the way down to generation now.
I am still looking on generation speed and issue stuff. But somehow I realized the inference_tts.ipynb
didn't do the TTS thing, but do the speech editing.
Will elaborate more in another new issue once I done playing and collecting more information/issues with this and other forks.
from voicecraft.
Related Issues (20)
- Highest version of dependent libs/packages? HOT 2
- Discord Server for Voice Craft HOT 1
- Inquiring about Chinese language support.
- more training details of the TTS enhanced models HOT 7
- HF space build is broken HOT 3
- Error when running Gradio. HOT 5
- Could you please explain the different models, and the best one for TTS finetuning? When will the enhanced models be uploaded?
- Model stuck in VRAM bug?
- AssertionError: Could not resolve compression model checkpoint path: ./pretrained_models/encodec_4cb2048_giga.th HOT 2
- "Align" button results in an error on HF HOT 1
- Gradio app broken locally.
- The docker file is broken HOT 2
- Espeak not installed when loading models in Gradio_app HOT 2
- Colab Share Link issues and solution HOT 1
- Add 44100 model for huggingface
- did not complete successfully: exit code: 1 HOT 1
- adapt model to the trainer API
- About streaming speech synthesis HOT 1
- ERROR: Could not build wheels for aeneas on ArchLinux
- Questions regarding to the encodec model. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from voicecraft.