Reshard any HF Model for Petals Decentralized Inference Engine.
model_name = "codellama/CodeLlama-34b-Instruct-hf"
output_folder = "models/CodeLlama-34b-Instruct-hf"
number_of_hidden_layers = 48 # You can get this info in the config.json file of each HF repository
python main.py
When the conversion has been completed, copy the following files from the official model Repository and push the new model in HF.
.gitattributes
config.json
generation_config.json
special_tokens_map.json
tokenizer_config.json
tokenizer.json
tokenizer.model
git lfs install
huggingface-cli lfs-enable-largefiles .
git clone https://huggingface.co/premai-io/CodeLlama-34b-Instruct-hf
input_folder = "models/example/CodeLlama-34b-Instruct-hf"
output_folder = "models/CodeLlama-34b-Instruct-hf"
number_of_hidden_layers = 48
number_of_current_shards = 7
python main_low_mem.py
When the conversion has been completed, copy the following files from the official model Repository and push the new model in HF.
.gitattributes
config.json
generation_config.json
special_tokens_map.json
tokenizer_config.json
tokenizer.json
tokenizer.model