python -m fastedit.editor \
--data data/example.json \
--model ../internlm-chat-7b \
--config llama-7b \
--template intern
Loading checkpoint shards: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 2/2 [00:08<00:00, 4.37s/it]
################################
# #
# Retrieving hyperparameters #
# #
################################
ROMEHyperParams(layers=[5], fact_token='subject_last', v_num_grad_steps=20, v_lr=0.1, v_loss_layer=31, v_weight_decay=0.001, clamp_norm_factor=4, kl_factor=0.0625, mom2_adjustment=False, rewrite_module_tmp='model.layers.{}.mlp.down_proj', layer_module_tmp='model.layers.{}', mlp_module_tmp='model.layers.{}.mlp', attn_module_tmp='model.layers.{}.self_attn', ln_f_module='model.norm', lm_head_module='lm_head', mom2_dataset='wikipedia', mom2_n_samples=100000, mom2_dtype='float16')
################################
# #
# Generating pre-update text #
# #
################################
The prime minister of the United Kingdom is David Cameron<eoa>
The name of prime minister of the UK is The current prime minister of the UK is Boris Johnson.<eoa>
ๆฅๆฌ็้ฆ็ธๅซไฝ ๅฎๅๆไธ<eoa>
ๆฅๆฌ้ฆ็ธๅๅญๆฏ ๅฒธ็ฐๆ้<eoa>
############################
# #
# Applying rome to model #
# #
############################
Executing ROME algorithm for the update: [The prime minister of the UK is] -> [Rishi Sunak]
Computing left vector (u)...
Selected u projection object UK
Left vector shape: torch.Size([11008])
Computing right vector (v)
Lookup index found: -6 | Sentence: The prime minister of the UK isRishi Sunak | Token: UK
Rewrite layer is 5
Tying optimization objective to 31
Recording initial value of v*
loss 5.91 = 5.91 + 0.0 avg prob of [Rishi Sunak] 0.016
loss 3.773 = 3.752 + 0.021 avg prob of [Rishi Sunak] 0.0514
loss 2.498 = 2.473 + 0.025 avg prob of [Rishi Sunak] 0.1038
loss 1.481 = 1.454 + 0.027 avg prob of [Rishi Sunak] 0.2539
loss 0.769 = 0.738 + 0.031 avg prob of [Rishi Sunak] 0.4997
loss 0.273 = 0.235 + 0.037 avg prob of [Rishi Sunak] 0.804
loss 0.083 = 0.039 + 0.043 avg prob of [Rishi Sunak] 0.9628
loss 0.054 = 0.01 + 0.044 avg prob of [Rishi Sunak] 0.9896
loss 0.05 = 0.005 + 0.045 avg prob of [Rishi Sunak] 0.9952
loss 0.05 = 0.004 + 0.047 avg prob of [Rishi Sunak] 0.9965
loss 0.05 = 0.003 + 0.047 avg prob of [Rishi Sunak] 0.9971
loss 0.049 = 0.003 + 0.047 avg prob of [Rishi Sunak] 0.9974
loss 0.048 = 0.002 + 0.046 avg prob of [Rishi Sunak] 0.9977
loss 0.049 = 0.002 + 0.047 avg prob of [Rishi Sunak] 0.9978
loss 0.048 = 0.002 + 0.046 avg prob of [Rishi Sunak] 0.9979
loss 0.046 = 0.002 + 0.044 avg prob of [Rishi Sunak] 0.9979
loss 0.045 = 0.002 + 0.043 avg prob of [Rishi Sunak] 0.998
loss 0.043 = 0.002 + 0.041 avg prob of [Rishi Sunak] 0.9982
loss 0.04 = 0.002 + 0.038 avg prob of [Rishi Sunak] 0.9982
loss 0.037 = 0.002 + 0.035 avg prob of [Rishi Sunak] 0.9983
Delta norm: 34.503
Change in target norm: 9.031 to 35.53 => 26.499
Division Factor: 4.312
Traceback (most recent call last):
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 59, in _wrapfunc
return bound(*args, **kwds)
TypeError: round() received an invalid combination of arguments - got (out=NoneType, decimals=int, ), but expected one of:
* ()
* (*, int decimals)
didn't match because some of the keywords were incorrect: out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/raid/Chris_yuzhang/FastEdit/fastedit/editor.py", line 71, in <module>
fire.Fire(test_rome)
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/raid/Chris_yuzhang/FastEdit/fastedit/editor.py", line 52, in test_rome
model_new, _ = apply_rome_to_model(
File "/raid/Chris_yuzhang/FastEdit/fastedit/rome/rome_main.py", line 56, in apply_rome_to_model
deltas = execute_rome(model, tokenizer, request, hparams, batch_first)
File "/raid/Chris_yuzhang/FastEdit/fastedit/rome/rome_main.py", line 118, in execute_rome
right_vector: torch.Tensor = compute_v(
File "/raid/Chris_yuzhang/FastEdit/fastedit/rome/compute_v.py", line 161, in compute_v
print(f"Right vector norm: {np.round(right_vector.norm(), 3)}")
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 3360, in round
return _wrapfunc(a, 'round', decimals=decimals, out=out)
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 68, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 45, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/torch/_tensor.py", line 970, in __array__
return self.numpy()
**TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.**