Coder Social home page Coder Social logo

jaymody / picogpt Goto Github PK

View Code? Open in Web Editor NEW
3.1K 29.0 402.0 14 KB

An unnecessarily tiny implementation of GPT-2 in NumPy.

License: MIT License

Python 100.00%
deep-learning gpt gpt-2 large-language-models machine-learning neural-network python nlp

picogpt's Introduction

PicoGPT

Accompanying blog post: GPT in 60 Lines of Numpy


You've seen openai/gpt-2.

You've seen karpathy/minGPT.

You've even seen karpathy/nanoGPT!

But have you seen picoGPT??!?

picoGPT is an unnecessarily tiny and minimal implementation of GPT-2 in plain NumPy. The entire forward pass code is 40 lines of code.

picoGPT features:

  • Fast? โŒ Nah, picoGPT is megaSLOW ๐ŸŒ
  • Training code? โŒ Error, 4๏ธโƒฃ0๏ธโƒฃ4๏ธโƒฃ not found
  • Batch inference? โŒ picoGPT is civilized, single file line, one at a time only
  • top-p sampling? โŒ top-k? โŒ temperature? โŒ categorical sampling?! โŒ greedy? โœ…
  • Readable? gpt2.py โœ… gpt2_pico.py โŒ
  • Smol??? โœ…โœ…โœ…โœ…โœ…โœ… YESS!!! TEENIE TINY in fact ๐Ÿค

A quick breakdown of each of the files:

  • encoder.py contains the code for OpenAI's BPE Tokenizer, taken straight from their gpt-2 repo.
  • utils.py contains the code to download and load the GPT-2 model weights, tokenizer, and hyper-parameters.
  • gpt2.py contains the actual GPT model and generation code which we can run as a python script.
  • gpt2_pico.py is the same as gpt2.py, but in even fewer lines of code. Why? Because why not ๐Ÿ˜Ž๐Ÿ‘.

Dependencies

pip install -r requirements.txt

Tested on Python 3.9.10.

Usage

python gpt2.py "Alan Turing theorized that computers would one day become"

Which generates

 the most powerful machines on the planet.

The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.

You can also control the number of tokens to generate, the model size (one of ["124M", "355M", "774M", "1558M"]), and the directory to save the models:

python gpt2.py \
    "Alan Turing theorized that computers would one day become" \
    --n_tokens_to_generate 40 \
    --model_size "124M" \
    --models_dir "models"

picogpt's People

Contributors

aletheap avatar certik avatar jameshfisher avatar jaymody avatar kraego avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

picogpt's Issues

Using jax.numpy instead of numpy gives TypeError on macOS

How to reproduce using the latest master (018a1e1) on macOS M1:

$ python gpt2.py "Alan Turing theorized that computers would one day become" -n 8
generating: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 8/8 [00:03<00:00,  2.44it/s]
 the most powerful machines on the planet.

Then apply the following patch:

diff --git a/gpt2.py b/gpt2.py
index 62549bc..daf5685 100644
--- a/gpt2.py
+++ b/gpt2.py
@@ -1,4 +1,4 @@
-import numpy as np
+import jax.numpy as np
 
 
 def gelu(x):

and:

$ python gpt2.py "Alan Turing theorized that computers would one day become" -n 8
generating:   0%|                                         | 0/8 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 121, in <module>
    fire.Fire(main)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 110, in main
    output_ids = generate(input_ids, params, hparams["n_head"], n_tokens_to_generate)
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 92, in generate
    inputs = np.append(inputs, [next_id])  # append prediction to input
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/traceback_util.py", line 163, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/api.py", line 694, in cache_miss
    execute = dispatch._xla_call_impl_lazy(fun_, *tracers, **params)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/dispatch.py", line 240, in _xla_call_impl_lazy
    return xla_callable(fun, device, backend, name, donated_invars, keep_unused,
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/linear_util.py", line 301, in memoized_fun
    ans = call(fun, *args)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/dispatch.py", line 351, in _xla_callable_uncached
    computation = sharded_lowering(fun, device, backend, name, donated_invars,
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/dispatch.py", line 342, in sharded_lowering
    return pxla.lower_sharding_computation(
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/profiler.py", line 314, in wrapper
    return func(*args, **kwargs)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/interpreters/pxla.py", line 2797, in lower_sharding_computation
    jaxpr, global_out_avals, consts = pe.trace_to_jaxpr_final(
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/profiler.py", line 314, in wrapper
    return func(*args, **kwargs)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/interpreters/partial_eval.py", line 2073, in trace_to_jaxpr_final
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/interpreters/partial_eval.py", line 2006, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/linear_util.py", line 165, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/lax_numpy.py", line 2802, in append
    return concatenate([ravel(arr), ravel(values)], 0)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/traceback_util.py", line 163, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/api.py", line 698, in cache_miss
    top_trace.process_call(primitive, fun_, tracers, params))
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/interpreters/partial_eval.py", line 1747, in process_call
    jaxpr, out_type, consts = trace_to_subjaxpr_dynamic2(f, self.main, debug_info=dbg)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/interpreters/partial_eval.py", line 2035, in trace_to_subjaxpr_dynamic2
    ans = fun.call_wrapped(*in_tracers_)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/linear_util.py", line 165, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/lax_numpy.py", line 812, in ravel
    _stackable(a) or _check_arraylike("ravel", a)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/util.py", line 345, in _check_arraylike
    raise TypeError(msg.format(fun_name, type(arg), pos))
jax._src.traceback_util.UnfilteredStackTrace: TypeError: ravel requires ndarray or scalar arguments, got <class 'list'> at position 0.

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 121, in <module>
    fire.Fire(main)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 110, in main
    output_ids = generate(input_ids, params, hparams["n_head"], n_tokens_to_generate)
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 92, in generate
    inputs = np.append(inputs, [next_id])  # append prediction to input
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/lax_numpy.py", line 2802, in append
    return concatenate([ravel(arr), ravel(values)], 0)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/lax_numpy.py", line 812, in ravel
    _stackable(a) or _check_arraylike("ravel", a)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/util.py", line 345, in _check_arraylike
    raise TypeError(msg.format(fun_name, type(arg), pos))
TypeError: ravel requires ndarray or scalar arguments, got <class 'list'> at position 0.

I am running in the following Conda environment:

$ conda env export
name: pico
channels:
  - conda-forge
dependencies:
  - appdirs=1.4.4=pyh9f0ad1d_0
  - brotlipy=0.7.0=py39h02fc5c5_1005
  - bzip2=1.0.8=h3422bc3_4
  - c-ares=1.18.1=h3422bc3_0
  - ca-certificates=2022.12.7=h4653dfc_0
  - cffi=1.15.1=py39h7e6b969_3
  - cryptography=39.0.1=py39he2a39a8_0
  - idna=3.4=pyhd8ed1ab_0
  - jax=0.4.3=pyhd8ed1ab_0
  - jaxlib=0.4.3=cpu_py39h99d3290_1
  - libabseil=20220623.0=cxx17_h28b99d4_6
  - libblas=3.9.0=16_osxarm64_openblas
  - libcblas=3.9.0=16_osxarm64_openblas
  - libcxx=14.0.6=h2692d47_0
  - libffi=3.4.2=h3422bc3_5
  - libgfortran=5.0.0=11_3_0_hd922786_27
  - libgfortran5=11.3.0=hdaf2cc0_27
  - libgrpc=1.51.1=hb15be72_1
  - liblapack=3.9.0=16_osxarm64_openblas
  - libopenblas=0.3.21=openmp_hc731615_3
  - libprotobuf=3.21.12=hb5ab8b9_0
  - libsqlite=3.40.0=h76d750c_0
  - libzlib=1.2.13=h03a7124_4
  - llvm-openmp=15.0.7=h7cfbb63_0
  - ncurses=6.3=h07bb92c_1
  - openssl=3.0.8=h03a7124_0
  - opt_einsum=3.3.0=pyhd8ed1ab_1
  - packaging=23.0=pyhd8ed1ab_0
  - pip=23.0=pyhd8ed1ab_0
  - pooch=1.6.0=pyhd8ed1ab_0
  - pycparser=2.21=pyhd8ed1ab_0
  - pyopenssl=23.0.0=pyhd8ed1ab_0
  - pysocks=1.7.1=pyha2e5f31_6
  - python=3.9.16=hea58f1e_0_cpython
  - python_abi=3.9=3_cp39
  - re2=2023.02.01=hb7217d7_0
  - readline=8.1.2=h46ed386_0
  - scipy=1.10.0=py39h18313fe_2
  - setuptools=67.1.0=pyhd8ed1ab_0
  - tk=8.6.12=he1e0b03_0
  - tzdata=2022g=h191b570_0
  - urllib3=1.26.14=pyhd8ed1ab_0
  - wheel=0.38.4=pyhd8ed1ab_0
  - xz=5.2.6=h57fd34a_0
  - zlib=1.2.13=h03a7124_4
  - pip:
    - absl-py==1.4.0
    - astunparse==1.6.3
    - cachetools==5.3.0
    - certifi==2022.12.7
    - charset-normalizer==2.0.12
    - fire==0.5.0
    - flatbuffers==23.1.21
    - gast==0.4.0
    - google-auth==2.16.0
    - google-auth-oauthlib==0.4.6
    - google-pasta==0.2.0
    - grpcio==1.51.1
    - h5py==3.8.0
    - importlib-metadata==6.0.0
    - keras==2.11.0
    - libclang==15.0.6.1
    - markdown==3.4.1
    - markupsafe==2.1.2
    - numpy==1.24.1
    - oauthlib==3.2.2
    - protobuf==3.19.6
    - pyasn1==0.4.8
    - pyasn1-modules==0.2.8
    - regex==2017.4.5
    - requests==2.27.1
    - requests-oauthlib==1.3.1
    - rsa==4.9
    - six==1.16.0
    - tensorboard==2.11.2
    - tensorboard-data-server==0.6.1
    - tensorboard-plugin-wit==1.8.1
    - tensorflow-estimator==2.11.0
    - tensorflow-macos==2.11.0
    - termcolor==2.2.0
    - tqdm==4.64.0
    - typing-extensions==4.4.0
    - werkzeug==2.2.2
    - wrapt==1.14.1
    - zipp==3.13.0
prefix: /Users/ondrej/mambaforge/envs/pico

Jax is slower than NumPy

With #10, I get the following timings with NumPy on my Apple M1 Max:

$ time python gpt2.py "Alan Turing theorized that computers would one day become" -n 40
generating: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 40/40 [00:18<00:00,  2.13it/s]
 the most powerful machines on the planet.

The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.

python gpt2.py "Alan Turing theorized that computers would one day become" -n  115.74s user 1.71s system 559% cpu 20.993 total

And Jax:

$ time python gpt2.py "Alan Turing theorized that computers would one day become" -n 40
generating: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 40/40 [00:21<00:00,  1.85it/s]
 the most powerful machines on the planet.

The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.

python gpt2.py "Alan Turing theorized that computers would one day become" -n  28.86s user 1.91s system 127% cpu 24.115 total

So Jax is slower. Using htop Jax is using roughly 1.3 CPU cores, while NumPy is using almost 6 CPU cores. Is NumPy automatically parallel on macOS?

Here is my Conda environment:

$ conda env export
name: pico
channels:
  - conda-forge
dependencies:
  - appdirs=1.4.4=pyh9f0ad1d_0
  - appnope=0.1.3=pyhd8ed1ab_0
  - asttokens=2.2.1=pyhd8ed1ab_0
  - backcall=0.2.0=pyh9f0ad1d_0
  - backports=1.0=pyhd8ed1ab_3
  - backports.functools_lru_cache=1.6.4=pyhd8ed1ab_0
  - brotlipy=0.7.0=py39h02fc5c5_1005
  - bzip2=1.0.8=h3422bc3_4
  - c-ares=1.18.1=h3422bc3_0
  - ca-certificates=2022.12.7=h4653dfc_0
  - cffi=1.15.1=py39h7e6b969_3
  - cryptography=39.0.1=py39he2a39a8_0
  - decorator=5.1.1=pyhd8ed1ab_0
  - executing=1.2.0=pyhd8ed1ab_0
  - idna=3.4=pyhd8ed1ab_0
  - ipython=8.10.0=pyhd1c38e8_0
  - jax=0.4.3=pyhd8ed1ab_0
  - jaxlib=0.4.3=cpu_py39h99d3290_1
  - jedi=0.18.2=pyhd8ed1ab_0
  - libabseil=20220623.0=cxx17_h28b99d4_6
  - libblas=3.9.0=16_osxarm64_openblas
  - libcblas=3.9.0=16_osxarm64_openblas
  - libcxx=14.0.6=h2692d47_0
  - libffi=3.4.2=h3422bc3_5
  - libgfortran=5.0.0=11_3_0_hd922786_27
  - libgfortran5=11.3.0=hdaf2cc0_27
  - libgrpc=1.51.1=hb15be72_1
  - liblapack=3.9.0=16_osxarm64_openblas
  - libopenblas=0.3.21=openmp_hc731615_3
  - libprotobuf=3.21.12=hb5ab8b9_0
  - libsqlite=3.40.0=h76d750c_0
  - libzlib=1.2.13=h03a7124_4
  - llvm-openmp=15.0.7=h7cfbb63_0
  - matplotlib-inline=0.1.6=pyhd8ed1ab_0
  - ncurses=6.3=h07bb92c_1
  - openssl=3.0.8=h03a7124_0
  - opt_einsum=3.3.0=pyhd8ed1ab_1
  - packaging=23.0=pyhd8ed1ab_0
  - parso=0.8.3=pyhd8ed1ab_0
  - pexpect=4.8.0=pyh1a96a4e_2
  - pickleshare=0.7.5=py_1003
  - pip=23.0=pyhd8ed1ab_0
  - pooch=1.6.0=pyhd8ed1ab_0
  - prompt-toolkit=3.0.36=pyha770c72_0
  - ptyprocess=0.7.0=pyhd3deb0d_0
  - pure_eval=0.2.2=pyhd8ed1ab_0
  - pycparser=2.21=pyhd8ed1ab_0
  - pygments=2.14.0=pyhd8ed1ab_0
  - pyopenssl=23.0.0=pyhd8ed1ab_0
  - pysocks=1.7.1=pyha2e5f31_6
  - python=3.9.16=hea58f1e_0_cpython
  - python_abi=3.9=3_cp39
  - re2=2023.02.01=hb7217d7_0
  - readline=8.1.2=h46ed386_0
  - scipy=1.10.0=py39h18313fe_2
  - setuptools=67.1.0=pyhd8ed1ab_0
  - six=1.16.0=pyh6c4a22f_0
  - stack_data=0.6.2=pyhd8ed1ab_0
  - tk=8.6.12=he1e0b03_0
  - traitlets=5.9.0=pyhd8ed1ab_0
  - tzdata=2022g=h191b570_0
  - urllib3=1.26.14=pyhd8ed1ab_0
  - wcwidth=0.2.6=pyhd8ed1ab_0
  - wheel=0.38.4=pyhd8ed1ab_0
  - xz=5.2.6=h57fd34a_0
  - zlib=1.2.13=h03a7124_4
  - pip:
    - absl-py==1.4.0
    - astunparse==1.6.3
    - cachetools==5.3.0
    - certifi==2022.12.7
    - charset-normalizer==2.0.12
    - fire==0.5.0
    - flatbuffers==23.1.21
    - gast==0.4.0
    - google-auth==2.16.0
    - google-auth-oauthlib==0.4.6
    - google-pasta==0.2.0
    - grpcio==1.51.1
    - h5py==3.8.0
    - importlib-metadata==6.0.0
    - keras==2.11.0
    - libclang==15.0.6.1
    - markdown==3.4.1
    - markupsafe==2.1.2
    - numpy==1.24.1
    - oauthlib==3.2.2
    - protobuf==3.19.6
    - pyasn1==0.4.8
    - pyasn1-modules==0.2.8
    - regex==2017.4.5
    - requests==2.27.1
    - requests-oauthlib==1.3.1
    - rsa==4.9
    - tensorboard==2.11.2
    - tensorboard-data-server==0.6.1
    - tensorboard-plugin-wit==1.8.1
    - tensorflow-estimator==2.11.0
    - tensorflow-macos==2.11.0
    - termcolor==2.2.0
    - tqdm==4.64.0
    - typing-extensions==4.4.0
    - werkzeug==2.2.2
    - wrapt==1.14.1
    - zipp==3.13.0
prefix: /Users/ondrej/mambaforge/envs/pico

is it better x[-1] @ wte.T

is it better to change
return x @ wte.T # [n_seq, n_embd] -> [n_seq, n_vocab]
by
x[-1] @ wte.T
?

then we can use
next_id = np.argmax(logits)

Tensorflow pollutes the console output with missing GPU library warnings.

My VM doesn't have GPU acceleration. When I run a test command, the console has warnings like this:

2023-02-10 10:02:02.669259: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-02-10 10:02:02.864146: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2023-02-10 10:02:02.864303: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2023-02-10 10:02:03.882298: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-02-10 10:02:03.882433: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-02-10 10:02:03.882446: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

It appears that adding TF_CPP_MIN_LOG_LEVEL=3 on the command line suppresses this behaviour. An alternative is to import os and to add os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3', but that kills the idea of just importing numpy.

Perhaps updating README.md to add the following before executing picoGPT for the first time is the quickest way to deal with this. It will also provide a point to discuss acceleration. Perhaps this is also the place to describe disabling the progress bar for generation if that's something that is desirable.

export TF_CPP_MIN_LOG_LEVEL=3

On a personal note, this is a sensational introduction to GPT and it's given me the incentive to start experimenting, thank you!

TensorFlow Error - Any Workaround please?

E:\picoGPT-main>python gpt2.py "Alan Turing theorized that computers would one day become"
Traceback (most recent call last):
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper
fp, pathname, description = imp.find_module('_pywrap_tensorflow_internal', [dirname(file)])
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\Lib\imp.py", line 296, in find_module
raise ImportError(_ERR_MSG.format(name), name=name)
ImportError: No module named '_pywrap_tensorflow_internal'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 20, in swig_import_helper
import _pywrap_tensorflow_internal
ModuleNotFoundError: No module named '_pywrap_tensorflow_internal'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:\picoGPT-main\gpt2.py", line 121, in
fire.Fire(main)
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\fire\core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\fire\core.py", line 475, in _Fire
component, remaining_args = CallAndUpdateTrace(
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\fire\core.py", line 691, in CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "E:\picoGPT-main\gpt2.py", line 98, in main
from utils import load_encoder_hparams_and_params
File "E:\picoGPT-main\utils.py", line 7, in
import tensorflow as tf
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow_init
.py", line 24, in
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python_init
.py", line 49, in
from tensorflow.python import pywrap_tensorflow
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 74, in
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper
fp, pathname, description = imp.find_module('_pywrap_tensorflow_internal', [dirname(file)])
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\Lib\imp.py", line 296, in find_module
raise ImportError(_ERR_MSG.format(name), name=name)
ImportError: No module named '_pywrap_tensorflow_internal'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 20, in swig_import_helper
import _pywrap_tensorflow_internal
ModuleNotFoundError: No module named '_pywrap_tensorflow_internal'

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.