jaymody / picogpt Goto Github PK

View Code? Open in Web Editor NEW

3.1K 29.0 402.0 14 KB

An unnecessarily tiny implementation of GPT-2 in NumPy.

License: MIT License

Python 100.00%

deep-learning gpt gpt-2 large-language-models machine-learning neural-network python nlp

picogpt's Introduction

PicoGPT

Accompanying blog post: GPT in 60 Lines of Numpy

You've seen openai/gpt-2.

You've seen karpathy/minGPT.

You've even seen karpathy/nanoGPT!

But have you seen picoGPT??!?

picoGPT is an unnecessarily tiny and minimal implementation of GPT-2 in plain NumPy. The entire forward pass code is 40 lines of code.

picoGPT features:

Fast? ❌ Nah, picoGPT is megaSLOW 🐌
Training code? ❌ Error, 4️⃣0️⃣4️⃣ not found
Batch inference? ❌ picoGPT is civilized, single file line, one at a time only
top-p sampling? ❌ top-k? ❌ temperature? ❌ categorical sampling?! ❌ greedy? ✅
Readable? gpt2.py ✅ gpt2_pico.py ❌
Smol??? ✅✅✅✅✅✅ YESS!!! TEENIE TINY in fact 🤏

A quick breakdown of each of the files:

encoder.py contains the code for OpenAI's BPE Tokenizer, taken straight from their gpt-2 repo.
utils.py contains the code to download and load the GPT-2 model weights, tokenizer, and hyper-parameters.
gpt2.py contains the actual GPT model and generation code which we can run as a python script.
gpt2_pico.py is the same as gpt2.py, but in even fewer lines of code. Why? Because why not 😎👍.

Dependencies

pip install -r requirements.txt

Tested on Python 3.9.10.

Usage

python gpt2.py "Alan Turing theorized that computers would one day become"

Which generates

 the most powerful machines on the planet.

The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.

You can also control the number of tokens to generate, the model size (one of ["124M", "355M", "774M", "1558M"]), and the directory to save the models:

python gpt2.py \
    "Alan Turing theorized that computers would one day become" \
    --n_tokens_to_generate 40 \
    --model_size "124M" \
    --models_dir "models"

picogpt's People

Contributors

Stargazers

Watchers

Forkers

johnvilsack darenr jvandenbos brianewing plpxsk khoroo goswamig guluarte jithinraj mshahrfar chefren antelligent-app dotkt mrcodechef tanelpoder salikshah kidusmik miknoj kustomzone mattmcd jagadeeshyadhav pratikdhanave rarhs hadryan existeundelta aclaputra piotrlnordea jaedukseo jameshfisher ethanrich andronnix squaregen ihalage mentalcq lorenzoserna36 argoneus classicvalues sheepbaabaabaa immortal3 mihhaj alabarga schaferk ellio777 payne7770 xb0t eltociear ghillb techventurebuilder dsninja wooiljeong naokitachiaoi ailabteam ukaserge iuriimattos2 mbrukman martinbowling cksac sarvex yaoshj975 akshayvegesna phaeta shalevy1 risonaldomoura zhangzhengde0225 babyblue26 sbrugman dmizheg julicq emax73 certik vladimirshleyev alexewd barzillin eduhayon eimihar stancx1 ccc-ai0 aabhishekdata andikanaffiah lseongjoo israrbacha dsaha21 nick888888888 misctyler phaelon74 ramiil vtancs joskid chenke0612 gaovicki pkpkpk sailfish009 hanmeh david20080125 kojix2 cybort urbanist-ai qwzhong1988 imjking techthiyanes

picogpt's Issues

Visualization of picoGPT

Great work! For beginners, here's a graphical representation of your code. Feel free to embed it in your scripts:
https://gctpy.com/graph/1ca770a1905176a355836d485ee7c8fc5b97e74ae058fce332ca59fdcf4ac919. It shows how different functions connect to other functions in your code. This was generated for gpt.py

How can I download the models manual?

Hello,

Thanks for the simple gpt tutorials, and how can I download the models manual? From my side, the downloading always timeout. Thanks!

do I need on windows this line tensorflow-macos==2.11.0 in requirements.txt

can it be run on windows ?

meaning do I need on windows this line
tensorflow-macos==2.11.0; sys_platform == 'darwin' and platform_machine == 'arm64'

in
https://github.com/jaymody/picoGPT/blob/main/requirements.txt

Using jax.numpy instead of numpy gives TypeError on macOS

How to reproduce using the latest master (018a1e1) on macOS M1:

$ python gpt2.py "Alan Turing theorized that computers would one day become" -n 8
generating: 100%|█████████████████████████████████| 8/8 [00:03<00:00,  2.44it/s]
 the most powerful machines on the planet.

Then apply the following patch:

diff --git a/gpt2.py b/gpt2.py
index 62549bc..daf5685 100644
--- a/gpt2.py
+++ b/gpt2.py
@@ -1,4 +1,4 @@
-import numpy as np
+import jax.numpy as np
 
 
 def gelu(x):

and:

$ python gpt2.py "Alan Turing theorized that computers would one day become" -n 8
generating:   0%|                                         | 0/8 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 121, in <module>
    fire.Fire(main)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 110, in main
    output_ids = generate(input_ids, params, hparams["n_head"], n_tokens_to_generate)
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 92, in generate
    inputs = np.append(inputs, [next_id])  # append prediction to input
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/traceback_util.py", line 163, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/api.py", line 694, in cache_miss
    execute = dispatch._xla_call_impl_lazy(fun_, *tracers, **params)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/dispatch.py", line 240, in _xla_call_impl_lazy
    return xla_callable(fun, device, backend, name, donated_invars, keep_unused,
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/linear_util.py", line 301, in memoized_fun
    ans = call(fun, *args)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/dispatch.py", line 351, in _xla_callable_uncached
    computation = sharded_lowering(fun, device, backend, name, donated_invars,
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/dispatch.py", line 342, in sharded_lowering
    return pxla.lower_sharding_computation(
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/profiler.py", line 314, in wrapper
    return func(*args, **kwargs)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/interpreters/pxla.py", line 2797, in lower_sharding_computation
    jaxpr, global_out_avals, consts = pe.trace_to_jaxpr_final(
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/profiler.py", line 314, in wrapper
    return func(*args, **kwargs)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/interpreters/partial_eval.py", line 2073, in trace_to_jaxpr_final
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/interpreters/partial_eval.py", line 2006, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/linear_util.py", line 165, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/lax_numpy.py", line 2802, in append
    return concatenate([ravel(arr), ravel(values)], 0)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/traceback_util.py", line 163, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/api.py", line 698, in cache_miss
    top_trace.process_call(primitive, fun_, tracers, params))
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/interpreters/partial_eval.py", line 1747, in process_call
    jaxpr, out_type, consts = trace_to_subjaxpr_dynamic2(f, self.main, debug_info=dbg)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/interpreters/partial_eval.py", line 2035, in trace_to_subjaxpr_dynamic2
    ans = fun.call_wrapped(*in_tracers_)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/linear_util.py", line 165, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/lax_numpy.py", line 812, in ravel
    _stackable(a) or _check_arraylike("ravel", a)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/util.py", line 345, in _check_arraylike
    raise TypeError(msg.format(fun_name, type(arg), pos))
jax._src.traceback_util.UnfilteredStackTrace: TypeError: ravel requires ndarray or scalar arguments, got <class 'list'> at position 0.

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 121, in <module>
    fire.Fire(main)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 110, in main
    output_ids = generate(input_ids, params, hparams["n_head"], n_tokens_to_generate)
  File "/Users/ondrej/repos/picoGPT/gpt2.py", line 92, in generate
    inputs = np.append(inputs, [next_id])  # append prediction to input
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/lax_numpy.py", line 2802, in append
    return concatenate([ravel(arr), ravel(values)], 0)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/lax_numpy.py", line 812, in ravel
    _stackable(a) or _check_arraylike("ravel", a)
  File "/Users/ondrej/mambaforge/envs/pico/lib/python3.9/site-packages/jax/_src/numpy/util.py", line 345, in _check_arraylike
    raise TypeError(msg.format(fun_name, type(arg), pos))
TypeError: ravel requires ndarray or scalar arguments, got <class 'list'> at position 0.

I am running in the following Conda environment:

$ conda env export
name: pico
channels:
  - conda-forge
dependencies:
  - appdirs=1.4.4=pyh9f0ad1d_0
  - brotlipy=0.7.0=py39h02fc5c5_1005
  - bzip2=1.0.8=h3422bc3_4
  - c-ares=1.18.1=h3422bc3_0
  - ca-certificates=2022.12.7=h4653dfc_0
  - cffi=1.15.1=py39h7e6b969_3
  - cryptography=39.0.1=py39he2a39a8_0
  - idna=3.4=pyhd8ed1ab_0
  - jax=0.4.3=pyhd8ed1ab_0
  - jaxlib=0.4.3=cpu_py39h99d3290_1
  - libabseil=20220623.0=cxx17_h28b99d4_6
  - libblas=3.9.0=16_osxarm64_openblas
  - libcblas=3.9.0=16_osxarm64_openblas
  - libcxx=14.0.6=h2692d47_0
  - libffi=3.4.2=h3422bc3_5
  - libgfortran=5.0.0=11_3_0_hd922786_27
  - libgfortran5=11.3.0=hdaf2cc0_27
  - libgrpc=1.51.1=hb15be72_1
  - liblapack=3.9.0=16_osxarm64_openblas
  - libopenblas=0.3.21=openmp_hc731615_3
  - libprotobuf=3.21.12=hb5ab8b9_0
  - libsqlite=3.40.0=h76d750c_0
  - libzlib=1.2.13=h03a7124_4
  - llvm-openmp=15.0.7=h7cfbb63_0
  - ncurses=6.3=h07bb92c_1
  - openssl=3.0.8=h03a7124_0
  - opt_einsum=3.3.0=pyhd8ed1ab_1
  - packaging=23.0=pyhd8ed1ab_0
  - pip=23.0=pyhd8ed1ab_0
  - pooch=1.6.0=pyhd8ed1ab_0
  - pycparser=2.21=pyhd8ed1ab_0
  - pyopenssl=23.0.0=pyhd8ed1ab_0
  - pysocks=1.7.1=pyha2e5f31_6
  - python=3.9.16=hea58f1e_0_cpython
  - python_abi=3.9=3_cp39
  - re2=2023.02.01=hb7217d7_0
  - readline=8.1.2=h46ed386_0
  - scipy=1.10.0=py39h18313fe_2
  - setuptools=67.1.0=pyhd8ed1ab_0
  - tk=8.6.12=he1e0b03_0
  - tzdata=2022g=h191b570_0
  - urllib3=1.26.14=pyhd8ed1ab_0
  - wheel=0.38.4=pyhd8ed1ab_0
  - xz=5.2.6=h57fd34a_0
  - zlib=1.2.13=h03a7124_4
  - pip:
    - absl-py==1.4.0
    - astunparse==1.6.3
    - cachetools==5.3.0
    - certifi==2022.12.7
    - charset-normalizer==2.0.12
    - fire==0.5.0
    - flatbuffers==23.1.21
    - gast==0.4.0
    - google-auth==2.16.0
    - google-auth-oauthlib==0.4.6
    - google-pasta==0.2.0
    - grpcio==1.51.1
    - h5py==3.8.0
    - importlib-metadata==6.0.0
    - keras==2.11.0
    - libclang==15.0.6.1
    - markdown==3.4.1
    - markupsafe==2.1.2
    - numpy==1.24.1
    - oauthlib==3.2.2
    - protobuf==3.19.6
    - pyasn1==0.4.8
    - pyasn1-modules==0.2.8
    - regex==2017.4.5
    - requests==2.27.1
    - requests-oauthlib==1.3.1
    - rsa==4.9
    - six==1.16.0
    - tensorboard==2.11.2
    - tensorboard-data-server==0.6.1
    - tensorboard-plugin-wit==1.8.1
    - tensorflow-estimator==2.11.0
    - tensorflow-macos==2.11.0
    - termcolor==2.2.0
    - tqdm==4.64.0
    - typing-extensions==4.4.0
    - werkzeug==2.2.2
    - wrapt==1.14.1
    - zipp==3.13.0
prefix: /Users/ondrej/mambaforge/envs/pico

do you have video to this code in addition to your blog

do you have video to this code in addition to your blog
https://jaykmody.com/blog/gpt-from-scratch/

Jax is slower than NumPy

With #10, I get the following timings with NumPy on my Apple M1 Max:

$ time python gpt2.py "Alan Turing theorized that computers would one day become" -n 40
generating: 100%|███████████████████████████████| 40/40 [00:18<00:00,  2.13it/s]
 the most powerful machines on the planet.

The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.

python gpt2.py "Alan Turing theorized that computers would one day become" -n  115.74s user 1.71s system 559% cpu 20.993 total

And Jax:

$ time python gpt2.py "Alan Turing theorized that computers would one day become" -n 40
generating: 100%|███████████████████████████████| 40/40 [00:21<00:00,  1.85it/s]
 the most powerful machines on the planet.

The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.

python gpt2.py "Alan Turing theorized that computers would one day become" -n  28.86s user 1.91s system 127% cpu 24.115 total

So Jax is slower. Using htop Jax is using roughly 1.3 CPU cores, while NumPy is using almost 6 CPU cores. Is NumPy automatically parallel on macOS?

Here is my Conda environment:

$ conda env export
name: pico
channels:
  - conda-forge
dependencies:
  - appdirs=1.4.4=pyh9f0ad1d_0
  - appnope=0.1.3=pyhd8ed1ab_0
  - asttokens=2.2.1=pyhd8ed1ab_0
  - backcall=0.2.0=pyh9f0ad1d_0
  - backports=1.0=pyhd8ed1ab_3
  - backports.functools_lru_cache=1.6.4=pyhd8ed1ab_0
  - brotlipy=0.7.0=py39h02fc5c5_1005
  - bzip2=1.0.8=h3422bc3_4
  - c-ares=1.18.1=h3422bc3_0
  - ca-certificates=2022.12.7=h4653dfc_0
  - cffi=1.15.1=py39h7e6b969_3
  - cryptography=39.0.1=py39he2a39a8_0
  - decorator=5.1.1=pyhd8ed1ab_0
  - executing=1.2.0=pyhd8ed1ab_0
  - idna=3.4=pyhd8ed1ab_0
  - ipython=8.10.0=pyhd1c38e8_0
  - jax=0.4.3=pyhd8ed1ab_0
  - jaxlib=0.4.3=cpu_py39h99d3290_1
  - jedi=0.18.2=pyhd8ed1ab_0
  - libabseil=20220623.0=cxx17_h28b99d4_6
  - libblas=3.9.0=16_osxarm64_openblas
  - libcblas=3.9.0=16_osxarm64_openblas
  - libcxx=14.0.6=h2692d47_0
  - libffi=3.4.2=h3422bc3_5
  - libgfortran=5.0.0=11_3_0_hd922786_27
  - libgfortran5=11.3.0=hdaf2cc0_27
  - libgrpc=1.51.1=hb15be72_1
  - liblapack=3.9.0=16_osxarm64_openblas
  - libopenblas=0.3.21=openmp_hc731615_3
  - libprotobuf=3.21.12=hb5ab8b9_0
  - libsqlite=3.40.0=h76d750c_0
  - libzlib=1.2.13=h03a7124_4
  - llvm-openmp=15.0.7=h7cfbb63_0
  - matplotlib-inline=0.1.6=pyhd8ed1ab_0
  - ncurses=6.3=h07bb92c_1
  - openssl=3.0.8=h03a7124_0
  - opt_einsum=3.3.0=pyhd8ed1ab_1
  - packaging=23.0=pyhd8ed1ab_0
  - parso=0.8.3=pyhd8ed1ab_0
  - pexpect=4.8.0=pyh1a96a4e_2
  - pickleshare=0.7.5=py_1003
  - pip=23.0=pyhd8ed1ab_0
  - pooch=1.6.0=pyhd8ed1ab_0
  - prompt-toolkit=3.0.36=pyha770c72_0
  - ptyprocess=0.7.0=pyhd3deb0d_0
  - pure_eval=0.2.2=pyhd8ed1ab_0
  - pycparser=2.21=pyhd8ed1ab_0
  - pygments=2.14.0=pyhd8ed1ab_0
  - pyopenssl=23.0.0=pyhd8ed1ab_0
  - pysocks=1.7.1=pyha2e5f31_6
  - python=3.9.16=hea58f1e_0_cpython
  - python_abi=3.9=3_cp39
  - re2=2023.02.01=hb7217d7_0
  - readline=8.1.2=h46ed386_0
  - scipy=1.10.0=py39h18313fe_2
  - setuptools=67.1.0=pyhd8ed1ab_0
  - six=1.16.0=pyh6c4a22f_0
  - stack_data=0.6.2=pyhd8ed1ab_0
  - tk=8.6.12=he1e0b03_0
  - traitlets=5.9.0=pyhd8ed1ab_0
  - tzdata=2022g=h191b570_0
  - urllib3=1.26.14=pyhd8ed1ab_0
  - wcwidth=0.2.6=pyhd8ed1ab_0
  - wheel=0.38.4=pyhd8ed1ab_0
  - xz=5.2.6=h57fd34a_0
  - zlib=1.2.13=h03a7124_4
  - pip:
    - absl-py==1.4.0
    - astunparse==1.6.3
    - cachetools==5.3.0
    - certifi==2022.12.7
    - charset-normalizer==2.0.12
    - fire==0.5.0
    - flatbuffers==23.1.21
    - gast==0.4.0
    - google-auth==2.16.0
    - google-auth-oauthlib==0.4.6
    - google-pasta==0.2.0
    - grpcio==1.51.1
    - h5py==3.8.0
    - importlib-metadata==6.0.0
    - keras==2.11.0
    - libclang==15.0.6.1
    - markdown==3.4.1
    - markupsafe==2.1.2
    - numpy==1.24.1
    - oauthlib==3.2.2
    - protobuf==3.19.6
    - pyasn1==0.4.8
    - pyasn1-modules==0.2.8
    - regex==2017.4.5
    - requests==2.27.1
    - requests-oauthlib==1.3.1
    - rsa==4.9
    - tensorboard==2.11.2
    - tensorboard-data-server==0.6.1
    - tensorboard-plugin-wit==1.8.1
    - tensorflow-estimator==2.11.0
    - tensorflow-macos==2.11.0
    - termcolor==2.2.0
    - tqdm==4.64.0
    - typing-extensions==4.4.0
    - werkzeug==2.2.2
    - wrapt==1.14.1
    - zipp==3.13.0
prefix: /Users/ondrej/mambaforge/envs/pico

is it better x[-1] @ wte.T

is it better to change
return x @ wte.T # [n_seq, n_embd] -> [n_seq, n_vocab]
by
x[-1] @ wte.T
?

then we can use
next_id = np.argmax(logits)

Tensorflow pollutes the console output with missing GPU library warnings.

My VM doesn't have GPU acceleration. When I run a test command, the console has warnings like this:

2023-02-10 10:02:02.669259: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-02-10 10:02:02.864146: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2023-02-10 10:02:02.864303: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2023-02-10 10:02:03.882298: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-02-10 10:02:03.882433: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-02-10 10:02:03.882446: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

It appears that adding TF_CPP_MIN_LOG_LEVEL=3 on the command line suppresses this behaviour. An alternative is to import os and to add os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3', but that kills the idea of just importing numpy.

Perhaps updating README.md to add the following before executing picoGPT for the first time is the quickest way to deal with this. It will also provide a point to discuss acceleration. Perhaps this is also the place to describe disabling the progress bar for generation if that's something that is desirable.

export TF_CPP_MIN_LOG_LEVEL=3

On a personal note, this is a sensational introduction to GPT and it's given me the incentive to start experimenting, thank you!

Pytorch reimplemention has something wrong!

Referring to your code, I reimplemented it using torch. but there are some problems. Can you help me take a look?

TensorFlow Error - Any Workaround please?

E:\picoGPT-main>python gpt2.py "Alan Turing theorized that computers would one day become"
Traceback (most recent call last):
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper
fp, pathname, description = imp.find_module('_pywrap_tensorflow_internal', [dirname(file)])
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\Lib\imp.py", line 296, in find_module
raise ImportError(_ERR_MSG.format(name), name=name)
ImportError: No module named '_pywrap_tensorflow_internal'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 20, in swig_import_helper
import _pywrap_tensorflow_internal
ModuleNotFoundError: No module named '_pywrap_tensorflow_internal'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:\picoGPT-main\gpt2.py", line 121, in
fire.Fire(main)
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\fire\core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\fire\core.py", line 475, in _Fire
component, remaining_args = CallAndUpdateTrace(
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\fire\core.py", line 691, in CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "E:\picoGPT-main\gpt2.py", line 98, in main
from utils import load_encoder_hparams_and_params
File "E:\picoGPT-main\utils.py", line 7, in
import tensorflow as tf
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow_init.py", line 24, in
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python_init.py", line 49, in
from tensorflow.python import pywrap_tensorflow
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 74, in
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper
fp, pathname, description = imp.find_module('_pywrap_tensorflow_internal', [dirname(file)])
File "C:\Users\AlanT\AppData\Local\Programs\Python\Python39\Lib\imp.py", line 296, in find_module
raise ImportError(_ERR_MSG.format(name), name=name)
ImportError: No module named '_pywrap_tensorflow_internal'

During handling of the above exception, another exception occurred:

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.

jaymody / picogpt Goto Github PK

picogpt's Introduction

PicoGPT

Dependencies

Usage

picogpt's People

Contributors

Stargazers

Watchers

Forkers

picogpt's Issues

Recommend Projects

Recommend Topics

Recommend Org