databricks-academy / large-language-models Goto Github PK
View Code? Open in Web Editor NEWNotebooks for Large Language Models (LLMs) Specialization
License: Other
Notebooks for Large Language Models (LLMs) Specialization
License: Other
In LLM 03, we call models that have now been deprecated by OpanAI so that the commands fail when we run them.
This is the case in the demo where we define a call to text-babbage-001
which OpenAI says should now be replaced with gpt-3.5-turbo-instruct
.
Further down in the Agents section, we use
from langchain.llms import OpenAI
llm = OpenAI()
which defaults to calling text-davinci-003
which is also deprecated and should be replaced by gpt-3.5-turbo-instruct
.
Hey Team,
It will be great if you can provided the ipynb version.
Thanks,
Subham
Hi
Really comprehensive contents.
Can you please share if we can run all notebooks offline / without Internet connection after initial model download?
And is there any API access needed to run notebooks in this course?
Thank you
I get the below error, when i run the code in Jupyter Notebook:
Adding the below line of code solved the problem:
pip install -U accelerate and restart the notebook
Here is the error discussion: https://github.com/huggingface/transformers/issues/23340
Hello!
I could not find the "Releases" section in this repo for getting the slides of the course.
Can you please point out where can I find them?
Thanks,
Florin
This lab uses the CharacterTextSplitter
which splits based on a separator not on chunk_size
and chunk_overlap
we need to either remove those or use the TokenTextSplitter
In LLM 03L - Building LLM Chains Lab
notebook - Question 3, when the user tries map_rerank chain type, the model we use ('google/flan-t5-large') may not be powerful enough to consistently generate scores that can be parsed by the default parser in the default prompt in order for the results to be ranked.
The student may see error messages like
Could not parse output: [score between 0 and 100]
There are similar reports of this issue on langchain langchain-ai/langchain#3970
When I run
%run ../Includes/Classroom-Setup
for the first time, and I received errors. I looked inside the code, it was at the line "DA.reset_lesson()". I was able to bypass this by commenting out this line, since it was the first time running the code, there's no need to reset anything. But I think it may be helpful to provide the feedback.
In LLM 04 demo, we call imdb_ds = load_dataset("imdb")
as our fine-tuning dataset.
It looks like there was an update to this dataset, and this line will throw an error ExpectedMoreSplits: {'unsupervised'}
.
This can be fixed by forcing a re-install of the latest version of Hugging Face's datasets
library. However doing so breaks the code further down where it can't find the train
and validation
splits in the dataset object.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.