ysymyth / react Goto Github PK
View Code? Open in Web Editor NEW[ICLR 2023] ReAct: Synergizing Reasoning and Acting in Language Models
License: MIT License
[ICLR 2023] ReAct: Synergizing Reasoning and Acting in Language Models
License: MIT License
Thank you for your code. But I can not access the webshop url in your jupterbook. Do I have to launch another servise?
Hi, I was wondering how could we finetune the small REACT model given the prompts generated using LLM being prompt tuned.
Are we trying to use LoRA or P-Tuning for the finetuning step?
How to use the prompt data?
(1) Letting all the actions and thoughts be the input and let the final action (answer) be the output
(2) Parse the whole ReAct process and use previous in-context info as input and current action as output
(3) Or any other way you used?
Really appreciate your help.
Hi,
For webshop env, what was the number of retrieved items displayed per page?
As per the code, it seems item names indexed after 3 are purposefully omitted, which does not seem to be clarified in the actual paper.
Could you please explicitly clarify this setting just so that I am clear whether this was a small change for visualization in code or was it done for all results reported in the paper?
I was looking through the earlier issues in the repo and couldn't find this resolved in the closed issues.
Thanks!
Hi, I'm trying to reproduce your ReAct results on Webshop using some LLM APIs. However, I sometimes encountered the following errors.
Basically, sometimes, after you select some specific options and then click[Buy Now], it's going to show the error below:
Traceback
(most recent call last)
File
"/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py"
,
line
2095
,
in
__call__
def __call__(self, environ: dict, start_response: t.Callable) -> t.Any:
"""The WSGI server calls the Flask application object as the
WSGI application. This calls :meth:`wsgi_app`, which can be
wrapped to apply middleware.
"""
return self.wsgi_app(environ, start_response)
File
"/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py"
,
line
2080
,
in
wsgi_app
try:
ctx.push()
response = self.full_dispatch_request()
except Exception as e:
error = e
response = self.handle_exception(e)
except: # noqa: B001
error = sys.exc_info()[1]
raise
return response(environ, start_response)
finally:
File
"/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py"
,
line
2077
,
in
wsgi_app
ctx = self.request_context(environ)
error: t.Optional[BaseException] = None
try:
try:
ctx.push()
response = self.full_dispatch_request()
except Exception as e:
error = e
response = self.handle_exception(e)
except: # noqa: B001
error = sys.exc_info()[1]
File
"/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py"
,
line
1525
,
in
full_dispatch_request
request_started.send(self)
rv = self.preprocess_request()
if rv is None:
rv = self.dispatch_request()
except Exception as e:
rv = self.handle_user_exception(e)
return self.finalize_request(rv)
def finalize_request(
self,
rv: t.Union[ResponseReturnValue, HTTPException],
File
"/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py"
,
line
1523
,
in
full_dispatch_request
self.try_trigger_before_first_request_functions()
try:
request_started.send(self)
rv = self.preprocess_request()
if rv is None:
rv = self.dispatch_request()
except Exception as e:
rv = self.handle_user_exception(e)
return self.finalize_request(rv)
def finalize_request(
File
"/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py"
,
line
1509
,
in
dispatch_request
getattr(rule, "provide_automatic_options", False)
and req.method == "OPTIONS"
):
return self.make_default_options_response()
# otherwise dispatch to the handler for that endpoint
return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
def full_dispatch_request(self) -> Response:
"""Dispatches the request and on top of that performs request
pre and postprocessing as well as HTTP exception catching and
error handling.
File
"/home/user/webshop/web_agent_site/app.py"
,
line
221
,
in
done
return html
@app.route('/done/<session_id>/<asin>/<options>', methods=['GET', 'POST'])
def done(session_id, asin, options):
options = literal_eval(options)
goal = user_sessions[session_id]['goal']
purchased_product = product_item_dict[asin]
price = product_prices[asin]
reward, reward_info = get_reward(
File
"/home/user/anaconda3/envs/webshop/lib/python3.8/ast.py"
,
line
59
,
in
literal_eval
expression. The string or node provided may only consist of the following
Python literal structures: strings, bytes, numbers, tuples, lists, dicts,
sets, booleans, and None.
"""
if isinstance(node_or_string, str):
node_or_string = parse(node_or_string, mode='eval')
if isinstance(node_or_string, Expression):
node_or_string = node_or_string.body
def _raise_malformed_node(node):
raise ValueError(f'malformed node or string: {node!r}')
def _convert_num(node):
File
"/home/user/anaconda3/envs/webshop/lib/python3.8/ast.py"
,
line
47
,
in
parse
assert major == 3
feature_version = minor
elif feature_version is None:
feature_version = -1
# Else it should be an int giving the minor version for 3.x.
return compile(source, filename, mode, flags,
_feature_version=feature_version)
def literal_eval(node_or_string):
"""
File "<unknown>", line 1
{'color': '2
^
SyntaxError: EOL while scanning string literal
This is the Copy/Paste friendly version of the traceback.
Traceback (most recent call last):
File "/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py", line 2095, in __call__
return self.wsgi_app(environ, start_response)
File "/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py", line 2080, in wsgi_app
response = self.handle_exception(e)
File "/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py", line 2077, in wsgi_app
response = self.full_dispatch_request()
File "/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py", line 1525, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py", line 1523, in full_dispatch_request
rv = self.dispatch_request()
File "/home/user/anaconda3/envs/webshop/lib/python3.8/site-packages/flask/app.py", line 1509, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
File "/home/user/webshop/web_agent_site/app.py", line 221, in done
options = literal_eval(options)
File "/home/user/anaconda3/envs/webshop/lib/python3.8/ast.py", line 59, in literal_eval
node_or_string = parse(node_or_string, mode='eval')
File "/home/user/anaconda3/envs/webshop/lib/python3.8/ast.py", line 47, in parse
return compile(source, filename, mode, flags,
File "<unknown>", line 1
{'color': '2
^
SyntaxError: EOL while scanning string literal
The debugger caught an exception in your WSGI application. You can now
look at the traceback which led to the error.
If you enable JavaScript you can also use additional features such as code
execution (if the evalex feature is enabled), automatic pasting of the
exceptions and much more.
Brought to you by
DON'T PANIC
, your
friendly Werkzeug powered traceback interpreter.
Console Locked
The console is locked and needs to be unlocked by entering the PIN.
You can find the PIN printed out on the standard output of your
shell that runs the server.
PIN:
To reproduce the error, you can try this:
In the ipython file of ReAct webshop, select the task id 83: i need a slim fit gray colored coat that has long sleeves. it should be in x-large size, and price lower than 40.00 dollars. Then do the following actions:
Then the error occurs. When doing these actions directly on the website, there is no such error. Therefore there may be something wrong when passing the argument to the environment.
(The errors I notice all come when an option that has '#' inside it is selected, maybe that's useful. )
Could you please help check that? Thank you so much!
Hi! I'm replicating ReAct results on WebShop, and I have several questions with webshopEnv in the jupyter notebook
if prod_cnt >= 3:
processed_t = ''
Is this also what you used in the paper?
assert False
when the button Next or Prev is clicked. Is this also intentional?Also, I have got results of ReAct on WebShop with session id fixed_{1-500}, which I believe is the same setup as the paper, using this environment (did not modify it) but with different llm (not PaLM-540B):
gpt-turbo-3.5
Act - Score: 64.99 Success Rate: 34.0
ReAct - Score: 59.9 Success Rate: 30.0
code-davinci-002
Act - Score: 64.99 Success Rate: 34.0
ReAct - Score: 65.60 Success Rate: 38.8
Is this to be expected? Wondering if you have any thoughts on this. After some researching, there're people saying that chain-of-thought might not be as effective for models that was trained with RLHF like ChatGPT. But I don't have much explanation for why I'm not seeing the performance boost from Act to ReAct with Codex (code-davinci-002)
Thank you in advance! Love the simplicity of your work and I'm trying to come up with new ideas based off of this paper :)
Hello @ysymyth, thanks for sharing your code, excellent work! Is there any plan to release the code of FEVER and WebShop? Thank you!
Hi,
I wondered if you had more details or numbers from your GPT-3 results on Alfworld? For instance, do you have the splits of accuracy across the different subtasks (as in Table 3 in the paper)?
I would try to reproduce it, but I reckon the total cost would be > $100 and would like to avoid it if possible.
Hello, thank you for this important work and project!
I'm already seeing many references to the paradigm. The problem is that there was already a massively popular project named React. This makes searches for ReAct somewhat difficult.
Hi Shunyu,
Could you provide text-davinci-002 trajectory on HotpotQA 500 (30.8EM in Table 5 of A.1 GPT-3 Experiments)?
Thank you!
I was wondering if autogpt is inspired by your ideas. anyway, thanks for your great efforts.
@ysymyth Thanks for your good work!
Can you attach the output of HotpotQA (hotpotqa.ipynb), like those in (FEVER.ipynb)? Thank you!
Did you use prompt like https://github.com/hwchase17/langchain/blob/bc2ed93b77cf9c40920ca5bf96968c90bb3e322e/langchain/agents/react/textworld_prompt.py#L4-L45 to ask GPT3 to generate result in ReAct format?
Or you just create many examples, and fine tune it, so it generate it? And this only works in your fine tuned model, and not working in GPT3-4?
I'd like to know if the method in langchain actually correct and works.
Hi, I'm tring to run ReAct with GPT-3.5-Turbo on hotpot dataset with provided jupyter notebook. But only get 0.182 accuracy, is it a reasonable result? I think it is much lower than result showed in paper.
Don't give me links to Alfworld! The installations there don't work, the support is nonexistent.
How can I install ReAct on my Ubuntu 22.04?
I tried to run Webshop.ipython, and here are some of the outputs:
Observation: Invalid action!
Action: click[Add to Cart]
Observation:
Action: click[Add to Cart]
Observation:
Action: click[Checkout]
Observation:
Action: click[Checkout]
Observation:
Action: click[Proceed to checkout]
Observation:
Action: click[Proceed to checkout]
Observation:
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
1 0.0 0.0 0.0
-------------
-----------------
1
Action: reset
Observation:
Action: click[Buy Now]
Observation: Invalid action!
Action: click[Add to Cart]
Observation:
Action: click[Add to Cart]
Observation:
Action: click[Checkout]
Observation:
Action: click[Checkout]
Observation:
Action: click[Proceed to checkout]
Observation:
Action: click[Proceed to checkout]
Observation:
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
Action: fill out form[name: John Doe, email: [email protected], phone: 555-555-1212, address: 123 Main St, city: Anytown, state: CA, zip: 99999]
Observation: Invalid action!
2 0.0 0.0 0.0
-------------
-----------------
How can I get the right score? Thank you
I am impressed with your research. Thank you for your good research.
But I have a question and would like to ask.
According to Table 2 of the paper, success and failure modes are divided.
Thanks!
Hi,
Thanks for your great work! I have a question on Table 3, where results of Act and ReAct are reported as avg/best of 6. I am wondering where does 6 come from, given that the decoding strategy is greedy.
Thank you!
Have you ever considered to apply ReAct prompting to numerical reasoning task? (like GSM8k or datasets which contain more difficult symbolic operations)
If yes, If you have considered it, does it show any improvement?
Thank you.
Hello, I would like to ask if there is a code implementation for cot ->react and react ->cot mentioned in the paper
你好,我想问一下论文里提到的cot->react 和 react->cot 有代码实现吗
Hi there, I cannot seem to find any information on the fine-tuning process in your paper and this repository.
A snippet from your paper:
However, when finetuned with just 3,000 examples, ReAct becomes the best
method among the four, with PaLM-8B finetuned ReAct outperforming all PaLM-62B prompting
methods, and PaLM-62B finetuned ReAct outperforming all 540B prompting methods. In contrast,
finetuning Standard or CoT is significantly worse than finetuning ReAct or Act for both PaLM-
8/62B, as the former essentially teaches models to memorize (potentially halluincated) knowledge
facts, and the latter teaches models how to (reason and) act to access information from Wikipedia, a
more generalizable skill for knowledge reasoning.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.