cohere-ai / sandbox-grounded-qa Goto Github PK
View Code? Open in Web Editor NEWA sandbox repo for grounded question answering with Cohere and Google Search
License: MIT License
A sandbox repo for grounded question answering with Cohere and Google Search
License: MIT License
python3 cli_demo.py --cohere_api_key ... --serp_api_key ... --help
usage: cli_demo.py [-h] --cohere_api_key COHERE_API_KEY --serp_api_key SERP_API_KEY [--verbosity VERBOSITY]
A grounded QA bot with cohere and google search
optional arguments:
-h, --help show this help message and exit
--cohere_api_key COHERE_API_KEY
api key for cohere
--serp_api_key SERP_API_KEY
api key for serpAPI
--verbosity VERBOSITY
verbosity level
What are the verbosity levels?
I experienced some cases where web pages were not available, because the server was down, or for example
ERROR: Page 'https://ch.linkedin.com/in/michaelwechner' could not be loaded! Exception message: HTTP Error 999: INKApi Error
or also because of issues re SSL certificate verification.
Therefore I think it would be good to log the exception message inside qa/search.py
def get_paragraphs_text_from_url(k):
"""Extract a list of paragraphs from the contents pointed to by an url."""
i, search_result_url = k
try:
html = open_link(search_result_url)
return paragraphs_from_html(html)
except Exception as e:
pretty_print("FAIL", f"ERROR: Page '{search_result_url}' could not be loaded! Exception: {e}")
return []
It is critical -- for the application I have in mind -- that the API return the text (and its url) justifying the answer.
Oddly, this time I do get a response, and it is correct. But the stated url does not have this information! The entire trace does not have this information (CL). It just mysteriously produces the correct answer.
Looks like the log does not show all the texts processed...?
python3 cli_demo.py --cohere_api_key ... --serp_api_key ... --verbosity 2000
question: What is the GLOBEX Code for Crude Oil Futures on the Chicago Mercantile Exchange?
contextual question prompt: user: How much does a watermellon weight?
bot: The average watermellon weighhts about 20 - 25 pounds
user: where do they grow?
-where do watermellons grow?
---
user: who was the president in 2010
bot: Obama was president in 2010
user: when was Obama born?
bot: August 4, 1961
user: who wrote attention is all you need?
-who wrote attention is all you need?
---
user: can people live on mars?
bot: People cannot live without advanced technology.
user: what is the killers latest album?
bot: Pressure Machine
user: was it well recieved
-was Pressure Machine by the Killers well recieved?
---
user: When is ACL is this year?
bot: ACL will be september 2nd
user: where?
-where is ACL this year?
---
user: but speaking of do you know when that band formed?
bot: 1967
user: who was their guitarist?
-who was the guitarist of The Band?
---
user: when did world war 2 end?
bot: September 2, 1945
user: Who was President at that time?
bot: FDR
user: who did he serve with?
-who did President Franklin Delano Roosevelt serve with?
---
user: what year was chuck berry born?
bot: 1942
user: where?
-where was Chuck Berry born?
---
user: What year did Woodstock take place?
bot: 1969
user: where was it held?
-where was Woodstock held?
---
user: who wrote the screenplay for Rocky?
bot: Sylvester Stallone
user: what is the plot?
-what is the plot of Rocky by Sylvester Stallone?
---
user: What is the GLOBEX Code for Crude Oil Futures on the Chicago Mercantile Exchange?
-
contextual question: What is the GLOBEX Code for Crude Oil Futures on the Chicago Mercantile Exchange?
https://serpapi.com/search
all search result context: Just getting started with futures? Learn more about futures and the unique advantages of futures trading.
Companies that produce or consume crude oil can manage price risk by using crude oil futures as hedge to lock in the price for the crude oil — the selling price if they produce it or the buying price if they consume it. Investors can also use crude oil futures to hedge against investments in their portfolio that may be sensitive to crude oil price changes.
Crude oil futures provide individual investors with an easy and convenient way to participate in one of the world's most important commodity markets. In addition, a broad cross-section of companies in the energy industry-from those involved in exploration and production to refiners-can use crude oil futures contracts to hedge their price risk. Light, sweet crude is preferred by refiners because of its low sulfur content and relatively high yields of gasoline, diesel fuel, heating oil, and jet fuel. Even companies that are substantial consumers of energy products can use crude oil futures to protect against adverse price fluctuations.
Investors and traders can use crude oil futures to speculate on the future price of crude oil and can be used as an alternative to oil and gas stocks. Crude oil prices can change due to a number of factors but primarily from the perceived changes in supply and demand that comes from both overall output worldwide and the economic health of the industry’s major consuming countries.
It is important to understand the benefits and risks involved with crude oil futures before placing a futures trade. Compared to traditional investments, with crude oil futures you can trade nearly 24 hours a day during the trading week and take advantage of trading opportunities regardless of market direction. Crude oil futures also provide the ability to trade with greater leverage and allow a more efficient use of trading capital. However, trading leveraged products like crude oil futures also involves the risk that losses can exceed the amount originally invested and may not be suitable for all investors.
This site is designed for U.S. residents. Non-U.S. residents are subject to country-specific restrictions. Learn more about our services for non-U.S. residents.
The Charles Schwab Corporation provides a full range of brokerage, banking and financial advisory services through its operating subsidiaries. Its broker-dealer subsidiary, Charles Schwab & Co., Inc. (Member SIPC), offers investment services and products, including Schwab brokerage accounts. Its banking subsidiary, Charles Schwab Bank, SSB (member FDIC and an Equal Housing Lender), provides deposit and lending services and products. Access to Electronic Services may be limited or unavailable during periods of peak demand, market volatility, systems upgrade, maintenance, or for other reasons.
Crude oil futures on the New York Mercantile Exchange (NYMEX) are the world's most actively traded futures contract on a physical commodity. Because of its excellent liquidity and price transparency, the contract is used as a principal international pricing benchmark. The NYMEX also offers trading in heating oil futures and gasoline futures.
At Schwab, you get access to specialize trading tools and resources, such as real-time crude oil futures quotes, timely research and education, and other helpful insights.
Charles Schwab Futures and Forex LLC (NFA Member) and Charles Schwab & Co., Inc. (Member FINRA/SIPC) are separate but affiliated companies and subsidiaries of The Charles Schwab Corporation.
Considering trading crude oil futures? Here are the crude oil futures contract specifications.
Futures and futures options trading involves substantial risk and is not suitable for all investors. Please read the Risk Disclosure for Futures and Options prior to trading futures products. Futures accounts are not protected by SIPC. Futures and futures options trading services provided by Charles Schwab Futures and Forex LLC. Trading privileges subject to review and approval. Not all clients will qualify.
Crude oil futures are 1,000 barrels per contract, traded from 6:00 p.m. U.S. until 5:00 p.m. U.S. ET, all months of the year. However, you can trade more than just NYMEX crude oil futures online with Schwab. We also offer Brent crude oil futures as well as E-mini crude oil futures, which are just 50% of the size of a standard futures contract. E-mini crude futures trade exclusively on the Chicago Mercantile Exchange's Globex® platform nearly 24 hours per day.
Please visit this URL to
review a list of supported browsers.
The ICE West Texas Intermediate (WTI) Light Sweet Crude Oil Futures
Contract offers participants the opportunity to trade one of the
world's most liquid oil commodities in an electronic marketplace.
The contract not only brings the benefits of electronic trading a
US light sweet crude marker, but also brings together the world's
most significant crude benchmarks on a single exchange: Brent,
(Platts) Dubai, and WTI, as well as the emerging benchmarks Murban
and Midland WTI AGC. This offers a reduction in collateral
requirements through the offsetting of margins.
Take advantage of $2.25 per contract pricing plus specialized tools, research, and support.
© 2022 Charles Schwab & Co., Inc. All rights reserved. Member SIPC. Unauthorized access is prohibited. Usage will be monitored.
relevant result context: Take advantage of $2.25 per contract pricing plus specialized tools, research, and support.
© 2022 Charles Schwab & Co., Inc. All rights reserved. Member SIPC. Unauthorized access is prohibited. Usage will be monitored.
Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
answer: CL
Source:
https://www.schwab.com/futures/crude-oil
question:
I would like to suggest to develop a minimal web service / REST demo, e.g. rest_demo.py using for example Flask
from flask import Flask
from flask_swagger_ui import get_swaggerui_blueprint
which would allow other applications/services to integrate grounded-qa using REST.
WDYT?
Since the free plan at SerpAPI only allows 100 requests per month, I would like to suggest, that one can Mock the SerpAPI for testing and development.
Inside
qa/search.py#get_results_paragraphs_multi_process(...)
maybe we could add something like
if mock:
results = mock_search()
else:
results = serp_api_search(search_term, serp_api_token, url)
WDYT?
When entering the question "What is the capital of Brazil?", then I received as answer "Rio de Janeiro", which is not correct, but also not completely wrong, whereas see the log below to understand why:
question: What is the capital of Brazil?
contextual question prompt: user: How much does a watermellon weight?
bot: The average watermellon weighhts about 20 - 25 pounds
user: where do they grow?
-where do watermellons grow?
---
user: who was the president in 2010
bot: Obama was president in 2010
user: when was Obama born?
bot: August 4, 1961
user: who wrote attention is all you need?
-who wrote attention is all you need?
---
user: can people live on mars?
bot: People cannot live without advanced technology.
user: what is the killers latest album?
bot: Pressure Machine
user: was it well recieved
-was Pressure Machine by the Killers well recieved?
---
user: When is ACL is this year?
bot: ACL will be september 2nd
user: where?
-where is ACL this year?
---
user: but speaking of do you know when that band formed?
bot: 1967
user: who was their guitarist?
-who was the guitarist of The Band?
---
user: when did world war 2 end?
bot: September 2, 1945
user: Who was President at that time?
bot: FDR
user: who did he serve with?
-who did President Franklin Delano Roosevelt serve with?
---
user: what year was chuck berry born?
bot: 1942
user: where?
-where was Chuck Berry born?
---
user: What year did Woodstock take place?
bot: 1969
user: where was it held?
-where was Woodstock held?
---
user: who wrote the screenplay for Rocky?
bot: Sylvester Stallone
user: what is the plot?
-what is the plot of Rocky by Sylvester Stallone?
---
user: What is the capital of Brazil?
-
contextual question: what is the capital of Brazil?
https://serpapi.com/search
all search result context:
Dr. David William Foster, Arizona State University
March 21, 2014 (Fri) 3:00 – 5:00 p.m.
UNM Continuing Education Auditorium
1634 University Blvd NE (at the intersection with Indian School Rd)
Supported by New Mexico Humanities Council, Sandia National Labs & University of New Mexico
David William Foster (Ph.D. & MA, University of Washington) is Regents’ Professor of Spanish, Humanities, and Women’s Studies at Arizona State University. He served as Chair of the Department of Languages and Literatures from 1997-2001. His research interests focus on urban culture in Latin America, with emphasis on issues of gender construction and sexual identity, as well as Jewish culture. He has held Fulbright teaching appointments in Argentina, Brazil, and Uruguay. He has also served as an Inter-American Development Bank Professor in Chile. Foster’s most recent books are Urban Photography in Argentina (2007) and São Paulo: Perspectives on the City and Cultural Production (2011). Foster conducted a seminar in 2010 on Brazilian Urban Fiction as part of the National Endowment for the Humanities Summer Seminars for College and University Teachers, and in 2007 he conducted a seminar, in Argentina, on Jewish Buenos Aires. Foster is past-President of the Latin American Jewish Studies Association.
In reality, Brazil has only had three official capitals, each one relating to a distinct phase in its social and cultural history. Salvador de Bahia was the early Portuguese colonial capital. Rio de Janeiro became the capital of colonial Brazil during its most dynamic development in the 19th century, and it remained capital throughout its serving as seat of the Portuguese empire, as capital of the Brazilian empire, and as capital of the Republic of Brazil, until the mid-twentieth century. In 1960, the modernist capital of Brasília, serving what is now a Latin American superpower, is inaugurated, although the transition from Rio to Brasília is not without problems. Alongside these three official capitals, from early in the 20th century, São Paulo emerges as the major industrial center of Latin America and as the financial capital of Brazil and of Latin America as a whole. Today São Paulo, while Rio remains the historical capital of Brazil and the capital of the tourist imaginary of Brazil, serves as the cultural capital of the country.
relevant result context:
David William Foster (Ph.D. & MA, University of Washington) is Regents’ Professor of Spanish, Humanities, and Women’s Studies at Arizona State University. He served as Chair of the Department of Languages and Literatures from 1997-2001. His research interests focus on urban culture in Latin America, with emphasis on issues of gender construction and sexual identity, as well as Jewish culture. He has held Fulbright teaching appointments in Argentina, Brazil, and Uruguay. He has also served as an Inter-American Development Bank Professor in Chile. Foster’s most recent books are Urban Photography in Argentina (2007) and São Paulo: Perspectives on the City and Cultural Production (2011). Foster conducted a seminar in 2010 on Brazilian Urban Fiction as part of the National Endowment for the Humanities Summer Seminars for College and University Teachers, and in 2007 he conducted a seminar, in Argentina, on Jewish Buenos Aires. Foster is past-President of the Latin American Jewish Studies Association.
In reality, Brazil has only had three official capitals, each one relating to a distinct phase in its social and cultural history. Salvador de Bahia was the early Portuguese colonial capital. Rio de Janeiro became the capital of colonial Brazil during its most dynamic development in the 19th century, and it remained capital throughout its serving as seat of the Portuguese empire, as capital of the Brazilian empire, and as capital of the Republic of Brazil, until the mid-twentieth century. In 1960, the modernist capital of Brasília, serving what is now a Latin American superpower, is inaugurated, although the transition from Rio to Brasília is not without problems. Alongside these three official capitals, from early in the 20th century, São Paulo emerges as the major industrial center of Latin America and as the financial capital of Brazil and of Latin America as a whole. Today São Paulo, while Rio remains the historical capital of Brazil and the capital of the tourist imaginary of Brazil, serves as the cultural capital of the country.
answer: Rio de Janeiro
Source:
http://www.abqinternational.org/four-capitals-of-brazil/
Despite the not so clear source, one would expect that the answer is "Brasilia" and not "Rio de Janeiro".
I followed the installation instructions and when running python3 cli_demo.py ... then I receive the following error
Traceback (most recent call last):
File "cli_demo.py", line 13, in <module>
from qa.bot import GroundedQaBot
File "/Users/michaelwechner/src/sandbox-grounded-qa/qa/bot.py", line 11, in <module>
import cohere
ModuleNotFoundError: No module named 'cohere'
whereas when I installed the dependencies (pip3 install -r requirements.txt) I received various errors:
Building wheel for numpy (PEP 517) ... error
ERROR: Command errored out with exit status 1:
..
ERROR: Failed building wheel for numpy
...
Building wheel for pyarrow (PEP 517) ... error
ERROR: Command errored out with exit status 1:
...
ERROR: Failed building wheel for pyarrow
....
Building wheel for pillow (setup.py) ... error
ERROR: Command errored out with exit status 1:
....
Running setup.py clean for pillow
....
Failed to build numpy pyarrow pillow
ERROR: Could not build wheels for numpy, pyarrow which use PEP 517 and cannot be installed directly
I was running it on Mac OS Monterey 12.2.1
Python 3.8.9
pip 20.2.3
Any idea what I might have to change? Thanks for any pointers!
Hi, this is a very cool project, thanks for open sourcing it!
I just watched your intro video in which you showed the failure mode on questions like "Who is the king of America?" and would like to propose a potential solution to this by adding one more step to the question answering system:
Maybe I will work on a PR sometime in the next week but I wanted to leave the idea here in case somebody else wants to do it quicker
My question was simple:
What is the GLOBEX Code for Crude Oil Futures on the Chicago Mercantile Exchange?
I get a traceback:
https://serpapi.com/search
Traceback (most recent call last):
File "/Users/vijaysaraswat/Documents/code/sandbox-grounded-qa/cli_demo.py", line 26, in <module>
reply = bot.answer(question, verbosity=args.verbosity, n_paragraphs=2)
File "/Users/vijaysaraswat/Documents/code/sandbox-grounded-qa/qa/bot.py", line 42, in answer
answer_text, source_urls, source_texts = answer_with_search(question,
File "/Users/vijaysaraswat/Documents/code/sandbox-grounded-qa/qa/answer.py", line 83, in answer_with_search
response = answer(question, context, co, chat_history=chat_history, model=model)
File "/Users/vijaysaraswat/Documents/code/sandbox-grounded-qa/qa/answer.py", line 39, in answer
prediction = co.generate(model=model,
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/site-packages/cohere/client.py", line 115, in generate
response = self.__request(json_body, cohere.GENERATE_URL)
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/site-packages/cohere/client.py", line 289, in __request
raise CohereError(message=res['message'], http_status=response.status_code, headers=response.headers)
cohere.error.CohereError: too many tokens: total number of tokens (prompt and prediction) cannot exceed 2048 - received 2954. Try using a shorter prompt or a smaller max_tokens value.
Exception ignored in: <function Pool.__del__ at 0x10d690160>
Traceback (most recent call last):
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/pool.py", line 268, in __del__
self._change_notifier.put(None)
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/queues.py", line 378, in put
self._writer.send_bytes(obj)
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
self._send(header + buf)
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 373, in _send
n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor
Exception ignored in: <function Pool.__del__ at 0x10d690160>
Traceback (most recent call last):
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/pool.py", line 268, in __del__
self._change_notifier.put(None)
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/queues.py", line 378, in put
self._writer.send_bytes(obj)
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
self._send(header + buf)
File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 373, in _send
n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor
Looks like a bug in extracting the text from the webpage...?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.