Coder Social home page Coder Social logo

sandbox-grounded-qa's People

Contributors

chirag127 avatar dependabot[bot] avatar lvwerra avatar michaelwechner avatar neilatcohere avatar nickfrosst avatar stewart-co avatar yichern avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sandbox-grounded-qa's Issues

--help is not helpful

 python3 cli_demo.py --cohere_api_key ...  --serp_api_key  ...  --help
usage: cli_demo.py [-h] --cohere_api_key COHERE_API_KEY --serp_api_key SERP_API_KEY [--verbosity VERBOSITY]

A grounded QA bot with cohere and google search

optional arguments:
  -h, --help            show this help message and exit
  --cohere_api_key COHERE_API_KEY
                        api key for cohere
  --serp_api_key SERP_API_KEY
                        api key for serpAPI
  --verbosity VERBOSITY
                        verbosity level

What are the verbosity levels?

Log exception when web page cannot be loaded

I experienced some cases where web pages were not available, because the server was down, or for example

ERROR: Page 'https://ch.linkedin.com/in/michaelwechner' could not be loaded! Exception message: HTTP Error 999: INKApi Error

or also because of issues re SSL certificate verification.

Therefore I think it would be good to log the exception message inside qa/search.py

def get_paragraphs_text_from_url(k):
    """Extract a list of paragraphs from the contents pointed to by an url."""

    i, search_result_url = k
    try:
        html = open_link(search_result_url)
        return paragraphs_from_html(html)
    except Exception as e:
        pretty_print("FAIL", f"ERROR: Page '{search_result_url}' could not be loaded! Exception: {e}")
        return []

Mystifying, but correct response

It is critical -- for the application I have in mind -- that the API return the text (and its url) justifying the answer.

Oddly, this time I do get a response, and it is correct. But the stated url does not have this information! The entire trace does not have this information (CL). It just mysteriously produces the correct answer.

Looks like the log does not show all the texts processed...?

  python3 cli_demo.py --cohere_api_key ...  --serp_api_key ... --verbosity 2000
question:  What is the GLOBEX Code for Crude Oil Futures on the Chicago Mercantile Exchange?
contextual question prompt: user: How much does a watermellon weight?
bot: The average watermellon weighhts about 20 - 25 pounds
user: where do they grow?
-where do watermellons grow? 
---
user: who was the president in 2010
bot: Obama was president in 2010
user: when was Obama born?
bot: August 4, 1961
user: who wrote attention is all you need?
-who wrote attention is all you need? 
---
user: can people live on mars?
bot: People cannot live without advanced technology.
user: what is the killers latest album?
bot: Pressure Machine
user: was it well recieved
-was Pressure Machine by the Killers well recieved?
---
user: When is ACL is this year? 
bot: ACL will be september 2nd
user: where?
-where is ACL this year? 
---
user: but speaking of do you know when that band formed?
bot: 1967
user: who was their guitarist?
-who was the guitarist of The Band?
---
user: when did world war 2 end?
bot: September 2, 1945
user: Who was President at that time?
bot: FDR
user: who did he serve with?
-who did President Franklin Delano Roosevelt serve with? 
---
user: what year was chuck berry born?
bot: 1942
user: where?
-where was Chuck Berry born? 
---
user: What year did Woodstock take place?
bot: 1969
user: where was it held?
-where was Woodstock held? 
---
user: who wrote the screenplay for Rocky?
bot: Sylvester Stallone
user: what is the plot?
-what is the plot of Rocky by Sylvester Stallone?
---
user:  What is the GLOBEX Code for Crude Oil Futures on the Chicago Mercantile Exchange?
-
contextual question: What is the GLOBEX Code for Crude Oil Futures on the Chicago Mercantile Exchange? 

https://serpapi.com/search
all search result context: Just getting started with futures? Learn more about futures and the unique advantages of futures trading.
Companies that produce or consume crude oil can manage price risk by using crude oil futures as hedge to lock in the price for the crude oil — the selling price if they produce it or the buying price if they consume it. Investors can also use crude oil futures to hedge against investments in their portfolio that may be sensitive to crude oil price changes.
Crude oil futures provide individual investors with an easy and convenient way to participate in one of the world's most important commodity markets. In addition, a broad cross-section of companies in the energy industry-from those involved in exploration and production to refiners-can use crude oil futures contracts to hedge their price risk. Light, sweet crude is preferred by refiners because of its low sulfur content and relatively high yields of gasoline, diesel fuel, heating oil, and jet fuel. Even companies that are substantial consumers of energy products can use crude oil futures to protect against adverse price fluctuations.
 
Investors and traders can use crude oil futures to speculate on the future price of crude oil and can be used as an alternative to oil and gas stocks. Crude oil prices can change due to a number of factors but primarily from the perceived changes in supply and demand that comes from both overall output worldwide and the economic health of the industry’s major consuming countries.
It is important to understand the benefits and risks involved with crude oil futures before placing a futures trade. Compared to traditional investments, with crude oil futures you can trade nearly 24 hours a day during the trading week and take advantage of trading opportunities regardless of market direction. Crude oil futures also provide the ability to trade with greater leverage and allow a more efficient use of trading capital. However, trading leveraged products like crude oil futures also involves the risk that losses can exceed the amount originally invested and may not be suitable for all investors.
This site is designed for U.S. residents. Non-U.S. residents are subject to country-specific restrictions. Learn more about our services for non-U.S. residents.
The Charles Schwab Corporation provides a full range of brokerage, banking and financial advisory services through its operating subsidiaries. Its broker-dealer subsidiary, Charles Schwab & Co., Inc. (Member SIPC), offers investment services and products, including Schwab brokerage accounts. Its banking subsidiary, Charles Schwab Bank, SSB (member FDIC and an Equal Housing Lender), provides deposit and lending services and products. Access to Electronic Services may be limited or unavailable during periods of peak demand, market volatility, systems upgrade, maintenance, or for other reasons.
Crude oil futures on the New York Mercantile Exchange (NYMEX) are the world's most actively traded futures contract on a physical commodity. Because of its excellent liquidity and price transparency, the contract is used as a principal international pricing benchmark. The NYMEX also offers trading in heating oil futures and gasoline futures.
At Schwab, you get access to specialize trading tools and resources, such as real-time crude oil futures quotes, timely research and education, and other helpful insights. 
Charles Schwab Futures and Forex LLC (NFA Member) and Charles Schwab & Co., Inc. (Member FINRA/SIPC) are separate but affiliated companies and subsidiaries of The Charles Schwab Corporation. 
Considering trading crude oil futures? Here are the crude oil futures contract specifications.
Futures and futures options trading involves substantial risk and is not suitable for all investors. Please read the Risk Disclosure for Futures and Options prior to trading futures products. Futures accounts are not protected by SIPC. Futures and futures options trading services provided by Charles Schwab Futures and Forex LLC. Trading privileges subject to review and approval. Not all clients will qualify. 
Crude oil futures are 1,000 barrels per contract, traded from 6:00 p.m. U.S. until 5:00 p.m. U.S. ET, all months of the year. However, you can trade more than just NYMEX crude oil futures online with Schwab. We also offer Brent crude oil futures as well as E-mini crude oil futures, which are just 50% of the size of a standard futures contract. E-mini crude futures trade exclusively on the Chicago Mercantile Exchange's Globex® platform nearly 24 hours per day. 

    Please visit this URL to
    review a list of supported browsers.
  
The ICE West Texas Intermediate (WTI) Light Sweet Crude Oil Futures
Contract offers participants the opportunity to trade one of the
world's most liquid oil commodities in an electronic marketplace.
The contract not only brings the benefits of electronic trading a
US light sweet crude marker, but also brings together the world's
most significant crude benchmarks on a single exchange: Brent,
(Platts) Dubai, and WTI, as well as the emerging benchmarks Murban
and Midland WTI AGC. This offers a reduction in collateral
requirements through the offsetting of margins.

Take advantage of $2.25 per contract pricing plus specialized tools, research, and support.
© 2022 Charles Schwab & Co., Inc. All rights reserved. Member SIPC. Unauthorized access is prohibited. Usage will be monitored.
relevant result context: Take advantage of $2.25 per contract pricing plus specialized tools, research, and support.
© 2022 Charles Schwab & Co., Inc. All rights reserved. Member SIPC. Unauthorized access is prohibited. Usage will be monitored.
Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
answer: CL
Source:
https://www.schwab.com/futures/crude-oil
question: 

REST Demo

I would like to suggest to develop a minimal web service / REST demo, e.g. rest_demo.py using for example Flask

from flask import Flask
from flask_swagger_ui import get_swaggerui_blueprint

which would allow other applications/services to integrate grounded-qa using REST.

WDYT?

Mock SerpApi

Since the free plan at SerpAPI only allows 100 requests per month, I would like to suggest, that one can Mock the SerpAPI for testing and development.

Inside

qa/search.py#get_results_paragraphs_multi_process(...)

maybe we could add something like

if mock:
    results = mock_search()
else:
    results = serp_api_search(search_term, serp_api_token, url)

WDYT?

The national capital of Brazil is Brasilia

When entering the question "What is the capital of Brazil?", then I received as answer "Rio de Janeiro", which is not correct, but also not completely wrong, whereas see the log below to understand why:

question: What is the capital of Brazil?
contextual question prompt: user: How much does a watermellon weight?
bot: The average watermellon weighhts about 20 - 25 pounds
user: where do they grow?
-where do watermellons grow? 
---
user: who was the president in 2010
bot: Obama was president in 2010
user: when was Obama born?
bot: August 4, 1961
user: who wrote attention is all you need?
-who wrote attention is all you need? 
---
user: can people live on mars?
bot: People cannot live without advanced technology.
user: what is the killers latest album?
bot: Pressure Machine
user: was it well recieved
-was Pressure Machine by the Killers well recieved?
---
user: When is ACL is this year? 
bot: ACL will be september 2nd
user: where?
-where is ACL this year? 
---
user: but speaking of do you know when that band formed?
bot: 1967
user: who was their guitarist?
-who was the guitarist of The Band?
---
user: when did world war 2 end?
bot: September 2, 1945
user: Who was President at that time?
bot: FDR
user: who did he serve with?
-who did President Franklin Delano Roosevelt serve with? 
---
user: what year was chuck berry born?
bot: 1942
user: where?
-where was Chuck Berry born? 
---
user: What year did Woodstock take place?
bot: 1969
user: where was it held?
-where was Woodstock held? 
---
user: who wrote the screenplay for Rocky?
bot: Sylvester Stallone
user: what is the plot?
-what is the plot of Rocky by Sylvester Stallone?
---
user: What is the capital of Brazil?
-
contextual question: what is the capital of Brazil? 

https://serpapi.com/search
all search result context:
Dr. David William Foster, Arizona State University
March 21, 2014 (Fri) 3:00 – 5:00 p.m.
UNM Continuing Education Auditorium
1634 University Blvd NE (at the intersection with Indian School Rd)
Supported by New Mexico Humanities Council, Sandia National Labs & University of New Mexico
David William Foster (Ph.D. & MA, University of Washington) is Regents’ Professor of Spanish, Humanities, and Women’s Studies at Arizona State University. He served as Chair of the Department of Languages and Literatures from 1997-2001. His research inter­ests focus on urban culture in Latin America, with emphasis on issues of gender construction and sexual identity, as well as Jewish culture. He has held Fulbright teaching appointments in Argentina, Brazil, and Uruguay. He has also served as an Inter-American Development Bank Professor in Chile. Foster’s most recent books are Urban Photography in Argentina (2007) and São Paulo: Perspectives on the City and Cultural Production (2011). Foster conducted a seminar in 2010 on Brazilian Urban Fiction as part of the National Endowment for the Humanities Summer Seminars for College and University Teachers, and in 2007 he conducted a seminar, in Argentina, on Jewish Buenos Aires. Foster is past-President of the Latin American Jewish Studies Association.
In reality, Brazil has only had three official capitals, each one relating to a distinct phase in its social and cultural history. Salvador de Bahia was the early Portuguese colonial capital. Rio de Janeiro became the capital of colonial Brazil during its most dynamic development in the 19th century, and it remained capital throughout its serving as seat of the Portuguese empire, as capital of the Brazilian empire, and as capital of the Republic of Brazil, until the mid-twentieth century. In 1960, the modernist capital of Brasília, serving what is now a Latin American superpower, is inaugurated, although the transition from Rio to Brasília is not without problems. Alongside these three official capitals, from early in the 20th century, São Paulo emerges as the major industrial center of Latin America and as the financial capital of Brazil and of Latin America as a whole. Today São Paulo, while Rio remains the historical capital of Brazil and the capital of the tourist imaginary of Brazil, serves as the cultural capital of the country.

relevant result context:
David William Foster (Ph.D. & MA, University of Washington) is Regents’ Professor of Spanish, Humanities, and Women’s Studies at Arizona State University. He served as Chair of the Department of Languages and Literatures from 1997-2001. His research inter­ests focus on urban culture in Latin America, with emphasis on issues of gender construction and sexual identity, as well as Jewish culture. He has held Fulbright teaching appointments in Argentina, Brazil, and Uruguay. He has also served as an Inter-American Development Bank Professor in Chile. Foster’s most recent books are Urban Photography in Argentina (2007) and São Paulo: Perspectives on the City and Cultural Production (2011). Foster conducted a seminar in 2010 on Brazilian Urban Fiction as part of the National Endowment for the Humanities Summer Seminars for College and University Teachers, and in 2007 he conducted a seminar, in Argentina, on Jewish Buenos Aires. Foster is past-President of the Latin American Jewish Studies Association.
In reality, Brazil has only had three official capitals, each one relating to a distinct phase in its social and cultural history. Salvador de Bahia was the early Portuguese colonial capital. Rio de Janeiro became the capital of colonial Brazil during its most dynamic development in the 19th century, and it remained capital throughout its serving as seat of the Portuguese empire, as capital of the Brazilian empire, and as capital of the Republic of Brazil, until the mid-twentieth century. In 1960, the modernist capital of Brasília, serving what is now a Latin American superpower, is inaugurated, although the transition from Rio to Brasília is not without problems. Alongside these three official capitals, from early in the 20th century, São Paulo emerges as the major industrial center of Latin America and as the financial capital of Brazil and of Latin America as a whole. Today São Paulo, while Rio remains the historical capital of Brazil and the capital of the tourist imaginary of Brazil, serves as the cultural capital of the country.
answer: Rio de Janeiro
Source:
http://www.abqinternational.org/four-capitals-of-brazil/

Despite the not so clear source, one would expect that the answer is "Brasilia" and not "Rio de Janeiro".

ModuleNotFoundError: No module named 'cohere'

I followed the installation instructions and when running python3 cli_demo.py ... then I receive the following error

Traceback (most recent call last):
  File "cli_demo.py", line 13, in <module>
    from qa.bot import GroundedQaBot
  File "/Users/michaelwechner/src/sandbox-grounded-qa/qa/bot.py", line 11, in <module>
    import cohere
ModuleNotFoundError: No module named 'cohere'

whereas when I installed the dependencies (pip3 install -r requirements.txt) I received various errors:

Building wheel for numpy (PEP 517) ... error
 ERROR: Command errored out with exit status 1:
..
 ERROR: Failed building wheel for numpy
...
 Building wheel for pyarrow (PEP 517) ... error
 ERROR: Command errored out with exit status 1:
...
 ERROR: Failed building wheel for pyarrow
....
 Building wheel for pillow (setup.py) ... error
 ERROR: Command errored out with exit status 1:
....
 Running setup.py clean for pillow
....
Failed to build numpy pyarrow pillow
ERROR: Could not build wheels for numpy, pyarrow which use PEP 517 and cannot be installed directly

I was running it on Mac OS Monterey 12.2.1

Python 3.8.9
pip 20.2.3

Any idea what I might have to change? Thanks for any pointers!

Checking assumptions of a question

Hi, this is a very cool project, thanks for open sourcing it!

I just watched your intro video in which you showed the failure mode on questions like "Who is the king of America?" and would like to propose a potential solution to this by adding one more step to the question answering system:
Screenshot 2022-11-05 at 12 25 37

Maybe I will work on a PR sometime in the next week but I wanted to leave the idea here in case somebody else wants to do it quicker

traceback -- too many tokens in prompt

My question was simple:

What is the GLOBEX Code for Crude Oil Futures on the Chicago Mercantile Exchange?

I get a traceback:

https://serpapi.com/search
Traceback (most recent call last):
  File "/Users/vijaysaraswat/Documents/code/sandbox-grounded-qa/cli_demo.py", line 26, in <module>
    reply = bot.answer(question, verbosity=args.verbosity, n_paragraphs=2)
  File "/Users/vijaysaraswat/Documents/code/sandbox-grounded-qa/qa/bot.py", line 42, in answer
    answer_text, source_urls, source_texts = answer_with_search(question,
  File "/Users/vijaysaraswat/Documents/code/sandbox-grounded-qa/qa/answer.py", line 83, in answer_with_search
    response = answer(question, context, co, chat_history=chat_history, model=model)
  File "/Users/vijaysaraswat/Documents/code/sandbox-grounded-qa/qa/answer.py", line 39, in answer
    prediction = co.generate(model=model,
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/site-packages/cohere/client.py", line 115, in generate
    response = self.__request(json_body, cohere.GENERATE_URL)
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/site-packages/cohere/client.py", line 289, in __request
    raise CohereError(message=res['message'], http_status=response.status_code, headers=response.headers)
cohere.error.CohereError: too many tokens: total number of tokens (prompt and prediction) cannot exceed 2048 - received 2954. Try using a shorter prompt or a smaller max_tokens value.
Exception ignored in: <function Pool.__del__ at 0x10d690160>
Traceback (most recent call last):
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/pool.py", line 268, in __del__
    self._change_notifier.put(None)
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/queues.py", line 378, in put
    self._writer.send_bytes(obj)
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
    self._send(header + buf)
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 373, in _send
    n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor
Exception ignored in: <function Pool.__del__ at 0x10d690160>
Traceback (most recent call last):
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/pool.py", line 268, in __del__
    self._change_notifier.put(None)
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/queues.py", line 378, in put
    self._writer.send_bytes(obj)
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
    self._send(header + buf)
  File "/Users/vijaysaraswat/anaconda3/envs/py310/lib/python3.9/multiprocessing/connection.py", line 373, in _send
    n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor

Looks like a bug in extracting the text from the webpage...?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.