Coder Social home page Coder Social logo

newsapi-python's Introduction

newsapi-python

A Python client for the News API.

License PyPI Status Python

License

Provided under MIT License by Matt Lisivick.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

General

This is a Python client library for News API V2. The functions and methods for this library should mirror the endpoints specified by the News API documentation.

Installation

Installation for the package can be done via pip:

$ python -m pip install newsapi-python

Usage

After installation, import the client class into your project:

from newsapi import NewsApiClient

Initialize the client with your API key:

api = NewsApiClient(api_key='XXXXXXXXXXXXXXXXXXXXXXX')

Endpoints

An instance of NewsApiClient has three instance methods corresponding to three News API endpoints.

Top Headlines

Use .get_top_headlines() to pull from the /top-headlines endpoint:

api.get_top_headlines(sources='bbc-news')

Everything

Use .get_everything() to pull from the /everything endpoint:

api.get_everything(q='bitcoin')

Sources

Use .get_sources() to pull from the /sources endpoint:

api.get_sources()

For Windows users printing to cmd or powershell

You will encounter an error if you attempt to print the .json() object to the command line. This is because the '{', '}' curly braces to be printed to the console. This becomes especially annoying if developers wish to get 'under the hood'.

Here is the error:

UnicodeEncodeError: 'charmap' codec can't encode character '\u2019' in position 1444: character maps to <undefined>

This can be fixed by: - installing 'win-unicode-console' py -mpip install win-unicode-console - then running it while calling your python script... py -mrun myPythonScript.py

Another option is hardcoding your console to only print in utf-8. This is a bad idea, as it could ruin many other scripts and/or make errors MUCH more difficult to track. More information.

Support

Feel free to make suggestions or provide feedback regarding the library. Thanks. Reach out at [email protected]

newsapi-python's People

Contributors

arnmishra avatar bantaisaiah avatar bsolomon1124 avatar carterfawson-code avatar danielmichaels avatar dependabot[bot] avatar jborchma avatar jonnymaserati avatar mattlisiv avatar menarguez avatar michaelbalazs avatar reidarst avatar rickykim93 avatar shubham24006 avatar theteleforce avatar tmm6907 avatar tobialbert avatar toxicmender avatar tyknot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

newsapi-python's Issues

Missing searchIn

Describe the bug
There's a qintitle param but the API support a searchIn param, so the current functionality is limited.

Set default language as en

Hey guys, I'm using the module to create a headline display, using some dot displays, thanks for your efforts on this code, I liked it a lot!

Now regarding my suggestion,
What about setting the default language to en, this way we can use newsapi.get_top_headlines() only, without any arguments. Currently if we do so it raises an exception.

I can send a pull request to implement it if you want.

Enable use of proxies

It could be useful to enable the use of proxies leveraging the proxies parameter from standard requests library.

Issue with endpoint

When I type this code:

btc_headlines = newsapi.get_everything(q="bitcoin", language="en", page_size=100,sort_by="relevancy")
btc_articles = btc_headlines["articles"]
btc_articles[0]

I get error:

TypeError: expected string or bytes-like object

Could you please advise what could be the issue? Thanks.

Note: This happens on Macbook Pro M1. On Intel Mac it works fine.

str and unicode

In order to work properly with parameters passed as unicode strings, the correct way to check if a parameter is a string in Python 2.x in newsapi_client.py is:

if isinstance(q, basestring)

instead of:

if type(q) == str

basestring includes both str and unicode.

See python docs for reference.

get full article text

Is your feature request related to a problem? Please describe.
I'd like to be able to request the full text of the article in a secondary api call.

Describe the solution you'd like
see above

Describe alternatives you've considered
n/a

Additional context
n/a

Syntax Error

Hello,

just installed the package and got the following error:

  File "<stdin>", line 1, in <module>
  File "/anaconda3/envs/ff/lib/python3.6/site-packages/newsapi/__init__.py", line 1, in <module>
    from newsapi.newsapi_client import NewsApiClient
  File "/anaconda3/envs/ff/lib/python3.6/site-packages/newsapi/newsapi_client.py", line 12
    def get_top_headlines(self, q=None: str, sources=None: str, language='en': str, country=None: str, category=None: str, page_size=20: int,
                                      ^
SyntaxError: invalid syntax

I can see the change coming in from this PR.

Do I need a specific version of something or am I missing something else?

Thanks in advance.

EDIT:

Just tested it with version 0.2.3 and I can confirm that it's working. So presumably the whole thing blowed up 2 hours ago, when the version was bumped. 👼

Wrong example for python

I follow this document https://newsapi.org/docs/client-libraries/python and test python api.

Seems like the follow code will get error I paste.
sources = newsapi.get_sources()

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "crawler.py", line 93, in <module>
    crawler.crawl()
  File "crawler.py", line 84, in crawl
    sources = newsapi.get_sources()
  File "/Users/mabodx/anaconda/lib/python3.6/site-packages/newsapi/newsapi_client.py", line 311,
 in get_sources
    r = requests.get(const.SOURCES_URL, auth=self.auth, timeout=30, params=payload)
  File "/Users/mabodx/anaconda/lib/python3.6/site-packages/requests/api.py", line 72, in get
    return request('get', url, params=params, **kwargs)
  File "/Users/mabodx/anaconda/lib/python3.6/site-packages/requests/api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Users/mabodx/anaconda/lib/python3.6/site-packages/requests/sessions.py", line 513, in r
equest
    resp = self.send(prep, **send_kwargs)
  File "/Users/mabodx/anaconda/lib/python3.6/site-packages/requests/sessions.py", line 623, in s
end
    r = adapter.send(request, **kwargs)
  File "/Users/mabodx/anaconda/lib/python3.6/site-packages/requests/adapters.py", line 514, in s
end
    raise SSLError(e, request=request)
requests.exceptions.SSLError: ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)

TypeError: super(type, obj): obj must be an instance or subtype of type

When I run:
api.get_everything(q='bitcoin')

I get
api.get_everything(q='bitcoin')
Traceback (most recent call last):

File "", line 1, in
api.get_everything(q='bitcoin')

File "C:\Users\LapLap\Anaconda3\envs\fluffy\lib\site-packages\newsapi\newsapi_client.py", line 248, in get_everything
r = requests.get(const.EVERYTHING_URL, auth=self.auth, timeout=30, params=payload)

File "C:\Users\LapLap\Anaconda3\envs\fluffy\lib\site-packages\requests\api.py", line 72, in get
return request('get', url, params=params, **kwargs)

File "C:\Users\LapLap\Anaconda3\envs\fluffy\lib\site-packages\requests\api.py", line 57, in request
with sessions.Session() as session:

File "C:\Users\LapLap\Anaconda3\envs\fluffy\lib\site-packages\requests\sessions.py", line 386, in init
self.mount('https://', HTTPAdapter())

File "C:\Users\LapLap\Anaconda3\envs\fluffy\lib\site-packages\requests\adapters.py", line 120, in init
super(HTTPAdapter, self).init()

TypeError: super(type, obj): obj must be an instance or subtype of type

get_top_headlines() not working

When get_top_headlines() is called in my code, no articles are returned I simply get: {'status': 'ok', 'totalResults': 0, 'articles': []}.

not working

from newsapi import NewsApiClient newsapi = NewsApiClient(api_key='e6702efb133e48418f78ea26f4620e20') top_headlines = newsapi.get_top_headlines(q='bitcoin', sources='bbc-news,the-verge', category='business', language='en', country='us')
getting error in python3.x
ImportError: cannot import name 'NewsApiClient'

v2/everything - multiple keywords

Hi there,

I tried using multiple keywords but nomatter how I join them, I'm not receiving results any close to what I receive when I use the classic GogleNews Search.
My code:

from newsapi import NewsApiClient
api = NewsApiClient(api_key='xxx')
keywords = ['Neubau', 'Weiße', 'Stadt']
all_articles = api.get_everything(q=','.join(keywords),
                                  sort_by='publishedAt',
                                  language='de')

besides ',' I tried joining the keywords by:

  • '&'
  • '%20'
  • '%2C'

Is there something I miss?

maximumResultsReached - get_everything method

I have been trying to get all the news article related to say money laundering using the get_everything method, but after a certain number of results have been accessed, it gives up this error.

image

Note: I have a paid account, not a developers account. I can access a max of 9900 articles only when let's say the number of articles related to money laundering is 50K. How can I access the remaining articles?

Error regarding Newsapi-python get_top_headlines()

Describe the bug

When I used only sources for the given parameter, I was able to retrieve the data. However, if I used q and country or other parameters, I received 0 results. For example:

This works:

top_headlines = newsapi.get_top_headlines(sources='bbc-news, the-verge')
{'status': 'ok', 'totalResults': 26, 'articles': .......}

But this doesn't work:

top_headlines = newsapi.get_top_headlines()
top_headlines = newsapi.get_top_headlines(q='tesla', country='us')
top_headlines = newsapi.get_top_headlines(category='business')
{'status': 'ok', 'totalResults': 0, 'articles': []}

I also tried:

top_headlines = newsapi.get_top_headlines(language='en-US')

but got this error:

top_headlines = newsapi.get_top_headlines(language='en-US')
  File "myproject/venv/lib/python3.8/site-packages/newsapi/newsapi_client.py", line 120, in get_top_headlines
    raise ValueError("invalid language")
ValueError: invalid language

I replaced the en in newsapi.const.language with en-US and I can receive the data right now but the source of the data is only google-news.

Also, if I used other countries, for example:

top_headlines = newsapi.get_top_headlines(country='jp')

I received 0 results.

Issue with Page

How do I get all the results as opposed to just 20? When I retrieve results it says I have 1404, but I only get 20 of the articles in the articles section of the dictionary. I know it says to use the page parameter but that does not seem to rectify the issue.

there are no data come back

hello,i have set the api key ,and using the right code,and the demo get the "{'status': 'ok', 'totalResults': 0, 'articles': []}",this is no new_data come back ,why

Trying to use qInTitle and getting error

I am trying to use qInTitle when using get_everything and am getting the following error:
get_everything() got an unexpected keyword argument 'qInTitle'

Is the required argument slightly different?

Bug report


TypeError Traceback (most recent call last)
in
----> 1 all_bitcoin_articles=newsapi.get_everything(q='bitcoin')

TypeError: expected string or bytes-like object

Add search_in parameters

Is your feature request related to a problem? Please describe.
Please consider adding the search_in parameter.

Describe the solution you'd like
I would like to use the search_in parameter for niche lookups.

Describe alternatives you've considered
I could just filter myself after I get the results

Content is not fully returned when run get_top_headlines()

The content is returned like this:
"content": "The company operating the National Broadband Network has claimed competition from wireless services including Elon Musks Starlink is threatening the viability of its business, as retail internet prov\u2026 [+2829 chars]"
which has [+2829] at the end

To Reproduce
sample: headlines = newsapi.get_top_headlines(q='', category='business')

Expected behavior
A response is returned okay but the content of each article is not fully returned

Not able to get sources for the specific countries

Describe the bug
The issue is that I'm able to get sources for countries like: ae, at, be, bg, ch, cn, co, cu, ua, ru.

To Reproduce
from newsapi import NewsApiClient

newsapi = NewsApiClient(api_key="api_key")
print(newsapi.get_sources(country="ua"))

{'status': 'ok', 'sources': []}

Expected behavior
Should be able to get all sources for all countries

Documentation typo

https://newsapi.org/docs/client-libraries/python

Following example:

# /v2/everything
all_articles = newsapi.get_everything(q='bitcoin',
                                      sources='bbc-news,the-verge',
                                      domains='bbc.co.uk,techcrunch.com',
                                      from_parameter='2017-12-01',
                                      to='2017-12-12',
                                      language='en',
                                      sortBy='relevancy',
                                      page=2)

sortBy should be sort_by

Can't get top headlines for some countries (even though supported by the API)

The default language of get_top_headlines seems "en". When needed to get specific news for a country, if language is not changed article count seems 0.

Change the language get the news, could be a choice.

BUT

The problem occurs when the language is not supported by the API
top_headlines = newsapi.get_top_headlines(country="tr",language = "tr")
print(top_headlines)

Above code gives" invalid language error".
But in the API doc, specific news are supported for Turkey.
Since in the client, language has to be changed to get the news and language is not defined tor Turkey in the API, it raises an error

Also the issue exists in getting category specific news for countries

newsapi.newsapi_exception.NewsAPIException

While trying to run the following code snippet , which is actually a part of documentation :

all_articles = newsapi.get_everything(q='bitcoin',
                                      sources='bbc-news,the-verge',
                                      domains='bbc.co.uk,techcrunch.com',
                                      from_param='2017-12-01',
                                      to='2017-12-12',
                                      language='en',
                                      sort_by='relevancy',
                                      page=2)

I am getting this error again and again :

Traceback (most recent call last):
  File "news.py", line 16, in <module>
    page=2)
  File "C:\Python27\lib\site-packages\newsapi\newsapi_client.py", line 261, in get_everything
    raise NewsAPIException(r.json())
newsapi.newsapi_exception.NewsAPIException

Any suggestions on why this might be happening ?

Library Partial Import Error

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior. For instance:

  1. conda activate venv
  2. pip install newsapi_python
from newsapi import NewsApiClient
# Init
api = NewsApiClient(api_key='xxxxxxxxxxxx')
# /v2/everything
all_articles = api.get_everything(q='mars')
print(all_articles)
  1. (venv) C:\Users\abc\anaconda3\envs\venv\bin>C:/Users/abc/anaconda3/envs/venv/python.exe

Expected behavior
News Data

Screenshots

d:/Projects/advisely/advisely/API/newsapi.py
Traceback (most recent call last):
  File "d:/Projects/newsoftoday/API/newsapi.py", line 3, in <module>
    from newsapi import NewsApiClient
  File "d:\Projects\newsoftoday\API\newsapi.py", line 3, in <module>
    from newsapi import NewsApiClient
ImportError: cannot import name 'NewsApiClient' from partially initialized module 'newsapi' (most likely due to a circular import) (d:\Projects\newsoftoday\API\newsapi.py)

Desktop (please complete the following information):

  • OS: Windows 10
  • Visual Studio Code
  • Version [1.57.1]

Additional context

(venv) C:\Users\abc\anaconda3\envs\venv\bin>pip install newsapi_python   
Collecting newsapi_python
  Using cached newsapi_python-0.2.6-py2.py3-none-any.whl (7.9 kB)
Requirement already satisfied: requests<3.0.0 in c:\users\abc\anaconda3\envs\venv\lib\site-packages (from newsapi_python) (2.25.1)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\abc\anaconda3\envs\venv\lib\site-packages (from requests<3.0.0->newsapi_python) (1.26.5)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\abc\anaconda3\envs\venv\lib\site-packages (from requests<3.0.0->newsapi_python) (2021.5.30)
Requirement already satisfied: idna<3,>=2.5 in c:\users\abc\anaconda3\envs\venv\lib\site-packages (from requests<3.0.0->newsapi_python) (2.10)
Requirement already satisfied: chardet<5,>=3.0.2 in c:\users\abc\anaconda3\envs\venv\lib\site-packages (from requests<3.0.0->newsapi_python) (4.0.0)
Installing collected packages: newsapi-python
Successfully installed newsapi-python-0.2.6

Enum field checking

Some of the fields have a limited set of strings that are acceptable, such as category, country, or language. It would probably be more user friendly if these were checked for validity before being passed to the API.

Enums would also be possible, though that could make for more work on the library user depending on how they are writing their code, and could also be a breaking change.

I already imported newsapi and i got sucessfully package installed but i am getting no module error for NewsApiClient

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior. For instance:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Best way to track search results by day

There's no problem with the code—I just don't know much programming.

So I have a list of terms that I'd like to create timeseries of, by days for about a 5-6 month period.

Hopefully, I can just change the keyword, run the script, and end up with a .csv file with just two or three columns for the date, and the # of results on that day.

Using the Dev version, which is limited to 500 requsts a day. I don't think I can get the actual dates for individual articles if I use a long period of time right? So I guess I'll have to do each day separately.

Can this be done with loops to print the total number of articles per day? And would anyone be willing to help me out (please?)? Would be very much appreciated :)

Thank you!

Language list contains unavaliable or invalid codes

Describe the bug
Langauge list in newsapi/const.py contains invalid language and 2 of them don't follow ISO-639-1 standard.

To Reproduce
Try to use 'se' country code with /everything endpoint, it won't give results and isn't available on NewsAPI documentation.

'cn' and 'en-US' aren't ISO-639-1 codes

Desktop (please complete the following information):

  • OS: MacOS, Arch Linux
  • Browser: Chrome, VSCode (Python 3.10, iso639 library)

Pagesize is missing

Hi, and thank you very much for this API!

In the method get_everything in newsapi_client class, there is currently no parameter taking pagesize. Since it is possible through the utilization of Curl, implementing this in the Python library also, would be of great value, so you do not need to look through so many pages.

Best regards

ImportError: cannot import name 'NewsApiClient' from 'newsapi

pip install newsapi-python

from newsapi import NewsApiClient

its suppost to allow me to all newsapiclient into other scripts but insted it says

line 4, in
from newsapi import NewsApiClient
ImportError: cannot import name 'NewsApiClient' from 'newsapi' (C:\Users"my name"\AppData\Local\Programs\Python\Python311\Lib\site-packages\newsapi_init_.py

ValueError: Timeout value connect was Timeout(connect=30, read=30, total=None), but it must be an int, float or None.

from newsapi import NewsApiClient

api = NewsApiClient(api_key='XXXXXXXXXXXXXXXXXXXXXXXX')

api.get_top_headlines(sources='bbc-news')

Traceback (most recent call last):

File "", line 1, in
api.get_top_headlines(sources='bbc-news')

File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\newsapi\newsapi_client.py", line 115, in get_top_headlines
r = requests.get(const.TOP_HEADLINES_URL, auth=self.auth, timeout=30, params=payload)

File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\api.py", line 70, in get
return request('get', url, params=params, **kwargs)

File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\api.py", line 56, in request
return session.request(method=method, url=url, **kwargs)

File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\sessions.py", line 488, in request
resp = self.send(prep, **send_kwargs)

File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\sessions.py", line 609, in send
r = adapter.send(request, **kwargs)

File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\adapters.py", line 423, in send
timeout=timeout

File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 587, in urlopen
timeout_obj = self._get_timeout(timeout)

File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 302, in _get_timeout
return Timeout.from_float(timeout)

File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\packages\urllib3\util\timeout.py", line 154, in from_float
return Timeout(read=timeout, connect=timeout)

File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\packages\urllib3\util\timeout.py", line 94, in init
self._connect = self._validate_timeout(connect, 'connect')

File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\packages\urllib3\util\timeout.py", line 127, in _validate_timeout
"int, float or None." % (name, value))

ValueError: Timeout value connect was Timeout(connect=30, read=30, total=None), but it must be an int, float or None.

is_valid_string is too restrictive

Describe the bug
The API is able to handle multiple search terms or phrases if a list of strings is passed as the q arg.

e.g. if q = [google, "google ai"] is passed and parsed to the payload dict then it will return a successful result. However, the input check won't currently allow this, limiting functionality.

Error using from_parameter

The code below errors out:

try:
    newsapi = NewsApiClient(api_key='my key')
    everything = newsapi.get_everything(q='bitcoin',from_parameter='2018-08-01',sources='us',language='en',page_size=100)
except Exception as e:
    print(e)

This is what the exception prints:

get_everything() got an unexpected keyword argument 'from_parameter'

I'm using Python3.6 on Ubuntu 16.04.

cant import newsapi

from newsapi import NewsAPIClient

ImportError: cannot import name 'NewsAPIClient'

I have newsapi installed and I have an api key yet it still wont work

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.