Coder Social home page Coder Social logo

dankbot's Introduction

Dankbot

PyPIVersion CircleCI CoverageStatus Code Health StoriesInReady

A Slack Bot that scrapes memes from subreddits and posts them to slack

Steps to run

Clone into directory

cd /opt
sudo mkdir dankbot && sudo chown <user>:<user> dankbot
git clone [email protected]:DankCity/dankbot.git

Setup INI file

cd /opt/dankbot
cp dankbot/dankbot.sample.ini dankbot/dankbot.ini

Edit the INI file to fill in the missing token, username, and password fields:

(.venv35)➜  dankbot git:(master) ✗ cat dankbot/dankbot.sample.ini
[dankbot]
# Leave directory blank and dankbot will determine the best place to
# log to your platform
log_to_file: true
directory:
file_name: dankbot.log
backups: 5
max_bytes: 1000000

[slack]
# Follow instructions at https://my.slack.com/services/new/bot
token: <put here>
channel: #random

[reddit]
# r/dankmemes, r/funnygifs, etc
subreddits: dankmemes, funnygifs

[imgur]
# Register at https://api.imgur.com/oauth2/addclient
# Select Anonymous usage
client_id: <your client ID>
client_secret: <your client secret>

[misc]
include_nsfw: false
max_memes: 3

Create and activate a virtual environment

cd /opt/dankbot
virtualenv --python=`which python3` env
source env/bin/activate

Install the python package

cd /opt/dankbot
source env/bin/activate
pip install -e .

Add an entry to your crontab

Edit the crontab with your favorite editor

sudo vi /etc/crontab

And add an entry like so:

# /etc/crontab: system-wide crontab
# Unlike any other crontab you don't have to run the 'crontab'
# command to install the new version when you edit this file
# and files in /etc/cron.d. These files also have username fields,
# that none of the other crontabs do.

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

# m h dom mon dow usernamecommand
*/5 09-17 * * 1-5 root cd /opt/dankbot && source env/bin/activate && dankbot .

This will run dankbot once every 5 minutes, Monday to Friday, between 9 AM and 5 PM CST

dankbot's People

Contributors

levi-rs avatar singularperturbation avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dankbot's Issues

Add CLI args

Add CLI args to dankbot, using clicks package

Bug: Gallery parsing order is incorrect

The parsing order for gallery links is correct. Given this:

    def _parse_as_gallery(self):
        """
        Connects to Imgur to get more info on the gallery
        """
        # Entry point format: imgur.com/gallery/{gallery_post_id}
        # Entry point format: imgur.com/gallery/{gallery_post_id}/new
        gallery_post_id = self.link.split('/')[-1].split('/new')[0]

and this link: http://imgur.com/gallery/OxVV5lL/new

The result would be new

Likewise, given this link: http://imgur.com/gallery/OxVV5lL/

The result is an empty string.

Groom database to remove stale memes

The current implementation of Dankbot stores all memes in a database that have been successfully posted to Slack. When a new Dankbot run is started, and memes are pulled from subreddits, their links are first compared against the database, and any memes already in the database are removed. While this approach works, it allows the database to grow in an unbounded manner, as stale memes are never removed from the database.

Alter Dankbot do do the following:

  • Add a "last seen" column to the 'memes' table, which will contain a datetime string
  • If a meme is found in the database, update it's datetime string
  • Add an entry to the dankbot.ini to specify how long, in days, a meme entry must be stale before it is groomed out of the database
  • Add a new method to Dankbot that grooms the database, removing any memes that haven't been seen in X days, where X is the number of days specified in the dankbot.ini

Bug: Poor exception handling logic can crash dankbot when posting to Slack

The following code correctly catches digestion exceptions:
(from dankbot.py)

        # If any memes are Imgur memes, get more information
        for meme in [meme for meme in pared_memes if isinstance(meme, ImgurMeme)]:
            try:
                meme.digest()
            except Exception:  # pylint: disable=C0103, W0612, W0703
                # TODO: Add exception logging
                pass

However, it fails to remove the meme from the pool of memes. Because it is not removed, and because the _digested flag is never set to True, an UndigestedError gets raised later on, crashing dankbot:
(from memes.py)

        if not self._digested:
            exc_str = "You must digest ImgurMeme objects before attempting to" + \
                      "run img_obj.format_for_slack(). See img_obj.digest()"
            raise UndigestedError(exc_str)

API call to reddit fails occassionally

This call to reddit's API:

for meme in r_client.get_subreddit(sub).get_hot():

Occasionally throws this exception:

[ERROR 2016-05-04 16:00:53,830] Caught exception:
Traceback (most recent call last):
  File "/opt/dankbot/env/lib/python3.4/site-packages/praw/internal.py", line 213, in _ra
ise_response_exceptions
    response.raise_for_status()  # These should all be directly mapped
  File "/opt/dankbot/env/lib/python3.4/site-packages/requests/models.py", line 844, in r
aise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: https://ap
i.reddit.com/r/memes/.json

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/dankbot/dankbot/cli.py", line 55, in main
    DankBot(config, logger).find_and_post_memes()
  File "/opt/dankbot/dankbot/dankbot.py", line 45, in find_and_post_memes
    memes = self.get_memes()
  File "/opt/dankbot/dankbot/dankbot.py", line 94, in get_memes
    for meme in r_client.get_subreddit(sub).get_hot():
  File "/opt/dankbot/env/lib/python3.4/site-packages/praw/__init__.py", line 565, in get_content
    page_data = self.request_json(url, params=params)
  File "<decorator-gen-8>", line 2, in request_json
  File "/opt/dankbot/env/lib/python3.4/site-packages/praw/decorators.py", line 116, in raise_api_exceptions
    return_value = function(*args, **kwargs)
  File "/opt/dankbot/env/lib/python3.4/site-packages/praw/__init__.py", line 620, in request_json
    retry_on_error=retry_on_error)
  File "/opt/dankbot/env/lib/python3.4/site-packages/praw/__init__.py", line 452, in _request
    _raise_response_exceptions(response)
  File "/opt/dankbot/env/lib/python3.4/site-packages/praw/internal.py", line 215, in _raise_response_exceptions
    raise HTTPException(_raw=exc.response)
praw.errors.HTTPException: HTTP error

Move the API call to its own method, where the praw.errors.HTTPException is handled and retried using the retrying package.

Feature: Find a better way to handle non-ASCII characters

The following exceptions show up in the log when non ASCII characters are in a URL:

[ERROR 2016-05-02 10:15:58,740] Bad character in meme: https://www.reddit.com/r/raiseyourdongers/comments/4gknq6/shit_there_hasnt_been_a_shit_in_shit_ヽ_益/
Traceback (most recent call last):
  File "/opt/dankbot/dankbot/dankbot.py", line 117, in in_collection
    resp = cur.execute(query)
  File "/opt/dankbot/env/lib/python3.4/site-packages/MySQLdb/cursors.py", line 213, in execute
    query = query.encode(db.unicode_literal.charset, 'surrogateescape')
UnicodeEncodeError: 'latin-1' codec can't encode character '\u30fd' in position 130: ordinal not in range(256)

This happens because mysql cant handle non ASCII characters and throws an exception.

Find a better way to store these URLS in the database, and query for their presence.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.