Coder Social home page Coder Social logo

crccheck / atx-bandc Goto Github PK

View Code? Open in Web Editor NEW
0.0 4.0 0.0 3.26 MB

Scrape Austin, TX Boards and Commissions into RSS feeds

Home Page: https://bandc.crccheck.com/

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.73% Python 15.27% HTML 83.77% Dockerfile 0.23%
python django pdf rss opengovernment civic-hacking austin

atx-bandc's People

Contributors

crccheck avatar

Watchers

 avatar  avatar  avatar  avatar

atx-bandc's Issues

add edims id as a field

so we can use it in the url for the document detail.
non-documents like videos won't work though

fix timezone

only an issue for the rss feed since html pages just use dates

show thumbnail of pdf

convert -transparent white -fuzz 10% 211162.pdf z.png
convert is even available on heroku. use dream objects to host

UNIQUE constraint failed: agenda_document.edims_id

Scraping "Urban Transportation Commission" id #50
 Scraped "Urban Transportation Commission"
Scraping "Low Income Consumer Advisory Task Force" id #130
Traceback (most recent call last):
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/query.py", line 581, in get_or_create
    return self.get(**kwargs), False
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/query.py", line 435, in get
    raise self.model.DoesNotExist(
bandc.apps.agenda.models.Document.DoesNotExist: Document matching query does not exist.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/.venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/app/.venv/lib/python3.10/site-packages/django/db/backends/sqlite3/base.py", line 423, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.IntegrityError: UNIQUE constraint failed: agenda_document.edims_id

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/manage.py", line 21, in <module>
    main()
  File "/app/manage.py", line 17, in main
    execute_from_command_line(sys.argv)
  File "/app/.venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/app/.venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/app/.venv/lib/python3.10/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/app/.venv/lib/python3.10/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/app/bandc/apps/agenda/management/commands/scrape.py", line 48, in handle
    bandc.pull()
  File "/app/bandc/apps/agenda/models.py", line 92, in pull
    return pull_bandc(self)
  File "/app/bandc/apps/agenda/utils.py", line 200, in pull_bandc
    should_process_next = _save_page(meeting_data, doc_data, bandc=bandc)
  File "/app/bandc/apps/agenda/utils.py", line 140, in _save_page
    doc, created = Document.objects.get_or_create(
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/query.py", line 588, in get_or_create
    return self.create(**params), True
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/query.py", line 453, in create
    obj.save(force_insert=True, using=self.db)
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/base.py", line 739, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/base.py", line 776, in save_base
    updated = self._save_table(
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/base.py", line 881, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/base.py", line 919, in _do_insert
    return manager._insert(
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/query.py", line 1270, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/app/.venv/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1416, in execute_sql
    cursor.execute(sql, params)
  File "/app/.venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/app/.venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/app/.venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 79, in _execute
    with self.db.wrap_database_errors:
  File "/app/.venv/lib/python3.10/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/app/.venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/app/.venv/lib/python3.10/site-packages/django/db/backends/sqlite3/base.py", line 423, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: UNIQUE constraint failed: agenda_document.edims_id

Add scrape audit log

I should be able to see all recent scrapes, what boards were scraped, what docs were found, and what scrape errors happened

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.