Coder Social home page Coder Social logo

readhomer's Introduction

Scaife Viewer

The new reading environment for version 5.0 of the Perseus Digital Library.

This repository is part of the Scaife Viewer project, an open-source ecosystem for building rich online reading environments.

Getting Started with Codespaces Development

This project can be developed via GitHub Codespaces.

Setting up the Codespace

  • Browse to https://github.com/scaife-viewer/scaife-viewer
  • (Optionally) fork the repo; if you're a part of the Scaife Viewer development team, you can work from scaife-viewer/scaife-viewer
  • Create a codespace from the green "Code" button: image-20230622050539589
  • Configure options to:
    • Choose the closest data center to your geographical location
    • Start the codespace from the feature/content-update-pipeline branch image-20230622050632620

Install and build the frontend

  • Install and activate Node 12:
nvm use 12
  • Install dependencies:
npm i
  • Rebuild the node-sass dependency:
npm rebuild node-sass
  • Build the frontend:
npm run build

Start up PostgreSQL and ElasticSearch

Note: These may be made optional in the future Build and start up services via:

touch deploy/.env
docker-compose -f deploy/docker-compose.yml up -d sv-elasticsearch sv-postgres

Prepare the backend

  • Create a virtual environment and activate it:
python3 -m venv .venv
source .venv/bin/activate
  • Install dependencies:
pip install pip wheel --upgrade
pip install -r requirements.txt
pip install PyGithub
  • Set required environment variables:
export CTS_RESOLVER=local \
    CTS_LOCAL_DATA_PATH=data/cts \
    CONTENT_MANIFEST_PATH=data/content-manifests/test.yaml \
    DATABASE_URL=postgres://scaife:[email protected]:5432/scaife
  • Populate the database schema and load site fixture:
./manage.py migrate
./manage.py loaddata sites
  • Copy the static assets
./manage.py collectstatic --noinput
  • Fetch content from content-manifests/test.yaml:
mkdir -p $CTS_LOCAL_DATA_PATH
./manage.py load_text_repos
./manage.py slim_text_repos
  • Ingest the data and pre-populate CTS cache:
mkdir -p atlas_data
./manage.py prepare_atlas_db --force

Seed the search index

We'll ingest a portion of the data into ElasticSearch

  • Fetch the ElasticSearch template:
curl -O https://gist.githubusercontent.com/jacobwegner/68e538edf66539feb25786cc3c9cc6c6/raw/252e01a4c7e633b4663777a7e12dcb81119131e1/scaife-viewer-tmp.json
  • Install the template:
curl -X PUT "localhost:9200/_template/scaife-viewer?pretty" -H 'Content-Type: application/json' -d "$(cat scaife-viewer-tmp.json)"
  • Index content:
python manage.py indexer --max-workers=1 --limit=1000
  • Cleanup the search index template:
rm scaife-viewer-tmp.json

Run the dev server

 ./manage.py runserver

Codespaces should show a notification that a port has been mapped: image-20230622052553784

  • Click "Open in Browser" to load the dev server.
  • Click on "try the Iliad" to load the reader: image-20230622054959080

The Codespace has now been set up! Close it by opening the "Codespaces" menu (F1) and then selecting Codespaces: Stop Current Codespace.

Rename the Codepsace

  • Browse to https://github.com/codespaces and find the codespace: image-20230622165419317
  • Select the "..." menu and then "Rename": image-20230622165552978
  • Give the Codespace a meaningful name (e.g. Scaife Viewer / Perseus dev): image-20230622165414325

Ongoing development

  • Browse to https://github.com/codespaces and find the codespace
  • Select the "..." menu and then "Open in..." and select "Open in browser" or another of the available options. image-20230622165503906
  • After the Codespace launches, open a new terminal and reactivate the Python virtual environment:
source .venv/bin/activate
  • Populate required envionment variables:
export CTS_RESOLVER=local \
    CTS_LOCAL_DATA_PATH=data/cts \
    CONTENT_MANIFEST_PATH=data/content-manifests/test.yaml \
    DATABASE_URL=postgres://scaife:[email protected]:5432/scaife
  • Start up PostgreSQL and ElasticSearch:
docker-compose -f deploy/docker-compose.yml up -d sv-elasticsearch sv-postgres
# Optionally wait 10 seconds for Postgres to finish starting
sleep 10
  • Run the dev server:
 ./manage.py runserver

Codespaces should show a notification that a port has been mapped: image-20230622052553784

  • Click "Open in Browser" to load the dev server.

Getting Started with Local Development

Requirements:

  • Python 3.6.x
  • Node 11.7
  • PostgreSQL 9.6
  • Elasticsearch 6

First, install and run Elasticsearch on port 9200. If you're on a Mac, we recommend using brew for this:

brew install elasticsearch
brew services start elasticsearch

Then, set up a postgres database to use for local development:

createdb scaife-viewer

This assumes your local PostgreSQL is configured to allow your user to create databases. If this is not the case you might be able to create the user yourself:

createuser --username=postgres --superuser $(whoami)

Create a virtual environment. Then, install the Node and Python dependencies:

npm install
pip install -r requirements-dev.txt

Set up the database:

python manage.py migrate
python manage.py loaddata sites

Seed the text inventory to speed up local development:

./bin/download_local_ti

You should now be set to run the static build pipeline and hot module reloading:

npm start

In another terminal, collect the static files and then start runserver:

python manage.py collectstatic --noinput
python manage.py runserver

Browse to http://localhost:8000/.

Note that, although running Scaife locally, this is relying on the Nautilus server at https://scaife-cts-dev.perseus.org to retrieve texts.

Tests

You can run the Vue unit tests, via:

npm run unit

Cross-browser testing is provided by BrowserStack through their open source program.

Translations

Before you work with translations, you will need gettext installed.

macOS:

brew install gettext
export PATH="$PATH:$(brew --prefix gettext)/bin"

To prepare messages:

python manage.py makemessages --all

If you need to add a language; add it to LANGUAGES in settings.py and run:

python manage.py makemessages --locale <lang>

Hosting Off-Root

If you need to host at a place other than root, for example, if you need to have a proxy serve at some path off your domain like http://yourdomain.com/perseus/, you'll need to do the following:

  1. Set the environment variable, FORCE_SCRIPT_NAME to point to your script:
    export FORCE_SCRIPT_NAME=/perseus  # this front slash is important
  1. Make sure this is set prior to running npm run build as well as prior to and part of your wsgi startup environment.

  2. Then, you just set your proxy to point to the location of where your wsgi server is running. For example, if you are running wsgi on port 8000 you can have this snippet inside your nginx config for the server:

    location /perseus/ {
        proxy_pass        http://localhost:8000/;
    }

That should be all you need to do.

Deploying via Docker

A sample docker-compose configuration is available at deploy/docker-compose.yml.

Copy .env.example and customize environment variables for your deployment:

cp deploy/.env.example deploy/.env

To build the Docker image and bring up the scaife-viewer, sv-postgres and sv-elasticsearch services in the background:

docker-compose -f deploy/docker-compose.yml up --build -d

Tail logs via:

docker-compose -f deploy/docker-compose.yml logs --follow

To host the application off-root using docker-compose, you'll need to ensure that the scaife-viewer Docker image is built with the FORCE_SCRIPT_NAME build arg:

docker-compose -f deploy/docker-compose.yml build --build-arg FORCE_SCRIPT_NAME=/<your-off-root-path>

You'll also need to ensure that FORCE_SCRIPT_NAME exists in deploy/.env:

echo "FORCE_SCRIPT_NAME=/<your-off-root-path>" >> deploy/.env

Then, bring up all services:

docker-compose -f deploy/docker-compose.yml up -d

Using Docker for development

The project also includes Dockerfile-dev and Dockerfile-webpack images which can be used with Docker Compose to facilitate development.

First, copy .env.example and customize environment variables for development:

cp deploy/.env.example deploy/.env

Then build the images and spin up the containers:

docker-compose -f deploy/docker-compose.yml -f deploy/docker-compose.override.yml up --build

To run only the scaife-viewer, sv-webpack, and sv-postgres services, set the USE_ELASTICSEARCH_SERVICE environment variable in docker-compose.override.yml to 0, and then run:

docker-compose -f deploy/docker-compose.yml -f deploy/docker-compose.override.yml up --build scaife-viewer sv-webpack sv-postgres

To run the indexer command:

docker-compose -f deploy/docker-compose.yml -f deploy/docker-compose.override.yml exec scaife-viewer python manage.py indexer

API Library Cache

The client-side currently caches the results of library/json/. The cache is automatically invalidated every 24 hours. You can manually invalidate it by bumping the LIBRARY_VIEW_API_VERSION environment variable.

ATLAS Database

bin/fetch_atlas_db can be used to fetch and extract an ATLAS database from a provided URL.

To build a copy of this database locally:

  • Run bin/download_local_ti to get a local copy of the text inventory from $CTS_API_ENDPOINT
  • Run bin/fetch_corpus_config to load corpus-specific configuration files
  • Run the prepare_atlas_db management command to ingest ATLAS data from CTS collections (assumes atlas_data directory exists; create it via mkdir -p atlas_data)

Queries to ATLAS models are routed via the ATLASRouter database router (and therefore are isolated from the default database)

CTS Data

CTS data is now bundled with the application.

The deployment workflow is responsible for making corpora available under at the location specified by settings.CTS_LOCAL_DATA_PATH.

For Heroku deployments, this is currently accomplished by preparing a tarball made available via $CTS_TARBALL_URL and downloading and uncompressing the tarball using bin/fetch_cts_tarball.sh.

Build / Deploy Pipeline

The production instance of the application is built using GitHub Actions.

To deploy a new version, trigger the following GitHub Actions workflows:

After verifying the changes on scaife-dev.perseus.org, re-run "Deploy app image" and "Promote search index" with:

  • Heroku app: scaife-perseus-org
  • Name of the latest search index: ($ELASTICSEARCH_INDEX_NAME) from the "Reindex content" workflow.

Additional maintenance tasks are documented below.

GitHub releases

After deploying to scaife.perseus.org, manually create a new release:

  1. Create a new release

    • Tag: YYYY-MM-DD-###, e.g. 2023-07-06-001
    • Title: (repeat value from "Tag")
    • Description:
      • Code feature 1
      • Code feature 2, etc
      • Content changes since the last deployment (To generate a diff, use the "Diff corpora contents" workflow documented below in "Release Tasks")
  2. Save the release as a draft

  3. After verifying changes on scaife-dev.perseus.org and promoting changes to scaife.perseus.org, publish the draft

  4. Restart the Heroku app to pick up the new release: *

* TODO: Add a workflow to restart the app.

It will be restarted when "Promote search index" is ran (due to the updated environment variable):

image

Or manually via:

heroku ps:restart --app=scaife.perseus.org

After the application restarts, refresh the homepage to verify the latest release is linked:

image

Release Tasks

  • Diff corpora contents:

    Diff two versions of the corpus-metadata manifest.

    This workflow should be ran after deploying to scaife-perseus-org-dev and before deploying to scaife-perseus-org.

    It uses the scaife CLI to create a diff:

    scaife diff-corpora-contents

    If the workflow is succesful, the diff will be included in the job summary:

    --- old.json	2023-07-06 08:30:12
    +++ new.json	2023-07-06 08:30:12
    @@ -1527,10 +1527,10 @@
        ]
    },
    {
    -    "ref": "0.0.5350070018",
    +    "ref": "0.0.5426028119",
        "repo": "PerseusDL/canonical-greekLit",
    -    "sha": "593087513cb16dd02f0a7b8362519a3a0e2f29bc",
    -    "tarball_url": "https://api.github.com/repos/PerseusDL/canonical-greekLit/tarball/0.0.5350070018",
    +    "sha": "701d7470d6bf9a11fb6e508ddd3270bf88748303",
    +    "tarball_url": "https://api.github.com/repos/PerseusDL/canonical-greekLit/tarball/0.0.5426028119",
        "texts": [
        "urn:cts:greekLit:tlg0001.tlg001.perseus-grc2",
        "urn:cts:greekLit:tlg0003.tlg001.opp-fre1",
    \ No newline at end of file

Maintenance Tasks

The following GitHub Actions workflows are used to run maintenance tasks:

  • Check reindexing job status:

    This workflow should be ran to check the status of the Reindex content job.

    It will query the Google Cloud Run API and return a description of the latest job execution:

    image

    It also checks the completion status of the execution. If the execution has not completed, an error will be returned:

    image

    When the execution has completed, no error is returned:

    image
  • Delete indices:

    This workflow should be ran after promoting the search index.

    It will remove all indices except for the current active index ($ELASTICSEARCH_INDEX_NAME as configured for the scaife-perseus-org Heroku app).

    This could be a scheduled workflow, but was kept as a manual task in case there was a need to have multiple indices available (for testing on scaife-perseus-org-dev, etc.)

readhomer's People

Contributors

jacobwegner avatar mjhea0 avatar paltman avatar rillian avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

rillian cdli-gh

readhomer's Issues

bring in tagging and lemmatisation

with modification and cleanup, serve up Celano's work via an API (this is already partially being done in scaife.perseus.org for the token list widget) but put the information in the data plane for subsequent decoration of the text based on it.

Even though lots will be possible, a sample use case would be to colour-code words by their part of speech.

loading globalComponents fails in production

I was trying to use the netlify builds, but the output of yarn build won't actually load. It fails with

TypeError: t.__file is undefined main.js:13:2
56d7 main.js:13
forEach self-hosted:266
56d7 main.js:11
Webpack 6

The issue seems to be related to globalComponents array containing objects without a __file property in the minified version.

show the CTS URN for passage being shown

even though we're not using the CTS protocol as such, it's still very useful to have the CTS URN widget. Could even link directly to the API output like it does in scaife.perseus.org

synchronised reading of translations

like we have in scaife.perseus.org but let's try to solve the problem of the corresponding text parts getting out of sync.

Also we don't currently serve up the English translations via any API.

Note also that there is some richer markup in at least one of the English translations than in the Greek.

infinite scroll

this might need to be a toggle via a widget (because it sort of competes with pagination) but for some projects it will be nice to have infinite scroll and this is a good project to prototype it on

add some new TOC chunking

find something for Homer that isn't necessary complete or contiguous but which has named passages that we can use to showcase the TOC/chunking/pagination capability

translation alignment

if translation alignment data between the Greek and a translation exists and can be served up via an API, we can add it to the data plane and support hovering over words in one version to see the aligned words in the other highlighted.

Localization support

Scaife-viewer uses django's localization and internationalization features to support translating interface elements.

A flagship project like readhomer should also do something for this, to demonstrate how to improve accessibility and build toward a less colonized scholarship.

Vue.js doesn't have any built-in support for this that I can find. There's a vue-i18n package which might work. It doesn't support po files the way django does. Other options are VueLocalize (unmaintained) and v-localize (seems too simple for large projects).

tweaks to HOMER READER and HOMER REFERENCE INPUT

  • rename "Global Sync" to just "Sync" as otherwise it wraps when narrow.
  • make the Iliad / Odyssey a button group toggle placed to the right of the passage reference, e.g [ Il. | Od. ] [ Passage Ref ] [ Lookup ] [ Sync ]
  • Homer Reference Input should NOT have the Global Sync button (but it could make the work a button group toggle)
  • I think when Sync is on, we either need to hide the passage ref input box or change its contents as it's confused when it say something different than what's being shown

add word list widget

the vocab.perseus.org API should work so we should be able to port over the word list widget.

view arbitrary line ranges of the Greek text in either work

I deliberately haven't specified here how you give the passage range you want because I'll set up separate issues for that. I realise that makes this less of a user-story and more functionality but if the URL reflects the passage range, this story can be accepted based on whether I can change the URL to get new passage ranges.

Reference Model: A2 C1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.