Coder Social home page Coder Social logo

Scaife Viewer

The new reading environment for version 5.0 of the Perseus Digital Library.

This repository is part of the Scaife Viewer project, an open-source ecosystem for building rich online reading environments.

Getting Started with Codespaces Development

This project can be developed via GitHub Codespaces.

Setting up the Codespace

  • Browse to https://github.com/scaife-viewer/scaife-viewer
  • (Optionally) fork the repo; if you're a part of the Scaife Viewer development team, you can work from scaife-viewer/scaife-viewer
  • Create a codespace from the green "Code" button: image-20230622050539589
  • Configure options to:
    • Choose the closest data center to your geographical location
    • Start the codespace from the feature/content-update-pipeline branch image-20230622050632620

Install and build the frontend

  • Install and activate Node 12:
nvm use 12
  • Install dependencies:
npm i
  • Rebuild the node-sass dependency:
npm rebuild node-sass
  • Build the frontend:
npm run build

Start up PostgreSQL and ElasticSearch

Note: These may be made optional in the future Build and start up services via:

touch deploy/.env
docker-compose -f deploy/docker-compose.yml up -d sv-elasticsearch sv-postgres

Prepare the backend

  • Create a virtual environment and activate it:
python3 -m venv .venv
source .venv/bin/activate
  • Install dependencies:
pip install pip wheel --upgrade
pip install -r requirements.txt
pip install PyGithub
  • Set required environment variables:
export CTS_RESOLVER=local \
    CTS_LOCAL_DATA_PATH=data/cts \
    CONTENT_MANIFEST_PATH=data/content-manifests/test.yaml \
    DATABASE_URL=postgres://scaife:[email protected]:5432/scaife
  • Populate the database schema and load site fixture:
./manage.py migrate
./manage.py loaddata sites
  • Copy the static assets
./manage.py collectstatic --noinput
  • Fetch content from content-manifests/test.yaml:
mkdir -p $CTS_LOCAL_DATA_PATH
./manage.py load_text_repos
./manage.py slim_text_repos
  • Ingest the data and pre-populate CTS cache:
mkdir -p atlas_data
./manage.py prepare_atlas_db --force

Seed the search index

We'll ingest a portion of the data into ElasticSearch

  • Fetch the ElasticSearch template:
curl -O https://gist.githubusercontent.com/jacobwegner/68e538edf66539feb25786cc3c9cc6c6/raw/252e01a4c7e633b4663777a7e12dcb81119131e1/scaife-viewer-tmp.json
  • Install the template:
curl -X PUT "localhost:9200/_template/scaife-viewer?pretty" -H 'Content-Type: application/json' -d "$(cat scaife-viewer-tmp.json)"
  • Index content:
python manage.py indexer --max-workers=1 --limit=1000
  • Cleanup the search index template:
rm scaife-viewer-tmp.json

Run the dev server

 ./manage.py runserver

Codespaces should show a notification that a port has been mapped: image-20230622052553784

  • Click "Open in Browser" to load the dev server.
  • Click on "try the Iliad" to load the reader: image-20230622054959080

The Codespace has now been set up! Close it by opening the "Codespaces" menu (F1) and then selecting Codespaces: Stop Current Codespace.

Rename the Codepsace

  • Browse to https://github.com/codespaces and find the codespace: image-20230622165419317
  • Select the "..." menu and then "Rename": image-20230622165552978
  • Give the Codespace a meaningful name (e.g. Scaife Viewer / Perseus dev): image-20230622165414325

Ongoing development

  • Browse to https://github.com/codespaces and find the codespace
  • Select the "..." menu and then "Open in..." and select "Open in browser" or another of the available options. image-20230622165503906
  • After the Codespace launches, open a new terminal and reactivate the Python virtual environment:
source .venv/bin/activate
  • Populate required envionment variables:
export CTS_RESOLVER=local \
    CTS_LOCAL_DATA_PATH=data/cts \
    CONTENT_MANIFEST_PATH=data/content-manifests/test.yaml \
    DATABASE_URL=postgres://scaife:[email protected]:5432/scaife
  • Start up PostgreSQL and ElasticSearch:
docker-compose -f deploy/docker-compose.yml up -d sv-elasticsearch sv-postgres
# Optionally wait 10 seconds for Postgres to finish starting
sleep 10
  • Run the dev server:
 ./manage.py runserver

Codespaces should show a notification that a port has been mapped: image-20230622052553784

  • Click "Open in Browser" to load the dev server.

Getting Started with Local Development

Requirements:

  • Python 3.6.x
  • Node 11.7
  • PostgreSQL 9.6
  • Elasticsearch 6

First, install and run Elasticsearch on port 9200. If you're on a Mac, we recommend using brew for this:

brew install elasticsearch
brew services start elasticsearch

Then, set up a postgres database to use for local development:

createdb scaife-viewer

This assumes your local PostgreSQL is configured to allow your user to create databases. If this is not the case you might be able to create the user yourself:

createuser --username=postgres --superuser $(whoami)

Create a virtual environment. Then, install the Node and Python dependencies:

npm install
pip install -r requirements-dev.txt

Set up the database:

python manage.py migrate
python manage.py loaddata sites

Seed the text inventory to speed up local development:

./bin/download_local_ti

You should now be set to run the static build pipeline and hot module reloading:

npm start

In another terminal, collect the static files and then start runserver:

python manage.py collectstatic --noinput
python manage.py runserver

Browse to http://localhost:8000/.

Note that, although running Scaife locally, this is relying on the Nautilus server at https://scaife-cts-dev.perseus.org to retrieve texts.

Tests

You can run the Vue unit tests, via:

npm run unit

Cross-browser testing is provided by BrowserStack through their open source program.

Translations

Before you work with translations, you will need gettext installed.

macOS:

brew install gettext
export PATH="$PATH:$(brew --prefix gettext)/bin"

To prepare messages:

python manage.py makemessages --all

If you need to add a language; add it to LANGUAGES in settings.py and run:

python manage.py makemessages --locale <lang>

Hosting Off-Root

If you need to host at a place other than root, for example, if you need to have a proxy serve at some path off your domain like http://yourdomain.com/perseus/, you'll need to do the following:

  1. Set the environment variable, FORCE_SCRIPT_NAME to point to your script:
    export FORCE_SCRIPT_NAME=/perseus  # this front slash is important
  1. Make sure this is set prior to running npm run build as well as prior to and part of your wsgi startup environment.

  2. Then, you just set your proxy to point to the location of where your wsgi server is running. For example, if you are running wsgi on port 8000 you can have this snippet inside your nginx config for the server:

    location /perseus/ {
        proxy_pass        http://localhost:8000/;
    }

That should be all you need to do.

Deploying via Docker

A sample docker-compose configuration is available at deploy/docker-compose.yml.

Copy .env.example and customize environment variables for your deployment:

cp deploy/.env.example deploy/.env

To build the Docker image and bring up the scaife-viewer, sv-postgres and sv-elasticsearch services in the background:

docker-compose -f deploy/docker-compose.yml up --build -d

Tail logs via:

docker-compose -f deploy/docker-compose.yml logs --follow

To host the application off-root using docker-compose, you'll need to ensure that the scaife-viewer Docker image is built with the FORCE_SCRIPT_NAME build arg:

docker-compose -f deploy/docker-compose.yml build --build-arg FORCE_SCRIPT_NAME=/<your-off-root-path>

You'll also need to ensure that FORCE_SCRIPT_NAME exists in deploy/.env:

echo "FORCE_SCRIPT_NAME=/<your-off-root-path>" >> deploy/.env

Then, bring up all services:

docker-compose -f deploy/docker-compose.yml up -d

Using Docker for development

The project also includes Dockerfile-dev and Dockerfile-webpack images which can be used with Docker Compose to facilitate development.

First, copy .env.example and customize environment variables for development:

cp deploy/.env.example deploy/.env

Then build the images and spin up the containers:

docker-compose -f deploy/docker-compose.yml -f deploy/docker-compose.override.yml up --build

To run only the scaife-viewer, sv-webpack, and sv-postgres services, set the USE_ELASTICSEARCH_SERVICE environment variable in docker-compose.override.yml to 0, and then run:

docker-compose -f deploy/docker-compose.yml -f deploy/docker-compose.override.yml up --build scaife-viewer sv-webpack sv-postgres

To run the indexer command:

docker-compose -f deploy/docker-compose.yml -f deploy/docker-compose.override.yml exec scaife-viewer python manage.py indexer

API Library Cache

The client-side currently caches the results of library/json/. The cache is automatically invalidated every 24 hours. You can manually invalidate it by bumping the LIBRARY_VIEW_API_VERSION environment variable.

ATLAS Database

bin/fetch_atlas_db can be used to fetch and extract an ATLAS database from a provided URL.

To build a copy of this database locally:

  • Run bin/download_local_ti to get a local copy of the text inventory from $CTS_API_ENDPOINT
  • Run bin/fetch_corpus_config to load corpus-specific configuration files
  • Run the prepare_atlas_db management command to ingest ATLAS data from CTS collections (assumes atlas_data directory exists; create it via mkdir -p atlas_data)

Queries to ATLAS models are routed via the ATLASRouter database router (and therefore are isolated from the default database)

CTS Data

CTS data is now bundled with the application.

The deployment workflow is responsible for making corpora available under at the location specified by settings.CTS_LOCAL_DATA_PATH.

For Heroku deployments, this is currently accomplished by preparing a tarball made available via $CTS_TARBALL_URL and downloading and uncompressing the tarball using bin/fetch_cts_tarball.sh.

Build / Deploy Pipeline

The production instance of the application is built using GitHub Actions.

To deploy a new version, trigger the following GitHub Actions workflows:

After verifying the changes on scaife-dev.perseus.org, re-run "Deploy app image" and "Promote search index" with:

  • Heroku app: scaife-perseus-org
  • Name of the latest search index: ($ELASTICSEARCH_INDEX_NAME) from the "Reindex content" workflow.

Additional maintenance tasks are documented below.

GitHub releases

After deploying to scaife.perseus.org, manually create a new release:

  1. Create a new release

    • Tag: YYYY-MM-DD-###, e.g. 2023-07-06-001
    • Title: (repeat value from "Tag")
    • Description:
      • Code feature 1
      • Code feature 2, etc
      • Content changes since the last deployment (To generate a diff, use the "Diff corpora contents" workflow documented below in "Release Tasks")
  2. Save the release as a draft

  3. After verifying changes on scaife-dev.perseus.org and promoting changes to scaife.perseus.org, publish the draft

  4. Restart the Heroku app to pick up the new release: *

* TODO: Add a workflow to restart the app.

It will be restarted when "Promote search index" is ran (due to the updated environment variable):

image

Or manually via:

heroku ps:restart --app=scaife.perseus.org

After the application restarts, refresh the homepage to verify the latest release is linked:

image

Release Tasks

  • Diff corpora contents:

    Diff two versions of the corpus-metadata manifest.

    This workflow should be ran after deploying to scaife-perseus-org-dev and before deploying to scaife-perseus-org.

    It uses the scaife CLI to create a diff:

    scaife diff-corpora-contents

    If the workflow is succesful, the diff will be included in the job summary:

    --- old.json	2023-07-06 08:30:12
    +++ new.json	2023-07-06 08:30:12
    @@ -1527,10 +1527,10 @@
        ]
    },
    {
    -    "ref": "0.0.5350070018",
    +    "ref": "0.0.5426028119",
        "repo": "PerseusDL/canonical-greekLit",
    -    "sha": "593087513cb16dd02f0a7b8362519a3a0e2f29bc",
    -    "tarball_url": "https://api.github.com/repos/PerseusDL/canonical-greekLit/tarball/0.0.5350070018",
    +    "sha": "701d7470d6bf9a11fb6e508ddd3270bf88748303",
    +    "tarball_url": "https://api.github.com/repos/PerseusDL/canonical-greekLit/tarball/0.0.5426028119",
        "texts": [
        "urn:cts:greekLit:tlg0001.tlg001.perseus-grc2",
        "urn:cts:greekLit:tlg0003.tlg001.opp-fre1",
    \ No newline at end of file

Maintenance Tasks

The following GitHub Actions workflows are used to run maintenance tasks:

  • Check reindexing job status:

    This workflow should be ran to check the status of the Reindex content job.

    It will query the Google Cloud Run API and return a description of the latest job execution:

    image

    It also checks the completion status of the execution. If the execution has not completed, an error will be returned:

    image

    When the execution has completed, no error is returned:

    image
  • Delete indices:

    This workflow should be ran after promoting the search index.

    It will remove all indices except for the current active index ($ELASTICSEARCH_INDEX_NAME as configured for the scaife-perseus-org Heroku app).

    This could be a scheduled workflow, but was kept as a manual task in case there was a need to have multiple indices available (for testing on scaife-perseus-org-dev, etc.)

Scaife Viewer's Projects

backend icon backend

Packages and utilities to build Scaife Viewer backends using ATLAS / CTS resolvers

delarose icon delarose

Vue.js application for Roman de la Rose

doric icon doric

a simple Vue.js-based application for wiring together widgets organized into columns.

frontend icon frontend

Skeleton, Widgets, and other frontend packages for the Scaife Viewer

mycapytain icon mycapytain

Texts API and Textual Resources Utility Library for Python 3

readhomer icon readhomer

in-progress flagship demo project for the Scaife Viewer ecosystem

scaife-stack icon scaife-stack

Sample project utilizing the Scaife Viewer frontend and backend

scaife-viewer icon scaife-viewer

new reading environment for version 5.0 of the Perseus Digital Library

sheets-url-shortener icon sheets-url-shortener

A simple short URL redirect service built on top of Google Sheets, and runs for cheap on Google Cloud Run serverless.

sv-mini icon sv-mini

front-end for the Scaife "SV Mini" prototype

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.