Coder Social home page Coder Social logo

sefaria-export's Introduction

Sefaria-Export

Structured Jewish texts and metadata with free public licenses, exported from Sefaria's database.

This repo contains texts, bibliographical information and lists of intertextual connections created by Sefaria.

A MongoDB dump of Sefaria's database is also available for download here or a smaller version (without text edit history) here. Download this file, extract it and use mongorestore to load into your local DB.

From the parent of the unzipped dump folder, run:

mongorestore --drop

This will create (or overwrite) a mongo database called sefaria.

More details available here.

For Sefaria source code see Sefaria-Project.

Contents

  • /json/ - simple json output of texts
  • /txt/ - simple plain text output of texts
  • /xml/ - simple xml output of texts (coming as soon as requested)
  • /links/ - CSV output of all known interconnections in texts
  • /schemas/ - JSON files corresponding to schema information about each text
  • /misc - other miscellaneous data outputs

Text output folders are organized by category and contain seperate directories for each language. Each file is named according the version of the particular text.

Each terminal directory also includes a file called merged (e.g., merged.json or merged.txt). This file uses the same logic used on the Sefaria web site to include the maximal content available. For example, we have cases in the Mishnah where no single English version is complete by itself, but the merged version will include a complete text that picks and merges from multiple sources as needed.

When we do have complete versions of texts, we will still include a merged file. In that case, the merged file will be a copy of the default complete version. This simplifies many applications - you are always guaranteed that by looking at the merged version you'll see a maximal amount of text available, with preference for the text versions we've set.

Code for generating these files can be found in our Sefaria-Project repo under sefaria/export.py.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.