Coder Social home page Coder Social logo

kitab-website-draft's People

Contributors

mabarber92 avatar

Watchers

 avatar  avatar

Forkers

pverkind

kitab-website-draft's Issues

logos in page footers

Page URL:

(all pages)

Issue with the website:

Because the footer is blue and the background of the logos is white, you get an ugly block effect around the logos.
Perhaps it's nicer to use white as the background colour of the footer?
The logos are also too small and the letters impossible to read.

Mention that passim data is created for every release of the corpus + difference between primary and secondary runs

Page URL:

https://mabarber92.github.io/data#passim-text-reuse-datasets

Suggested content change

Before:

Passim text reuse datasets

We produce two types of passim dataset: normal and aggregated.

After

Passim text reuse datasets

**We run passim on our corpus every time we release a new version of the corpus. For each passim run, we **produce two types of passim dataset: normal and aggregated.

Corpus and data message

Page URL:

https://mabarber92.github.io/corpus/

Issue with the website:

The landing page for this section talks only about the Corpus, not about Data and Corpus.

Perhaps we need to re-organize the structure of this section, so that one lands on a very short message explaining that the project is based on a corpus and that we produce a lot of data, and then have separate sections on Corpus and on Data.

Before:

CORPUS AND DATA
Message from the team
About the corpus
Using the corpus
mARkdown
OCR
Corpus releases
Read more

DATASETS
About our data
About our Vizualisations
Use Our Applications
Read more

#After:

CORPUS AND DATA
Message from the team

CORPUS
Message from the team
About the corpus
Using the corpus
mARkdown
OCR
Corpus releases
Read more

DATASETS
About our data
About our Vizualisations
Use Our Applications
Read more

Documentation: add intermediate page

Page URL:

(all pages)

Currently, the "Documentation" entry in the menu at the top of the page links directly to the OpenITI documentation (https://openiti.github.io/documentation/)

Would it not be better to have a Documentation page like we have a page for each of the other main sections, and from there, link to the OpenITI documentation? At some point, we will have documentation that is not OpenITI-related as welll.

Apps page: different layout?

Page URL:

https://mabarber92.github.io/#test-link

Issue with the website:

The page currently consists of a block for each application with an image, a very short text and a button that links to the application.
At the top of the page, users are referred to the vizualisations page for guidance on how to use the applications. Since not all our applications are related to visualization, I don't think that's the best location for the guidance.

I think it's probably better that each app has its own page with a description of it and a link to it (so, probably more similar to the https://mabarber92.github.io/data/viz page that describes each type of visualization).

home page banner

Page URL:

https://mabarber92.github.io/

  • A good image should be used for the banner.
  • The title of the page is a bit dull, and I don't like the fact that the title of the project is abbreviated

Suggested content change

Before:

Knowledge Information and the Arabic Book
The Website for the KITAB project

After

The KITAB project
Knowledge, Information Technology, and the Arabic Book

(not that this is less dull...)

documentation for columns in passim alignment + statistics files not linked

Page URL:

https://mabarber92.github.io/data#alignment-files
https://mabarber92.github.io/data#passim-text-reuse-statistics

Issue with the website:

The text links to the "docs" for more info on the columns in the alignment and statistics files, but the URL has not been added so the link leads back to the general https://mabarber92.github.io/data page:

  • For detailed guidance on the data fields, see our [docs](For detailed guidance on the data fields, see our docs).
  • (for a full outline of the data fields, see our docs:

NB: in the latter, there's also a closing parenthesis missing

Passim text reuse datasets

Page URL:

https://mabarber92.github.io/data#passim-text-reuse-datasets

Suggested content change

Before:

The normal dataset uses passim alignments based on the milestones (the logical chunks that we use to run passim). In this dataset large alignments might be split across multiple milestones. The aggregated dataset takes large alignments the cross milestones and brings them together into one alignment.

After

The normal dataset uses passim alignments based on the milestones (the 300-word chunks into which we divide texts before running passim). In this dataset large alignments might be split across multiple milestones; this dataset is especially useful for book-to-book visualization of text reuse. The aggregated dataset takes large alignments that cross milestones and brings them together into one alignment; this dataset is particularly useful for close reading.

"When does KITAB run passim?" Add that passim runs are linked to specific corpus release

Page URL:

https://mabarber92.github.io/methods/text-reuse#when-does-kitab-run-passim

Suggested content change

Before:

Passim is run at least twice every year to account for corpus changes. We do not run it more regularly because the preparation of the corpus and subsequent analysis of the data produced by passim is very time consuming. It is important for us that the corpus is prepared appropriately and that the data and subsequent analysis is checked thoroughly, to guard against potential errors.

After

Passim is run at least twice every year to account for corpus changes. We do not run it more regularly because the preparation of the corpus and subsequent analysis of the data produced by passim is very time consuming. It is important for us that the corpus is prepared appropriately and that the data and subsequent analysis is checked thoroughly, to guard against potential errors. Each time passim is run, the corpus as it is at that point is released on Zenodo, so that the text reuse data can always be linked back to the state the texts were in when passim was run.

pairwise viz explanation: add less technical explanation?

Page URL:

https://mabarber92.github.io/data/viz

Perhaps it's useful to add a simplified explanation of the visualization that doesn't include terms like x axis, milestones, etc. Something like:

"The visualization represents each of the two works as a 300-words wide scroll of which the top is on the left and the bottom on the right. Passages that are common to both works are highlighted in red. The yellow lines connect the common passages between the books."

reorganize the alignment files explanation?

Page URL:

https://mabarber92.github.io/data#alignment-files

The order of the explanations does not feel logical. I would first explain what is in those files, and only after that, explain the naming system and the different versions.

Suggested content change

Before:

If passim identifies text reuse between two books, a file is produced. A separate file is produced for both the normal and aggregated datasets. The file name takes the format of bookid1_bookid2. (On book IDs and the way we name the books in our corpus, see our page on [using the corpus]({{ 'corpus/use#uri-structure' | relative_url }}).) The file recording alignments between Ibn Hisham's Sira and al-Tabari's Taʾrikh would, therefore be:

Shamela0009783BK1-ara1.completed_Shamela0023833-ara1.completed.csv
{: .notice--primary}

For ease of identifying text pairs, we produce each alignment file in both directions, so:

Shamela0023833-ara1.completed_Shamela0009783BK1-ara1.completed.csv
{: .notice--primary}

Would be a flipped version of the same file.

In each file, each row gives an alignment between the text pair, recording the aligned text (aligned using [Smith-Waterman]({{ '/methods/text-reuse#how-does-passim-work' | relative_url }})), the location of each alignment in the book and some statistics about each alignment. For detailed guidance on the data fields, see our docs.

The main difference between the alignment files for normal and aggregated is that the location of normal alignments is given as a milestone, where for aggregated a milestone range is provided.

After

If passim identifies text reuse between two books, a file is produced. In each file, each row gives data on one aligned passage in the text pair, recording the aligned text (aligned using the [Smith-Waterman]({{ '/methods/text-reuse#how-does-passim-work' | relative_url }}) algorithm), the location of each alignment in both books and some statistics about each alignment. For detailed guidance on the data fields, see our docs.

The file name takes the format of bookid1_bookid2. (On book IDs and the way we name the books in our corpus, see our page on [using the corpus]({{ 'corpus/use#uri-structure' | relative_url }}).) The file recording alignments between Ibn Hisham's Sira and al-Tabari's Taʾrikh would, therefore be:

Shamela0009783BK1-ara1.completed_Shamela0023833-ara1.completed.csv
{: .notice--primary}

For ease of identifying text pairs, we produce each alignment file in both directions, so:

Shamela0023833-ara1.completed_Shamela0009783BK1-ara1.completed.csv
{: .notice--primary}

would be a flipped version of the same file.

A separate file is produced for both the normal and aggregated datasets. The main difference between the alignment files for normal and aggregated is that the location of normal alignments is given as a milestone, where for aggregated a milestone range is provided.

Issue with the website:

Describe the issue

New feature/page

Describe what you would like adding

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.