Coder Social home page Coder Social logo

nuxeo_spreadsheet's Introduction

Prerequisites

Follow these instructions first, to install the prerequisites to run nuxeo_spreadsheet.

Python 3 version

Using nuxeo_spreadsheet

nuxeo_spreadsheet constitutes a set of prototype Python scripts ("csv2dict"), which can be used to import metadata in a tab-delimited spreadsheet into Nuxeo. Note that comma separated value-based spreadsheets (CSV) are not supported ("csv2dict" is a misnomer).

1. Upload files to a Project Folder in Nuxeo

Once you have the prerequisites in place, upload your content files into a Project Folder in Nuxeo. Import the files through the Nuxeo UI, or use the bulk import options.

2. Get directory paths to the files

Next, generate a list of directory paths for the files in that Project Folder. You'll need the path to the Project Folder; it's reflected in the URL in your browser view of Nuxeo. Make sure you are in your python environment (e.g., venv) and run this command.

nxls /asset-library/UCX/Project_folder --show-only-path

Optionally, add the additional > command followed a .txt filename, to output the list of directory paths to a .txt file in your home directory (e.g., at cd C:\Users\yourname\):

nxls /asset-library/UCX/Project_folder --show-only-path > paths.txt 

If you're using miniconda within Windows, here's an overview of the process:

  • Open the Command Prompt from the Start menu
  • Activate your python environment. In this example, we're activating a python environment named "venv": activate venv
  • Run the command: nxls /asset-library/UCX/Project_folder --show-only-path

3. Create metadata in tab-delimited spreadsheet

Use Nuxeo Spreadsheet Template. The first tab comprises the template; the second tab provides an example for reference purposes.

Note with the following considerations:

  • We'd suggest saving a copy of the template in Google Sheets, and working directly in the Google Sheets format to build out the metadata. We do not recommend using Excel (.xlsx), based on our initial tests (Excel can add additional quotes around text, and also introduce errors with special characters).

  • The column headings in the tab-delimited spreadsheet need to exactly match the headings expected by the Python scripts constituting nuxeo_spreadsheet. You can double-check the headings by reviewing the columns.txt file in GitHub.

  • In cases where metadata elements are repeatable in Nuxeo, you can append a numeric indicator after the column heading. In the Nuxeo Tab-Delimited Spreadsheet Template, you can see examples of this for Creator. When using this function, note that you must include columns for all complex data fields (e.g., if repeating Creator information, the following fields must be in place: Creator # Name, Creator # Name Type, Creator # Role, Creator # Source, and Creator # Authority ID).

  • Each row in the spreadsheet can contain metadata for either a simple object, a parent-level record for a complex object, or a component for a complex object. The main thing to ensure is that the row corresponds to the correct File Path in Nuxeo.

  • File Path (color-coded in red) is required for each row; additionally, either Title, Type, Copyright Status, and/or Copyright Statement is required, if and when the objects will be published in Calisphere. For additional information on the metadata requirements, see the Nuxeo user guide

  • The File Path cell should contain the exact file directory path to the content file in Nuxeo, to be associated with the metadata record (e.g., "/asset-library/UCOP/nuxeo_tab_import_demo/ucm_li_1998_009_i.jpg").

  • If using Google Sheets directly to create your metadata records, note that some of the fields have validation rules. These fields are keyed to controlled vocabularies established in Nuxeo.

  • Once you've completed the process of creating metadata records using the template, save a copy as a tab-delimited file.

screen shot 2016-06-01 at 5 23 53 pm

If using Google Sheets, download as tab separated value:

screen shot 2016-06-01 at 9 59 58 pm

4. Import metadata in tab-delimited spreadsheet into Nuxeo

Load with meta_from_csv.py. This process will convert the metadata from the spreadsheet into Python dict outputs, and call pynux to import the Python dict outputs directly into Nuxeo.

usage: meta_from_csv.py [-h] --datafile DATAFILE [-d] [--loglevel LOGLEVEL]
                        [--rcfile RCFILE]

optional arguments:
  -h, --help           show this help message and exit
  --datafile DATAFILE  tab-delimited spreadsheet input file -- required
  -d, --dry-run        dry run
  --blankout           blank out all fields not set in sheet

common options for pynux commands:
  --loglevel LOGLEVEL  CRITICAL ERROR WARNING INFO DEBUG NOTSET, default is
                       ERROR
  --rcfile RCFILE      path to ConfigParser compatible ini file

Note for Windows: you may need to run python meta_from_csv.py ... or edit the shebang. If you're using miniconda within Windows, here's an overview of the process:

  • Open the Command Prompt from the Start menu
  • Activate your python environment. In this example, we're activating a python environment named "venv": activate venv
  • Go to nuxeo_spreadsheet\csv2dict in your home directory, e.g.: cd C:\Users\yourname\nuxeo_spreadsheet\csv2dict
  • Run the command. In this example, the DATAFILE is the location of a tab-delimited file (named "tab-delimited-metadata.txt") that's on our Desktop. python meta_from_csv.py --datafile C:\Users\yourname\Desktop\tab-delimited-metadata.txt

mets_example

Sample code for loading METS into Nuxeo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.