Coder Social home page Coder Social logo

Comments (5)

coolharsh55 avatar coolharsh55 commented on August 23, 2024 1

Hi. Thanks - I will check the updated tables. Meanwhile below are what I have as notes so far for the JSON.

Notes

Interpretations

  • all fields are mandatory unless otherwise noted
  • where a field is not applicable, then it must explicitly state so, and where relvant also provide a justification for the lack of applicability
  • if a field is not present, or its value is a blank then it is ambigious as to what is the state of data source, therefore lack of information does not mean anything and is an error

Namespaces

Datasheet

  • new:Datasheet to represent an instance of datasheet
  • metadata about the dataset is specified as follows
    • title using dcat:title
    • creator as in the entity that generated it using dcat:creator
    • publisher as in the entity 'publishing' it using dcat:publisher
    • see dcat:Dataset for more annotations, such as format and license
  • dpv:DataSource to represent the source of data, associated using dpv:hasDataSource
    • can have multiple values
    • examples of data source can be dpv:DataSubject to indicate data is collected from individuals, dpv:ThirdParty or dpv:Organisation to specify categories of entities from which data has been collected, or it can also be another dataset (using dcat:Dataset)
    • important to distinguish this concept from Data Source which is about the origin of data e.g. data origin may be individuals but data source may be a dataset
    • if the data source information is not available (but exists), then dpv:NotAvailable can be used, and if it is unknown then dpv:Unknown can be used
    • TODO: what does clear/unclear mean?
  • dpv:Purpose represents the purposes for which data can be used
    • can have multiple values
    • examples of purposes can be TODO provide examples of dataset purposes
    • purpose cannot be unknown or not applicable, if there is no fixed purpose and the dataset can be used for any purpose then the value should be dpv:Purpose
    • the purpose may not be the end purpose of a process, e.g. the dataset purpose might be only to test an existing face recognition model for accuracy, whereas the end process of that model and system might be detecting faces for employee identity verification on premises
    • TODO: what does clear/unclear mean?
  • dpv:Data represents the categories of data represented within the dataset which can be present implicitly e.g. faces in a photograph, or explicitly e.g. annotations of a person's characteristics in a photograph
    • can have multiple values
    • can be categorised as dpv:VerifiedData (i.e. asserted to be present) or dpv:UnverifiedData (i.e. unknown if present), or as dpv:DerivedData (i.e. present implicitly) or dpv:InferredData (i.e. assumed)
    • data can also be categorised as dpv:SensitiveData (requires additional safety mechanisms) or dpv:SpecialCategoryData (requires separate legal basis or permission to use)
    • data can be dpv:PersonalData or dpv:NonPersonalData; and also indicated to be dpv:SyntheticData
    • if personal data, it can be dpv:ExplicitlyIdentifyingPersonalData, dpv:IndirectlyIdentifyingPersonalData
    • TODO: how to express "labels" and source (e.g. manual) - should they be derived/inferred data as well?

from data-and-ai-internship-2023.

NLSanyu avatar NLSanyu commented on August 23, 2024

Noted.

We are currently working on updating the terminology used in the tables and the JSON files (for both datasheets and model cards) into the terminology used in DPV. We had to first complete both tables before we could start on that.

The updated tables are in the documents in the drive, so it would be great if you checked that and gave us some feedback, if any. The JSON files are within this repository under the Outputs folder.

Thanks.

from data-and-ai-internship-2023.

NLSanyu avatar NLSanyu commented on August 23, 2024

Feedback noted, we will update

from data-and-ai-internship-2023.

NLSanyu avatar NLSanyu commented on August 23, 2024

Hi @coolharsh55,

UPDATE:
We have updated the table and the JSON as directed in your notes on this GitHub issue.

Thanks.

from data-and-ai-internship-2023.

coolharsh55 avatar coolharsh55 commented on August 23, 2024

Hi - I have added comments. For the missing concepts, we need a working definition for the concepts to decide how to represent them.

from data-and-ai-internship-2023.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.