Coder Social home page Coder Social logo

ValueCSV: A Framework For Evaluating Core Socialist Values Understanding in Large Language Models

Overview

To address the potential social risks and safety challenges associated with Large Language Models (LLMs), human values alignment has been proposed to guarantee LLMs' outputs align with human values as a key step to achieve responsible AI technology. However, current efforts to align LLMs with human values often rely on value theories such as Schwartzโ€™s value theory or moral foundation theory, which may not capture the full spectrum of diverse cultural or social values. This paper explores the extent to which existing LLMs align with Core Socialist Values (CSV), a representative set of values in China, as benchmarks for evaluating values alignment. Our framework is publicly available at here.


In this repository, we provide:

  • ValueCSV dataset with 5,000 Core Socialist Values annotated data at here.

  • Code for training CSV evaluator. Flie Multi_evaluator.py for training Multi-label classifier and file Binary_evaluator.py for training seperate binary classifiers.

  • The checkpoints of trained ValueCSV evaluator.

  • A question dataset with 100 question covering diversity CSV dimensions for testing LLM's CSV understanding ability at here.

Definition of CSV

Core Socialist Values contains 12 distinct types of values, which are Prosperity, Democracy, Civility, Harmony, Freedom, Equality, Justice, Rule of Law, Patriotism, Dedication, Integrity and Friendliness. These 12 dimensions of values can be categorised into three higher-level groups, i.e., National level, Society level, and Personal level, as listed below:

Pretrained Model

We use both bert-baes-chinese and chinese-roberta-wwm-ext-large as backstones of evaluators.

Experiment Results

The comparison of different ValueCSV evaluators are shown in table below:

Note that M.Value-BERT and M.Value-RoBERTa are both multi-label classifier and Value-BERT and Value-RoBERTa are both 12 separate binary classifiers.

Checkpoints

We also release the checkpoints of ValueCSV evaluator Value-Bert at Google-Drive as Value-Bert outperforms in all ValueCSV evaluators.

Data License

We make the dataset under the following licenses:

valuecsv's Projects

ccae icon ccae

The Official Repository for ๐Ÿ‘‰ CCAE: A Corpus of Chinese-based Asian Englishes @ NLPCC 2023

ddpo icon ddpo

Code for the paper "Training Diffusion Models with Reinforcement Learning"

valuecsv icon valuecsv

Framework released in paper "ValueCSV: Evaluating Core Socialist Values Understanding in Large Language Models"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.