Light

neuspeech / neuspeech1 Goto Github PK

View Code? Open in Web Editor NEW

22.0 22.0 5.0 287 KB

Decode Neural signal as Speech

License: Apache License 2.0

Python 99.46% Shell 0.54%

neuspeech1's Introduction

Hi there 👋 I'm Yiqian Yang, running this orgnisation NeuSpeech.

🔭 I’m currently working on different types of neural signal and unified neural model. I am really interested in this.
👯 I’m looking to collaborate on fine-grained MEG-to-speech, unified neural model.
📫 How to reach me: [email protected]
⚡ Fun fact: my cat can do back-flip.

Collaborators now:

Yiqun Duan (duanyiqun) Yiqun Duan

Hyejeong Jo (girlsending0) Hyejeong Jo

Qiang Zhang (jonyzhang2023) Qiang Zhang

neuspeech1's People

Contributors

Stargazers

Watchers

Forkers

girlsending0 simonfei123 ai-in-pm ailinnastar yutonishimura-v2

neuspeech1's Issues

I Have a some questions

hi
I am a researcher studying EEG-To-Text. I recently saw your Neuspeech paper. I was impressed by your paper, and it was a great help to my research direction. thanks. But I have some kinds of questions.

Have you ever applied ZuCO data? Did your model perform poorly on EEG data?
I am also trying various things using ZuCo data, but I am having a lot of trouble because I am not getting any meaningful results.

Could I possibly get your baseline code? I would like to experiment with MEG data using your dataloader and code. I look forward to a positive response.
thank you

I am very happy to be doing the same research as you.😄😄

Some question of data split

First, thank you for sharing your experiment and code on brain waves.
Please note that my English is not very good, so I may use some incorrect English sentences.

We have done a replication experiment based on the code provided and have achieved similar results to the performance reported in the paper.
We could not get Schoffelen's data, so we only used GWilliams.

While analyzing the experimental results, we found that most of the predicted (generated) sentences either have all the words matching the correct answer or all the words are incorrect.
I am a natural language processing major. As far as I know, there are many cases where the generation model generates only some words in the whole sentence incorrectly.
However, in my experiments with the provided code, there are very few such cases.

We analyzed the data and found that all the sentences were the same in the training and evaluation data.
There were a total of 23339 training data, but only 661 unique sentences.
Similarly, the evaluation data had 651 unique sentences out of 2918.

However, all 651 unique sentences in the evaluation data were included in the train data.
Every MEG path is unique and is not shared by training data and test data.
This is probably a problem with the generation process that generates MEG data from multiple people through the same sentence.

We believe that this separation of data is hard an accurate evaluation.
Pre-trained whisper can learn patterns in word sequences.
Therefore, in this environment, if pre-trained whisper correctly guesses the first word, it can easily predict all subsequent words.

Our simple data analysis code is shown below.
Also, we were unable to get hold of the Schoffelen data, can you tell us where we can download it?

import jsonlines

train_data_path = "{data_path}/preprocess5/split1/train.jsonl"
val_data_path = "{data_path}/preprocess5/split1/val.jsonl"
test_data_path = "{data_path}/preprocess5/split1/test.jsonl"


train_data_sent = []
train_data_meg_path = []
with jsonlines.open(train_data_path, mode='r') as reader:
    for json_obj in reader:
        train_data_sent.append(json_obj["sentence"])
        train_data_meg_path.append(json_obj["eeg"]["path"])

val_data_sent = []
val_data_meg_path = []
with jsonlines.open(val_data_path, mode='r') as reader:
    for json_obj in reader:
        val_data_sent.append(json_obj["sentence"])
        val_data_meg_path.append(json_obj["eeg"]["path"])
        
test_data_sent = []
test_data_meg_path = []
with jsonlines.open(test_data_path, mode='r') as reader:
    for json_obj in reader:
        test_data_sent.append(json_obj["sentence"])
        test_data_meg_path.append(json_obj["eeg"]["path"])

print("counting unique elements")
print("train")
print("sentence", len(train_data_sent))
print("unique_sentence", len(set(train_data_sent)))
print("meg", len(train_data_meg_path))
print("unique_meg", len(set(train_data_meg_path)))
print()
print("val")
print("sentence", len(val_data_sent))
print("unique_sentence", len(set(val_data_sent)))
print("meg", len(val_data_meg_path))
print("unique_meg", len(set(val_data_meg_path)))
print()
print("sentence", len(test_data_sent))
print("unique_sentence", len(set(test_data_sent)))
print("meg", len(test_data_meg_path))
print("unique_meg", len(set(test_data_meg_path)))
print()

same_sent = 0
same_meg = 0
for i,j in zip(test_data_sent, test_data_meg_path):
    if i in test_data_sent:
        same_sent += 1
    if j in train_data_meg_path:
        same_meg += 1

print("number of test_data", len(test_data_sent))
print("counting of setence in train-data", same_sent)
print("counting of meg in train-data", same_meg)

result

counting unique elements
train
sentence 23339
unique_sentence 661
meg 23339
unique_meg 23339

val
sentence 2917
unique_sentence 647
meg 2917
unique_meg 2917

sentence 2918
unique_sentence 651
meg 2918
unique_meg 2918

number of test_data 2918
counting of setence in train-data 2918
counting of meg in train-data 0

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.