jitsi / gsoc-ideas Goto Github PK

Google Summer of Code ideas

gsoc-ideas's Introduction

Jitsi Desktop

Jitsi Desktop is a free open-source audio/video and chat communicator that supports protocols such as SIP, XMPP/Jabber, IRC and many other useful features.

Please do not confuse this project with Jitsi Meet, the online video conferencing solution with a free instance at https://meet.jit.si.

Support

Jitsi Desktop is the heritage of Jitsi Meet. While some components are still used in e.g. Jigasi, the project is not actively developed anymore. Improvements, bugfixes and builds are entirely based on community contributions.

Installation

Releases

Windows and macOS

Download the installers from GitHub releases.

Debian/Ubuntu

An APT repository is available at https://nexus.ingo.ch/jitsi-desktop/. Note the trailing slash at the end of the distro-name. This is required since the repository has no components.

deb https://nexus.ingo.ch/jitsi-desktop-unstable/ <distro>/

RPM Distros

Sorry, there are currently no rpm packages available.

Snapshots

Snapshot or pre-release builds are also available in additional repositories.

Windows and macOS: See https://github.com/jitsi/jitsi/releases
Debian/Ubuntu: https://nexus.ingo.ch/jitsi-desktop-unstable/

Helpful Resources

Contributing

Please, read the contribution guidelines before opening a new issue or pull request.

gsoc-ideas's People

Contributors

Stargazers

Watchers

gsoc-ideas's Issues

GSOC -22, Speech to Text (ISSUE)

Hi Jitsi team,
I have used Artyom Speech to text API in past which can be very well integrated in this project, I am attaching a video demo of the working prototype as well.

Why I am using artyom

It's completely free
It's accuracy is above the one present in the free model.
It processes the data and even rectifies any pronunciation errors with its ML model.
It converts the speech to text in real-time providing the user a lag-free & low latency experience.

Working prototype:-

https://drive.google.com/file/d/15lmaRUiFTYdGFd-yCguBwRnsAr0I9J2d/view?usp=sharing

Speech to Text Feature.

Hello @saghul @nikvaessen!

Recently came across Jitsi while researching GSOC projects. Love the platform! Wanted to contribute to the real time speech to text feature. The options considered:

DeepSpeech2
Facebooks Hubert -with no LM, still got a fast, quality output)
Improvements to Vosk.

But, to narrow these options down for you, knowing about what specific areas need improvement would be helpful.

Regards!

Change `prometheus-stats` to `prometheus-stats.md`

Since the file doesn't have an extension as of now, therefore it appears in markdown format and is not readable properly. Adding an file extension of .md will slove this.

Fix a typo in 2022 directory multiple files

In cast-meeting.md, ios-pip-window.md, ml-audio-enhancemants.md, mobile-video-effects.md, react-native-sdk.md, spatial-audio.md, template.md files, there is a small typo. Change ooutcomes to outcomes.

How do I post a new idea to the ideas list?

Hello there,
I have a new feature idea in my mind but am not sure how can I post it to the ideas list of GSOC 2022 . Should I directly initiate a pull request for that or should I simply post it here in the Issue section? Your guidance would be of great help.

ML audio enhancements project in GSoC' 2022 (Thread)

Hello @saghul & @hristoterezov, I'm Saurabh pursuing a prefinal year Btech in India, I have good experience with ML & Development. I'm interested in applying at GSoC for ML audio enhancements project. I have started exploring the structure and flow of the Jitsi codebase. I would like to have your suggestions before starting with my proposal for this project.

Thanks

Had new idea that I think is worth sharing

Jitsi

Hi there, I just wanted to brainstorm a few things that I think is overlooked by many companies such as google meets, and I think this should be one of the many features that makes jitsi, special.

If this idea doesn't make sense, please feel free to ignore it.

The idea

The idea is you know how there are pointers you can use while presenting your screen in google meets, but can participants show you any part of the screen they want to direct your attention to? they will have to spell it out to you every time they want to point out at something, and this is very challenging, especially in a coding presentation scenario. as a result, we can come up with a solution where the participant will be given an access by the presenter to use pointers from their end.

I am not really sure if it is a good idea, so please let me know what you think 😊. and I will explain more.

Hi I want to contribute to the Spatial Audio idea for Gsoc.

I want to contribute to the Spatial Audio as mentioned in the idealist , how can I interact with the mentioned mentors and is there any Discord or Slack channel for it ?

ml_audio_enhancements

Hello,
I was looking at the ml audio enhancement in jitsi meet .I just wanted to know what sort of outcomes are being accepted in the proposal for this project .
Thanks!

Can anyone Review my GSOC proposal for speech to text.

I have created my proposal. Please Provide me a feedback @nikvaessen @saghul
Birat Datta GSOC.pdf

Questions about language support for the speech-to-text project implementation

Hi @nikvaessen, what languages is Jitsi looking to add support for? Is the focus going to be on English or on multi-lingual support. Also on a side note: I was browsing through the Wave2vec2 documentation on HuggingFace and saw that sometimes the model would predict acoustically accurate but grammatically incorrect words/sentences. What is the expectation with regards to handling those cases?

Some examples of what I'm referring to taken from HuggingFace's blog post

Speech to Text

Hey... I was just exploring Speech to Text project ideas for GSoC 2022 and came across this Javascript Web Speech API which offers Speech Recognition. It seems to be quite accurate and real-time but the privacy policy is not very clear. However, it somewhere states:

Chrome currently takes the audio and sends it to Google's servers to perform the transcription.

Just wanted to know what are your views over the same and if it suits your requirements. Plus, I also tested the DeepSpeech open-source library over a local setup and the results are good enough for the English language but not very convincing for other regional languages. Do you consider it a probable solution? Looking forward to suggestions and feedback @nikvaessen

Regards
Rishabh

Multiple Recording Storage Providers - GSoC

Hello everyone! This is Jayanth, a second year undergraduate from Indian Institute of Technology, Madras. I am a software developer with strong expertise in web development (Javascript based). I was exploring all the projects put up for this year’s GSoC program and the ‘Multiple Recording Storage Providers’ project by Jitsi grabbed my interest. So, I would like to know about the potential mentors and what are their plans for this project. Any kind of inputs from them would help me draft the proposal. I wish to make an impactful contribution for this project. Is anyone planning for contributing to this project? I'm open for discussion.

Speech-to-text GSOC project ! discussion

@nikvaessen, I was studying the backend implementation 'Jigasi'. And there was a Heading Vosk Configuration, I read about this and found out that Vosk is an Open Source speech recognition system. So In this Project We will have to find a different Open Source Model other than Vosk ?if yes, What are the properties that Vosk is lacking behind. It would help in finding the correct Open source model as I was searching there are many other models. like deepspeech by Mozilla, OpenSeq2Seq by NVIDIA.

Speech-to-text with publicly available deep learning models

Hi!
I have read through the idea document for "Speech-to-text with publicly available deep learning models". But, one thing I feel is missing there: Will the inference servers have a GPU available to them? Meaning will Jigasi be using a GPU enabled instance to run?

Because, the choice of a specific model complexity would be decided based on the resources available. Though, we can run these heavy and high-performance models on CPU instances, we might not get the low latency required in the process. So, deciding the instance type early can help in choice of models and so on.

Feature : Design Enhancement in the Jitsi Website.

Fixing the Seo & UI Design and Front-end Web-development of Jitsi

The audience of Jitsi largely consists of teachers, educators and students. At the heart of Jitsi are Jitsi Vide bridge and Jitsi Meet, which let you have conferences on the internet, while other projects in the community enable other features such as audio, dial-in, recording, and simulcasting. One of the most requested feature in Jitsi is to engage large audience as in the market there are other market consuming product as well who are providing somewhere the same functionality as Jitsi provide.

The current website works well but lacks some potential features and has a lot of scope for improvements. This is the main objective of this project proposal, to replenish the website and engage more functionality to it for better user interface & user experience.

There have been instances of people migrating to platforms other than Jitsi because of the poor user interface which results in bad user experience on the website itself before reaching to the actual product features. Through this project, I would like to understand exactly what users expect from a Secure video conferencing environment and implement the same in Jitsi. I’ll try to finish as many UI design and its code implementation as I can during my Google Summer of Code coding period and work on the rest after that as people keep suggesting better ones.

Although I am done with the proposal and mockups for the new design ,Would love to know , if this can be consider as the GSoC proposal statement or not .

Speech to text feature in GSoC 2022

Hi everyone,
Hope you’re all doing great!

My name is Shahryar Soltanpour, I’m studying MSc in computer science at the University of Calgary. I’m very excited to join the Jitsi community. I want to contribute to speech-to-text feature of Jigasi and I’m preparing my proposal for it. I would be more than happy to hear any advice from you about getting started or any features and ideas that you want me to include in my proposal.

Thanks in advance

JITSI: Speech-to-text with publicly available deep learning models

Hi @nikvaessen, this is Bhargav B R. I have submitted my proposal for the Jitsi: speech-to-text project.

I have tested various open source publicly available speech-to-text models like Mozilla’s DeepSpeech, Coqui STT, Facebook’s flashlight, SpeechBrain and Kaldi, comparing the word error rate and inference speed of each and also included the speech transcription results obtained by running against sample audio using various models in my GSOC proposal.

For enabling communication between Jigasi and the web server hosting the open source model, I have designed a sample client-server model based application considering both WebSocket framework and Rest Service architecture, exposing the open source speech-to-text model as a service for serving transcription results back to the client.

I went through the existing Jigasi implementation of speech-to-text conversion. I am very much interested in contributing to this project, making it a completely self-hosted solution for speech transcription. Please review my proposal.
Thank you.