User google (or some) tranion service to get text from the videos, add it to the

I really like the tranion tab, i feel like it uses our website real estate very

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Created some api keys on gcloud for the transcribing. Corona virus crea

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Transcription of Video to text about undebate HOT 23 CLOSED

enciv commented on June 12, 2024

Transcription of Video to text

from undebate.

Comments (23)

tianchili11 commented on June 12, 2024 1

https://xd.adobe.com/view/628227f4-05fd-46f1-548c-83a661b5e7e3-14b3/
Here is the 2nd version.

from undebate.

epg323 commented on June 12, 2024 1

I really like the transcription tab, i feel like it uses our website real estate very efficiently and effectively.

from undebate.

ddfridley commented on June 12, 2024

After a candidate has recorded a video, take the video and use some natural language transcription service and translate the speech into text with timestamps, and then attach the data to the database document for that participant. What service, we are looking for your recommendation - but hopefully one that free, or free at low volume, etc - you get the idea. But accuracy is important.

If google, use something like this: https://cloud.google.com/speech-to-text/docs/async-time-offsets
so that we get the timestamps of the words, so that we can play them back with the video.

app/api/create-participant.js is what gets called when the browser is submitting the recordings of a new participant.

An iota record for a participant look like this:

 {
    "_id": {
        "$oid": "5d65a6b877fff400177d50ed"
    },
    "parentId": "5d6350b0e7179a084ef376b9",
    "subject": "Participant:School Board Candidate Conversation - Candidate Conversation",
    "description": "A participant in the following discussion:A prototype Candidate Conversation for schoolboard",
    "component": {
        "component": "MergeParticipants",
        "participant": {
            "speaking": [
                "https://res.cloudinary.com/hf6mryjpf/video/upload/v1566942893/5d5dc697d32514001766ca87-1-speaking20190827T215452394Z.webm",
                "https://res.cloudinary.com/hf6mryjpf/video/upload/v1566942898/5d5dc697d32514001766ca87-2-speaking20190827T215455964Z.webm",
                "https://res.cloudinary.com/hf6mryjpf/video/upload/v1566942903/5d5dc697d32514001766ca87-3-speaking20190827T215503161Z.webm"
            ],
            "name": "Will",
            "listening": "https://res.cloudinary.com/hf6mryjpf/video/upload/v1566942901/5d5dc697d32514001766ca87-2-nextUp20190827T215500659Z.webm"
        }
    },
    "userId": "5d5dc697d32514001766ca87"
}

So implement something like speechToText of a videoURL

obj.component.participant.speaking.forEach(videoURL=>{
text=speechToText(videoURL);
obj.component.participant.speechToText.push(text)
}

from undebate.

poornaraob commented on June 12, 2024

@epg323 : Please provide a quick update in the comment section with below 4 key points before today's meeting for us to review:

Progress
Challenges / road blocks
Expected completion date
Availability during the week

from undebate.

epg323 commented on June 12, 2024

Created some api keys on gcloud for the transcribing.
Corona virus created a lot of chaos. Will be working remotely the next few weeks.
I should be finished no later than this Friday.
I should be available all day this week after 8 am.

from undebate.

poornaraob commented on June 12, 2024

Update from 03/16 standup: Issue implementation is expected to complete by 03/20.

from undebate.

poornaraob commented on June 12, 2024

03/25: Poorna to check with Esaul about the implementation status. Also, check if he wants to pair program with DJ for this case.

from undebate.

poornaraob commented on June 12, 2024

https://www.npmjs.com/package/handbrake-js

from undebate.

epg323 commented on June 12, 2024

Pretty busy right now. Might be better, if you pass this task to someone else. I am still willing to take on smaller tasks, to remain involved with the project.

from undebate.

ddfridley commented on June 12, 2024

@epg323 check out https://github.com/EnCiv/smpreview

from undebate.

poornaraob commented on June 12, 2024

05/20:
Created transcription schema.
Need to work on making API col upon recording.
Have to work with UI designer for integration.
To be complete by 05/27.

from undebate.

tianchili11 commented on June 12, 2024

https://xd.adobe.com/view/31159d3e-8ed2-4966-416b-b234103484d0-4724/
Hi. Here is the first version of the layout. There are more details about the transcription part. Should I go over it with Luis? BTW, this is a clickable prototype. Feel free to contact me with further questions.

from undebate.

poornaraob commented on June 12, 2024

05/27: To be made available by 05/28 2.00p.m Pacific time.

from undebate.

ddfridley commented on June 12, 2024

David and @epg323 to meet tomorrow

from undebate.

poornaraob commented on June 12, 2024

06/10: Backend is ready. Now working on front end. We will create a separate task for front end related activities.
List the components in here as Task for us to ease our tracking - @epg323

Support for multiple speech segments
Design review to be done

from undebate.

poornaraob commented on June 12, 2024

06/17:
Dana is going to review and provide a feedback
We need a separate ticket for front end related activities - Esaul @epg323

from undebate.

poornaraob commented on June 12, 2024

06/24: Esaul @epg323 to meet with David @ddfridley to discuss on how to merge iota records transcription into viewer.

from undebate.

poornaraob commented on June 12, 2024

07/01: This is ready for testing. Need to be merged after testing by David @ddfridley.

Delayed API to be investigated for long term @epg323

from undebate.

poornaraob commented on June 12, 2024

07/08: We found an issue with google transcription for video with length longer than 1min. Need to work on other option Cloudinary or 5min limit stream with google (string).

from undebate.

djbowers commented on June 12, 2024

@epg323 and I had a meeting last Thursday 07/09 to discuss his current blocker on this ticket.

He was able to get Cloudinary to do the transcription automatically when the video is initially uploaded using the Google Transcription add-on, but he was having issues reading the file returned from Cloudinary, which is a .transcript file.

Turns out this is really just JSON with a custom extension, and the following script will download the JSON data and log it to the console.

const superagent = require('superagent')

const url =
  'https://res.cloudinary.com/hrewc5ehd/raw/upload/v1594686544/5ebda7b58fb38e3ccbeff667-0-speaking20200714T002902197Z.transcript'

superagent
  .get(url)
  .then(res => {
    let data = ''
    res.on('data', chunk => {
      data += chunk
    })
    res.on('end', async () => {
      console.log(JSON.parse(data))
    })
  })
  .catch(err => console.log('Error: ' + err))

>>>
[
  {
    confidence: 0.8742577433586121,
    transcript: 'Testing testing one two, three, testing testing one, two three.',
    words: [
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object]
    ]
  }
]

Esaul said this was his only blocker so hopefully this will help to unblock him. I'm happy to schedule another meeting to discuss next steps if needed.

from undebate.

poornaraob commented on June 12, 2024

07/15: Dana has done the front end refactoring transcribe-frontend. google transcription you get double the Cloudinary free video transcription. google streaming is to be verified for 5min. Work on both options and decision can me made at later stage.

Process flow for Cloudinary:
@epg323 Please update process followed

Process flow of Google streaming:
@epg323 Please update process followed

DJ and Esaul to pair program to fix this issue.

@epg323 Please update process followed in Cloudinary and google streaming.

from undebate.

djbowers commented on June 12, 2024

@epg323 and I just took a look at the Cloudinary transcription issue he is having.

We verified that we are correctly generating the URL for the .transcript file, but it appears that when a .transcript file first appears on Cloudinary, we get a 404 error when we try to download it. We are not sure if this is because of some inherent lag in the time it takes for Cloudinary to get the .transcript file back from Google, or if there is some bug on their end, so Esaul is going to write up a message to send to Cloudinary to get some feedback from their support team.

In the meantime, Esaul is going to move forward with the Google Streaming API until we can get support from Cloudinary. I will do the same for #226 if we do not hear back Cloudinary quickly.

from undebate.

poornaraob commented on June 12, 2024

07/22: This is working using google streaming. Move to testing.

from undebate.

Transcription of Video to text about undebate HOT 23 CLOSED

Comments (23)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent