classtranscribe / webapi Goto Github PK
View Code? Open in Web Editor NEWRepository for the .NET Core backend for ClassTranscribe
License: Other
Repository for the .NET Core backend for ClassTranscribe
License: Other
AsNoTracking is described here-
https://docs.microsoft.com/en-us/ef/core/querying/tracking
Hi developers,
I have the following issues when I sign in through UIUC shibboleth:
POST https://classtranscribe.illinois.edu/api/Account/SignIn 405
Failed to get user data and auth token from backend Error: Request failed with status code 405
at e.exports (createError.js:17)
at e.exports (settle.js:19)
at XMLHttpRequest.f.onreadystatechange (xhr.js:63)
I wonder if there is anything I can do on my end to solve it.
Provided all files use a random UUID/ random hex and are at least 6 characters long. Use aa/bb/cc (hexadecimal), so number of entries per directory is 100-256 and not 10000.
e.g. Filename 'f5d32a6c6452.txt' would be in directory f5/d3/3a
e.g. Filename '13254278678.txt'' would be in directory 13/25/42
For filenames (excluding the extension) less than 6 characters, just append 6 underscores to calculate the directories.
Map uppercase to lowercase and only allow digits 0-9a-z. Map all other characters to underscore.
ABC.txt will be in ab/c_/tx/
my music.mp3 will be in /my/_m/us/
Be sure to check subdirectories exist before saving the file. e.g. TO create the file 'f5d32a6c6452.txt' You don't know if f5/d3/3a already exists or f5/d3 or f5 exists.
Stage 1: Do this for all new arriving files
Stage 2: Remap existing resources i.e. update database of file-type resources to include subdirectory info
Related Comments: Eventually we may need to support remote files on other systems
Allow Instructors fine-grained control to control the visibility of their content.
The API schemas on the swagger page are different from the actual data.
For example the GET endpoint for Terms
and Media
is much more complicated than the actual data in the response
Most Tasks were already creating a DB connection on the fly with
using (context = CTDBContext)
The Transcription task was not; this leads to race conditions and update errors which are apparent in the taskengine log.
Relevant commit -
d098fd5
protected async override Task OnConsume(string videoId, TaskParameters taskParameters)
{ using (var _context = CTDbContext.CreateDbContext()) {
example error log -
taskengine | Error occured in RabbitMQConnection Transcribe for message TaskObject(Data=052b9ac6-0d81-4e63-a203-d3f888a771cc; TaskParameters=TaskParameters(Force = False; Metadata = );
taskengine | System.InvalidOperationException: A second operation started on this context before a previous operation completed.
This is usually caused by different threads using the same instance of DbContext. For more information on how to avoid threading issues with DbContext,
see https://go.microsoft.com/fwlink/?linkid=2097913.
taskengine | at Microsoft.EntityFrameworkCore.Internal.ConcurrencyDetector.EnterCriticalSection()
...
taskengine | at TaskEngine.Tasks.TranscriptionTask.OnConsume(String videoId, TaskParameters taskParameters)
in /src/TaskEngine/Tasks/TranscriptionTask.cs:line 52
taskengine | at CTCommons.RabbitMQConnection.<>c__DisplayClass8_0`1.<b__0>d.MoveNext() in /src/CTCommons/RabbitMQ/RabbitMQConnection.cs:line 104
You cant create a course if the course number for a department does not exist.
Allow instructors to just enter a 1-5 digit number if it does not exist already.
Microsoft Cognitive Services is used by ClassTranscribe for automatic caption generation. This process takes about 0.7x the duration of the video. Often this process can get interrupted by a "ServiceTimeout" or "ConnectionFailure" exception, this results in restarting the caption generation. A workaround is to resume from last failure point.
Such as EPubs, images, and other user-generated content.
Currently frontend are using latest commit's sha for the FrontEnd master branch to handle the browser cache.
but there's a race condition with that because there's a window of time between master commit and when master is deployed. So some clients will see the new commit version while the old api is deployed; then wont know to clear the cache when the new version is deployed
view on teams
Every 5 hours the periodic task checks for transcription tasks that have not started (and includes logic to avoid multiple attempts). If there are many transcriptions it is possible that the task from a previous check 5 hours previously has not started, thus the queue can end up with two tasks for the same video.
Note two transcriptions of the same video cannot occur simultaneously because the Key logic keeps track of which Transcription is currently in progress, however they can occur if they do not overlap.
There is no logic to check if the task should exit in the task itself. There should be some checking.
There is no point adding additional tasks if it is already queued. However since this is not easy to do until we have a better task manager, purging the existing queue and rebuilding it would be acceptable.
The newly created playlist has its index
default as 0
. It should be the length of the playlists in an offering, so that the newly added playlist can be placed at the end of the list.
Mediaspace generates its own captions, it is possible using the mediaspace APIs to fetch them. The task is to implement such a feature to fetch mediaspace's existing captions and convert them into formats ClassTranscribe stores it in.
Add more "smarts" to caption generation
e.g. New Sentences should usually start a new caption line.
Beware of end-of-caption edge cases (there are many...)
See
https://github.com/classtranscribe/WebAPI/compare/MSToVtt
which was based on convert Angrave's word-to-captions python code. See heuristics here-
https://github.com/classtranscribe/PythonTools/blob/master/transcribe-cli/ms_json_to_caption.py
"You might also add a task to revise the database to allow for a department having several degrees. For example,
ISE has IE and SED
MATH has Math and ASRM
MechSE has ME and EM (Engineering Mechanics)"
Only downloading from Youtube Playlists is supported currently. Support downloading from Youtube Channels
Auth0 caches credentials; so it is not possible to sign in as someone else without using a new private window.
We used to test apis through Swagger UI (/swag/index.html). It is easy to use but requires many manual repetitive operations for backend developers to conduct rigorous api testing. Also, the test results are not trustworthy because the comparing of responses and expectations are conducted manually. So it is quite necessary to generate a client SDK and package code of calling apis.
Since the main purpose of the client SDK is to conduct api testing, we want to be able to generate a C# client SDK from openAPI.
All EF queries are "tracking" by default, but this incurs some overhead.
We can use no-tracking queries for read-only queries: "They're quicker to execute because there's no need to set up the change tracking information." (https://docs.microsoft.com/en-us/ef/core/querying/tracking)
Because of the Soft-Delete feature used by ClassTranscribe, no row is ever deleted from the database, but rather is marked as inactive. This creates an issue when an already "deleted" row is added again in which case there occurs a key conflict.
Various video watch bugs -
(paraphrased) "When a monitor is set to 16:9 (but not 16:10),CT crops the top and bottom of the videos."
(Suggest Test in full screen mode and reproduce this bug before trying to fix it).
"Have an easy way to navigate from one playlist to the next. Currently you have to go back to the home page and refind the course, and select the next playlist.
Speed selector issues. If I view part 1 of a lecture, when I go to part 2 the speed is not correct. It displays the speed as still faster, but it the video only plays at 1X speed.
Can't rewind from 9 or less seconds back to the 0 by arrow keys. To prevent going negative, it doesn't seek. It should seek back to 0.
"
The current API the frontend is using to get all the offerings of an instructor is /api/CourseOfferings/ByInstructor/{userId}
. This API grouped the offerings by courses, but for the new interface this grouping is not needed any more and could cause even more works to parse the data.
So we want to request a new API /api/Offerings/ByInstructor
just like /api/Offerings/ByStudent
, that returns all the offerings for an instructor in an array ordered by created time. Also, like the /api/Offerings/ByStudent
, this API is expected to have the term data and department data inside each offering object.
The below is a single offering data obtained by calling /api/Offerings/ByStudent
. It will be great that the /api/Offerings/ByInstructor
can return similar ones.
To get the data used for visualization, now frontend has to send multiple requests through the APIs for logs
, and usually, it costs about 30 sec or more for some popular courses like CS241.
So maybe it's time to have special APIs for each data visualization api/.../{offeringId}
. According to the issues #150 and #131 in FrontEnd
(Especially for #150, we need to have a giant table joined by users and medias, and each cell in the table represents how much time a user spent on this media).
Hence, we need an API that can return the following data:
id
and email
; (2) an array of ALL the medias in this offering (not just the medias the user watched). should be ordered by their indices
.timeupdate
event type in last1Hr
, last3Days
, lastWeek
, lastMonth
, and total
for this user. (if a media is not watched by the user, all the values should be 0)[
{
"user": {
"id" : "",
"email": ""
},
"medias": [
{
"mediaName": "",
"id": "",
"last1Hr": 12,
"last3Days": 12,
"lastWeek": 12,
"lastMonth": 12,
"total": 12
}
]
}
]
Migrate to sentence ASR output but use best effort mapping of the word-based timing for caption timing.
The API /api/WatchHistories/GetAllWatchedMediaForUser
should not return deleted medias
Work with UIUC Box IT admin to create and use an application Box key
Find a better way to log user event data rather than dumping to a SQL database.
Course Template and course number will no longer be needed linked to a Course; their only value will be to suggest drop-downs when creating a new course; but new courses are not limited by these suggestions. They can be updated automatically when courses are created.
Offerings should be promoted to contain all information (i.e. don't refer to a Course Template). "Offerings" is not standard venacular.
Rename existing Course to CourseTemplate
Rename Offering to Course
A playlist mixes two concepts: An upstream source and a list of videos to play to the user. Let's separate these out, so that a playlist for students can contain multiple videos from multiple sources.
Rename existing Playlist to VideoSource
Create a new playlist object that includes a list of video sources.
New Playlist object is a collection of VideoSources, and can include one off videos manually uploaded.
C# and python code create random files. e.g. the C# Code calls GetTmpFile() in CommonUtils.cs
Later these files may be renamed (e.g. an extension "vtt" is added) so the files are not actually temporary at all.
Other files are temporary and (as far I can tell!) may never be cleaned up (e.g. WAV files e.g. database dumps)
See also #8
Related comments:
Should we create random filenames using Guid.NewGuid() instead of the convoluted C# and something similar for python code?
Microsoft Cognitive Services allow supplying it some domain words to improve the accuracy of transcriptions. The task is to figure out how this works and implement this feature within ClassTranscribe.
When working on issue 34, I found that Video.JsonMetadata is initialized to null. It would be better to initialize it and all the other JObject fields defined under Model.cs to an empty JObject at the beginning. This would save the effort to explicitly check if JsonMetadata is null.
See for example,
https://hintsandmemories.wordpress.com/2014/04/10/ffmpeg-libx264-tune/
https://trac.ffmpeg.org/wiki/Encode/H.264
https://videoblerg.wordpress.com/2017/11/10/ffmpeg-and-how-to-use-it-wrong/
https://forum.videohelp.com/threads/194088-Need-help-with-ffmpeg-2-pass-VBR-encoding
https://hintsandmemories.wordpress.com/2014/04/10/ffmpeg-libx264-tune/
Ability to regenerate translations (all/some) from latest captions.
Also: Use domain words and tag them as do-not-translate
on -dev, A manual upload playlist had multiple entries of the same mp4 video. In the logs -
taskengine | 2020-09-23T21:17:20.694417274Z Stop recognition.
taskengine | 2020-09-23T21:17:21.767143311Z fail: TaskEngine.Tasks.TranscriptionTask[0]
taskengine | 2020-09-23T21:17:21.767229753Z Transcription Exception: at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.IdentityMap`1.ThrowIdentityConflict(InternalEntityEntry entry)
taskengine | 2020-09-23T21:17:21.767261793Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.IdentityMap`1.Add(TKey key, InternalEntityEntry entry, Boolean updateDuplicate)
taskengine | 2020-09-23T21:17:21.767287826Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.StartTracking(InternalEntityEntry entry)
taskengine | 2020-09-23T21:17:21.767314533Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.InternalEntityEntry.SetEntityState(EntityState oldState, EntityState newState, Boolean acceptChanges, Boolean modifyProperties)
taskengine | 2020-09-23T21:17:21.767340763Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.InternalEntityEntry.SetEntityStateAsync(EntityState entityState, Boolean acceptChanges, Boolean modifyProperties, Nullable`1 forceStateWhenUnknownKey, CancellationToken cancellationToken)
taskengine | 2020-09-23T21:17:21.767368708Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.EntityGraphAttacher.PaintActionAsync(EntityEntryGraphNode`1 node, CancellationToken cancellationToken)
taskengine | 2020-09-23T21:17:21.767394094Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.EntityEntryGraphIterator.TraverseGraphAsync[TState](EntityEntryGraphNode`1 node, Func`3 handleNode, CancellationToken cancellationToken)
taskengine | 2020-09-23T21:17:21.767420988Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.EntityEntryGraphIterator.TraverseGraphAsync[TState](EntityEntryGraphNode`1 node, Func`3 handleNode, CancellationToken cancellationToken)
taskengine | 2020-09-23T21:17:21.767447355Z at Microsoft.EntityFrameworkCore.DbContext.AddRangeAsync(IEnumerable`1 entities, CancellationToken cancellationToken)
taskengine | 2020-09-23T21:17:21.767470505Z at TaskEngine.Tasks.TranscriptionTask.OnConsume(String videoId, TaskParameters taskParameters) in /src/TaskEngine/Tasks/TranscriptionTask.cs:line 113
taskengine | 2020-09-23T21:17:21.767497730Z System.InvalidOperationException: The instance of entity type 'Caption' cannot be tracked because another instance with the same key value for {'Id'} is already being tracked. When attaching existing entities, ensure that only one entity instance with a given key value is attached. Consider using 'DbContextOptionsBuilder.EnableSensitiveDataLogging' to see the conflicting key values.
taskengine | 2020-09-23T21:17:21.767515693Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.IdentityMap`1.ThrowIdentityConflict(InternalEntityEntry entry)
taskengine | 2020-09-23T21:17:21.767529456Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.IdentityMap`1.Add(TKey key, InternalEntityEntry entry, Boolean updateDuplicate)
taskengine | 2020-09-23T21:17:21.767563141Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.StartTracking(InternalEntityEntry entry)
taskengine | 2020-09-23T21:17:21.767578217Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.InternalEntityEntry.SetEntityState(EntityState oldState, EntityState newState, Boolean acceptChanges, Boolean modifyProperties)
taskengine | 2020-09-23T21:17:21.767592264Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.InternalEntityEntry.SetEntityStateAsync(EntityState entityState, Boolean acceptChanges, Boolean modifyProperties, Nullable`1 forceStateWhenUnknownKey, CancellationToken cancellationToken)
taskengine | 2020-09-23T21:17:21.767606737Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.EntityGraphAttacher.PaintActionAsync(EntityEntryGraphNode`1 node, CancellationToken cancellationToken)
taskengine | 2020-09-23T21:17:21.767620650Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.EntityEntryGraphIterator.TraverseGraphAsync[TState](EntityEntryGraphNode`1 node, Func`3 handleNode, CancellationToken cancellationToken)
taskengine | 2020-09-23T21:17:21.767635010Z at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.EntityEntryGraphIterator.TraverseGraphAsync[TState](EntityEntryGraphNode`1 node, Func`3 handleNode, CancellationToken cancellationToken)
taskengine | 2020-09-23T21:17:21.767649082Z at Microsoft.EntityFrameworkCore.DbContext.AddRangeAsync(IEnumerable`1 entities, CancellationToken cancellationToken)
taskengine | 2020-09-23T21:17:21.767663209Z at TaskEngine.Tasks.TranscriptionTask.OnConsume(String videoId, TaskParameters taskParameters) in /src/TaskEngine/Tasks/TranscriptionTask.cs:line 113
The university list should not report UNK as a listed, editable, university.
This could be filtered on the frontend but it's probably best to filter on the backend
https://github.com/classtranscribe/WebAPI/blob/master/CTCommons/MSTranscription/MSTranscriptionService.cs
Line 51-55
The language setting for transcription is hard-coded. This could be moved into other setting files like the environment file.
were there actual compile errors, or did I read that wrong? If there are we should really start to look at how to use github actions to compile.
This might be a nice way to do this: https://github.com/actions/setup-dotnet
Originally posted by @robkooper in #49 (comment)
Server should create trace and exception messages using Application Insights
However it should print out when it attempts to start AI to normal output (in case AI fails)
I just tried uploading 5 videos to a new playlist on -dev. The are all very short (3seconds) recorded using Zoom as a local mp4. "This is video 1" "This is video 2" etc; Only the 5th video made it to the playlist, even after I refreshed the page. However future attempts worked fine. Parts of the log that appear relevant
taskengine | 2020-09-23T20:42:25.517947163Z fail: TaskEngine.Tasks.DownloadMediaTask[0]
taskengine | 2020-09-23T20:42:25.517987186Z DownloadLocalPlaylist failed. mediaId 891edecd-5312-4e4b-8277-d0d9d5fac371
taskengine | 2020-09-23T20:42:25.518001689Z System.IO.FileNotFoundException: Could not find file '/data/OF4AP98FR23H'.
taskengine | 2020-09-23T20:42:25.518014610Z File name: '/data/OF4AP98FR23H'
taskengine | 2020-09-23T20:42:25.518028095Z at System.IO.File.Move(String sourceFileName, String destFileName, Boolean overwrite)
taskengine | 2020-09-23T20:42:25.518041374Z at System.IO.File.Move(String sourceFileName, String destFileName)
taskengine | 2020-09-23T20:42:25.518054185Z at ClassTranscribeDatabase.Models.FileRecord.GetNewFileRecord(String filepath, String ext) in /src/ClassTranscribeDatabase/Models/FileRecord.cs:line 33
taskengine | 2020-09-23T20:42:25.518067570Z at TaskEngine.Tasks.DownloadMediaTask.DownloadLocalPlaylist(Media media) in /src/TaskEngine/Tasks/DownloadMediaTask.cs:line 237
taskengine | 2020-09-23T20:42:25.646355898Z fail: CTCommons.RabbitMQConnection[0]
taskengine | 2020-09-23T20:42:25.646396864Z Error occured in RabbitMQConnection DownloadMedia for message TaskObject(Data=891edecd-5312-4e4b-8277-d0d9d5fac371; TaskParameters=TaskParameters(Force = False; Metadata = );
taskengine | 2020-09-23T20:42:25.646412192Z System.Exception: DownloadMediaTask failed for mediaId 891edecd-5312-4e4b-8277-d0d9d5fac371
taskengine | 2020-09-23T20:42:25.974389435Z [x] DownloadMedia Received TaskObject(Data=9df32c4d-8b9b-4ad4-be79-da631045438e; TaskParameters=TaskParameters(Force = False; Metadata = );
taskengine | 2020-09-23T20:42:25.980614133Z info: TaskEngine.Tasks.DownloadMediaTask[0]
taskengine | 2020-09-23T20:42:25.980686236Z ConsumingCastle.Proxies.MediaProxy
taskengine | 2020-09-23T20:42:25.981420773Z fail: TaskEngine.Tasks.DownloadMediaTask[0]
taskengine | 2020-09-23T20:42:25.981452815Z DownloadLocalPlaylist failed. mediaId 9df32c4d-8b9b-4ad4-be79-da631045438e
taskengine | 2020-09-23T20:42:25.981466851Z System.IO.FileNotFoundException: Could not find file '/data/MCWLSGBKZHS9'.
taskengine | 2020-09-23T20:42:25.981480331Z File name: '/data/MCWLSGBKZHS9'
taskengine | 2020-09-23T20:42:25.981493182Z at System.IO.File.Move(String sourceFileName, String destFileName, Boolean overwrite)
taskengine | 2020-09-23T20:42:25.981505960Z at System.IO.File.Move(String sourceFileName, String destFileName)
taskengine | 2020-09-23T20:42:25.981519291Z at ClassTranscribeDatabase.Models.FileRecord.GetNewFileRecord(String filepath, String ext) in /src/ClassTranscribeDatabase/Models/FileRecord.cs:line 33
taskengine | 2020-09-23T20:42:25.981533484Z at TaskEngine.Tasks.DownloadMediaTask.DownloadLocalPlaylist(Media media) in /src/TaskEngine/Tasks/DownloadMediaTask.cs:line 237
taskengine | 2020-09-23T20:42:25.982630181Z info: TaskEngine.Tasks.ProcessVideoTask[0]
taskengine | 2020-09-23T20:42:25.982666442Z ConsumingCastle.Proxies.VideoProxy
taskengine | 2020-09-23T20:42:25.994436797Z fail: CTCommons.RabbitMQConnection[0]
taskengine | 2020-09-23T20:42:25.994477664Z Error occured in RabbitMQConnection DownloadMedia for message TaskObject(Data=9df32c4d-8b9b-4ad4-be79-da631045438e; TaskParameters=TaskParameters(Force = False; Metadata = );
taskengine | 2020-09-23T20:42:25.994817636Z System.Exception: DownloadMediaTask failed for mediaId 9df32c4d-8b9b-4ad4-be79-da631045438e
taskengine | 2020-09-23T20:42:25.994865290Z at TaskEngine.Tasks.DownloadMediaTask.OnConsume(String mediaId, TaskParameters taskParameters) in /src/TaskEngine/Tasks/DownloadMediaTask.cs:line 65
taskengine | 2020-09-23T20:42:25.996048953Z at CTCommons.RabbitMQConnection.<>c__DisplayClass8_0`1.<<ConsumeTask>b__0>d.MoveNext() in /src/CTCommons/RabbitMQ/RabbitMQConnection.cs:line 104
taskengine | 2020-09-23T20:42:25.996340493Z info: CTCommons.RabbitMQConnection[0]
taskengine | 2020-09-23T20:42:25.996372410Z [x] DownloadMedia Done TaskObject(Data=9df32c4d-8b9b-4ad4-be79-da631045438e; TaskParameters=TaskParameters(Force = False; Metadata = );
Reduce the need to hand-edit the .env file. Copy-pasting keys within files or across files should be unnecessary.
From an instructor - "The ClassTranscribe web site consistently appears to times out after about 100 seconds, when the progress bar reaches about 25%, whenever I try to upload a large video file (1.6GB) from home.I say “appears to time out” because there is no error message of any kind, The web site just returns to the previous page (with the “+ UPLOAD VIDEOS” button. If this timeout behavior is intentional, please either remove the time limit, increase it to 15 minutes, or at least display an error message. (“Took too long to upload; try again when you’re on campus.”)"
Notice there are several items to address here:
TODO: TaskEngine should use one RabbitMQ connection for the whole process not one per task
TODO: Take a deep dive into how this actually working and document it.
It would also be useful to confirm that we are actually using async co-routines and not multiple threads.
TaskEngine does not explicilty create any threads; however it certainly has a main loop that sleeps for a couple of hours, while mulitple message queues are being concurrently services!
We may need to upgrade to the latest RabbitMQ C# client.
And emperically confirm that RabbitMQ C# Implementation with a prefetch count>1 does not implement concurrency using multiple threads.
e.g. alway print out a threadID and print out how many threads are running
e.g. take a deep dive into the C# RabbitMQ source code.
_channel.QueueDeclare(.... );
_channel.BasicQos(prefetchSize: 0, prefetchCount: concurrency, global: false);
}
var consumer = new EventingBasicConsumer(_channel);
consumer.Received += async (model, ea) =>
{ ...
captions should be repositioned if they overlap burned in content.
After adding new property CourseId to CourseDTO, GetOfferingsByStudent api should also be updated.
Steps to reproduce -
docker exec -it pythonrpcserver sh
ipython
import kaltura
k = kaltura.KalturaProvider()
c2=k.getKalturaChannelEntries(167312872)
len(c2) # Returns 30 but 37 are listed on MediaSpace
https://mediaspace.illinois.edu/channel/CS+173+Summer+2020+AL1/167312872
The Mediaspace docs suggest the paging returns 25 results. However the autogenerated suggests 30 pageSize
https://www.kaltura.com/api_v3/testmeDoc/objects/KalturaFilterPager.html
We will need an ffmpeg task to create the mp4
https://video.stackexchange.com/questions/22197/ffmpeg-how-to-add-several-subtitle-streams
We will want to add the audio description/enhanced text description track too.
(We could even add chapter markers from the epub data)
For discussion: How do we not litter the storage with old mp4s?
Some tricky timing corner cases to work out (e.g. captions being updated while mp4 being regenerated)
The new version of the ePub generator on the frontend, are able to create multiple ePub for a media, based on the epub data of different languages.
For each media, we need API's to:
POST: api/EPubData/../{mediaId}
GET: api/EPubData/../{mediaId}
GET: api/EPubData/../{epubId}
PUT: api/EPubData/../{epubId}
DELETE: api/EPubData/../{epubId}
My idea is to make the ePub's id
, title
, filename
, author
, createdAt
, isPublished
, cover
, and language
as columns in a database's table row, and the chapters
as a json-formatted string column.
It could be an issue that each of the ePub chapters object can be really huge, since some of them may contain hundreds of chapters/sub-chapters/images/texts.
Also, we need API that can store an image that is uploaded by users (some users will need images for their ePub other than the generated screenshots). e.g. POST: api/Image/
that returns the URL for the created image.
This is the structure used by frontend to build an ePub.
{
"id": "uuid str",
"title": "string",
"filename": "string",
"language": "string",
"author": "string",
"publisher": "string",
"cover": "image URL",
"isPublished": true,
"chapters": [
{
"id": "uuid str generated by frontend",
"title": "string",
"start": "string",
"end": "string",
"contents": [
"a piece of text",
{
"src": "image src",
"alt": "image alt",
"description": "image description/AD"
},
....
],
"items": [
{
"id": "uuid str generated by frontend",
"start": "string",
"end": "string",
"image": "image url",
"text": "string"
}
],
"subChapters": [
{
"id": "uuid str generated by frontend",
"title": "string",
"start": "string",
"end": "string",
"contents": [],
"items": [
{
"id": "uuid str generated by frontend",
"start": "string",
"end": "string",
"image": "image url",
"text": "string"
}
]
}
]
}
]
}
The API UserOfferings/AddUsers/{offeringId}/{roleName}
cannot add a previously deleted email
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.