audeering / audb Goto Github PK
View Code? Open in Web Editor NEWManage audio and video databases
Home Page: https://audeering.github.io/audb/
License: Other
Manage audio and video databases
Home Page: https://audeering.github.io/audb/
License: Other
At the moment we create a single folder for each flavor and store media, tables, header and the dependency files in it.
In principle this is not needed as only the media files will change between the flavors (and in the current implementation, the db.meta['audb']
entry in the header.
audresample
does not work on Windows at the moment and needs to be fixed in order to support mixing or resampling on Windows, see audeering/audresample#5
Since we store file duration in the dependencies, we don't have to it again in the tables of a database. However, we should add a usage example how to get the duration from there. Currently we only show how to get the total duration:
https://audeering.github.io/audb/load.html#metadata-and-header-only
For all audb.info
functions that use the dependency file under the hood, it would be great to add a media
and tables
option as we have in audb.load()
in order to filter for only parts of the database. The following functions would be affected:
audb.info.bit_depths()
audb.info.channels()
audb.info.duration()
audb.info.formats()
audb.info.sampling_rates()
Currently, it's not possible to publish a database on two different repositories with the same version. This prevents ending up with different databases published under the same version. However, we maybe want to mirror a database to another repository. I propose to implement audb.mirror()
for this use-case.
Say you have two versions of a database 1.0.0 and 1.0.1 and the second just changes something in the header.
If I load the second version, it will again download the audio data. Instead we could copy them from the cache of the first version.
This would be especially meaningful for databases that are growing over time.
At the moment it seems to be possible to publish a database that contains absolute paths in the tables to the data without raising an error during publication (see https://gitlab.audeering.com/data/myai/-/issues/12).
This is a little bit unfortunate and we should see if we can add a check that only relative paths are added.
Try:
>>> audb.info.duration('audioset')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-141-86306742f7d6> in <module>
----> 1 audb.info.duration('audioset')
~/git/audeering/audb/audb/core/info.py in duration(name, version)
118 deps = dependencies(name, version=version)
119 return pd.to_timedelta(
--> 120 sum([deps.duration(file) for file in deps.media]),
121 unit='s',
122 )
TypeError: unsupported operand type(s) for +: 'float' and 'NoneType'
But if I inspect the entries in the dependency dataframe there are no missing values:
>>> deps = audb.dependencies('audioset')
>>> df = deps()
>>> df['duration'].isnull().sum()
0
>>> df['duration'].sum()
19879979.737141866
So maybe instead of doing
sum([deps.duration(file) for file in deps.media]),
we should just do
deps()['duration'].sum()
Currently, audb.load
seems to contact the repository server to download metadata even when the requested database/media are already present locally. It would be nice if there was a way to disable this behaviour, mainly to avoid the multi-second delay of fetching this metadata.
Conan provides something similar for its install
command. Only if the opt-in flag --update
is set, it will always check the remote for newer versions. If the flag is not set and the requested artifacts can be provided by the local cache, the remote is not contacted at all and the command runs very quickly. Something similar (but maybe opt-out instead of opt-in) could work for audb.
If you do
audb.load('emodb', metadata_only=True)
instead of
audb.load('emodb', only_metadata=True)
it will not raise an error, but simply ignore the wrong argument.
This happens due to our backwards compatibility handling code.
It's not a big deal, but it is unfortunate as in this case it just downloads all the media files, which is not what the user intended.
If one of the tables is empty in your database and you request format='wav'
or a different format, the renaming of the files in the tables will fail with:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/scratch/shuber/envs/shuber_env/lib/python3.6/site-packages/audb/core/load.py", line 863, in load
_fix_media_ext(db.tables.values(), flavor.format, num_workers, verbose)
File "/scratch/shuber/envs/shuber_env/lib/python3.6/site-packages/audb/core/load.py", line 187, in _fix_media_ext
task_description='Fix format',
File "/scratch/shuber/envs/shuber_env/lib/python3.6/site-packages/audeer/core/utils.py", line 444, in run_tasks
results[index] = task_func(*param[0], **param[1])
File "/scratch/shuber/envs/shuber_env/lib/python3.6/site-packages/audb/core/load.py", line 179, in job
inplace=True,
File "/scratch/shuber/envs/shuber_env/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 830, in set_levels
if is_list_like(levels[0]):
File "/scratch/shuber/envs/shuber_env/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 4104, in __getitem__
return getitem(key)
IndexError: index 0 is out of bounds for axis 0 with size 0
If the database is stored in the cache and you request it with the version:
audb.load(database, version=version)
audb.load()
should not need a connection to the backend, but just load the data from cache.
This is not the case as in
Line 666 in b0f6c30
We should fix this, but it might make more sense to first tackle #46.
Before we an implementation using https://github.com/frankenjoe/pandasgui.git for easily inspecting a table and playing back some audio.
I was wondering if we should instead implement a curse based interafce using urwid.
audresample
needs to be fixed for MacOS to support flavors that require mixing or resampling, see audeering/audresample#4
First reported in #63
Requesting single media files from a database takes much longer now:
audb 1.0.4
>>> # Starting with empty cache
>>> timeit.timeit('audb.load("msppodcast", version="2.3.0", media=["Audios/MSP-PODCAST_0001_0008.wav"], full_path=False, verbose=True)', number=1, setup="import audb")
37.210156934015686
>>> # Loading from cache
>>> timeit.timeit('audb.load("msppodcast", version="2.3.0", media=["Audios/MSP-PODCAST_0001_0008.wav"], full_path=False, verbose=True)', number=1, setup="import audb")
4.945569497998804
audb 1.1.0
>>> # Starting with empty cache
>>> timeit.timeit('audb.load("msppodcast", version="2.3.0", media=["Audios/MSP-PODCAST_0001_0008.wav"], full_path=False, verbose=True)', number=1, setup="import audb")
93.68407620198559
>>> # Loading from cache
>>> timeit.timeit('audb.load("msppodcast", version="2.3.0", media=["Audios/MSP-PODCAST_0001_0008.wav"], full_path=False, verbose=True)', number=1, setup="import audb")
39.96311020699795
As we can filter by media (files) when loading a databases, I was wondering if we should provide an easy way to get all files that come with a database.
At the moment you can get that info with:
db = audb.load('emodb', only_metadata=True)
db.files
or faster with
deps = audb.dependencies('emodb')
deps.media
To get a list of all tables in a database you can do:
list(audb.info.tables('emodb'))
I was first thinking about audb.info.media('emodb')
, but this exists already and returns information on the media type as audb.info
in general deal with header information and only a few functions access the dependency file instead, e.g. audb.info.channels()
.
So we could think about adding something like audb.info.files()
or audb.list_media()
.
@agfcrespi also reported that he did search the documentation for files
not for media
when he was searching for such a function.
I am trying to use audb in our CI. I install it via pip install audb==1.1.2
but it appears it has an additional undocumented dependency on libsndfile:
import audb
OSError: sndfile library not found
I haven't tested it yet but it seems the apt package libsndfile1
needs to be installed on Ubuntu: https://stackoverflow.com/questions/55086834/cant-import-soundfile-python
I think this should be documented or ideally the dependency be included in the pip package if possible.
I'm seeing this warning from time to time:
/opt/hostedtoolcache/Python/3.7.10/x64/lib/python3.7/site-packages/audb/core/load.py:180: FutureWarning: The default value of regex will change from True to False in a future version.
table.df.index = table.df.index.str.replace(cur_ext, new_ext)
Should we overwrite the default value to suppress it?
Before we had a command line interface based on fire
, but we removed it as we had several problems with it.
Would be nice to readd a subset of those functions.
At the moment audb.load()
and audb.load_to()
are implemented more or less independently of each other, which makes no sense and is risky to maintain. We should try to share as much code between them as possible.
It is ok in audformat.Database
to allow for empty author and license entries, but audb.publish()
should raise an error if:
author
is missinglicense
is missingIn the shared folder databases have to be shared by different users, which is not working at the moment:
>>> audb.load("mgb5", version="1.0.0", cache_root="/data/audb")
PermissionError: [Errno 13] Permission denied: '/data/audb/mgb5/1.0.0/ebbb9037
as the data was downloaded before by another user.
In older versions of audb
we handled this, I do not remember exactly how, but I guess it is related to these lines of code:
# Set permissions for to be stored files to the one from cache folder
current_permission = os.stat(cache_root).st_mode & 0o777
mask = 0o777 - current_permission
current_mask = os.umask(mask)
Currently publish()
always checks if media was changed. To do that the checksum of all media files in the database has to be calculated. This can take quite some while on large databases. However, most of the time only the metadata is changed and maybe new media is added. That existing media changes is a rather rare case. So I wonder if we should give the user the option to skip the test for altered media.
I tried to load audioset
on compute4
and got:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-7-1d6b92c78f3c> in <module>
----> 1 db = audb.load(name, version=version, only_metadata=True, full_path=False, tables='Human-sounds.unbalanced-train')
~/.envs/audb/lib/python3.6/site-packages/audb/core/load.py in load(name, version, only_metadata, bit_depth, channels, format, mixdown, sampling_rate, tables, media, removed_media, full_path, cache_root, num_workers, verbose, **kwargs)
821 cache_root,
822 num_workers,
--> 823 verbose,
824 )
825
~/.envs/audb/lib/python3.6/site-packages/audb/core/load.py in _load_tables(tables, backend, db_root, db_root_tmp, db, version, cached_versions, deps, flavor, cache_root, num_workers, verbose)
502 version,
503 flavor,
--> 504 cache_root,
505 )
506 if cached_versions:
~/.envs/audb/lib/python3.6/site-packages/audb/core/load.py in _cached_versions(name, version, flavor, cache_root)
33
34 df = cached(cache_root=cache_root)
---> 35 df = df[df.name == name]
36
37 cached_versions = []
~/.envs/audb/lib/python3.6/site-packages/pandas/core/generic.py in __getattr__(self, name)
5139 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5140 return self[name]
-> 5141 return object.__getattribute__(self, name)
5142
5143 def __setattr__(self, name: str, value) -> None:
AttributeError: 'DataFrame' object has no attribute 'name'
At the moment we have audbget
to download single tables.
There are three changes that we might want to do:
audb get
insteadtable
argument of audb.load()
directlymedia
argumentAt the moment audb.info.header()
always downloads the header file from the backend,
but it might be that the header is already stored in the cache folder.
So, in principle we could load it from there.
If you have stored your wav file using upper case letters, e.g. WAV
or Wav
loading the wav
flavor might fail as it knows from the dependencies files that we have the WAV format and will not convert the data, but it will still rename all the table entries to lower case letters.
Here is an example of a corresponding dependency file of aspeechdb
:
>>> deps = audb.dependencies('aspeechdb', version='1.0.0')
>>> deps()
archive bit_depth channels checksum duration format removed sampling_rate type version
db.dev.csv dev 0 0 d2978b3411c3f4ab69c5577e5d06c1ba 0.000 csv 0 0 0 1.0.0
db.test.csv test 0 0 4c8cbb71a5e7ffe60b5fb3c5ba45debb 0.000 csv 0 0 0 1.0.0
db.train.csv train 0 0 2f97e2352310462ae397b8ad3bffd1b7 0.000 csv 0 0 0 1.0.0
audio/1.1.WAV audio 16 1 fd2045cd60e1b9b12e84b5862c5f835f 3.840 wav 0 16000 1 1.0.0
audio/1.10.WAV audio 16 1 7e726d6eacd6dc52cff3d26e078c7da7 4.608 wav 0 16000 1 1.0.0
... ... ... ... ... ... ... ... ... ... ...
audio/99.79.WAV audio 16 1 afc7d8211fb5a31a635293a95bdbfa68 4.096 wav 0 16000 1 1.0.0
audio/99.78.WAV audio 16 1 8c802aef907927a4dfa319edb5b22a6b 4.608 wav 0 16000 1 1.0.0
audio/99.8.WAV audio 16 1 74fe29eecd8f2ac01a04b4fe00a0690b 3.840 wav 0 16000 1 1.0.0
audio/99.9.WAV audio 16 1 cd18866eaf082a01d9b6ec1df4de9c2d 4.096 wav 0 16000 1 1.0.0
audio/99.80.WAV audio 16 1 cf14e49c36e809bc637ba92d8df274eb 4.608 wav 0 16000 1 1.0.0
[16404 rows x 10 columns]
As we need to load the dependency table for nearly every operation we do in audb
it would be nice to speed this up.
E.g. for audioset
running audb.Dependencies.load()
takes around 130s.
The problem is we don't have that many options. We specify already datatypes for every column of the corresponding CSV file.
As the cache folder is given by the flavor, it can happen that two users will download at the same time to the same folder.
This can fail at the moment, e.g.
FileNotFoundError: [Errno 2] No such file or directory: '/data/audb/projectsmile-salamander-agent-tone/12.4.1/e2677cd6~/data/2020_09_08/7af930bb3f0645be933bc717826e9635_
7KiW/7af930bb3f0645be933bc717826e9635_7KiW.wav' -> '/data/audb/projectsmile-salamander-agent-tone/12.4.1/e2677cd6/data/2020_09_08/7af930bb3f0645be933bc717826e9635_7KiW/$
af930bb3f0645be933bc717826e9635_7KiW.wav'
as we don't lock the temporary (and the cache?) folder.
The following functions all have dict
as return type, but the correct return type is audformat.core.common.HeaderDict
:
audb.info.media()
audb.info.meta()
audb.info.raters()
audb.info.schemes()
audb.info.splits()
audb.info.tables()
There are two solutions:
audformat.core.common.HeaderDict
dict
The advantage of 1. would be that the result remains identical to calling e.g. db.media
, its disadvantage is that audformat.core.common.HeaderDict
is not documented.
The advantage of 2. would be that it returns a well known type, the disadvantage of 2. is that it wouldn't be any longer identical to calling e.g. db.media
.
It is totally fine to publish different versions of a database on different backends. For example, emodb
0.2.2 and 1.0.1 are stored on Repository('data-public-local', 'https://artifactory.audeering.com/artifactory', 'artifactory')
, whereas version 1.1.0 is published on Repository('data-public', 'https://audeering.jfrog.io/artifactory', 'artifactory')
. But if you request the list of available databases:
>>> df = audb.available(only_latest=True)
>>> df.loc['emodb']
backend artifactory
host https://artifactory.audeering.com/artifactory
repository data-public-local
version 1.1.0
Name: emodb, dtype: object
it shows the wrong repository.
This happens because we set backend information and database version independently of each other in audb.available()
:
if name not in match:
match[name] = {
'backend': repository.backend,
'host': repository.host,
'repository': repository.name,
'version': [],
}
match[name]['version'].append(version)
When loading a large database (e.g. voxceleb
) it might take over 60 seconds before audb
shows the first progress bar after showing the text message:
Get: voxceleb2-videos v1.0.0
Cache: /data/work3/hwierstorf/audb/voxceleb2-videos/1.0.0/2208f75e
We should try to speed up audb
or if not possible, maybe show another progress bar or a text message indicating that audb
is doing something.
If you try to resample MP4 files, but you do not specify format='wav'
or format='flac'
you will get an error that you have to specify it,
but this error only appears after it had downloaded all the data,. It seems to me much more convienient if we raise this error already before downloading.
It should be possible as we store the format inside the data base dependency files.
As it can take a long time to download the dependency file of a big database and it is loaded by audb.load()
anyway,
we should also cache it with audb.dependencies()
.
We should raise an error instead of returning empty tables when requesting a non-existing table.
E.g. at the moment we get:
>>> db = audb.load('emodb', tables='noise', verbose=False)
>>> db.files
Index([], dtype='object', name='file')
>>> list(db.tables)
[]
Instead it should raise an error and maybe present a list of available tables, which could be generated with:
>>> list(audb.info.tables('emodb'))
['emotion', 'files']
The search is broken in the documentation, e.g.: https://audeering.github.io/audb/search.html?q=full_path&check_keywords=yes&area=default gives a JavaScript error:
Uncaught ReferenceError: Stemmer is not defined
query https://audeering.github.io/audb/_static/searchtools.js:158
setIndex https://audeering.github.io/audb/_static/searchtools.js:98
<anonymous> https://audeering.github.io/audb/search.html?q=full_path&check_keywords=yes&area=default line 2 > injectedScript:1
We had this issue a while ago with other packages and it was fixed there by updating audeering-sphinx-theme or one of the other packages if I remember correctly.
It might be possible that we could download single media files also for the case that they are stored in an archive.
The only problem is that this needs to be somehow supported by the backend.
For example, in Artifactory you can do something like this:
r = audfactory.rest_api_get(
f'{host}/{repository}/{name}/media/{archive}/{version}/{archive}-{version}.zip!/{filename}'
)
with open(dst_filename, 'wb') as fp:
fp.write(r.content)
In the tests we are getting the following warnings at the moment (Python 3.7), copied from https://github.com/audeering/audb/pull/40/checks?check_run_id=2444852881
/home/runner/work/audb/audb/audb/core/load.py:127: FutureWarning: The default value of regex will change from True to False in a future version.
787
table.df.index = table.df.index.str.replace(cur_ext, new_ext)
/home/runner/work/audb/audb/audb/core/load.py:130: FutureWarning: The default value of regex will change from True to False in a future version.
792
table.df.index.levels[0].str.replace(cur_ext, new_ext),
/home/runner/work/audb/audb/audb/core/load.py:132: FutureWarning: inplace is deprecated and will be removed in a future version.
797
inplace=True,
/home/runner/work/audb/audb/audb/core/load.py:156: FutureWarning: inplace is deprecated and will be removed in a future version.
805
root + table.df.index.levels[0], 'file', inplace=True,
Not sure if the regex warnings need any action, but I list them here.
It seems that audb.load()
always creates a temporal folder. Even if it is not needed, e.g. when the database was already completely downloaded. Usually a user will not notice, unless she is missing write rights, e.g. when reading from the shared cache. So it would be safer to only generate a temporal folder if it is actually needed.
When we store a database to the cache, we also cache its dependency file.
Which means when calling audb.dependencies()
there is no need to always download the dependency table from the backend, instead we could first look into the cache.
Make sure we are not starting a publishing process that will fail and leave behind a corrupted database.
After updating to audb 1.1.0, the following fails with an exception:
b = audbenchmark.load(
name='arousal',
subgroup='ser.msppodcast.regression',
version='1.0.0',
verbose=True
)
b.load_test_set()
Get: msppodcast v1.0.1
Cache: /media/chausner/Linux/Secured/audb/msppodcast/1.0.1/854d7d2f
0%| [00:00<?] Missing tables 0%| [00:00<?] Cached files Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/chausner/.local/lib/python3.6/site-packages/audbenchmark/core/benchmark.py", line 266, in load_test_set
return self._load_set(self.test_set)
File "/home/chausner/.local/lib/python3.6/site-packages/audbenchmark/core/benchmark.py", line 695, in _load_set
_, y = d()
File "/home/chausner/.local/lib/python3.6/site-packages/audbenchmark/core/data.py", line 72, in __call__
db, data = self._call()
File "/home/chausner/.local/lib/python3.6/site-packages/audbenchmark/core/data.py", line 189, in _call
return self._columns._call()
File "/home/chausner/.local/lib/python3.6/site-packages/audbenchmark/core/data.py", line 332, in _call
verbose=self.verbose,
File "/home/chausner/.local/lib/python3.6/site-packages/audb/core/load.py", line 737, in load
verbose,
File "/home/chausner/.local/lib/python3.6/site-packages/audb/core/load.py", line 451, in _get_tables_from_cache
task_description='Copy tables',
File "/home/chausner/.local/lib/python3.6/site-packages/audeer/core/utils.py", line 441, in run_tasks
disable=not progress_bar,
File "/home/chausner/.local/lib/python3.6/site-packages/audeer/core/tqdm.py", line 100, in progress_bar
leave=config.TQDM_LEAVE,
File "/home/chausner/.local/lib/python3.6/site-packages/tqdm/_tqdm.py", line 945, in __init__
self.display()
File "/home/chausner/.local/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1335, in display
self.sp(self.__repr__() if msg is None else msg)
File "/home/chausner/.local/lib/python3.6/site-packages/tqdm/_tqdm.py", line 979, in __repr__
return self.format_meter(**self.format_dict)
File "/home/chausner/.local/lib/python3.6/site-packages/tqdm/_tqdm.py", line 452, in format_meter
return bar_format.format(bar='?', **format_dict)
KeyError: 'percentage'
After downgrading to audb 1.0.4, it works again.
See:
>>> deps = audb.dependencies('kit01')
>>> deps()
archive bit_depth channels checksum duration format removed sampling_rate type version
db.files.csv files 0 0 aa46f52940b57000779ddb63b375da35 0.0 csv 0 0 0 1.0.0
db.emotion.csv emotion 0 0 a5301d2fd6744287df9e8f61f8734bc8 0.0 csv 0 0 0 1.0.0
data/2N5YF_action_prompt_angry_bordcomputer_zoo... kit01-data 0 0 02d7854ab410b7ce47fdaacb05e8d2e3 0.0 mp3 0 0 1 1.0.0
data/2N5YF_action_prompt_angry_connected_drive_... kit01-data 0 0 1e47f6130648091be7f542174bae3dff 0.0 mp3 0 0 1 1.0.0
data/2N5YF_action_prompt_angry_fahrzeugstatus_z... kit01-data 0 0 a42a3294d985597da7ccb092a6a09f72 0.0 mp3 0 0 1 1.0.0
... ... ... ... ... ... ... ... ... ... ...
data/Z1XVJ_dialog_prompt_surprised2_vier_zoom_c... kit01-data 0 0 a858311a1e0a34b9dd62f7dfa733dbb7 0.0 mp3 0 0 1 1.0.0
data/Z1XVJ_dialog_prompt_surprised2_zwei_zoom_c... kit01-data 0 0 e75227d19a940bca27ba96039bb228eb 0.0 mp3 0 0 1 1.0.0
data/Z1XVJ_dialog_prompt_surprised2_menue_zoom_... kit01-data 0 0 3f0ac983669e6ecbd9cfaa5cfee27fad 0.0 mp3 0 0 1 1.0.0
data/Z1XVJ_dialog_prompt_surprised2_vorherige_z... kit01-data 0 0 cbbae5a9ec9fb1533bc257c07d08b475 0.0 mp3 0 0 1 1.0.0
data/Z1XVJ_dialog_prompt_surprised2_zurueck_zoo... kit01-data 0 0 8764cf91e87517033fd00fad0d859b92 0.0 mp3 0 0 1 1.0.0
[6077 rows x 10 columns]
I guess this can happen if we publish on a device that does not have MP3 support when using audiofile
.
import audb
db = audb.load(
'testdata',
tables='emotion.dev.gold',
)
db['emotion.dev.gold'].get()
emotion
file start end
/media/jwagner/Data/audb/testdata/1.6.0/d3b62a9... 0 days 00:00:02.374641 0 days 00:00:04.101248 unhappy
0 days 00:00:05.445999 0 days 00:00:13.061626 happy
0 days 00:00:13.960496 0 days 00:00:14.897836 happy
0 days 00:00:21.454417 0 days 00:00:28.235479 unhappy
0 days 00:00:31.573883 0 days 00:00:35.081475 happy
0 days 00:00:46.336832 0 days 00:00:49.666294 neutral
0 days 00:00:53.288169 0 days 00:00:57.397128 neutral
/media/jwagner/Data/audb/testdata/1.6.0/d3b62a9... 0 days 00:00:04.441153 0 days 00:00:05.069330 unhappy
0 days 00:00:08.263919 0 days 00:00:13.812035 neutral
0 days 00:00:15.163421 0 days 00:00:18.361817 neutral
0 days 00:00:23.164433 0 days 00:00:23.922398 neutral
0 days 00:00:32.178272 0 days 00:00:35.268576 neutral
0 days 00:00:42.320708 0 days 00:00:42.838252 neutral
0 days 00:00:47.715380 0 days 00:00:48.870772 neutral
0 days 00:00:50.283749 0 days 00:00:51.663219 happy
0 days 00:00:57.899072 0 days 00:00:58.337701 unhappy
/media/jwagner/Data/audb/testdata/1.6.0/d3b62a9... 0 days 00:00:04.113521 0 days 00:00:06.757677 unhappy
0 days 00:00:07.189011 0 days 00:00:09.499757 neutral
0 days 00:00:10.056658 0 days 00:00:17.380463 neutral
0 days 00:00:20.189824 0 days 00:00:24.043259 happy
0 days 00:00:25.961979 0 days 00:00:26.743246 neutral
0 days 00:00:27.502626 0 days 00:00:27.814136 neutral
0 days 00:00:36.057617 0 days 00:00:39.101987 neutral
0 days 00:00:43.284854 0 days 00:00:46.313790 neutral
0 days 00:00:49.399081 0 days 00:00:49.681210 happy
0 days 00:00:53.789372 0 days 00:00:59.017524 happy
Looks ok, but if we set format=flac
we get:
db = audb.load(
'testdata',
tables='emotion.dev.gold',
format='flac',
)
db['emotion.dev.gold'].get()
Empty DataFrame
Columns: [emotion]
Index: []
To solve the issue we should change the file extension in the tables after applying the filtering.
We need to revisit helper functions _find_media()
and _put_media()
in audb.publish()
. The functions are not well separated yet and need some more comments.
As long as long file paths are not supported in audeer
(audeering/audeer#15), loading a database with long file paths might fail.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.