benfmiller / audalign Goto Github PK
View Code? Open in Web Editor NEWPackage for aligning audio files through audio fingerprinting
License: MIT License
Package for aligning audio files through audio fingerprinting
License: MIT License
Hi I wanted to try out audalign but had some trouble with the installation. The installation instructions in the readme wont let me install the module. I always get this error
pip install audalign
ERROR: Could not find a version that satisfies the requirement audalign (from versions: none)
ERROR: No matching distribution found for audalign
So I tried it with the following commands
pip install git+https://github.com/benfmiller/audalign.git
or (after downloading from github)
pip install audalign-master.zip
In both cases pip will state it "Successfully installed audalign-0.0.2" but it seams like the installation is not working. I just tried to import audalign and get:
Traceback (most recent call last): File "aligndashit.py", line 1, in <module> import audalign ModuleNotFoundError: No module named 'audalign'
Also the Lib\site-packages\ of my python env have an audalign-0.0.2.dist-info folder but no audalign folder.
Any idea whats wrong?
EDIT: I'm on Windows10, Python 3.7.6
Hi! Once again, great work on this project!
Is there any chance that the requirements can be split into modular parts?
Like this:
pip install audalign[correlation,fingerprint]
At the moment the installation takes a very long time for parts of the code that I'm not using. For now, I simply extracted the correlation bits of code into my project in order to make it slimmer, but it could be great if this can be supported officially so I don't miss on any helpful updates from you.
Thank you
Hi there,
dunno if it falls within the scope of the project but often, after the aligning, some phase/polarity "errors" could degrade the recording.
Here's a couple of interesting resources about those issues:
Dunno if these softwares may help...
Last but not least, here's a very interesting research about phase recovery by @magronp:
Phase recovery with Bregman divergences for audio source separation
Hope that inspires !
When regording two microphones where one microphone picks up the sound of the other, a slight offset causes nasty echo. Would you be interested in such examples? If yes, how to deliver them?
I was working to add in ML-based fingerprinting recognitions, but I hit a wall with the current layout of the program. It was originally designed for simplicity of use, but as more recognition techniques have been added, it has become increasingly difficult to keep track of what each parameter is actually used for in the alignments.
The recognitions will take a recognition object (one for each type of recognition technique), which will also contain a corresponding config object. This way, all configuration for a specific technique will be contained within that technique's config. Align will also be rewritten for generic objects.
This will make it much easier for those who want to extend the functionality of the program and add their own recognizers. It will also hopefully make the audalign much easier to use and understand.
Hi,
I have issues to align audio files with a sample rate higher than 44100 (in my case 48000).
1 import audalign
2
3 def main():
4 ada = audalign.Audalign()
5 rough_alignment = ada.align(
6 "./not_aligned/",
7 cor_sample_rate=48000,
8 )
9
10 fine_alignment = ada.fine_align(
11 rough_alignment,
12 destination_path="./aligned",
13 cor_sample_rate=48000,
14 )
15
16 if __name__ == "__main__":
17 main()
I stuck after the fingerprinting:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/site-packages/audalign/fingerprint.py", line 109, in _fingerprint_worker
channel, _ = audalign.filehandler.read(file_path, start_end=start_end)
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/site-packages/audalign/filehandler.py", line 154, in read
audiofile = create_audiosegment(
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/site-packages/audalign/filehandler.py", line 59, in create_audiosegment
audiofile = AudioSegment.from_file(filepath)
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/site-packages/pydub/audio_segment.py", line 685, in from_file
info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/site-packages/pydub/utils.py", line 279, in mediainfo_json
info = json.loads(output)
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "aligner.py", line 17, in <module>
main()
File "aligner.py", line 5, in main
rough_alignment = ada.align(
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/site-packages/audalign/__init__.py", line 22, in wrapper_decorator
results = func(*args, **kwargs)
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/site-packages/audalign/__init__.py", line 1141, in align
return align._align(
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/site-packages/audalign/align.py", line 48, in _align
set_ada_file_names(
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/site-packages/audalign/align.py", line 176, in set_ada_file_names
ada_obj.fingerprint_directory(file_dir)
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/site-packages/audalign/__init__.py", line 302, in fingerprint_directory
result = self._fingerprint_directory(
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/site-packages/audalign/__init__.py", line 388, in _fingerprint_directory
result = self.pool.map(
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
Are higher sample rates supported anyway?
Just wanted to let you know about https://github.com/protyposis/AudioAlign
It's coded in C# but maybe helpful for reference.
Good luck with the project
Currently total.wav seems to contain the cummulative result. I think it would be much more valuable to have a total file that contains all tracks being correctly aligned. It could be even nicer to use a container format that does not require the 'shift' to be encoded, hence the offset would only be encoded as metadata (in a edit decision list fashion).
At this moment many of the dependencies become unsupported. It would be nice that the package would use the latest versions of noisereduce, librosa, skikit, typed-ast, typing-extensions and others.
Hi again, this time its lines 114-117 in align.py
finally:
ada_obj.file_names = temp_file_names
ada_obj.fingerprinted_files = temp_fingerprinted_files
ada_obj.total_fingerprints = temp_total_fingerprints
all the temps seem to be empty when declared and they never appear again in the code. I noticed because save_fingerprinted_files
was saving 3 empty variables that get replaced with empty temp_
files in this snippet. Removing these 3 lines solves the issue but probably isnt the intentional behavior nor an optimal solution.
As user I have multiple recording devices. Practically a camera, and two rode wireless go II devices. I would like to achieve alignment between the different recordings. The source data of each device may be assumed to sequential in nature, but the recordings of different devices may not have been continous, thus have a different overlap.
What I would like to see is something happening where blocks within the same folder are not correlated, but different folders are. In addition, given that the input sequence is not a 'bag of files' but a 'sorted list of files' this knowledge should be used in the alignment proces: a forward search given the last prior.
At this moment I notice after the fingerprinting process the following error when I try to add my video folder. I have also attempted to make a wav file out of all the videos. The same error applied.
From the description above I could do a iterative approach which would sequentially align files single files, by initial finger print. My preference would obviously be an unsupervised method.
VID_20220409_115501.mp4: Finding Matches... Aligning matches
Traceback (most recent call last):
File "/mnt/storage/home/skinkie/Sources/audalign/run_align.py", line 284, in <module>
main(args=args)
File "/mnt/storage/home/skinkie/Sources/audalign/run_align.py", line 196, in main
results = ad.align(
File "/mnt/storage/home/skinkie/Sources/audalign/audalign/__init__.py", line 36, in wrapper_decorator
results = func(*args, **kwargs)
File "/mnt/storage/home/skinkie/Sources/audalign/audalign/__init__.py", line 91, in align
return aligner._align(
File "/mnt/storage/home/skinkie/Sources/audalign/audalign/align/__init__.py", line 48, in _align
files_shifts = calc_final_alignments(
File "/mnt/storage/home/skinkie/Sources/audalign/audalign/align/__init__.py", line 169, in calc_final_alignments
files_shifts = find_matches_not_in_file_shifts(
File "/mnt/storage/home/skinkie/Sources/audalign/audalign/align/__init__.py", line 290, in find_matches_not_in_file_shifts
nmatch_wt_most[main_name][audalign.Audalign.OFFSET_SECS] = None
AttributeError: module 'audalign' has no attribute 'Audalign'. Did you mean: 'datalign'?
Hi there, audalign is very cool !
We would suggest to keep in consideration some interesting fingerprint projects in order to evolve it even more:
Please check out AudioAlign - a tool written for research purposes to automatically synchronize audio and video recordings that have either been recorded in parallel at the same event or contain the same aural information - by @protyposis too, wich have a very cool advanced GUI.
Hope that inspires !
Hello,
I've working whit your code; thank you very much is so useful.
My problem happens when I trying to reproduce the audio, the audio is amplified, therefore the noise floor as well. Is there a way to hold the original audios gains?
Thank you!
I need to get a list of all posible matches that an audio could have, and by looking at the code I found that I could get a list of matches using match_len_filter
, however most of results were around the same second and filter_matches
wasn't useful at solving that problem, so I made this function for my script that filters numbers that are close to each other while conserving the original order, ex 1.5, 1.1, 60.1, 60.5, 30.4; it would return 1.5, 60.1, 30.4, and I thought it could also be useful for this program.
In my case I needed to remove matches that have less than 0.5 absolute difference, but it could be changed easily.
def remove_close_numbers_by_abs_diff(nums):
if not nums:
return []
output = [nums[0]]
for num in nums[1:]:
if all(abs(num - prev) > 0.5 for prev in output):
output.append(num)
return output
import unittest
class TestRemoveCloseNumbers(unittest.TestCase):
def test_remove_close_numbers(self):
self.assertEqual(remove_close_numbers_by_abs_diff([1.5, 1.1, 3.2, 3.9, 5, 5.9, 0.5, 3.3, 3.3]), [1.5, 3.2, 3.9, 5, 5.9, 0.5])
self.assertEqual(remove_close_numbers_by_abs_diff([1, 1, 2.2, 2.3, 2.5, 3.5, 4.4, 4.8]), [1, 2.2, 3.5, 4.4])
self.assertEqual(remove_close_numbers_by_abs_diff([10, 10, 2.2, 2.3, 2.5, 1.5, 1, 0.8]), [10, 2.2, 1.5, 0.8])
self.assertEqual(remove_close_numbers_by_abs_diff([2, 3]), [2, 3])
self.assertEqual(remove_close_numbers_by_abs_diff([1, 3, 5, 7, 9]), [1, 3, 5, 7, 9])
self.assertEqual(remove_close_numbers_by_abs_diff([]), [])
if __name__ == '__main__':
unittest.main()
I need to make a script that finds the cuts in an audio, and for that I have decided to make cuts of one second for an audio. Apparently, Audalign already does that, but since I don't know how to modify it, I have not chosen that option.
After making cuts with my script, and then finding the offset with audalign, in some cases it give bad results, as each audio piece is normalized individually . So I would like to know if there is any way to prevent the audio from being normalized, and do the normalization myself.
Sorry to bother you again.
I used audalign to convert two video files to wav. This worked like a charm. Now I was trying to align the output:
import audalign
ada = audalign.Audalign()
...
ada.convert_audio_file(filepath1, filepath1wav)
ada.convert_audio_file(filepath2, filepath2wav)
print(ada.align(r'.\files'))
I get the following error then:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "c:\Users\RetroHelix\Programming\Audio\aligndashit.py", line 12, in <module>
print(ada.align(r'.\files'))
File "C:\Users\RetroHelix\Envs\audalignPy38\lib\site-packages\audalign\__init__.py", line 528, in align
self.fingerprint_directory(directory_path)
File "C:\Users\RetroHelix\Envs\audalignPy38\lib\site-packages\audalign\__init__.py", line 223, in fingerprint_directory
result = self._fingerprint_directory(path, plot, nprocesses, extensions)
File "C:\Users\RetroHelix\Envs\audalignPy38\lib\site-packages\audalign\__init__.py", line 289, in _fingerprint_directory
with multiprocessing.Pool(nprocesses) as self.pool:
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\pool.py", line 212, in __init__
self._repopulate_pool()
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\pool.py", line 303, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\pool.py", line 326, in _repopulate_pool_static
w.start()
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "c:\users\retrohelix\appdata\local\programs\python\python38\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
I have some trouble understanding the values in the match_info dictionary. Having two audio files where one file is a short snippet of the other one I got these values for the match_info dictionary:
{'match_time': 148.59945511817932,
'match_info':
{'line_170.mp3':
{'confidence': [99],
'offset_samples': [-16972],
'locality': [[(83, 26)]],
'locality_setting': [4.96907],
'offset_seconds': [-788.17814],
'locality_seconds': [[(3.85451, 1.20744)]]}}}
I interpret this as follows:
For the file "line_170.mp3" a match was found that has a confidence of 99. locality_setting just states that the match was found in a ~5 second window. offset_seconds gives the offset in seconds to where the match was found. But what about the tuples in locality_seconds/locality and the offset_samples value?
Can you please explain the meaning behind these values?
Thank you very much :)
Is there a way someone that can do this process of aligning two audio files I have, without me having to do it using this command program (audalign)?
I have a source audio file from a video and a target audio file, which is a cloned audio of the source audio file. I am trying to sync the cloned audio onto the original video. I tried align_files, but the saved final file has two channels (both source and target audio).
How do we align the target with the source?
Thanks in advance for your help; so far, this repo gives the best results for the alignment task I am trying.
Hello there,
The newest version of the library audalign==1.0.0
is giving me the following error while importing the module.
>>> import audalign
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/pdaawr/Documents/project/metrics/venv.nosync/lib/python3.8/site-packages/audalign/__init__.py", line 21, in <module>
import audalign.align.aligner as aligner
ModuleNotFoundError: No module named 'audalign.align'
There is no issue with the previous one, i.e 0.7.2. I tested both versions on python3.8
def prelim_fingerprint_checks(ada_obj, target_file, directory_path):
all_against_files = audalign.filehandler.find_files(directory_path)
all_against_files_full = [x[0] for x in all_against_files]
all_against_files_base = [os.path.basename(x) for x in all_against_files_full]
if (
os.path.basename(target_file) in all_against_files_base
# if the target file is outside directory_path the above line makes it never fingerprint, shouldnt it look like:
os.path.basename(target_file) not in all_against_files_base
and target_file not in all_against_files_full
):
ada_obj.fingerprint_file(target_file)
Could be me being bad at programming but this change seems to solve the issue i had with target_align
I'm looking for a possibility to perform (potentially destructive) audio tracks synchronization from old (dubbed in different language) and remastered versions of movies.
In my scenario, applying single audio shift is not enough: sooner or later audios become out of sync at least due to
Any interest in supporting such a scenario?
Any existing projects that try to accomplish this problem?
Any ideas what's the best way to implement it?
Naive idea for implementation:
Thanks!
I would like to request a feature.
It's nice to be able to easily align various audio files with adalign but it would also be nice to see how the files differ. When you explained the structure of the match dictionary to me I posted a screenshot of an offset graph. Something like this or a graph of wave forms with colored parts that match would be nice. I hope you get what I mean :D
Could you please incorporate something like this into audalign?
Hi there, we just realized that we've never asked for it: can you please "standardize" the license file ?
Although it may sounds like a minor aspect, a GH "uncompliant" license file causes an inconsistent generation of the relative badge:
(badge-generator URL: https://flat.badgen.net/github/license/benfmiller/audalign/?label=LICENSE)
You can easily set a "correct" one through the GH's license wizard tool.
Last but not least, we're revising the AUDIO category \ Tools section \ Alignment/synch subsection where your project is listed, so let us know how - in your opinion - we could improve our categorizations and links to resources in order to favor collaboration between developers (and therefore evolution) of listed projects.
Thanks in advance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.