Comments (7)
I am having the same issue exactly
from whisper-jax.
Fixed on main in transformers
, can you do:
pip install git+https://github.com/huggingface/transformers.git
to install transformers from main?
from whisper-jax.
I have tried but the error persists.
from whisper-jax.
any update about it? I'm having exctly same error when I try to embed the result with speechbrain audio diarization.
def transform_timestamp_list(input_list, duration):
output_list = []
for item in input_list:
output_item = {
"start": item["timestamp"][0],
"end": item["timestamp"][1] if item["timestamp"][1] != None else duration,
"text": item["text"]
}
output_list.append(output_item)
return output_list
result = pipeline(temp_file.name, task="transcribe", language="pt", return_timestamps=True)
print("transcribe result", result)
segments = transform_timestamp_list(result["chunks"], duration)
# Create embedding
def segment_embedding(segment):
audio = Audio()
start = segment["start"]
# Whisper overshoots the end timestamp in the last segment
end = min(duration, segment["end"])
clip = Segment(start, end)
waveform, sample_rate = audio.crop(temp_file.name, clip)
return embedding_model(waveform[None])
print("starting embedding")
embeddings = np.zeros(shape=(len(segments), 192))
for i, segment in enumerate(segments):
embeddings[i] = segment_embedding(segment)
embeddings = np.nan_to_num(embeddings)
print(f'Embedding shape: {embeddings.shape}')
# Assign speaker label
clustering = AgglomerativeClustering(num_speakers).fit(embeddings)
labels = clustering.labels_
for i in range(len(segments)):
segments[i]["speaker"] = 'SPEAKER ' + str(labels[i] + 1)
# Make output
output = [] # Initialize an empty list for the output
for segment in segments:
# Append the segment to the output list
output.append({
'start': str(convert_time(segment["start"])),
'end': str(convert_time(segment["end"])),
'speaker': segment["speaker"],
'text': segment["text"]
})
print("done with embedding")
time_end = time.time()
time_diff = time_end - time_start
system_info = f"""-----Processing time: {time_diff:.5} seconds-----"""
print(system_info)
# Add this line at the end of the handler function, before the return statement
os.remove(temp_file.name)
return Response(
json = output,
status=200
)
LOGS
01 May, 11:19:36
There was an error while processing timestamps, we haven't found a timestamp as last token. Was WhisperTimeStampLogitsProcessor used?
01 May, 11:19:37
starting embedding
01 May, 11:19:38
Embedding shape: (12, 192)
01 May, 11:19:38
-----Processing time: 38.261 seconds-----
01 May, 11:19:38
done with embedding
from whisper-jax.
I am having the same issue exactly too
from whisper-jax.
Hey @nachoh8 - just double checked your code sample, we shouldn't be using stride_length_s=0.0
since this means we have no overlap between chunks (which will severely degrade the performance of your transcription). Could you try leaving this set to None
so that it defaults to chunk_length_s / 6 = 30 / 6 = 5
? This probably explains why only your first batch had timestamps, and not the successive ones.
from whisper-jax.
any update? i'm getting the same error, running on google colab gpu
from whisper-jax.
Related Issues (20)
- In need of batch inference explanations HOT 1
- Is there any way to reduce the first jit compile time HOT 1
- [Feature Request] Youtube Compatible Transcript HOT 4
- Whisper JAX is not faster than Whisper in colab GPU environment. HOT 3
- Out of vram and reboot HOT 2
- Cannot instantiate FlaxWhisperPipline with parameters anymore HOT 4
- colab, kaggle notebook has a library dependency issue HOT 1
- New library possibly faster than Jax or just a hoax?
- Adaptation for Whisper-Large-V3 model HOT 2
- please create true comparisons with other whisper implementations HOT 12
- cannot import name 'FlaxWhisperPipline' from partially initialized module 'whisper_jax' (most likely due to a circular import) .. HOT 1
- How to finetune whisper on kaggle TPU? HOT 3
- ERROR: Could not find a version that satisfies the requirement jaxlib==0.4.5 (from versions: 0.4.18, 0.4.19, 0.4.20, 0.4.21) HOT 1
- there is a requirements.txt file of whisper-jax? HOT 2
- Using mulaw audio buffer data
- The demo throws error when uploading file
- Is there some code for Whisper jax to produce srt subtitle? HOT 1
- How to add millisecond for the timestamp?
- I have downloaded the flax_model, where can I call it?
- why whisper-jax did not use my GPU? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from whisper-jax.