Coder Social home page Coder Social logo

rimio / specgram Goto Github PK

View Code? Open in Web Editor NEW
9.0 2.0 2.0 13.91 MB

Small program that computes and plots spectrograms, either in a live window or to disk, with support for stdin input.

License: MIT License

CMake 0.49% C++ 98.70% Shell 0.39% Python 0.41%
spectrogram stdin fft audio audio-visualization

specgram's Introduction

specgram

build

Small program that computes and plots spectrograms, either in a live window or to disk, with support for stdin input.

Preview

example

Build and install

If you are using ArchLinux you can grab the latest release from the AUR package specgram, or get the main branch with specgram-git.

Otherwise, you can build and install the program from source:

# Clone the repo
git clone https://github.com/rimio/specgram.git
cd specgram && mkdir build && cd build

# Build
cmake ..
make

# Install
sudo make install

Dependencies

This program dynamically links against FFTW and SFML 2.5.

The source code of Taywee/args is embedded in the program (see src/args.hxx).

Usage

For a complete description of the program functionality please see the manpage.

Input and output modes

specgram has two mutually exclusive input methods: from standard input and from file.

In order to generate a spectrogram from an input file infile and write the output to output.png:

specgram -i infile outfile.png

If no input file is specified, then the default behaviour is to read input data indefinitely from standard input. For example, we can query PulseAudio for audio sources:

$ pactl list sources short
1	alsa_output.usb-BEHRINGER_UMC204HD_192k-00.analog-surround-40.monitor	module-alsa-card.c	s16le 4ch 44100Hz	IDLE
2	alsa_input.usb-BEHRINGER_UMC204HD_192k-00.analog-stereo	module-alsa-card.c	s32le 2ch 44100Hz	SUSPENDED
3	alsa_output.pci-0000_00_1f.3.iec958-stereo.monitor	module-alsa-card.c	s16le 2ch 44100Hz	SUSPENDED
4	stereo.A.monitor	module-remap-sink.c	s16le 2ch 44100Hz	IDLE
5	stereo.B.monitor	module-remap-sink.c	s16le 2ch 44100Hz	SUSPENDED
11	alsa_output.pci-0000_01_00.1.hdmi-stereo.monitor	module-alsa-card.c	s16le 2ch 44100Hz	SUSPENDED

$ export PASOURCE="stereo.A.monitor"

In my case the default sink is stereo.A, and I use the monitor source stereo.A.monitor to capture what I'm hearing in my headphones:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram outfile.png
[2020-12-27 15:56:15.058] [info] Creating 1024-wide FFTW plan
[2020-12-27 15:56:15.058] [info] Input stream: signed 16bit integer at 44100Hz

The program will keep reading data and computing FFTs until end of times or until it receives a SIGINT. In the Linux terminal this can be achieved by pressing CTRL+C.

Once the signal is received, the program stops reading data from input and writes to outfile.png whatever it cached so far:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram outfile.png
[2020-12-27 15:57:29.967] [info] Creating 1024-wide FFTW plan
[2020-12-27 15:57:29.967] [info] Input stream: signed 16bit integer at 44100Hz
^C[2020-12-27 15:57:31.813] [info] Terminating ...

It is sometimes useful to see the spectrogram in real time. Live mode can be enabled with the -l, --live flag:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l outfile.png

The spectrogram will now be displayed as it is being compute from standard input. When either SIGINT is received or the live window is closed, the program will terminate and write outfile.png.

If file output is not desired, only live mode can be used, and nothing is written upon termination:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l

For obvious reasons, live mode cannot be used with file input.

Input options

In the above examples we assumed that the program input is 16-bit signed integer at 44.1kHz, which happens to be what my sound card (and many others) outputs by default.

We can, however, specify any other rate with -r, --rate and datatype with -d, --datatype:

$ parec --channels=1 --device="${PASOURCE}" --raw --format=float32 --rate=48000 | specgram -l -r 48000 -d f32

The above example will read 32-bit floating point input at 48kHz. For a full list of supported data types see the manpage.

NOTE: The specified rate is used only for display purposes and for interpreting other command line parameters. There's nothing stopping us from using a different rate than the actual device rate, going down to nanohertz:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -r 1e-8

Usage with FFmpeg

In order to generate a spectrogram for an encoded audio file, it is neccesarily to decode it first. This can be done with FFmpeg, using any of the raw audio formats available.

For example, to generate the spectrogram for an MP3 file:

$ ffmpeg -i input.mp3 -f s16le - | specgram output.png

Or, in order to use 32-bit data:

$ ffmpeg -i input.mp3 -f s32le - | specgram -d s32 output.png

Note that you will have to manually stop specgram with a SIGINT once the ffmpeg stream is finished.

FFT options

The FFT window width can be specified with -f, --fft_width and the stride, that is the distance between the beginning of two subsequent FFT windows, can be specified with -g, --fft_stride:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -f 2048 -g 1024 

The above will compute 2048 elements wide FFT windows with a stride of 1024 elements, that is with a 50% overlap between windows.

Usually, a larger FFT window will give better frequency resolution but worse time resolution (i.e. it will be harder to locate signals in the time domain).

A smaller stride will give a smoother and richer output, but will strain the CPU more.

NOTE: You will notice that there isn't much difference between the output of the above command and the others. That is because the display width is different from the FFT window width. To change the display width, see Display Options below.

Lastly, if you encounter high sample rate signals, for which you can't display a wide enough (or often enough) window, you can use window averaging (-A, --average).

$ rx_sdr -d 0 -g 50 -f 97300000 -s 960000 -F CF32 - | ./specgram -lq -r 960000 -d cf32 -A 20 

The above example consumes input at 960k samples per second from a RTL-SDR dongle, which at a 1024 wide FFT window would mean displaying over 900 windows per second; a bit much for the average PC, and for the average human to follow.

Averaging 20 windows gives us a much more reasonable 47 windows per second.

Display options

To change the display width we can use -w, --width:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -f 1024 -w 1024

As you will notice, the spectrogram is somewhat blurry, because the program is resampling the 513 element wide positive part of the FFT output to the display width of 1024. If you need sharp, crisp spectrograms, then you can use -q, --no_resampling to disable resampling:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -lq -f 1024

When not resampling, you can no longer specify the display width, as it is computed from the rest of the parameters.

Another use case is displaying a specific band of frequencies, using -x, --fmin and -y, --fmax to set the frequency bounds. For example, to zoom in on the 500-3000Hz band:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -lq -f 9192 -x 500 -y 3000 

The colormap can be specified with -c, --colormap; see image below the example for supported colormaps:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -c orange

colormaps

In order to see how to specify the background, foreground and custom colormap colors, please see the manpage.

While the live view has both axes and legend enabled by default, file output does not. To enable them use -a, --axes or -e, --legend. Please note that axes are implicit when displaying the legend.

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -le outfile.png

Finally, if you'd like to rotate the spectrogram 90 degrees counter-clockwise, so as to read it from left to right, you can use -z, --horizontal:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -lz

This flag applies to both the live window and output file.

Live options

Use -k, --count to control the number of FFT windows displayed in live view:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -k 128

Use -t, --title to specify the live window title:

$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -t "My spectrogram"

License

Copyright (c) 2020-2023 Vasile Vilvoiu <[email protected]>

specgram is free software; you can redistribute it and/or modify it under the terms of the MIT license. See LICENSE for details.

Acknowledgements

Taywee/args library by Taylor C. Richberger and Pavel Belikov, released under the MIT license.

Program icon by Flavia Fabian, released under the CC-BY-SA 4.0 license.

Share Tech Mono font by Carrois Type Design, released under Open Font License.

Special thanks to Eugen Stoianovici for code review and various fixes.

specgram's People

Contributors

bojidar-bg avatar rimio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

specgram's Issues

Really cool default colormap.

The jet colormap is a classic, but I have to admit it looks like crap for live (scrolling) spectrograms.

A cool but easy on the eyes colormap is required.

Build failure

Build fails under ArchLinux, cannot find std::clamp().

Process is busywaiting when no input is present

While good for latency during high throughput scenarios, it's not necessarily nice to keep a CPU at 100% while waiting for input.

Explore the ideas of:

  • introducing std::this_thread::sleep_for in the two cases where we wait for input within the main loop
  • introducing a flag for activating the above behavior (i.e. busywait by default, wait on flag set)

Functional testing

Write functional tests that make sure the overall output to known inputs is correct.

Drop spdlog as a dependency

The program's logging needs are rather simplistic, so there's no need to keep spdlog as a dependency, especially after the incompatibility issues with fmt 8.0.0.

Write own logging routine.

Allow custom scale string

Allow definition of custom scale strings using the -s, --scale options.

dBFS will remain a special case and default option, as it uses the dBFSValueMap which clamps the values to an upper limit of 0.

Add new LinearValueMap and dBValueMap value maps which map the values accordingly.

If the string begins with db or dB then it defaults to dBValueMap and the rest of the string is used for display purposes. Limits can be specified with, maybe, commas? I.e. -s dBm,20,-20 specifies the limits for a dBValueMap.

Otherwise it uses a LinearValueMap with the limits specified as above.

Refactoring #1

  • LiveOutput caches the Configuration object, but also caches ColorMap and ValueMap. Maybe split relevant stuff from Configuration?
    Renderer caches Configuration, maybe shouldnt?
  • Renderer::*fft_area* stuff refer to the spectrogram fields, so should be named accordingly.
  • const auto MustDumpToStdout() const -> auto MustDumpToStdout() const

Allow stdout output with `-`

Allow dumping of PNG image data to stdout, where the user can display it with imagemagick. E.g.:

specgram -i infile - | display

Unit testing

Unit testing for testable things like ColorMap, FFT, InputParser, InputReader, parts of Renderer, ValueMap, WindowFunction.

  • ColorMap
  • FFT
  • InputParser
  • InputReader
  • Renderer
  • ValueMap
  • WindowFunction

Assertion failed on small frequency band

$ ./specgram -lq -x 1 -y 100
[INFO] Creating 1024-wide FFTW plan
[INFO] Scale decibel, unit FS, bounds [-120, 0]
specgram: /home/rimio/git/specgram/src/renderer.cpp:396: std::__cxx11::list<std::tuple<double, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > Renderer::GetNiceTicks(double, double, const string&, unsigned int, unsigned int, bool): Assertion `fval <= v_max' failed.
Aborted (core dumped)

README typos

First sentence under Display Options reads You will notice that there isn't much difference between the output of the above command and the others which makes no sense, as the rx_sdr example has been added.

then you can use the -q -> then you can use -q

Fix these.

Legend ticks are badly spaced

Command line:

./specgram -eq -f 8192 -y 4000 --bg-color=00000000 --fg-color=000000ff -i ../resources/clips/rammstein_dalai_lama example_file.png

Output:
example_file

Stop on EOF from stdin

As per one user's request:

(also, I wonder if it would make sense for specgram to stop whenever the stream ends; I'm guessing around AsyncInputReader::ReachedEOF(). Since, once we get an EOF on stdin, there should be no way to get more data in... right?)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.