spiketools / spiketools Goto Github PK
View Code? Open in Web Editor NEWTools for analyzing spiking data.
Home Page: https://spiketools.github.io
License: Apache License 2.0
Tools for analyzing spiking data.
Home Page: https://spiketools.github.io
License: Apache License 2.0
From @LucaKolibius (but he didn't want to open an issue....)
Suggestion: have an option / implementation for circular shuffle that operates on spike times (adding different offsets) without converting to spike trains.
If both following conditions are satisfied:
position
data point in row orientation (position.shape == (2, 1)
)spiketools.spatial.utils.get_position_xy
There is an error when spiketools.spatial.utils.get_position_xy
tries to imply the orientation.
This happens because check_array_orientation(arr)
returns 'column'
when it should return 'row'
.
This is due to the condition position.shape[-2] > position.shape[-1]
in check_array_orientation(arr)
, which yields 'column'
as an output. This output is usually correct, but for the case of position.shape == (2, 1)
, it is not.
I assume this happens with all other functions that imply the orientation if the position
input has position.shape == (2, 1)
, but I have not checked.
ToDo: Add spike simulations.
Notes / to finish:
shuffle
Plot quirks to check:
plot_task_structure
: check inputs for event lines (seems to require lists, but inputs of arrays are common)plot_position_1d
: passing in a custom alpha value failsThe description of the bins
input in compute_spatial_bin_edges
isn't great - it could do with specifying it takes two integers - the number of X & Y bins to create.
ToDo:
We are currently not guaranteed to create and plot different spatial representations in the same orientation - meaning, for example, that computing and plotting a spatially binned firing rate map could end up in a flipped orientation as compared to plotting the raw position data.
ToDo:
Targets:
compute_spatial_bin_assignment
and plot_heatmap
imshow
) plots the data in a different orientation than standard plots (this could be it)This issue is to keep track of some ideas that are not a priority for a 0.1 release, but could go into a 0.2 update.
Potential Refactors:
np.histogramdd
Potential Updates:
get_value_by_time
to (optionally?) return the difference between requested timepoint and identified closest time sample (this would allow for keeping track of proximity). Similar to threshold exclusion, but allows for more detail.plot_rasters
: add option to colour raster dots by spike amplitudeplot_positions
: add option to color trajectory by elapsed timePotential Additions:
Things to do:
Need to figure out if/how to use these, and clean them up accordingly.
Spike trains have a sampling rate, but right now when we use them, this is little impicit (defaults to 1kHz, but doesn't document this, and has no options for changing.) This should be documented, and made more generalizable
Functions to check for this:
convert_train_to_times
create_spike_train
Also: computing spike trains depends on the unit of the data (ms vs. second), and the start time of the task. For examples, it blows up if times are encoded in UNIX time (as it tries to create a spike_train from time 0).
Note: relates to #5
Functions that need unit-tests adding:
Currently when we compute binned firing rates, we don't explicitly compute the time values they relate to, meaning that it defaults to plotting the value at the beginning of each bin. For visualization purposes, it should really plot the data values at the center of the bins.
This relates to:
compute_trial_frs
: computes the binned firing rates (does not compute time values)plot_rate_by_time
: plots the binned firing rates (does not offset the time values)Note: this could be fixed in one or the other, maybe requiring updates in both.
Plot helpers / utilities:
plot_line(s)
plot to plts/data
(done in #86 )_add_vline
and _add_shade
(done in #89)plot_dots
plot to plts/data
Plot edits:
plot_rasters
: update to take a shade region (done in #89)plot_positions
: update to allow for trial-by-trial input (done in #100)plot_positions
: update to visualize extra landmarks, etcNaming consistency:
plot_waveform
and plot_waveform3d
Goal - add doctest examples across the module
Files to add doctests to (check off when PR's are merged):
spiketools/measures/measures
(Claire) - #40spiketools/measures/conversions
(Claire) - #70spiketools/utils/spikes
(Claire) - #69spiketools/spatial/information
(Sandra) - #42spiketools/spatial/occupancy
(Sandra) - #26, #27, #35, #36spiketools/spatial/position
(Sandra) - #41spiketools/sims/prob
(Sandra) - #55spiketools/sims/dist
(Claire) - #71For examples of doctests, check the NeuroDSP module (https://github.com/neurodsp-tools/neurodsp). Note that one of the goals of doctests is to write in this Example
section, that gets rendered on the documentation website.
For example,
The goal of doctests is to provide a minimal example of running the function.
In terms of scope, typically we want:
Technical notes on doctests:
>>>
will get run by doctests# doctest:+SKIP
at the end of the lineOnce you have pytest installed, you can check doctests locally by doing:
# Move into the 'spiketools' folder '~/spiketools/spiketools` (eg. `cd `spiketools`)
# Run doctests, ignoring the tests folder
pytest --doctest-modules --ignore=tests
OR
# Run this command from the base repository ('~/spiketools/)
pytest --doctest-modules --ignore=spiketools/tests spiketools
Hey @TomDonoghue , there is an issue with the 'sim_spiketrain_poisson' function. In the documentation, the n_samples
is optional. But it raises a TypeError
when not included.
Example:
rate = 0.4 sim_spiketrain_poisson(rate, 1000, bias=0)
Error:
TypeError: sim_spiketrain_poisson() missing 1 required positional argument: 'fs'
ToDo:
This issue is to keep track of updates to the documentation / tutorials:
Code documentation:
Tutorials:
Make functions allow for 1d and 2d positions, bins, and edges.
The goal is to reduce unnecessary overlapping code.
PR #102
compute_spatial_bin_edges
compute_spatial_bin_assignment
compute_occupancy
get_pos_ranges
(also added example/docstring test for this one)This issue is to keep a running list of potential refactors, as we approach a 0.1 release.
Potential additions:
restrict_range
to reset time valuesHi, I am reviewing spiketools as part of openjournals/joss-reviews#5268.
https://spiketools.github.io/spiketools/search.html?q=test&check_keywords=yes&area=default
just says "Searching..."
The console reports:
searchtools.js:243 Uncaught ReferenceError: Stemmer is not defined
at Object.query (searchtools.js:243:21)
at Object.performSearch (searchtools.js:234:35)
at HTMLDocument.init (searchtools.js:175:23)
There is an issue with the create_spike_train
function, in that it sometimes ignores the last spike time.
This affects shuffle_circular
and shuffle_bins
functions. Might be one cause of #5
Example:
spikes = np.array([ 5, 91, 186, 206, 236, 325, 378, 677, 720, 763])
spike_train = create_spike_train(spikes)
# check that the last spike is missing by the following line:
np.where(spike_train)
This seems to be happening because the pre-allocated space for the output (spike_train
) in the create_spike_train
is too small by one. Additionally, when setting 1
to the correct spike_train
positions, the last one is missing.
ToDo:
Sometimes soonish, we can make an initial 0.1.0
release with an initial alpha version of the module. This issue collects things to do for this initial release.
Code updates (functionality):
sims
sub-module (done in #21)Module updates:
spiketools/spatial/
and spiketools/plts/space
is inconsistentDocumentation updates:
There is an issue with the shuffle_circular
function, in that it sometimes fails with a shape error (from line 216).
It's unclear why this is happening.
ToDo:
Example use case:
Have continuously sampled player position (with times), and an object position of interest. Goal: get the time point the player position is closest to the object. How: get position index closest to object, index into timestamps.
Notes: can do this with get_ind_by_time
, however, we are actually selecting index by value (so the name & description are not right / obvious).
Potential ToDo: add some kind of updated description or new function to explicitly support this kind of selection by values.
General ToDos:
Plot tweaks:
plot_trial_rasters
Plots to add:
Main issue: spike times in seconds vs. milliseconds
For example, shuffle_spikes
assumes ms, can fail weirdly with seconds (#5).
ToDo:
spike_times
inputs)Opening one full issue for item 7 of #159
Context: The shuffle_poisson
function is just permutations of one spike train.
In the source code of shuffle_poisson
, we have isis = permute_vector(compute_isis(poisson_spikes), n_permutations=n_shuffles)
and permute_vector
says that there is no randomness - the permutation is just moving an ISI and a spike from the end to the beginning.
Issue: If your num_shuffles
is small, then your random permutations will be very similar to each other while being quite different from the original data. Therefore, this would not serve as a good baseline distribution for stats.
Recommendation from @rly to address this issue by doing one of the following:
shuffle_poisson
documentationpoisson_generator
is called once for each shuffle instead of just once to introduce randomness in permute_vector
.If we want to mimic the organizational structure of NWBfile, note that we currently expect position data as row data, where as NWB expects it to be in column data. This means that we currently transpose position data when we pull it from NWBfiles.
Nice job using sphinx-gallery to give users downloadable, executable examples!
If you want to go a step farther, I'd recommend exploring configuring sphinx gallery to generate Binder links as well. This will let prospective users run your tutorials with one click (rather than needing to configure a local environment and download the examples).
There's more info on how do do this in the Sphinx-Galley docs here: https://sphinx-gallery.github.io/stable/configuration.html#generate-binder-links-for-gallery-notebooks-experimental
(related to openjournals/joss-reviews#5268)
There's is maybe a bug / issue with using the time_threshold in occupancy computations - that presumably relates back to create_position_df
(where the time_threshold is applied).
Notes:
compute_trial_occupancy
)There are some noted potential refactors for the repo, that need to be checked & updated.
This includes:
occupancy
(mostly around dropping pandas -> numpy).Its great that the Install page notes that statsmodels
is an optional dependency, however, you can make this even easier for users, with one (or both) of the following:
pip install statsmodels
statsmodels
as an extra dependency for setuptools so that pip install spiketools[all]
or the like will install all dependencies(related to openjournals/joss-reviews#5268)
compute_bin_counts_assgn
& compute_bin_counts_pos
should have the same top level description, and have a clearer output description (can be counts, if not normalized, or rates, if normalize)I think there's a bit of an issue with how the spatial information analyses are implemented: basically the public facing functions take in spike positions and occupancy, and then recomputes the binning and later normalizes by occupancy, before computing the actual spatial information.
There are a couple quirks with this approach:
It seems it might make more sense that we could have a general compute_spatial_info
function, that takes in a 1d or 2d array, that is the binned firing rate (normalized as desired), and then simply computes the spatial information.
@maestapereira & @claire98han - I'm tagging you in since you've both worked on this code and explored some place cell analyses - so let me know if you have any thoughts here!
A running list of quirks with the current version that we might fix:
1) The name spatial/utils/compute_bin_time
is a bit confusing, as 'bin' does not refer to spatial bins (as is generally standard across the rest of spatial
, but instead refers to something like "compute time bins of sampling"
2) The shuffling functions shuffle_bins
and shuffle_circular
throw errors when there is more than one spike within a single millisecond.
convert_times_to_train
asking for an increase in sampling ratesim_spiketimes
3) The naming of parameters that refers to time indices, timepoints and timestamps are not consistent.
utils/epoch/epoch_data_by_time
,utils/extract.py
: get_value_range
, get_ind_by_time
, get_inds_by_times
, get_value_by_time
, get_values_by_times
, get_values_by_time_range
, threshold_spikes_by_times
, threshold_spikes_by_values
. 4) since the plts.annotate
functions can and do get imported to use by the user, they shouldn't have a leading underscore. Also, some related fixes:
plts.annotate
, fixing _add_side_text
Some things to check and potentially update in the code:
check_orientation
to infer array orientation, there can be an issue with 1 element arrays, which can't be inferred properly (see #162)shuffle_poisson
function, including
permute_vector
subsequent shuffles aren't independent of previous ones, but rather are a "step" over (see stats tutorial, and also #163)detect_empty_ranges
and compute_presence_ratio
There appears to be an issue with shuffle-bins ('bincirc'), that needs digging into.
Updates:
plot_rasters
plot_position_by_time
should have an option to indicate events (such as responses)
I was trying to mix plt
plot functions with spiketools
plot functions and manually add a legend and noticed the spiketools
plot function label did not work properly.
Here are the scenarios:
ax = get_grid_subplot(grid, 1, slice(0, 3))
a1 = ax.plot(trace_times, trace_values, label='trace')
a2 = plot_scatter(spike_times, np.zeros(len(spike_times)), c='r', ax=ax, label='spikes')
ax.legend([a1, a2])
This shows a legend, but the spiketools plot has no label and the plt plot has a weird label (not "trace").
ax = get_grid_subplot(grid, 1, slice(0, 3))
a1 = ax.plot(trace_times, trace_values)
a2 = plot_scatter(spike_times, np.zeros(len(spike_times)), c='r', ax=ax)
ax.legend([a1, a2], ['trace', 'spikes'])
Which doesn't show a legend (just a tiny square where the legend is supposed to be. Adding the handles=
and labels=
to the above code in ax.legend
also shoes no legend.
Currently the spatial information take spike X & Y data, and pre-computed occupancy, and then re-bin the spike positions, given a bin definition. If passing a number of bins, this does not assure that the binning is the same as was used to compute the occupancy, in which case the results are null. To Fix.
https://spiketools.github.io/spiketools/auto_examples/index.html gives a 404 error.
Hi, I am reviewing spiketools
as part of openjournals/joss-reviews#5268. Great documentation and tutorials. I appreciate the effort put into both. In going through the tutorials, I found a few points of potential improvement and typos to address.
bin_widths = np.array([1, 1, 1, 2, 1.5, 1, 1, 1, 2, 1, 1, 1, 1, 1, 0.5])
is defined but not used and is therefore confusing. I suggest removing it.
The bins
variable defined in that cell is also not used until considerably later and might be confusing to define this early. I suggest moving the definition to right before it is used.
This code:
# Compute speed at each point
bin_widths = np.array([1, 1, 1, 2, 1.5, 1, 1, 1, 2, 1, 1, 1, 1, 1, 0.5])
is confusing where it comes from. I suggest replacing it with np.diff(timestamps)
which is equivalent and more clear.
The comment # Compute the durations of the timestamps
does not make sense to me. "the durations of the samples` might be better.
I think in the comment # Define speed threshold, used to position values for speed less than the threshold
, "position" should be changed to "remove" or "ignore" or "remove position".
The shuffle poisson raster looks like one spike train permuted. Indeed, the source code says isis = permute_vector(compute_isis(poisson_spikes), n_permutations=n_shuffles)
and permute_vector
says that there is no randomness - the permutation is just moving an ISI and a spike from the end to the beginning. If your num shuffles is small, then your random permutations will be very similar to each other while being quite different from the original data. Therefore, this would not serve as a good baseline distribution for stats. I recommend one of the following: add a warning about this behavior in the function documentation, introduce randomness such that poisson_generator
is called once for each shuffle instead of just once, introduce randomness in permute_vector
.
In the code shuffled_spikes = shuffle_spikes(spikes, 'CIRCULAR', shuffle_min=200, n_shuffles=10)
, I imagine that the 'CIRCULAR' arg is not case sensitive, but having it in all caps perhaps suggests that it might be? Not important, but just looks odd.
In the cell with df_pre_post = create_dataframe(data)
, I think it would be useful to print df_pre_post
in the same cell to give readers a better understanding of the table.
Now that we have our data organized into a dataframe, we can run an ANOVA using.
- remove "using" at the end.
Next line analyze whether there is an of the event on firing rates
- "an of" -> "an effect"
This will give us a the surrogate distribution
- "a the" -> "a"
To do so, we will simulating spike trains across trials (8 seconds) across spatial bins.
- "simulating" -> "simulate"
Like in point 9 above, in the cell with df = create_dataframe_bins(bin_firing_all, dropna=True)
, I think it would be useful to print df
in the same cell to give readers a better understanding of the table.
We should make tutorial pages for each of our core modules.
Tutorials to create:
Development suggestions:
tutorials
folderNote: tutorial pages are something we will workshop and edit a bunch. Start with drafts, hitting any of the main points you think we should hit. It's fine to leave template sections (like: "do X here", or "describe Y here"). Based on the early drafts we can organize what to look further into, and how to edit these
Examples from NeuroDSP:
Links to documentation tools we are using:
Naming things to check and make consistent:
There is an issue with the compute_spatial_bin_assignment
function, in that it misplaces positions that were supposed to be in the last bin. Instead, they are placed in the before last bin.
Here is an example:
position = np.array([[1.5, 2.5, 3.5, 4.5], [6.5, 7.5, 8.5, 9.5]])
x_edges = np.array([1, 2, 3, 4, 5])
y_edges = np.array([6, 7, 8, 9, 10])
compute_spatial_bin_assignment(position, x_edges, y_edges)
Expected output:
(array([1, 2, 3, 4], dtype=int64), array([1, 2, 3, 4], dtype=int64))
Actual output:
(array([1, 2, 3, 3], dtype=int64), array([1, 2, 3, 3], dtype=int64))
ToDo:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.