Comments (27)
The UUIDs I cannot smooth with my non-patched pytables plus good version of libhdf5:
- Katja's 66a392f402f811e5acb4d850e6c4a608 (20150525_180912.mainbrain.h5)
- Matthew's edd9d0dc23f811e5a038bcee7bdac3c6 (20150706_180558.mainbrain.h5)
Right now I think these might be genuinely broken mainbrain files and show a different problem than the one happening after pytables update.
I run this command:
flydra_analysis_export_flydra_hdf5 --dest-file ~/katja.h5 /mnt/strawscience/data/auto_pipeline/raw_archive/by_uuid/66a392f402f811e5acb4d850e6c4a608/*.mainbrain.h5
And I get this log (same for Matthew's file)
STAGE 1: finding timestamps
opening file /mnt/strawscience/data/auto_pipeline/raw_archive/by_uuid/66a392f402f811e5acb4d850e6c4a608/20150525_180912.mainbrain.h5...
caching raw 2D data... done
(cached index of 32154178 frame values of dtype int64)
hostname time_gain time_offset
-------- --------- -----------
'localhost' 1.0 -0.000182314803206
caching Kalman obj_ids...
finding unique obj_ids...
(found 16928)
(will export 16928)
finding 2d data for each obj_id...
/home/santi/Proyectos/imp/software/flydra/flydra/a2/data2smoothed.py:165: UserWarning: no host flycube6 in timestamp data. making up data.
'data.'%remote_hostname)
STAGE 2: running Kalman smoothing operation
detected file loaded with dynamic model "EKF mamarama, units: mm"
for smoothing, will use dynamic model "mamarama, units: mm"
/home/santi/Proyectos/imp/software/flydra/flydra/a2/core_analysis.py:1554: UserWarning: passing data_file as string to core_analysis.CachingAnalyzer.load_data()
warnings.warn('passing data_file as string to '
/home/santi/Utils/Science/anaconda/lib/python2.7/site-packages/adskalman-0.3.4-py2.7.egg/adskalman/adskalman.py:453: RuntimeWarning: invalid value encountered in isnan
Traceback (most recent call last):
File "/home/santi/Utils/Science/anaconda/bin/flydra_analysis_export_flydra_hdf5", line 9, in <module>
load_entry_point('flydra==0.6.6', 'console_scripts', 'flydra_analysis_export_flydra_hdf5')()
File "/home/santi/Proyectos/imp/software/flydra/flydra/a2/data2smoothed.py", line 311, in export_flydra_hdf5
main(hdf5_only=True)
File "/home/santi/Proyectos/imp/software/flydra/flydra/a2/data2smoothed.py", line 410, in main
**kwargs)
File "/home/santi/Proyectos/imp/software/flydra/flydra/a2/data2smoothed.py", line 259, in convert
**kwargs)
File "/home/santi/Proyectos/imp/software/flydra/flydra/a2/core_analysis.py", line 1663, in load_data
elevation_up_bias_degrees=elevation_up_bias_degrees,
File "/home/santi/Proyectos/imp/software/flydra/flydra/a2/core_analysis.py", line 943, in query_results
allocate_space_for_direction=have_body_axis_information,
File "/home/santi/Proyectos/imp/software/flydra/flydra/a2/core_analysis.py", line 372, in observations2smoothed
dynamic_model_name=dynamic_model_name)
File "/home/santi/Proyectos/imp/software/flydra/flydra/a2/core_analysis.py", line 347, in kalman_smooth
valid_data_idx=idx)
File "build/bdist.linux-x86_64/egg/adskalman/adskalman.py", line 565, in kalman_smoother
File "build/bdist.linux-x86_64/egg/adskalman/adskalman.py", line 455, in kalman_filter
ValueError: cannot do Kalman filtering with nan values in parameters
Closing remaining open files:/mnt/strawscience/data/auto_pipeline/raw_archive/by_uuid/66a392f402f811e5acb4d850e6c4a608/20150525_180912.mainbrain.kh5-smoothcache...done/mnt/strawscience/data/auto_pipeline/raw_archive/by_uuid/66a392f402f811e5acb4d850e6c4a608/20150525_180912.mainbrain.h5...done
from flydra.
With the files in the previous comment I seem to have 0% success rate.
With the other files, for which smoothing crashes randomly in strawcore, I seem to have a 100% success rate.
I guess I could systematically explore the last 3 months experiments, until now when crashes happen in my side, they happen at the very beginning.
from flydra.
There is a stochastic part to this bug. I have ran the following script overnight on my machine and on strawcore
#!/bin/bash
FILES="
20150525_180912.mainbrain.h5
20150624_175622.mainbrain.h5
20150703_173108.mainbrain.h5
20150706_180558.mainbrain.h5
20150703_173318.mainbrain.h5
20150703_174525.mainbrain.h5
20150702_174748.mainbrain.h5
20150702_175828.mainbrain.h5
20150702_175709.mainbrain.h5
20150630_175211.mainbrain.h5
20150702_175121.mainbrain.h5
"
while true
do
for f in $FILES
do
y=${f:0:4}
m=${f:4:2}
#ipath=/mnt/strawscience/
path="/mnt/strawscience/data/auto_pipeline/raw_archive/by_date/$y/$m/$f"
cp --update $path $f
fn=${f:0:25}
cachefn="$fn.kh5-smoothcache"
rm -f $cachefn
echo -n "$f "
flydra_analysis_export_flydra_hdf5 $f --dest-file /dev/null >${f}.log 2>&1
if [ $? -eq 0 ]; then
echo OK
else
echo FAIL
fi
done
done
On my machine
20150525_180912.mainbrain.h5 OK
20150624_175622.mainbrain.h5 OK
20150703_173108.mainbrain.h5 FAIL
20150706_180558.mainbrain.h5 FAIL
20150703_173318.mainbrain.h5 OK
20150703_174525.mainbrain.h5 FAIL
20150702_174748.mainbrain.h5 OK
20150702_175828.mainbrain.h5 OK
20150702_175709.mainbrain.h5 OK
20150630_175211.mainbrain.h5 FAIL
20150702_175121.mainbrain.h5 OK
20150525_180912.mainbrain.h5 OK
20150624_175622.mainbrain.h5 OK
20150703_173108.mainbrain.h5 FAIL
20150706_180558.mainbrain.h5 FAIL
20150703_173318.mainbrain.h5 FAIL
20150703_174525.mainbrain.h5 FAIL
20150702_174748.mainbrain.h5 OK
20150702_175828.mainbrain.h5 OK
20150702_175709.mainbrain.h5 OK
20150630_175211.mainbrain.h5 FAIL
20150702_175121.mainbrain.h5 OK
20150525_180912.mainbrain.h5 OK
20150624_175622.mainbrain.h5 OK
20150703_173108.mainbrain.h5 FAIL
20150706_180558.mainbrain.h5 FAIL
20150703_173318.mainbrain.h5
passes once and fails once. On strawcore
20150525_180912.mainbrain.h5 OK
20150624_175622.mainbrain.h5 OK
20150703_173108.mainbrain.h5 FAIL
20150706_180558.mainbrain.h5 FAIL
20150703_173318.mainbrain.h5 OK
20150703_174525.mainbrain.h5 FAIL
20150702_174748.mainbrain.h5 OK
20150702_175828.mainbrain.h5 OK
20150702_175709.mainbrain.h5 OK
20150630_175211.mainbrain.h5 FAIL
20150702_175121.mainbrain.h5 OK
20150525_180912.mainbrain.h5 OK
20150624_175622.mainbrain.h5 OK
20150703_173108.mainbrain.h5 FAIL
20150706_180558.mainbrain.h5 FAIL
20150703_173318.mainbrain.h5 OK
20150703_174525.mainbrain.h5 FAIL
20150702_174748.mainbrain.h5 OK
from flydra.
@nzjrs but your results about stochasticity are without 9179b57, right? I predict with that commit you won't have stochasticity anymore.
from flydra.
those files which @sdvillal and I have never been able to process - those are corrupt an unrecoverable?
from flydra.
@astraw So far this made it worse (testing with master)
20150525_180912.mainbrain.h5 FAIL
20150624_175622.mainbrain.h5 FAIL
20150703_173108.mainbrain.h5 FAIL
20150706_180558.mainbrain.h5 FAIL
sample log
cat 20150525_180912.mainbrain.h5.log
STAGE 1: finding timestamps
opening file 20150525_180912.mainbrain.h5...
caching raw 2D data.../home/stowers/Straw/flydra.git/flydra/a2/data2smoothed.py:165: UserWarning: no host flycube6 in timestamp data. making up data.
'data.'%remote_hostname)
/home/stowers/Straw/flydra.git/flydra/a2/core_analysis.py:1554: UserWarning: passing data_file as string to core_analysis.CachingAnalyzer.load_data()
warnings.warn('passing data_file as string to '
done
(cached index of 32154178 frame values of dtype int64)
hostname time_gain time_offset
-------- --------- -----------
'localhost' 0.999999999999 0.000729657720369
caching Kalman obj_ids...
finding unique obj_ids...
(found 16928)
(will export 16928)
finding 2d data for each obj_id...
STAGE 2: running Kalman smoothing operation
detected file loaded with dynamic model "EKF mamarama, units: mm"
for smoothing, will use dynamic model "mamarama, units: mm"
Traceback (most recent call last):
File "/home/stowers/.virtualenvs/flydranew/bin/flydra_analysis_export_flydra_hdf5", line 9, in <module>
load_entry_point('flydra==0.6.6', 'console_scripts', 'flydra_analysis_export_flydra_hdf5')()
File "/home/stowers/Straw/flydra.git/flydra/a2/data2smoothed.py", line 311, in export_flydra_hdf5
main(hdf5_only=True)
File "/home/stowers/Straw/flydra.git/flydra/a2/data2smoothed.py", line 410, in main
**kwargs)
File "/home/stowers/Straw/flydra.git/flydra/a2/data2smoothed.py", line 259, in convert
**kwargs)
File "/home/stowers/Straw/flydra.git/flydra/a2/core_analysis.py", line 1663, in load_data
elevation_up_bias_degrees=elevation_up_bias_degrees,
File "/home/stowers/Straw/flydra.git/flydra/a2/core_analysis.py", line 943, in query_results
allocate_space_for_direction=have_body_axis_information,
File "/home/stowers/Straw/flydra.git/flydra/a2/core_analysis.py", line 372, in observations2smoothed
dynamic_model_name=dynamic_model_name)
File "/home/stowers/Straw/flydra.git/flydra/a2/core_analysis.py", line 347, in kalman_smooth
valid_data_idx=idx)
File "/home/stowers/Straw/adskalman.git/adskalman/adskalman.py", line 566, in kalman_smoother
full_output=full_output)
File "/home/stowers/Straw/adskalman.git/adskalman/adskalman.py", line 454, in kalman_filter
raise ValueError("cannot do Kalman filtering with nan values in %s (shape %r)" % (name,arr.shape))
ValueError: cannot do Kalman filtering with nan values in R (shape (21642, 3, 3))
Closing remaining open files:20150525_180912.mainbrain.h5...done/mnt/ssd/CORRUPT/20150525_180912.mainbrain.kh5-smoothcache...done
from flydra.
So far @sdvillal gets a prize for fixing the crash with fce3436
20150525_180912.mainbrain.h5 OK
20150624_175622.mainbrain.h5 OK
20150703_173108.mainbrain.h5 OK
20150706_180558.mainbrain.h5 OK
20150703_173318.mainbrain.h5 OK
20150703_174525.mainbrain.h5 OK
the question is now - are the final h5 files the same
from flydra.
In my opinion, fce3436 is not a fix but a dangerous workaround that lets a bug propagate bad data into our system. I would delete those resulting processed files as they definitely will contain problematic results. I am working on a real fix but my internet access here is really terrible and hence I'm slow with a real fix.
from flydra.
Some observations:
I'm concerned that this bug started getting hit frequently in the last few days (07/02 and 07/03). This coincided with the upgrading of flydra / pytables on strawcore. Maybe that is coincidence. Maybe not.
Are you suggesting your fix, which caused all those files to be impossible to process was the correct fix or a partial fix? My reading of "propagate bad data into our system" is that the original files contained invalid data and therefor should fail. Thus your fix was correct. Yet this is a terribly large loss of data.
I understand that you don't have good internet for a fix, but a little communication on what you think the outcome is, or what you are currently thinking, would be welcome. As you might recall, regardless of this particular problem, we have gone down a one-way street (that we can't back out of) by moving to new pytables (as a rollback to old version leaves recent h5 files unreadable).
I also am very nervous about this whole pytables mokey patching business.
Between isilon and this, it has been a complete write-off week for us here.
from flydra.
These are the parallel smoothing + simple flydra comparison scripts I'm using
I'm right now running these in strz (Non sunt multiplicanda entia sine necessitate, I know...), using the same conda environment but for 3 different flydra versions:
- 066: release 0.6.6 (with empty array initialisation)
- master: 73bd9df (current master, with nan initialisation + the workaround for the race condition)
- nonans: fce3436 (the reverted, initialise to nonnan commit)
I will maybe also run the same setup in str22 (my machine). In any case, I will report here the results once they are there (sneak peek, both 066 and master fail to smooth most of these files).
from flydra.
I have added a new variant to the mix:
- nonans0: 614d469 like nonans, but initialising to 0; in my mind it should reproduce the errors I had with initialisation to empty, maybe showing that we are using data outside valid observations ranges somewhere other than in nanchecks.
I'm running in both strz and str22, to check if the distro / processor make a difference.
@astraw Let me know if at any time you would like to run the tests with more proper fixes, it takes me no time now.
from flydra.
Will test it very soon, in case it does not fix it I will call this variant:
- masterskip: a91f618 like master but with context manager for pytables + non-contiguous skipping fix
Anything that fixes it needs to be proven also in production
from flydra.
Santi, let me know a reduced subset you want me to test today (which HEADs
etc). I need my computer to not be painfully slow for the next few hours.
On 10 July 2015 at 09:31, Santi Villalba [email protected] wrote:
Will test it very soon
—
Reply to this email directly or view it on GitHub
#24 (comment).
from flydra.
mmmm let me write a bit of analysis code (yes, I cannot help it, will put stuff in a pandas dataframe) and I will tell you better but...
Yesterday I was very puzzled by you being able to smooth successfully these two in strawcore with 0.6.6:
- Katja's 66a392f402f811e5acb4d850e6c4a608 (20150525_180912.mainbrain.h5)
- Matthew's edd9d0dc23f811e5a038bcee7bdac3c6 (20150706_180558.mainbrain.h5)
Could you try again (0.6.6, these two) and tell me how often do you succeed?
from flydra.
Sure no problem, with master? with conda? with ubuntu?
On 10 July 2015 at 09:41, Santi Villalba [email protected] wrote:
mmmm let me write a bit of analysis code (yes, I cannot help it, will put
stuff in a pandas dataframe) and I will tell you better but...Yesterday I was very puzzled by you being able to smooth successfully
these two in strawcore with 0.6.6:
- Katja's 66a392f402f811e5acb4d850e6c4a608 (
20150525_180912.mainbrain.h5)- Matthew's edd9d0dc23f811e5a038bcee7bdac3c6 (
20150706_180558.mainbrain.h5)Could you try again (0.6.6, these two) and tell me how often do you
succeed?—
Reply to this email directly or view it on GitHub
#24 (comment).
from flydra.
Actually, I have changed my opinion. Better try first with Andrew's last commit (master) and ubuntu. If that fix it, we will be in much better, less work position.
If not, I was meaning released 0.6.6 (so choose conda or ubuntu, I would start with conda). I have a hypothesis why these files might have been failing until then that I will try to falsify later on (by checking if nans make it to these arrays when initialised to empty).
From my side, I will in this order:
- try first Andrew's last commit
- write that little results analysis code + report results
- if we still have the problem, maybe debug a bit flydra (why should I not have the fun too?)
from flydra.
conda with activate or without (as I get different results)?
from flydra.
It is interesting. If we get different results with and without accelerate I believe it means we are looking at blas/lapack dependent errors. I would just use accelerate to limit options (that's what I'm using everywhere, so we coulod cross-compare).
The first few minutes of tests with masterskip in strz look promising.
from flydra.
maybe I wait. your tests are much faster than mine.
can you also test on strz outside of conda? i.e. its just stock ubuntu right?
(oh, and as I said, they are different stochastic results, so in not really determinatively different)
from flydra.
Yes, just run master with ubuntu, that is the interesting bit, and otherwise let your machine and yourself take a rest.
I'm afraid that even if using stock ubuntu in strz is not a big deal, that cannot make it for a proper test in production (it is a 14.04 and I do not install from Andrew's repos).
from flydra.
ok, its running now. I put the interesting files (for you) first in the list
On 10 July 2015 at 10:25, Santi Villalba [email protected] wrote:
Yes, just run master with ubuntu, that is the interesting bit, and
otherwise let your machine and yourself take a rest.I'm afraid that even if using stock ubuntu in strz is not a big deal, that
cannot make it for a proper test in production (it is a 14.04).—
Reply to this email directly or view it on GitHub
#24 (comment).
from flydra.
So far no crashes with master, as expected given we just drop the invalid data.
20150525_180912.mainbrain.h5 OK
20150706_180558.mainbrain.h5 OK
20150624_175622.mainbrain.h5 OK
20150703_173108.mainbrain.h5 OK
from flydra.
Same good thing in strz, all uuids have been smoothed at least once (yohooo!). Not counting the chickens before they're hatched, these fixes make much sense and could even explain why we were having so many different varying and confusing failures (by just understanding the differences of np.empty between machines + smoothing dynamics with senseless data).
Let's compare these files when all is finished. We should also assess how many trajectories we are removing and think about why these trajectories with holes are happening in mainbrain files. All things being good, we should just resmooth all mainbrains that had trajectories with holes, and probably we will be able to rescue the data for experiments like these examples from Katja and Matthew.
strz 20150703_173318.mainbrain.h5 0 OK
strz 20150706_180558.mainbrain.h5 0 OK
strz 20150703_174525.mainbrain.h5 0 OK
strz 20150706_180558.mainbrain.h5 1 OK
strz 20150703_173318.mainbrain.h5 1 OK
strz 20150703_173108.mainbrain.h5 0 OK
strz 20150702_175709.mainbrain.h5 0 OK
strz 20150624_175622.mainbrain.h5 1 OK
strz 20150624_175622.mainbrain.h5 0 OK
strz 20150525_180912.mainbrain.h5 1 OK
strz 20150525_180912.mainbrain.h5 0 OK
strz 20150703_174525.mainbrain.h5 1 OK
strz 20150703_173108.mainbrain.h5 1 OK
strz 20150702_175828.mainbrain.h5 0 OK
strz 20150702_175709.mainbrain.h5 1 OK
strz 20150702_174748.mainbrain.h5 0 OK
strz 20150703_173318.mainbrain.h5 2 OK
strz 20150525_180912.mainbrain.h5 2 OK
strz 20150702_175121.mainbrain.h5 0 OK
strz 20150624_175622.mainbrain.h5 2 OK
strz 20150703_173108.mainbrain.h5 2 OK
strz 20150706_180558.mainbrain.h5 2 OK
strz 20150703_174525.mainbrain.h5 2 OK
strz 20150630_175211.mainbrain.h5 0 OK
from flydra.
Yes, although I'm more concerned about writing garbage than inducing it during read because of np.empty
from flydra.
Agreed, that is most scaring
from flydra.
Quick summary of smoothing in strz with masterskip:
- No errors, all runs produce same files (therefore no indeterminism)
- There were very few obj_ids with holes in each file, and in all but two cases these happened at the beginning (obj_id < 200, and usually much earlier)
The special cases with respect to holes are:
- 20150702_175709 did not have holes; actually smoothing never failed for me with that file, which is reassuring
- 20150624_175622 has a hole in obj_id 500019
from flydra.
Cool.
20150624_175622 is not an exception, rather that there is a bug in 0.6.6
where obj ids start from 500000 rather than 1 (31118a5)
On 12/07/2015 1:11 PM, "Santi Villalba" [email protected] wrote:
Quick summary of smoothing in strz with masterskip:
- No errors, all runs produce same files (therefore no indeterminism)
- There were very few obj_ids with holes in each trajectory, and in
all but two cases these happened at the beginning (obj_id < 200, and
usually much earlier)The special cases with respect to holes are:
- 20150702_175709 did not have holes; actually smoothing never
failed for me with that file, which is reassuring- 20150624_175622 has a hole in obj_id 500019
—
Reply to this email directly or view it on GitHub
#24 (comment).
from flydra.
Related Issues (20)
- use 3D distrance to set ROI radius in image_based_orientation
- refraction bug
- MemoryError -- flydra_analysis_export_flydra_hdf5 HOT 1
- make camera node publish ROS /camera_node/camera_info topic HOT 1
- roslib.names does not exist HOT 1
- flush currently tracked object to disk when stopping ongoing save HOT 1
- Frame discontinuities in mainbrain files
- Missing cameras dataset in reprojection error files HOT 10
- tracked object data should be periodically flushed to disk
- flydra_analysis_flip_calibration
- Crash in camnode.py while calibrating HOT 12
- draw detected points in /tracking view even if no calibration is loaded
- unable to calibrate with 0.6.8 HOT 1
- flydra 0.7.3 camera synchronization fails HOT 9
- camnode launch file fails HOT 1
- AVT camera not found HOT 5
- start saving hdf5 produces error
- timestamps off by more than 5 msec --synchronization error
- flydra_kalmanize: undefined symbol: IsZero HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flydra.