Coder Social home page Coder Social logo

Comments (13)

lvdmaaten avatar lvdmaaten commented on June 9, 2024 1

FFI = Foreign function interface

I think there exist several FFIs for Python; I've been using ctypes in the past.

from bhtsne.

gpapadop79 avatar gpapadop79 commented on June 9, 2024

result.dat is created in a temporary folder. In your case it tries to create it in /var.

Check if your user has permission to create folders in /var

from bhtsne.

Nightrord avatar Nightrord commented on June 9, 2024

@gpapadop79 thanks for the infos. Because the matrix as the input is large(95 rows and 745544 columns), if there is no enough memory, should it cause this problem? As the folder is a temporary one, how to check the permission to create folders?

Thanks in advance

from bhtsne.

lvdmaaten avatar lvdmaaten commented on June 9, 2024

Is the t-SNE algorithm itself actually being run? Like do you see a loss being printed every 50 iterations or something like that? The result file should actually be really small (95x2 matrix), so I would be surprised if this were an OOM problem.

To check permissions, you can do something like ls -la /var/folders/92/8ty0c6392m773r5tbp4s9gy80000gp/T/tmpakdFT0 and confirm that there are w for the three sets of users? Alternatively, try running this with sudo to see if that helps?

@rohit-gupta Perhaps it makes sense to have an input option to specify the folder for intermediate results? I am not a Python user, so I am not familiar with the exact behavior of mkdtemp().

from bhtsne.

Nightrord avatar Nightrord commented on June 9, 2024

The t-SNE algorithm doesn't run and it directly show the error: IOError: [Errno 2] No such file or directory: '/var/folders/92/8ty0c6392m773r5tbp4s9gy80000gp/T/tmpllxm8j/result.dat'. I think the problem is the input matrix so large, I will try it on some other machine.

from bhtsne.

lvdmaaten avatar lvdmaaten commented on June 9, 2024

Can you please copy-paste the full output? Does the data.dat file get written by the Python wrapper?

from bhtsne.

Nightrord avatar Nightrord commented on June 9, 2024

Here is the full output
~/Projects/bhtsne (master*) $ python bhtsne.py -i ~/Projects/tsne_python/lan_uid_matrix_tsne1.txt -o ~/Dropbox/github/data/lan_uid_coordinate.txt -p 5 -d 2 -t 1 -v

Error: could not open data file.
Traceback (most recent call last):
File "bhtsne.py", line 233, in
exit(main(argv))
File "bhtsne.py", line 224, in main
verbose=argp.verbose, initial_dims=argp.initial_dims, use_pca=argp.use_pca, max_iter=argp.max_iter):
File "bhtsne.py", line 211, in run_bh_tsne
for result in bh_tsne(tmp_dir_path, verbose):
File "bhtsne.py", line 164, in bh_tsne
with open(path_join(workdir, 'result.dat'), 'rb') as output_file:
IOError: [Errno 2] No such file or directory: '/var/folders/92/8ty0c6392m773r5tbp4s9gy80000gp/T/tmpllxm8j/result.dat'

The problem is that it can not find the intermediate file result.data.

from bhtsne.

lvdmaaten avatar lvdmaaten commented on June 9, 2024

The way the code works is: (1) Python wrapper writes data.dat, (2) binary runs t-SNE on data.dat, (3) binary writes results into result.dat, and (4) Python wrapper reads result.dat. Therefore, we first need to determine in which step things go wrong. The output suggests the problem is actually in step 1. Can you confirm by checking whether or not data.dat gets written?

from bhtsne.

Nightrord avatar Nightrord commented on June 9, 2024

The file data.dat has been written. But there is no result.dat. I think the problem happens on step 3.

from bhtsne.

Nightrord avatar Nightrord commented on June 9, 2024

I have run the data on another computer and I meet the same problem. Here is the output:

Traceback (most recent call last):
File "bhtsne.py", line 233, in
exit(main(argv))
File "bhtsne.py", line 224, in main
verbose=argp.verbose, initial_dims=argp.initial_dims, use_pca=argp.use_pca, max_iter=argp.max_iter):
File "bhtsne.py", line 206, in run_bh_tsne
init_bh_tsne(input_file, tmp_dir_path, no_dims=no_dims, perplexity=perplexity, theta=theta, randseed=randseed,verbose=verbose, initial_dims=initial_dims, use_pca=use_pca, max_iter=max_iter)
File "bhtsne.py", line 118, in init_bh_tsne
cov_x = np.dot(np.transpose(samples), samples)
MemoryError
Error: could not open data file.
Traceback (most recent call last):
File "bhtsne.py", line 233, in
exit(main(argv))
File "bhtsne.py", line 224, in main
verbose=argp.verbose, initial_dims=argp.initial_dims, use_pca=argp.use_pca, max_iter=argp.max_iter):
File "bhtsne.py", line 211, in run_bh_tsne
for result in bh_tsne(tmp_dir_path, verbose):
File "bhtsne.py", line 164, in bh_tsne
with open(path_join(workdir, 'result.dat'), 'rb') as output_file:
IOError: [Errno 2] No such file or directory: '/tmp/tmpiXA4u_/result.dat'

It still cannot find the result.dat and it also mentions MemoryError.
And this time, the data.dat file has not been written.
I think the problem is Python wrapper does not writer data.dat.

from bhtsne.

EMCP avatar EMCP commented on June 9, 2024

@lvdmaaten , just wanted to ask.. for those of us using the wrapper from within another python program.. we do not expect to see a data.dat file.. since the data is being fed via the call to run_bh_tsne correct?

from bhtsne.

lvdmaaten avatar lvdmaaten commented on June 9, 2024

The Python wrapper is writing a data.dat file, and then calling the bh_tsne binary. The binary writes the results in a results.dat file, which the wrapper reads in. Afterwards, the wrapper deletes both .dat files. So you would expect to see a data.dat file whilst the binary is running.

The whole thing is pretty clunky... I've been meaning to change this to a FFI call, but I haven't got around to doing that yet.

from bhtsne.

EMCP avatar EMCP commented on June 9, 2024

Understood. I guess I'm not seeing a data.dat then.. but it's likely related to the numppy crash I'm experiencing in OSX ...

Edit : After installing openBLAS and compiling numpy, wrapper functions (except inside of jupyter notebook, but that's an understood limitation I think.)

one last thing @lvdmaaten FFI means https://cffi.readthedocs.io/en/latest/overview.html ?

from bhtsne.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.