Comments (6)
So it wasn't quite as trivial as hoped, because mmap wants a file descriptor (int) instead
of a FILE*. Using fopen() and fclose() is the portable way to do things, but under
the covers they do a bunch of buffering and such that we don't really need for our
mmap call. On the bright side it seems like using fileno() from stdio.h allows us to
get the file descriptor, and I can't imagine that we will have any problems with mmap'ing
the region read-only so long as we are also fopen()'ing in "r" (read-only) mode, even
if it mmaps the same region under the covers. So hopefully all is well.
All unit tests pass in both dynamic and non-dynamic modes with the attached patch,
so I think everything should be alright.
Reported by [email protected]
on 2014-08-13 10:02:42
- _Attachment: [patch](https://storage.googleapis.com/google-code-attachments/cld2/issue-19/comment-1/patch)_
from cld2.
I had a pro/con discussion with another engineer on this, and here is the summary: the
patch will fix the problem, but is a little sketchy because of the mixing of fopen()
with mmap(). It would be better to #include unistd.h, and continue using open() instead
of fopen(); but this is specific to Unix and wouldn't work on Windows. That said, the
code is *already* broken on Windows because of the use of sys/mman.h (for mmap).
So: If we use unistd.h, we deepen the problems for Windows. If we use fopen(), we do
something questionable for non-Windows (it works, but it isn't really the right thing
to do).
I think the answer here is to use unistd.h for linux and use a workaround like this
to fix win32:
// Based on definitions from http://sourceforge.net/p/predef/wiki/OperatingSystems/:
#ifdef _WIN32
#include <io.h>
#define OPEN _open
#define CLOSE _close
#else
#include "unistd.h"
#define OPEN open
#define CLOSE close
#endif
We can extend this solution to fix the mmap problem as well, in the near future.
Reported by [email protected]
on 2014-08-13 11:12:51
from cld2.
I've created issue 20 to track the overall Windows compatibility problem for the dynamic
data loader.
Reported by [email protected]
on 2014-08-13 11:40:31
from cld2.
Here is a new patch based on #2 above. As described in issue 20, it disables support
for win32 in the "file"-based dynamic data apis; only raw pointer mode will work on
win32 until we have proper mmap support there.
So to be clear, we didn't end up changing the open() to fopen(), because open() makes
sense for this use case. Updating the bug description accordingly.
Reported by [email protected]
on 2014-08-13 12:31:26
- _Attachment: [patch2](https://storage.googleapis.com/google-code-attachments/cld2/issue-19/comment-4/patch2)_
from cld2.
Should be fixed in r166.
Reported by [email protected]
on 2014-08-13 12:35:43
- Status changed:
Fixed
from cld2.
Issue 23 has been merged into this issue.
Reported by [email protected]
on 2014-08-29 11:33:08
from cld2.
Related Issues (20)
- Fix array-subscript-is-char warning for Clang on Windows
- Fails to build from source with upcoming gcc-6 HOT 5
- A new language question HOT 3
- how to add it to c++ program HOT 4
- c++0x support HOT 1
- Language Detection with CLD2 with Mixed Inputs in long documents HOT 1
- How to interpret the Score value and Reliable flag. HOT 5
- UBSAN errors HOT 1
- Build/Compile Successfully but when it was deployed - error HOT 1
- Training set
- Which languages are supported HOT 1
- ISO differences HOT 1
- Training for a new language HOT 2
- CLD2 cannot classify text that doesn't have spaces
- crash SIGSEGV(SEGV_MAPERR) in utf8statetable.cc:384
- Thread-Safety for CLD2 ::DetectLanguageSummaryV2?
- How to install and use under pypy
- Java Binding? HOT 1
- Training neural network for language detection
- Long-form audio speaker diarization OOM in clustering HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cld2.