Comments (8)
Hi,
I have added this check to prevent multiple reads exception and script freeze(#9).
Elasticsearch has 30m timeout per scroll page and 120sec per http request.
You should provide more information about your Elasticsearch (version, architecture, index settings, index mapping) and more information about es2csv args.
If you are losing some information probably it could be hardware issue. Logs from Elasticsearch during scroll process can dot your i's and cross your t's.
from es2csv.
@WormsCH, @conradlee is this issue still reproduced for you?
from es2csv.
yes I encountered it again, even with your patch
On Mon, Oct 24, 2016, 10:34 AM Taras Layshchuk [email protected]
wrote:
@WormsCH https://github.com/WormsCH, @conradlee
https://github.com/conradlee is this issue still reproduced for you?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAOGJZKHHp-GiaBOv4zSvl9POVIklMWKks5q3MH3gaJpZM4KGFC7
.
from es2csv.
@conradlee You should provide more information about your Elasticsearch (version, architecture, index settings, index mapping) and more information about es2csv args, version, python and pip versions, OS version.
from es2csv.
sorry on the road now but I'll try to replicate this problem and document
all those important details when I'm done traveling next week
On Thu, Oct 27, 2016, 6:20 PM Taras Layshchuk [email protected]
wrote:
@conradlee https://github.com/conradlee You should provide more
information about your Elasticsearch (version, architecture, index
settings, index mapping) and more information about es2csv args, version,
python and pip versions, OS version.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAOGJSRP7rzhQNP3lFC-5Dy9SIIMPJsYks5q4SPIgaJpZM4KGFC7
.
from es2csv.
Ok, I can provide you with some information:
- Elasticsearch version: 1.7
- elasticsearch-py version: 2.4.0
- Python Version: 2.7.3
- Pip version: 8.1.1
I have a theory about what's causing the infinite loop. The query I'm running selects all documents with a saved date less than some specified cutoff. It's a big query though, so it takes around 12 hours for es2csv
to scroll through all the results and save them. In the meantime, some of the documents in the original result set have been re-saved, removing them from the result set.
Depending on how the scrolling is implemented, this could mean that the final result set is smaller than the original result set, which means that the while
loop never exits.
from es2csv.
The es2csv is using under the hood scroll-api, or rather to be precise elasticsearch-py.scroll-api.
I have never test it on editable indexes and can not find any documentation about logic how it works.
So my advice is to copy your index (to make it read only) and to query it with your request.
Logs from ES could help too.
from es2csv.
@conradlee
Oh, looks like I found out the root cause(source):
For Elasticsearch 2.0 and later, use the major version 2 (2.x.y) of the library.
For Elasticsearch 1.0 and later, use the major version 1 (1.x.y) of the library.
So an issue can be that es2csv is using elasticsearch-py version: 2.4.0 and You have Elasticsearch version: 1.7.
from es2csv.
Related Issues (20)
- Error during pip install HOT 1
- es2ecv can not support encoding
- Export data from aggregation
- Possible to provide alternate delimiter for kibana style?
- change list to string in output HOT 1
- Argument for handling null values
- es2csv expectd one argument
- Unable to get the whole result
- Extending --auth to work with ES AWS IAM
- Passing a file name to -q is not called out in docs
- Getting "TypeError: unsupported operand type(s) for -: 'dict' and 'int'" HOT 1
- Getting unrecognized parameter: [_source_include] with Elastic 7.3 and using -f to specify fields HOT 1
- Index with a lot docs
- How to format a string
- Unable to escape newline characters while exporting to CSV HOT 1
- openSSL auth
- Non-ASCII characters not exported correctly HOT 3
- Chinese characters in the list are missing data
- It kept saying the error "No matching distribution found for es2csv" HOT 2
- Updated Fork
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from es2csv.