Comments (13)
Ah, I think I see the problem. It's the way Elasticsearch operates, which can sometimes cause strange results when dealing with it from an algorithm (usually when doing integration tests).
By default, Elasticsearch performs a refresh
operation every second. This refresh makes new documents visible to search...until the refresh is executed the documents are effectively invisible to search operations. A count request is just a special type of search that counts the total number of documents, so it is influenced by this refresh interval.
When you are indexing into ES rapidly, it is possible to index and then call a Count
request in just a few milliseconds...way below the 1s threshold. When you run the curl request manually, enough time has elapsed so the docs are "visible".
To fix this in your test code, just add a refresh
command:
$client = new \Elasticsearch\Client();
$deleteParams['index'] = 'test';
$client->indices()->delete($deleteParams);
$indexParams['index'] = 'test';
$client->indices()->create($indexParams);
$doc = new \stdClass();
$doc->id = 123;
$doc->field = "abc";
$doc->field2 = "xyz";
$params = array();
$params['id'] = $doc->id;
$params['index'] = 'test';
$params['type'] = 'item';
$params['body'] = (array)$doc;
$client->index($params);
// This refresh command will force a refresh and you'll see correct counts
$client->indices()->refresh(array('index' => 'test'));
$result = $client->count(array('index' => 'test'));
Array
(
[count] => 1
[_shards] => Array
(
[total] => 5
[successful] => 5
[failed] => 0
)
)
Of course, this is just for testing...you shouldn't call a refresh after every document is indexed or you will hurt your indexing speed and performance.
from elasticsearch-php.
Yes - that did help, I added the refresh just before the count and everything works as expected now.
Thanks for your help ! Made my starting experience with Elasticsearch very pleasant.
from elasticsearch-php.
I'm a little confused, the output from your curl command also shows a count of zero?
[count] => 0
??
If you can paste your entire set of commands I'd be happy to run them myself and see if I can recreate the situation. Are you on Elasticsearch 1.0 or an older version?
from elasticsearch-php.
I am sorry - I copied in the wrong example.
Now updated...
from elasticsearch-php.
Ok, that makes more sense :)
Can you paste a set of commands which recreate the issue? It's much easier to debug an example than to just start digging through the code. Thanks!
from elasticsearch-php.
I posted the commands I am using and the log output here:
PS: Yes - I am using Elasticsearch 1.0.0
from elasticsearch-php.
Great! Glad to help...some parts of Elasticsearch can be confusing for new users, and you ran into one of them. I'll see about adding a blurb to the docs about this problem for future users.
Lemme know if you run into anything else, bug or otherwise! :)
from elasticsearch-php.
@polyfractal @bretrzaun It is possible to update the index in a single request with an additional parameter...I have updated your example as follows
$client = new \Elasticsearch\Client();
$deleteParams['index'] = 'test';
$client->indices()->delete($deleteParams);
$indexParams['index'] = 'test';
$client->indices()->create($indexParams);
$doc = new \stdClass();
$doc->id = 123;
$doc->field = "abc";
$doc->field2 = "xyz";
$params = array();
// -------------------------
$params['refresh'] = true; // no need for a separate refresh request
// -------------------------
$params['id'] = $doc->id;
$params['index'] = 'test';
$params['type'] = 'item';
$params['body'] = (array)$doc;
$client->index($params);
$result = $client->count(array('index' => 'test'));
from elasticsearch-php.
Yep, you can indeed do this :)
I generally hide this option from new users because it is easy to slap a "refresh" onto all your indexing commands...and then forget that "refresh" has been toggled. This becomes very expensive because it is refreshing on each new document.
I prefer to introduce the concept as an explicit API call so that people are aware that it is an additional operation which is being performed, so they consider the overhead of calling it repeatedly.
But you're right, you can absolutely add it to individual commands instead of a second request.
from elasticsearch-php.
I prefer to introduce the concept as an explicit API call so that people are aware that it is an additional operation which is being performed, so they consider the overhead of calling it repeatedly.
Fair point 😄 👍
from elasticsearch-php.
Where is the curl & json equivalent of this in the ElasticSearch Docs? I found it once before, but can't seem to find it again
from elasticsearch-php.
@eddiejaoude It's described (briefly, without code sample) here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html#index-refresh
from elasticsearch-php.
Thanks @polyfractal , that is the Documentation I had found too. I was looking for an actual example.
However, I got more information from the IRC & I documented it here https://github.com/TransformCore/elasticsearch-example-docs/blob/master/docs/4-runtime-parameters/refresh-parameter.md#example
from elasticsearch-php.
Related Issues (20)
- Path in hosts configuration is ignored HOT 2
- check the index exsists ,it have a error HOT 1
- Connecting to Elasticsearch v8.x using the v7.17.x client HOT 1
- Received a 403 Forbidden error when attempting to index HOT 1
- `Response\Elasticsearch::offsetGet()` return type declaration HOT 2
- Need a new Release 6.8.3 HOT 4
- ServerError was not handled correctly. HOT 2
- Add support for Elasticsearch with Bulk API and data stream HOT 1
- How to pass specific characters password to ElasticSearch through Sulu/ArticleBundle HOT 5
- [Request feedback] Looking for feedback about the UX experience
- Calling static trait method ... is deprecated HOT 1
- Inquiries about version use HOT 2
- [Proposal] Add a mapTo(class) function for map ES|QL response into objects HOT 1
- Why Can't I update the mapping with Laravel Scout? HOT 1
- 使用PHP GET方法,不能获取到文档内容 HOT 4
- please help me change the php code from pagination from + size to search_after HOT 5
- failed to get the last sort using search_after HOT 1
- Add OpenTelemetry support HOT 1
- will elasticsearch-php pre-open the urls on the page when I visit it? HOT 3
- "Type: illegal_argument_exception Reason: "Fielddata is disabled on [pageCategory] in [elastic_web_index]. Text fields are not optimised for operations that require per-document field data like aggregations HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from elasticsearch-php.