Coder Social home page Coder Social logo

medcl / elasticsearch-rtf Goto Github PK

View Code? Open in Web Editor NEW
2.7K 2.7K 718.0 535.35 MB

elasticsearch中文发行版,针对中文集成了相关插件,方便新手学习测试.

License: Apache License 2.0

Shell 15.11% Batchfile 19.63% JavaScript 46.30% HTML 13.97% CSS 4.99%

elasticsearch-rtf's People

Contributors

likaiguo avatar medcl avatar php-cpm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elasticsearch-rtf's Issues

为什么总是不能设置默认分词

PUT /index/
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"default": {
"type": "ik"
}
}
}
}
}
}

总是返回
{
"error": {
"root_cause": [
{
"type": "index_creation_exception",
"reason": "failed to create index"
}
],
"type": "illegal_argument_exception",
"reason": "Unknown Analyzer type [ik] for [default]"
},
"status": 400
}

linux下启动脚本文件格式问题

linux下 无法./elasticsearch console 异常/bin/sh^M: bad interpreter
需要vim打开后 :set ff=unix
可能是因为在win下编辑过此文件吧
也可以配置一下 git config
报告一下 也许能帮助之后遇到这个问题的人
多谢~

这个版本很多漏洞,自己升级到了ES 1.3.2

_search?source={'size":1,"query":{"filtered":{"query":{"match_all":{}}}},"script_fields":{"t":{"script":"Integer.toHexString(31415926)"}}}}&callback=json

_plugin/head/../../../../../../../../etc/passwd

_plugin/head/../../../../../../../../etc/passwd

_nodes/stats _nodes/stats

请问怎么设置默认的ik分词?

在elasticsearch.yml添加index.analysis.analyzer.ik.type : “ik” 后出现如下提示:


Since elasticsearch 5.x index level settings can NOT be set on the nodes
configuration like the elasticsearch.yaml, in system properties or command line
arguments.In order to upgrade all indices the settings must be updated via the
/${index}/_settings API. Unless all settings are dynamic all indices must be closed
in order to apply the upgradeIndices created in the future should use index templates
to set default values.

Please ensure all required values are updated on all indices by executing:

curl -XPUT 'http://localhost:9200/_all/_settings?preserve_existing=true' -d '{
  "index.analysis.analyzer.ik.type" : "“ik”"
}'

然而在自动elasticsearch之后,curl -XPUT 'http://localhost:9200/_all/_settings?preserve_existing=true' -d '{ "index.analysis.analyzer.ik.type" : "“ik”" 也没有用,提示:

$ curl -XPUT 'http://localhost:9200/_all/_settings?preserve_existing=true' -d '{
>   "index.analysis.analyzer.ik.type" : "“ik”"
> }'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   445  100   395  100    50   5064    641 --:--:-- --:--:-- --:--:--  6269{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Can't update non dynamic settings [[index.analysis.analyzer.ik.type]] for open indices [[flask_reminders/di-2kgxaQuegxc6Qbo2P-Q]]"}],"type":"illegal_argument_exception","reason":"Can't update non dynamic settings [[index.analysis.analyzer.ik.type]] for open indices [[flask_reminders/di-2kgxaQuegxc6Qbo2P-Q]]"},"status":400}

谢谢了~

2.2.0版本es在centos6.5 x64上自动关闭

我下载的是2.2.0版本的rtf,安装在centos6.5 x64上,同时用plugin install mobz/elasticsearch-head安装了head。

在es上有一个索引,索引中包含2个类型,1个分片,0个备份。数据量很小,总共500条,200k数据。

es启动后,每隔一段时间,第一次是2天,后来都是半天不到,就自动关闭了。查看了/var/log/messages里面没有异常杀死进程的信息。

看es的日志,能够看到输出了stopping stopped closing closed ,但是没有关闭的原因。以下是最后的日志输出:

[2016-10-18 01:55:57,872][INFO ][node                     ] [Tom Thumb] stopping ...
[2016-10-18 01:55:57,893][DEBUG][indices                  ] [Tom Thumb] [nggirl-esdb-test] closing ... (reason [shutdown])
[2016-10-18 01:55:57,894][DEBUG][indices                  ] [Tom Thumb] [nggirl-esdb-test] closing index service (reason [shutdown])
[2016-10-18 01:55:57,894][DEBUG][index                    ] [Tom Thumb] [nggirl-esdb-test] [0] closing... (reason: [shutdown])
[2016-10-18 01:55:57,895][DEBUG][index.shard              ] [Tom Thumb] [nggirl-esdb-test][0] state: [STARTED]->[CLOSED], reason [shutdown]
[2016-10-18 01:55:57,895][DEBUG][index.shard              ] [Tom Thumb] [nggirl-esdb-test][0] operations counter reached 0, will not accept any further writes
[2016-10-18 01:55:57,895][DEBUG][index.engine             ] [Tom Thumb] [nggirl-esdb-test][0] flushing shard on close - this might take some time to sync files to disk
[2016-10-18 01:55:57,897][DEBUG][index.engine             ] [Tom Thumb] [nggirl-esdb-test][0] close now acquiring writeLock
[2016-10-18 01:55:57,897][DEBUG][index.engine             ] [Tom Thumb] [nggirl-esdb-test][0] close acquired writeLock
[2016-10-18 01:55:57,898][DEBUG][index.translog           ] [Tom Thumb] [nggirl-esdb-test][0] translog closed
[2016-10-18 01:55:57,909][DEBUG][index.engine             ] [Tom Thumb] [nggirl-esdb-test][0] engine closed [api]
[2016-10-18 01:55:57,910][DEBUG][index.store              ] [Tom Thumb] [nggirl-esdb-test][0] store reference count on close: 0
[2016-10-18 01:55:57,910][DEBUG][index                    ] [Tom Thumb] [nggirl-esdb-test] [0] closed (reason: [shutdown])
[2016-10-18 01:55:57,910][DEBUG][indices                  ] [Tom Thumb] [nggirl-esdb-test] closing index cache (reason [shutdown])
[2016-10-18 01:55:57,910][DEBUG][index.cache.query.index  ] [Tom Thumb] [nggirl-esdb-test] full cache clear, reason [close]
[2016-10-18 01:55:57,911][DEBUG][index.cache.bitset       ] [Tom Thumb] [nggirl-esdb-test] clearing all bitsets because [close]
[2016-10-18 01:55:57,912][DEBUG][indices                  ] [Tom Thumb] [nggirl-esdb-test] clearing index field data (reason [shutdown])
[2016-10-18 01:55:57,912][DEBUG][indices                  ] [Tom Thumb] [nggirl-esdb-test] closing analysis service (reason [shutdown])
[2016-10-18 01:55:57,912][DEBUG][indices                  ] [Tom Thumb] [nggirl-esdb-test] closing mapper service (reason [shutdown])
[2016-10-18 01:55:57,912][DEBUG][indices                  ] [Tom Thumb] [nggirl-esdb-test] closing index query parser service (reason [shutdown])
[2016-10-18 01:55:57,917][DEBUG][indices                  ] [Tom Thumb] [nggirl-esdb-test] closing index service (reason [shutdown])
[2016-10-18 01:55:57,917][DEBUG][indices                  ] [Tom Thumb] [nggirl-esdb-test] closed... (reason [shutdown])
[2016-10-18 01:55:57,918][INFO ][node                     ] [Tom Thumb] stopped
[2016-10-18 01:55:57,918][INFO ][node                     ] [Tom Thumb] closing ...
[2016-10-18 01:55:57,925][INFO ][node                     ] [Tom Thumb] closed

请问,这可能是什么原因造成的,谢谢?

java.lang.IllegalStateException: Received message from unsupported version: [1.0.0] minimal compatible version is: [5.0.0]

[2016-11-25T11:56:54,526][WARN ][o.e.t.n.Netty4Transport ] [node-p1] exception caught on transport layer [[id: 0x1a560625, L:/192.168.1.51:9300 - R:/192.168.1.51:34760]], closing connection
java.lang.IllegalStateException: Received message from unsupported version: [1.0.0] minimal compatible version is: [5.0.0]
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1199) ~[elasticsearch-5.0.0.jar:5.0.0]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) ~[transport-netty4-5.0.0.jar:5.0.0]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:350) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293) [netty-codec-4.1.5.Final.jar:4.1.5.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:280) [netty-codec-4.1.5.Final.jar:4.1.5.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:396) [netty-codec-4.1.5.Final.jar:4.1.5.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248) [netty-codec-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:350) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:350) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:610) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:513) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:467) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:437) [netty-transport-4.1.5.Final.jar:4.1.5.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873) [netty-common-4.1.5.Final.jar:4.1.5.Final]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
请问这个警告是为什么?

discovery.zen.ping.timeout doesn't work

hello:
we used two machines to build an elasticsearch cluster. when we start the second machine, this machine publishes a ping, but the first machine doesn't response in 3 seconds, so the first machine will become a new cluster. we configure the discovery.zen.ping.timeout to 30s, but it doesn't work and the second machine still wait for 3 seconds.

plugin error

Exception in thread "main" java.nio.file.FileSystemException: /opt/elasticsearch/plugins/.DS_Store/plugin-descriptor.properties: 不是目录

Use service to start elasticsearch error

On some of the Ubuntu 12.04 server, using the service "/etc/init.d/elasticsearch start" to start the elasticsearch, cannot access http://127.0.0.1:9200/_plugin/rtf/ and our client cannot call the RESTful API (timed out).

It seems like some services die, but if we "ps -ef | grep elasticsearch" we can see the elasticsearch process.

I say "Some" as this only occurs on parts of our Ubuntu 12.04 server.

I cannot see any error log.

@medcl can you help point out how to tuning this issue?

ik analyzer is not updated

The master of ik analyzer is 1.2.7, and the ik analyzer repo has a download link to here, but the ik analyzer in here is 1.2.6 with performance issue.

遇到SLF4J版本不兼容的问题

SLF4J: Found binding in [jar:file:/home/dopool/elasticsearch-rtf/elasticsearch/plugins/mapper-attachments/tika-app-1.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/dopool/elasticsearch-rtf/elasticsearch/plugins/transport-thrift/slf4j-log4j12-1.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: The requested version 1.5.6 by your slf4j binding is not compatible with [1.6]
SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details.
我的java版本是java version "1.6.0_26"
我需要设置什么吗?

不能修改elasticsearch.yml文件

下载最新版本,直接执行./bin/elasticsearch可以正常启动;但是当我修改了elasticsearch.yml文件中的内容之后,比如说修改了cluster.name,然后启动就报错了,错误如下:
image

jbdc

  1. 最新版本中貌似没有jdbc插件?
    2.目前学习es中的瓶颈是数据库到索引迁移的工作,是否有相关资源资料。

作为service启动报错了

默认启动正常,service启动,报错FATAL | wrapper | Unable to get the path for ''-没有那个文件或目录

急求救,关于建立索引时报400错误

自己根据项目实际情况写了个PINYIN分词插件, 集成到elasticsearch里,我在elasticsearch.yml里按照配置后,建立索引时报
400 : {"error":"RemoteTransportException[[Thunderbird][inet[/192.168.0.105:9300]][indices/create]];
求助高人!!

识别不了英文单词

我结合spring jpa使用,发现中文识别很好,但英文单词识别不了。是不是需要特殊的设置才可以?

新版本无法使用_all字段进行搜索

7f7dc245142d4fb145b446c75295a068e922adae版是没有问题的,
8fb58e47ce1030a7fef36b4fcd398812b106ffbe就不行

使用的是ubuntu 12.04 server,安装默认的openjdk
ava version "1.6.0_27"
OpenJDK Runtime Environment (IcedTea6 1.12.6) (6b27-1.12.6-1ubuntu0.12.04.2)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
需要看看mapping么?

import model error

when I run commands like

rake environment elasticsearch:import:model CLASS=Topic FORCE=y

it gave eroor below:

Elasticsearch::Transport::Transport::Errors::BadRequest: [400] {"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"Failed to parse mapping [topic]: The [string] type is removed in 5.0 and automatic upgrade failed because parameters [term_vector] are not supported for automatic upgrades. You should now use either a [text] or [keyword] field instead for field [title]"}],"type":"mapper_parsing_exception","reason":"Failed to parse mapping [topic]: The [string] type is removed in 5.0 and automatic upgrade failed because parameters [term_vector] are not supported for automatic upgrades. You should now use either a [text] or [keyword] field instead for field [title]","caused_by":{"type":"illegal_argument_exception","reason":"The [string] type is removed in 5.0 and automatic upgrade failed because parameters [term_vector] are not supported for automatic upgrades. You should now use either a [text] or [keyword] field instead for field [title]"}},"status":400}

the model is

 mapping do
    indexes :title, term_vector: :yes
    indexes :body, term_vector: :yes
    indexes :node_name
  end

  def as_indexed_json(_options = {})
    {
      title: self.title,
      body: self.full_body,
      node_name: self.node_name
    }
  end

  def related_topics(size = 5)
    opts = {
      query: {
        more_like_this: {
          fields: [:title, :body],
          docs: [
            {
              _index: self.class.index_name,
              _type: self.class.document_type,
              _id: id
            }
          ],
          min_term_freq: 2,
          min_doc_freq: 5
        }
      },
      size: size
    }
    self.class.__elasticsearch__.search(opts).records.to_a
  end

mmseg 分词问题

在elasticsearch-rtf/config/mmseg/words-my.dic中增加了自定义的一些词汇,比如“西红柿”,但是最终结果中分词出现的结果是这样:

{
   "took": 6,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 0.7263499,
      "hits": [
         {
            "_index": "index",
            "_type": "fulltext",
            "_id": "5",
            "_score": 0.7263499,
            "_source": {
               "content": "西红柿,番茄,鸡蛋,面条,西红柿鸡蛋面"
            },
            "highlight": {
               "content": [
                  "<tag1>西</tag1><tag2>红</tag2><tag1>柿</tag1>,番茄,<tag1>鸡蛋</tag1>,面条,<tag1>西</tag1><tag2>红</tag2><tag1>柿</tag1><tag1>鸡蛋</tag1>面"
               ]
            }
         }
      ]
   }
}

对此,该如何处理,或是否有相关文档?
thx

想知道怎么样才能搜索到特殊字符,如'+'

我这边用的环境是rtf的版本,没用其他的专门的插件之类的,语言为PHP
通过PHP的CURL来模拟提交实现的搜索

$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);//show result on your screen
if($post)
{
if ($isjson)
{
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
}
else
{
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data,JSON_UNESCAPED_SLASHES));
}
}
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // if any redirection after upload
$data = curl_exec($ch);//run by curl
$data = json_decode($data,true);
return $data;

传入的filter如下:

$filter = array(
"bool"=>array(
"must"=>array(
array("match"=>array("subject"=>$keyword)),
array("range"=>array("displayorder"=>array("gte"=>0)))
),
"must_not"=>array(),
"should"=>array()
)
);

2.2.1 版本head插件问题

我在项目中,使用您的版本,还不错最近尝试 2.2.1 版本,发现head有点问题,不知道如何,首先,head插件是变成了英文,而且,页面无法显示集群信息,es日志在报错

有集成中文分词吗?

我没看到有传说中的ik,试着match 全文搜索 "卡”找不到, “学生卡"能找到,是不是没有没有分词的原因?。

有些词搜索不到

设置一个域的值是 “今天是周五”

可以通过 “周” 搜到, 但是 “是”却搜不到, 请问是为什么?

分词的问题

一篇文章里有“古希腊”这个词

搜索“古希腊”能出结果但是搜“希腊”不行,这个是因为就是这么设计的还是需要什么别的配置?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.