Coder Social home page Coder Social logo

elasticsearch-analysis-decompound's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elasticsearch-analysis-decompound's Issues

plugin [decompound] is incompatible with version [5.4.0]; was designed for version [5.1.1]

-> Downloading http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-analysis-decompound/5.4.0.0/elasticsearch-analysis-decompound-5.4.0.0-plugin.zip
[=================================================] 100%  
Exception in thread "main" java.lang.IllegalArgumentException: plugin [decompound] is incompatible with version [5.4.0]; was designed for version [5.1.1]
at org.elasticsearch.plugins.PluginInfo.readFromProperties(PluginInfo.java:146)
at org.elasticsearch.plugins.InstallPluginCommand.verify(InstallPluginCommand.java:428)
at org.elasticsearch.plugins.InstallPluginCommand.install(InstallPluginCommand.java:495)
at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:215)
at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:199)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:67)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122)
at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:69)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122)
at org.elasticsearch.cli.Command.main(Command.java:88)
at org.elasticsearch.plugins.PluginCli.main(PluginCli.java:47)

Running on:
[2017-05-15T11:02:40,575][INFO ][o.e.n.Node ] version[5.4.0], pid[9016], build[780f8c4/2017-04-28T17:43:27.229Z], OS[Mac OS X/10.12.4/x86_64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_25/25.25-b02]

elasticsearch 1.2.*

when is the plugin gonna be compatible with version 1.2.* of elasticsearch? or is there a way that I can install it manually?

Matching tokens

Hi,

I am stuck with this issue and I am quite sure I miss something really essential:

I setup the analyzer as below and it works quite well:

GET /myIndex/_analyze?analyzer=german&text=Straßenbahnschienenritzenreiniger

gives me all kinds of tokens. But: Searching returns all documents containing just ONE of the Tokens (with an OR-Operator so to say), ranking documents containing "straße" higher then documents containing "reiniiger" - ignoring multiple matches in the score. This is of course not what I intended...

However, I can see, that an AND-Operator for tokens would not do the right thing either... In fact the operation that could work would be something like (tokens derived from "straße" combined with OR) AND (tokens derived from "bahn" combined with OR) AND (...)

I could run analyze from the external application and build the AND-/OR-query there, but this does not seem to be quite elegant.

Is there another/better way?

"analysis": {
    "filter": {
       "baseform": {
          "type": "baseform",
          "language": "de"
       },
       "decomp": {
          "type": "decompound"
       }
    },
    "analyzer": {
       "german": {
          "filter": [
             "decomp",
             "baseform"
          ],
          "type": "custom",
          "tokenizer": "baseform"
       }
    },
    "tokenizer": {
       "baseform": {
          "filter": [
             "decomp",
             "baseform"
          ],
          "type": "standard"
       }
    }
 }

incorrect offsets / fast vector highlighter

Hi,

I use the german-decompounder in conjunction with the fast vector highlighter. The offsets of decompounded words seem to be incorrect.

For example, the analyze API returns for "Die Jahresfeier der Rechtsanwaltskanzleien auf dem Donaudampfschiff hat viel Ökosteuer gekostet.":

{

"tokens": [
    {
        "token": "Die",
        "start_offset": 1,
        "end_offset": 4,
        "type": "<ALPHANUM>",
        "position": 1
    }
    ,
    {
        "token": "Die",
        "start_offset": 1,
        "end_offset": 4,
        "type": "<ALPHANUM>",
        "position": 1
    }
    ,
    {
        "token": "Jahresfeier",
        "start_offset": 5,
        "end_offset": 16,
        "type": "<ALPHANUM>",
        "position": 2
    }
    ,
    {
        "token": "Jahr",
        "start_offset": 5,
        "end_offset": 9,
        "type": "<ALPHANUM>",
        "position": 2
    }
    ,
    {
        "token": "feier",
        "start_offset": 9,
        "end_offset": 14,
        "type": "<ALPHANUM>",
        "position": 2
    }
    ,...

The fast-vector-highlighter returns "Die Jahr<tag1>esfei</tag1>er der Rechtsanwaltskanzleien..." when searching for "Feier" since the offset of the "feier"-token is incorrect.

Add option to exclude certain words

It would be nice to have a option which excludes certain words like leinwand or haushalt from decompounding. I need this because the otherwise created terms wand and halt are causing relevance issues.

Controlling decomposition

I'd like to be able to use a dictionary based approach to controll which words will not be decomposed. Something similar like: https://www.elastic.co/guide/en/elasticsearch/guide/current/controlling-stemming.html

The words in a dictionary will not be decomposed by the plugin and will only produce the original token as output.

Example:
I'm indexing product data and merchant information. Some of the words are merchant names like: Interdiscount. I want to be able to control the decomposition plugin by providing a dictionary with words that must not be decomposed.

Support ES 2.4.4

I get the following error when installing:

ERROR: Plugin [decompound] is incompatible with Elasticsearch [2.4.4]. Was designed for version [2.4.1]

A patch version increase shouldn't break the compatability.

Can not get decompound to work

I have the current Elasticsearch version (1.5.2) and tried to setup decompound with the thin readme. I got not the expected results.

PUT /leads
{
  "settings": {
    "index": {
      "analysis": {
        "filter": {
          "decomp": {
            "type": "decompound"
          }
        },
        "tokenizer": {
          "decomp": {
            "type": "standard",
            "filter": [
              "decomp"
            ]
          }
        }
      }
    }
  }
}

Tested with:
GET leads/_analyze?
{Die Jahresfeier der Rechtsanwaltskanzleien auf dem Donaudampfschiff hat viel Ökosteuer gekostet}
Results in, which is not the same as shown in the readme:

{
   "tokens": [
      {
         "token": "die",
         "start_offset": 1,
         "end_offset": 4,
         "type": "<ALPHANUM>",
         "position": 1
      },
      {
         "token": "jahresfeier",
         "start_offset": 5,
         "end_offset": 16,
         "type": "<ALPHANUM>",
         "position": 2
      },
      {
         "token": "der",
         "start_offset": 17,
         "end_offset": 20,
         "type": "<ALPHANUM>",
         "position": 3
      },
      {
         "token": "rechtsanwaltskanzleien",
         "start_offset": 21,
         "end_offset": 43,
         "type": "<ALPHANUM>",
         "position": 4
      },
      {
         "token": "auf",
         "start_offset": 44,
         "end_offset": 47,
         "type": "<ALPHANUM>",
         "position": 5
      },
      {
         "token": "dem",
         "start_offset": 48,
         "end_offset": 51,
         "type": "<ALPHANUM>",
         "position": 6
      },
      {
         "token": "donaudampfschiff",
         "start_offset": 52,
         "end_offset": 68,
         "type": "<ALPHANUM>",
         "position": 7
      },
      {
         "token": "hat",
         "start_offset": 69,
         "end_offset": 72,
         "type": "<ALPHANUM>",
         "position": 8
      },
      {
         "token": "viel",
         "start_offset": 73,
         "end_offset": 77,
         "type": "<ALPHANUM>",
         "position": 9
      },
      {
         "token": "ökosteuer",
         "start_offset": 78,
         "end_offset": 87,
         "type": "<ALPHANUM>",
         "position": 10
      },
      {
         "token": "gekostet",
         "start_offset": 88,
         "end_offset": 96,
         "type": "<ALPHANUM>",
         "position": 11
      }
   ]
}

Equivalent setup in via java api did not change the outcome.

        final XContentBuilder mappingBuilder2 = jsonBuilder()
            .startObject()
                .startObject("index") // decompound filter
                    .startObject("analysis")
                        .startObject("filter")
                            .startObject("decomp").field("type", "decompound").endObject()
                        .endObject()
                        .startObject("tokenizer")
                            .startObject("decomp").field("type", "standard")
                            .startArray("filter")                                               
                                .field("decomp")
                            .endArray()
                            .endObject()
                        .endObject()
                    .endObject()
                .endObject()
            .endObject();


           final CreateIndexRequestBuilder createIndexRequestBuilder = client.admin().indices().prepareCreate(indexName);
     createIndexRequestBuilder.setSettings(ImmutableSettings.settingsBuilder().loadFromSource(mappingBuilder2.string()));

I tried also the your pack of plugins with the same result.
And yes I did the restart of my test elasticsearch server, otherwise it should have bailed out to create a filter of type decompound.

Highlighting seems to be broken

Hi,

i tried this on 1.5.2 and 1.7.2. This script should reproduce the error (NOTE: Im using port 9400 locally.):

curl -XDELETE http://localhost:9400/xyz/
curl -XPUT http://localhost:9400/xyz/ -d '
index:
  analysis:
    analyzer:
      search_analyzer:
        type: "custom"
        tokenizer: "standard"
        filter:
          - lowercase
          - x_compound
      index_analyzer:
        type: "custom"
        tokenizer: "standard"
        filter:
          - lowercase
          - x_compound
    filter:
      x_compound:
        type: "decompound"
'

curl -XPUT http://localhost:9400/xyz/_mapping/entries -d '
{
  "properties": {
    "title": {
      "type": "string",
      "search_analyzer": "search_analyzer",
      "analyzer": "index_analyzer"
    }
  }
}'

curl -XPOST http://localhost:9400/xyz/entries -d '
{"title": "dies ist ein test"}
'
curl -XPOST http://localhost:9400/xyz/entries -d '
{"title": "dies ist ein testbeitrag"}
'

curl -XPOST http://localhost:9400/xyz/entries -d '
{"title": "dies ist ein titeltest"}
'

curl -XGET http://localhost:9400/xyz/_search?pretty -d '
{
  "fields": ["title"],
  "query": {
    "multi_match": {
      "fields": ["title"],
      "query": "test",
      "analyzer": "search_analyzer"
    }
  },
  "size": 10,
  "highlight": {
    "number_of_fragments": 1,
    "fields": {
      "title": {"number_of_fragments": 1}
    }
  }
}'

The result returned by the query is:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.7123179,
    "hits" : [ {
      "_index" : "xyz",
      "_type" : "entries",
      "_id" : "AVEBt3eMGZQfGkXb9v9D",
      "_score" : 0.7123179,
      "fields" : {
        "title" : [ "dies ist ein test" ]
      },
      "highlight" : {
        "title" : [ "dies ist ein<em>dies ist ein test</em>" ]
      }
    }, {
      "_index" : "xyz",
      "_type" : "entries",
      "_id" : "AVEBt3ezGZQfGkXb9v9E",
      "_score" : 0.5036848,
      "fields" : {
        "title" : [ "dies ist ein testbeitrag" ]
      },
      "highlight" : {
        "title" : [ "dies ist ein<em>dies</em> testbeitrag" ]
      }
    }, {
      "_index" : "xyz",
      "_type" : "entries",
      "_id" : "AVEBt3fmGZQfGkXb9v9F",
      "_score" : 0.5036848,
      "fields" : {
        "title" : [ "dies ist ein titeltest" ]
      },
      "highlight" : {
        "title" : [ "dies ist ein<em>st e</em> titeltest" ]
      }
    } ]
  }
}

As you can see there are two problems with the highlights:

  1. The matched word is not highlighted: [ "dies ist ein<em>st e</em> titeltest" ]
  2. Parts of the sentence are duplicated: [ "dies ist ein<em>dies ist ein test</em>" ]

Maybe i have to change the mapping to use different word positions?

Thanks

Support for Elastic 2.0

The readme.md references a link to a 2.0-rc download archive, but the link is broken.
Any thoughts on supporting recent ES version?

Still maintained?

Last update was two years ago, so I am not sure if this plugin is still maintained?
Or maybe there is no need for this anymore in case Elasticsearch has its own implementation?

Thanks for clarifications in advance 👍

German decomp adds ";" symbol for certain words

Configuration:

default:
  tokenizer: standard
  filter: [german_decomp]

german_decomp:
  type: decompound

Query: _analyze?text="tomaten"

Result:

{
  tokens: [
  {
    token: tomaten
    start_offset: 1
    end_offset: 8
    type: <ALPHANUM>
    position: 1
  }
  {
    token: ;
    start_offset: 1
    end_offset: 2
    type: <ALPHANUM>
    position: 1
  }
  ]
}

License

Is the license really GPL?

Failure to decompose "Taschenhersteller"

Hi,

First of all, thanks for your plugin, which could avoid to use the obscure compound word token filter with hyphenation_decompounder (https://www.elastic.co/guide/en/elasticsearch/reference/2.0/analysis-compound-word-tokenfilter.html)

Having said that I cannot decompose "Taschenhersteller" which is a german word which should be decomposed as 2 words : Taschen & Hersteller
Having installed your plugin, I made the following (possibly erroneous) mapping :

-XPOST localhost:9200/my_index {
  "index": {
    "analysis": {
      "filter": {
        "decomp": {
          "type": "decompound"
        }
      },
      "tokenizer": {
        "decomp": {
          "type": "standard",
          "filter": [
            "decomp"
          ]
        }
      },
      "analyzer": {
        "my_anal": {
          "type": "custom",
          "tokenizer": "decomp"
        }
      }
    },
    "mappings": {
      "type1": {
        "properties": {
          "field1": {
            "type": "string",
            "analyzer": "my_anal"
          }
        }
      }
    }
  }
}

When trying to analyze the text "Taschenhersteller"

-XPOST localhost:9200/my_index {
    "analyzer": "my_anal",
    "text": "Taschenhersteller"
}

It gives me

{
    "tokens": [
        {
            "token": "Taschenhersteller",
            "start_offset": 0,
            "end_offset": 17,
            "type": "<ALPHANUM>",
            "position": 0
        }
    ]
}

Don't understand what I'm doing wrong ....

Could you help me please ? :)

Support for ES 2.2.1

When I try to install the plugin, I get this error:

ERROR: Plugin [decompound] is incompatible with Elasticsearch [2.2.1]. Was designed for version [2.2.0]

Is there going to be a release for this version?

Thanks!

xbib.com expired

Seems your website (xbib.com) expired, so downloads are not available anymore.
Probably a good occasion to also tackle #29 and set up an actual CI system. :)

Not recognizing 'Blutorange' as compound word

Hi, how exactly does this plugin work? Is it based on a german dictionary? We have the concrete problem that it does not decomposes the word 'Blutorange'. Is there a way this can be fixed?

Build/release for 5.1.2

A release for 5.1.2 would be really awesome 👍

Isn't it possible to define your plugin as compatibel for the 5.1.* series so it's possible to create only one build per minor release?

Support for ES 2.3.2

Now, it is not possible to install this plugin for ES 2.3.2. An error:
ERROR: Plugin [decompound] is incompatible with Elasticsearch [2.3.2]. Was designed for version [2.3.0]

Is it possible to have this plugin without new release for every minor elasticsearch release?

Thanks!

Failure to decompound Wandhalter

The term Wandhalterung is split to the tokens wand, alterung instead of wand, halterung. When setting the threshold to 0.63 or higher, the tokens are wandh and alterung. What can I do to fix this?

These are my settings:

index :
    analysis :
        analyzer :
            analyzer_decomp :
                type : custom
                tokenizer : standard
                filter : [lowercase, decomp]
        filter :
            decomp:
                type: decompound
        tokenizer:
            decomp:
                type: standard
                filter:
                  - decomp

I'm using Elasticsearch 2.1.1 and elasticsearch-analysis-decompound 2.1.1.0

Failure to decompound "Kinderzahnheilkunde"

The plugin fails to decompound the German word "Kinderzahnheilkunde". The resulting tokens are ["kinderzahnheilkunde", "kinderzahnhe", "ilkunde"]. The expected tokes are ["kinderzahnheilkunde", "kinder", "zahn", "heil", "kunde"].

I'm using plugin Version 2.2.0.0 and elasticsearch 2.2.0.

Index settings are

{
        "analysis": {
            "filter": {
                "german_stop": {
                    "type": "stop",
                    "stopwords": "_german_"
                },
                "german_stemmer": {
                    "type": "stemmer",
                    "language": "light_german"
                },
                "german_decompound": {
                    "type": "decompound"
                }
            },
            "analyzer": {
                "german_with_decompounder": {
                    "tokenizer": "standard",
                    "filter": [
                            "lowercase",
                            "german_decompound",
                            "unique",
                            "german_stop",
                            "german_normalization",
                            "german_stemmer"
                    ]
                }
            }
        }
    }

I got the results from the _analyze API with the explain=true option.

{
    "detail": {
        "custom_analyzer": true,
        "charfilters": [
        ],
        "tokenizer": {
            "name": "standard",
            "tokens": [
                {
                    "token": "Kinderzahnheilkunde",
                    "start_offset": 0,
                    "end_offset": 19,
                    "type": "<ALPHANUM>",
                    "position": 0,
                    "bytes": "[4b 69 6e 64 65 72 7a 61 68 6e 68 65 69 6c 6b 75 6e 64 65]",
                    "positionLength": 1
                }
            ]
        },
        "tokenfilters": [
            {
                "name": "lowercase",
                "tokens": [
                    {
                        "token": "kinderzahnheilkunde",
                        "start_offset": 0,
                        "end_offset": 19,
                        "type": "<ALPHANUM>",
                        "position": 0,
                        "bytes": "[6b 69 6e 64 65 72 7a 61 68 6e 68 65 69 6c 6b 75 6e 64 65]",
                        "positionLength": 1
                    }
                ]
            },
            {
                "name": "german_decompound",
                "tokens": [
                    {
                        "token": "kinderzahnheilkunde",
                        "start_offset": 0,
                        "end_offset": 19,
                        "type": "<ALPHANUM>",
                        "position": 0,
                        "bytes": "[6b 69 6e 64 65 72 7a 61 68 6e 68 65 69 6c 6b 75 6e 64 65]",
                        "keyword": false,
                        "positionLength": 1
                    },
                    {
                        "token": "kinderzahnhe",
                        "start_offset": 0,
                        "end_offset": 19,
                        "type": "<ALPHANUM>",
                        "position": 0,
                        "bytes": "[6b 69 6e 64 65 72 7a 61 68 6e 68 65]",
                        "keyword": false,
                        "positionLength": 1
                    },
                    {
                        "token": "ilkunde",
                        "start_offset": 0,
                        "end_offset": 19,
                        "type": "<ALPHANUM>",
                        "position": 0,
                        "bytes": "[69 6c 6b 75 6e 64 65]",
                        "keyword": false,
                        "positionLength": 1
                    }
                ]
            },

Any suggestions to receive better results are well appreciated. Thanks

Release for 5.5.0

A release supporting Elasticsearch 5.5.0 would be much appreciated.

I took the master, bumped elasticsearch.version in gradle.properties and haven't had any issues so far.

There is no zip for ElasticSearch 5.2.1

In case if someone needs these builds:
elasticsearch-analysis-decompound-5.2.0-plugin.zip
elasticsearch-analysis-decompound-5.2.1-plugin.zip

Installation:

$ sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install file:///path/to/elasticsearch-analysis-decompound-5.2.1-plugin.zip

or

$ sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install https://github.com/jprante/elasticsearch-analysis-decompound/files/807131/elasticsearch-analysis-decompound-5.2.1-plugin.zip

CI builds

jprante, can I offer you some help with establishing a CI host which will automatically build jarfiles and publish them, whenever new Elasticsearch is released?

java.lang.NumberFormatException: For input string: ""

Hi!

Thanks for your plugin.

Sometime I get exception:

[2013-06-06 16:57:49,918][DEBUG][action.bulk              ] [Quantum] [2] failed to execute bulk item (index) index 
java.lang.NumberFormatException: For input string: "" 
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:504)
    at java.lang.Integer.parseInt(Integer.java:527)
    at org.elasticsearch.analysis.decompound.Decompounder.reduceToBaseForm(Decompounder.java:223)
    at org.elasticsearch.analysis.decompound.Decompounder.decompound(Decompounder.java:61)
    at org.elasticsearch.index.analysis.DecompoundTokenFilter.decompound(DecompoundTokenFilter.java:68)
    at org.elasticsearch.index.analysis.DecompoundTokenFilter.incrementToken(DecompoundTokenFilter.java:55)
    at org.apache.lucene.analysis.miscellaneous.UniqueTokenFilter.incrementToken(UniqueTokenFilter.java:55)
    at org.apache.lucene.analysis.de.GermanNormalizationFilter.incrementToken(GermanNormalizationFilter.java:57)
    at org.elasticsearch.common.lucene.all.AllTokenStream.incrementToken(AllTokenStream.java:57)
    at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:202)
    at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:278)
    at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:766)
    at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2328)
    at org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:583)
    at org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:489)
    at org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:330)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:158)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:533)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:431)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:722)


Tried to debug your module but can't find anything. It happens from time to time (when I try to make bulk reindex, 50-100 docs per time).

eg: first time it crashes but second time it works correctly with same data.

Do you have any thoughts about the problem?

Thanks a lot anyway!

Decompound adds letters

Hi,

I just got stuck with some "FetchPhaseExecutionException" when using the highlighting and the decomp filter:

InvalidTokenOffsetsException[Token verzinnte exceeds length of provided text sized 83]

Drilling down into that was a little tricky since the words causing the Exceptions did not occur in the indexed text! After a while I found the following:

Using decompound add some words to the index that are longer than the orignal:

e.g. for "Kupferleiter, verzinnt" it ads "verzinnt" AND "verzinnte"
I have no clue what "verzinnte" is good for, but it sounds to me like the plural. However, since it is the last word in the text, highlighting fails because it exceeds the end of the text.

Here is an example analyzation of "verzinnt"

{
"tokens": [
{
"token": "verzinnt",
"start_offset": 0,
"end_offset": 8,
"type": "",
"position": 1
},
{
"token": "verzinnte",
"start_offset": 0,
"end_offset": 9,
"type": "",
"position": 1
}
]
}

My guess: The end_offset: 9 is the problem here because the analyzed text is just 8 characters long. So when it comes to highlighting, the highlighter probably tries to to highlight "verzinnte" as well, which leads to the Exception...

Installation doesn't work, incorrect download location

ElasticSearch Version: 0.20.5

When calling bin/plugin -install as given in README installation fails because it cannot download the plugin from the given locations. I tried to manually download the plugin from the given locations but got only 404 errors. What's the correct URL for downloading the plugin?

# bin/plugin -install jprante/elasticsearch-analysis-decompound/1.0.0
-> Installing jprante/elasticsearch-analysis-decompound/1.0.0...
Trying http://download.elasticsearch.org/jprante/elasticsearch-analysis-decompound/elasticsearch-analysis-decompound-1.0.0.zip...
Trying http://search.maven.org/remotecontent?filepath=jprante/elasticsearch-analysis-decompound/1.0.0/elasticsearch-analysis-decompound-1.0.0.zip...
Trying https://oss.sonatype.org/service/local/repositories/releases/content/jprante/elasticsearch-analysis-decompound/1.0.0/elasticsearch-analysis-decompound-1.0.0.zip...
Trying https://github.com/jprante/elasticsearch-analysis-decompound/zipball/v1.0.0... (assuming site plugin)
Failed to install jprante/elasticsearch-analysis-decompound/1.0.0, reason: failed to download out of all possible locations...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.