Coder Social home page Coder Social logo

Comments (5)

costin avatar costin commented on September 23, 2024

Pushed fixes for 1 and 2 in master. 3 is left for later as it ties into mapping.
The handling of URL (between Rest/BufferedRest client) is not settled down yet but once the mapping feature comes into play, this will be addressed as well.

Thanks for the feedback and it would be great to get feedback on the ESTap in particular with regards to [1]. This was added through [2] since without it only one split was used instead of one-per-shard (default of 5) and I'm still not sure why it occurs (only with Cascading).

Cheers,

[1] https://github.com/elasticsearch/elasticsearch-hadoop/blob/master/src/main/java/org/elasticsearch/hadoop/cascading/ESHadoopTap.java#L52
[2] e04a5ce

from elasticsearch-hadoop.

osinitsin avatar osinitsin commented on September 23, 2024

Thanks man, that was fast :-)

On 06/11/2013 02:05 PM, Costin Leau wrote:

Pushed fixes for 1 and 2 in master. 3 is left for later as it ties into mapping.
The handling of URL (between Rest/BufferedRest client) is not settled down yet but once the mapping feature comes into play, this will be addressed as well.

Thanks for the feedback and it would be great to get feedback on the ESTap in particular with regards to [1]. This was added through [2] since without it only one split was used instead of one-per-shard (default of 5) and I'm still not sure why it occurs (only with Cascading).

Cheers,

[1] https://github.com/elasticsearch/elasticsearch-hadoop/blob/master/src/main/java/org/elasticsearch/hadoop/cascading/ESHadoopTap.java#L52
[2] e04a5ce


Reply to this email directly or view it on GitHub:
#52 (comment)

from elasticsearch-hadoop.

osinitsin avatar osinitsin commented on September 23, 2024

Hi Costin,

We at cascading http://cascading.org are about to deliver a new
feature - (data) provider plugins for cascading and lingual.
As an example, I've create an ES plugin.
Essentially, it has this contract:

public class ElasticsearchProviderFactory
public String description()
public Scheme createScheme(Fields fields, Properties properties) //
{new ESTap(,"dummy-resource",), esTap.sourceConfInit(
FlowProcess.NULL,), esTap.getScheme()}
public Tap createTap(Scheme scheme, String path, Properties
properties)// {new ESTap(,scheme.getSourceFields())}

plus cascading/lingual/catalog/provider.properties with at least

factory.class.name =
cascading.lingual.catalog.ElasticsearchProviderFactory

There are 2 implementations:

Loading 1000 records from a tab-delimited file (your 'artists') into ES:

hits:{

  • total:994
  • max_score:1
  • hits:[
    o {
    + _index:artists
    + _type:artist
    + _id:Mexa6CRkQTSm8k7JS6OBsg
    + _score:1
    + _source:{
    # Id:16
    # Name:London After Midnight
    # PageUrl:http://www.last.fm/music/London+After+Midnight
    # PictureUrl:http://userserve-ak.last.fm/serve/252/5364091.jpg
    }
    }
    o {
    + _index:artists
    + _type:artist
    + _id:ML1lBWjYT0usmmaEBOP7-w
    + _score:1
    + _source:{
    # Id:18
    # Name:The Crüxshadows
    # PageUrl:http://www.last.fm/music/The+Cr%C3%BCxshadows
    #
    PictureUrl:http://userserve-ak.last.fm/serve/252/10323129.jpg
    }
    }, ........

and then doing search "artists/artist/_search?q=me*":

artists artist N1S_cm9SQ1WjyecLgEXS2Q 0.0 {Id=86, Name=Katie
Melua, PageUrl=http://www.last.fm/music/Katie+Melua,
PictureUrl=http://userserve-ak.last.fm/serve/252/38702721.png}
artists artist wOzfEndsSVa2VI0hxT9tUA 0.0 {Id=471, Name=Metro
Station, PageUrl=http://www.last.fm/music/Metro+Station,
PictureUrl=http://userserve-ak.last.fm/serve/252/8127003.jpg}
artists artist tXH8osdPQYaLefAN-FED5Q 0.0 {Id=707,
Name=Metallica, PageUrl=http://www.last.fm/music/Metallica,
PictureUrl=http://userserve-ak.last.fm/serve/252/7560709.jpg}
artists artist Ely-7TbWSlmRh6xqm0mRjA 0.0 {Id=914, Name=Medina,
PageUrl=http://www.last.fm/music/Medina,
PictureUrl=http://userserve-ak.last.fm/serve/252/60964027.png}
artists artist KbtwPj3PQH2HAchnkO9Hyg 0.0 {Id=996, Name=Mike &
The Mechanics,
PageUrl=http://www.last.fm/music/Mike%2B%2526%2BThe%2BMechanics,
PictureUrl=http://userserve-ak.last.fm/serve/252/57142699.png}
artists artist Kr_oQ7bXRMyGJtfvjO6krw 0.0 {Id=779, Name=Bring
Me The Horizon, PageUrl=http://www.last.fm/music/Bring+Me+The+Horizon,
PictureUrl=http://userserve-ak.last.fm/serve/252/51720179.jpg}
artists artist gd_FBH9RToa4xHyEL6LijQ 0.0 {Id=918,
Name=Megadeth, PageUrl=http://www.last.fm/music/Megadeth,
PictureUrl=http://userserve-ak.last.fm/serve/252/8129787.jpg}
artists artist d_9Gj_GLR-CWBJkgjRt4Xg 0.0 {Id=657, Name=Paolo
Meneguzzi, PageUrl=http://www.last.fm/music/Paolo+Meneguzzi,
PictureUrl=http://userserve-ak.last.fm/serve/252/8575439.jpg}
artists artist Q52O_Jj5QxaiMZrrINpFhQ 0.0 {Id=721, Name=Wim
Mertens, PageUrl=http://www.last.fm/music/Wim+Mertens,
PictureUrl=http://userserve-ak.last.fm/serve/252/35625237.png}
artists artist xz4iayREQvysM4V3nYkzmQ 0.0 {Id=847, Name=The
Mercury Arc, PageUrl=http://www.last.fm/music/The+Mercury+Arc,
PictureUrl=http://userserve-ak.last.fm/serve/252/39053993.jpg}
artists artist 15vP6TBhSi6-UxOeuK9Qrg 0.0 {Id=477, Name=Daniel
Merriweather, PageUrl=http://www.last.fm/music/Daniel+Merriweather,
PictureUrl=http://userserve-ak.last.fm/serve/252/53480041.png}
artists artist 7lDFkbFUTmCEYp2dH7J13g 0.0 {Id=777, Name=The
Crystal Method, PageUrl=http://www.last.fm/music/The+Crystal+Method,
PictureUrl=http://userserve-ak.last.fm/serve/252/26115391.jpg}
artists artist OghYAvWhRlWYknoOH58LFA 0.0 {Id=170, Name=Mew,
PageUrl=http://www.last.fm/music/Mew,
PictureUrl=http://userserve-ak.last.fm/serve/252/42247291.jpg}
artists artist o_d8GDDSSrSKmnREVz71eA 0.0 {Id=359, Name=Maria
Mena, PageUrl=http://www.last.fm/music/Maria+Mena,
PictureUrl=http://userserve-ak.last.fm/serve/252/13556587.jpg}
artists artist gKC8UA62S3Sgzd6OCK3lUA 0.0 {Id=643, Name=Nikolas
Metaxas, PageUrl=http://www.last.fm/music/Nikolas+Metaxas,
PictureUrl=http://userserve-ak.last.fm/serve/252/61486893.png}

flows:

2013-06-07 16:19:17,829 INFO [main] provider.TestCatalogProviderUtil
(TestCatalogProviderUtil.java:testElasticsearchProvider(341)) - loading
data into elasticsearch
2013-06-07 16:19:18,266 INFO [flow] flow.Flow
(BaseFlow.java:logInfo(1300)) - [] source:
FileTap["TextDelimited[['Id', 'Name', 'PageUrl',
'PictureUrl']]"]["/home/oleg/dev/git/lingual/lingual-core/src/test/resources/artists.tab"]
2013-06-07 16:19:18,266 INFO [flow] flow.Flow
(BaseFlow.java:logInfo(1300)) - [] sink:
ESLocalTap["ESLocalScheme[['Id', 'Name', 'PageUrl',
'PictureUrl']]"]["artists"]

2013-06-07 16:19:18,893 INFO [main] provider.TestCatalogProviderUtil
(TestCatalogProviderUtil.java:testElasticsearchProvider(375)) -
searching elasticsearch
2013-06-07 16:19:19,004 INFO [flow] flow.Flow
(BaseFlow.java:logInfo(1300)) - [] source:
ESLocalTap["ESLocalScheme[['Id', 'Name', 'PageUrl',
'PictureUrl']]"]["artists/artist/_search?q=me*"]
2013-06-07 16:19:19,005 INFO [flow] flow.Flow
(BaseFlow.java:logInfo(1300)) - [] sink: StdOutTap["TextLine[['num',
'line']->[ALL]]"]["stdOut"]

Best,
Oleg

On 06/11/2013 02:05 PM, Costin Leau wrote:

..would be great to get feedback on the ESTap

from elasticsearch-hadoop.

costin avatar costin commented on September 23, 2024

Moving this to milestone 1.3 M2 to address the last bit, namely being id aware.

from elasticsearch-hadoop.

costin avatar costin commented on September 23, 2024

I know this is an old bug but want to check whether the type is still required. The bulk already supports the various meta-data options and the type and index are built in (it's the es.resource).

from elasticsearch-hadoop.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.