Coder Social home page Coder Social logo

garvincasimir / elasticsearch-azure-paas Goto Github PK

View Code? Open in Web Editor NEW
17.0 17.0 8.0 2.5 MB

Visual Studio Project which creates an Elasticsearch cluster on Microsoft Azure using worker roles

License: MIT License

C# 85.32% PowerShell 14.68%
azure c-sharp elasticsearch worker-role

elasticsearch-azure-paas's People

Contributors

cata avatar garvincasimir avatar gitter-badger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

elasticsearch-azure-paas's Issues

Elasticsearch binding 'problem' when deployed to Azure

This is not really an issue with the project, but something I learned that might save others time. Elasticsearch 2.4.1 (don't know about other versions) will bind to the loopback address by default if you don't specify an address in the configuration file. This is OK when running in the Emulator, but can be problematic when you deploy to Azure because even if you define the correct endpoint in the role settings, you won't be able to connect to Elasticsearch. Binding to the address '0' will allow the Elasticsearch instance to accept requests on the endpoint you have defined.

Azure files share

I have been testing the performance of Elastic Search using the Azure files share persistent storage for Elastic Search indexes and the performance is not good.
I ran two tests one using local storage and one using the file share, both tests indexed 13 million documents. When using local storage indexing took about 45 minutes, using the share it took over 2 hours.

There should be a config value specifying whether to use local storage or azure files.
For systems using an external data store to populate the index, there may not be any need for a persistent disk for the index.

Error using Azure worker role temp folder for download

In softwareManager.cs line 40 download is called using:

_artifact.Download(_binaryArchive);

This causes the download method to use the Azure Worker role temp folder. This folder has a size limit of 100mb witch is too small to download both Elastic Search and Java.
Changing the call to download to this fixes the problem:

 _artifact.Download(_binaryArchive, false);

JAVA_HOME not set

I added some more logging and have found the reason why Elastic search does not start on new instances during scaling.
When Elastic Search is started it complains about missing JAVA_HOME variable. This variable has been set, but is somehow not picked up.
By restarting the worker role everything works as expected.

By starting the Elastic Search process like this:

 _process = new Process();
 _process.StartInfo = new ProcessStartInfo
  {
        FileName = startupScript,
        UseShellExecute = false,
        RedirectStandardOutput = true
    };
_process.Start();
_process.BeginOutputReadLine();

and the capturing the standardoutput like this:

_process.OutputDataReceived +=
   delegate(object sender, DataReceivedEventArgs args)
    {
         var output = args.Data;
          if (output == "JAVA_HOME environment variable must be set!")
          {
               Trace.TraceInformation("JAVA_HOME not set restarting");
                throw new Exception("JAVA_HOME not set restarting");
           }
      };

It is possible to listen for the text "JAVA_HOME environment variable must be set!" which elasticsearch.bat file outputs and restart the worker role by throwing an exception. Not very elegant, but it works as a temporary solution.

RoleRoot missing separator character when deployed to Azure

When deployed to Azure the RoleRoot point to a drive not a folder like in the emulator.
This leads to an invalid path when constructing PackagePluginPath. The path will look like this:
E:approot\plugins. The code should append a valid separator character to the drive like:
roleRoot = roleRoot + @"";

Implement internal load balancing

Configure worker roles (Elasticsearch Nodes) to have a single internally load balanced endpoint on the default Elasticsearch port.

Store additional plugins in storage

It would be nice if we moved marvel and any additional plugins to a configurable storage container. The only plugin required by this solution is the discovery plugin so we can make an exception and keep it in the solution. We don't want to be dictating what plugins should be used. We also don't want to make it difficult for people to add their own plugins for their purposes.

Elasticsearch command line plugin install

This project currently supports installing plugins by placing the zip files in a special storage container. However, using a local zip file is only one of the ways an Elasticsearch plugin can be installed. It would be nice if the project supported the following pattern

plugin --install <org>/<user/component>/<version>

I am not sure how this would work but we would need to store a list of these org/user/component/version combinations somewhere then feed it into the plugin installer for processing. It would be nice if the service configuration supported arbitrary lists of strings in a setting.

<ListSettings>
    <ListSetting name="ElasticsearchPlugins">
        <Setting>elasticsearch/marvel</Setting>
        <Setting>elasticsearch/shield</Setting>
   </ListSetting>
</ListSettings>

es_heap_size

We need a way to set the es_heap_size. I think this value should be set automatically based on the available memory on the worker role. The recommended setting is 50% of the available memory. It should also be possible to override this setting through a config value.

Elastic search recommends setting this value using the environment variable es_heap_size, but I think it would be a better approach setting the Xms and xmx parameters when starting elastic search to minimize dependency.

Can this be run in web jobs?

I'm pretty unfamiliar with how this actually works. I have a really low traffic site but I need ES and was hoping I could just add it via a web job.

Elastic Search shutdown

I have been trying to debug this part of the code trying to shutdown Elastic Search:

public virtual void Stop()
{
   if (_process != null)
        {
            return;
        }

        if (_process.HasExited)
        {
            return;
        }

        _process.CloseMainWindow();
}       

Too me it seems like _process is never null which means that _process.CloseMainWindow(); is never called.

Should the correct implementation be this?

    public virtual void Stop()
    {
        if (_process == null)
        {
            return;
        }

        if (_process.HasExited)
        {
            return;
        }

        _process.CloseMainWindow();

    }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.