shayhatsor / zookeeper Goto Github PK
View Code? Open in Web Editor NEWThis project forked from apache/zookeeper
Apache ZooKeeper .NET async Client
Home Page: https://nuget.org/packages/ZooKeeperNetEx/
License: Apache License 2.0
This project forked from apache/zookeeper
Apache ZooKeeper .NET async Client
Home Page: https://nuget.org/packages/ZooKeeperNetEx/
License: Apache License 2.0
I'm working on porting Apache Curator to .Net and it makes a call to ZooKeeper.updateServerList(String)
Any chance we can get that added to your ZooKeeper class?
Is there any way to provide a directory name that the ZK.*.log files should be created in rather the current working directory?
I have installed this SDK via nuget in my asp.net core project , but I can not using this package, why?
Is there any demo?
using ZooKeeperNetEx;
is error.
Hi,
Do you know when 3.5.x releases be in nuget? Also, can the RC releases be a prerelease version too?
Thanks!
When I add a FW rule to disable a ZK connection, then memory usage grows to 1GB in a few seconds. When I ran a profiler I saw that memory is flooded with Task objects, that were created in AsyncManualResetEvent class.
This happens in production when CPU usage in the VM is high as well. First a ZooKeeper ConnectionLoss exception is logged, and soon the process memory starts to grow very fast. It seems like the ThreadPool doesn't give a chance for the connection ping Tasks to fire because the CPU usage is high. Then a ConnectionLoss exception is logged. And then the Client starts to create the Tasks objects without any Delay.
I tested this behaviour with 3.4.8.3 and 3.4.9.1.
Here's the NUnit test fixture:
[TestFixture]
public class ZookeeperMemoryLeakTest_Manual {
// Zookeeper must run on Linux machine, this test adds firewall rule
private static readonly string zkSrvIp = "192.168.60.10";
private static readonly string zkSrvUser = "root";
private static readonly string zkSrvPassw = "Password";
private static readonly string zkPort = "2181";
private static readonly string fwRule = $"INPUT -p tcp --dport {zkPort} -j DROP;";
private static readonly string zkConnStr = $"{zkSrvIp}:{zkPort}";
[TearDown]
public virtual void TestFixtureTearDown() {
Console.WriteLine("Removing Firewall rule");
SshUtils.ExecuteSshCommand(zkSrvIp, zkSrvUser, zkSrvPassw, $"sudo iptables -D {fwRule}");
}
[Test]
[Explicit]
[Description("See Ram and Processor usage. Long running.")]
public void MemoryLeakTest() {
var zookeeper = new ZooKeeper(zkConnStr, 15000, new ClientWatch());
while (zookeeper.getState() != ZooKeeper.States.CONNECTED) {
Thread.Sleep(10);
}
// Add FW rule
SshUtils.ExecuteSshCommand(zkSrvIp, zkSrvUser, zkSrvPassw, $"sudo iptables -I {fwRule}");
var totalBytesOfMemoryUsedBefore = Process.GetCurrentProcess().WorkingSet64;
Console.WriteLine("Waiting..");
Thread.Sleep(20000);
Console.WriteLine("Done Waiting..");
long totalBytesOfMemoryUsedAfter = Process.GetCurrentProcess().WorkingSet64;
var memoryUsageDiffMB = (totalBytesOfMemoryUsedAfter - totalBytesOfMemoryUsedBefore) / 1024 / 1024;
Console.WriteLine($"memoryUsageDiffMB: {memoryUsageDiffMB} MB");
Assert.Less(memoryUsageDiffMB, 100);
}
#region Watcher for zookeeper
private class ClientWatch : Watcher {
public override Task process(WatchedEvent we) {
var eventState = we.getState();
Console.WriteLine($"State from watcher: {eventState}");
return Task.FromResult<object>(null);
}
}
#endregion
}
Application process used 447 MB memory, and had 70 threads after it is reset.
When I kill Zookeeper server, the app try to re-connect to Zookeeper server.
At this moment, CPU usage, memory usage and thread increases.
They keep increase up to 1605 MB for memory, 262 threads. CPU usage hits 100 max from time to time.
Please find attached .gif in order to help your understand. Please see image below
Also please kindly focus on dotnet.exe process.
This was tested on dotnet core 2.1 base and on Windows 10, CentOS 7.
This issue happens on all version that supports .NET core (3.4.8.5, 3.4.9.0, 3.4.9.2, 3.4.9.3, 3.4.9.4, 3.4.10.1)
Is there any way to resolve/improve this issue?
I am using C# .NetCore 1.0. When I call the createNode I am getting this response object Id = 1, Status = WaitingForActivation, Method = "{null}", Result = "{Not yet computed}" and getting an error like
Exception: One or more errors occurred. (Exception of type 'org.apache.zookeeper.KeeperException+NoNodeException' was thrown.)
Please help me on this.
Regards,
Ram
Dear Shay,
I have a question about absence of await configuration in ZooKeeperNetEx implementation.
Could you please explain rationalities behind not using ConfigureAwait(false) ?
(seems I'm missing a point as long as I'm new to .Net dev)
With Zookeeper 3.4.12 released a on May 1, 2018, is there a timeline for releasing a corresponding version of ZookeeperNetEx? We never saw a 3.4.11 version of ZookeeperNetEx.
I'm happy to help out if needed especially if you can describe your process for porting each release? Do you start from scratch or do you do diffs of the java code and then make corresponding changes to the C# code?
This client was used for a long time on Windows without any issues. A couple of month ago we tried to use this client on .NET Core and we tested it on Linux and Windows.
In our project we use ZooKeeperClient a lot to read nodes and set watchers.
Windows version works flawlessly.
However, Linux version causes Connection reset by peer
exception.
I investigated this problem and read Zookeeper logs. I found out that Zookeeper didn't reset it's connection.
I didn't capture any tcp dumps, but I'm pretty sure there are no TCP RST packets.
Upgrading to .NET 5 makes the situation even worse. (ConnectionLossExceptions appear more often).
I decided to go deeper into the ZooKeeperClient code.
I found a check which causes false-detected connection loss.
Unfortunately, I was not able to detect what causes this effect and how to reproduce this problem. Looks like a problem with sockets on Linux.
Removing this check solves the problem.
Also, this client sends KeepAlive pings anyway, so if there IS a real connection loss, we will know about it in a soon time (either next time we try to send something or next ping).
Please use this gist to reproduce - https://gist.github.com/nj4x/b496d492b6b31c7279dc
I can get initial data from ZooKeeper using the following code, but my watches don't work. Please could you reivew my usage of the library and let me know what I'm missing.
Here is my code:
void Main() {
var zooKeeper = new ZooKeeper(ConfigurationManager.AppSettings["ZooKeeperConnection"], 60000, connectionWatcher, true);
var watcher = new ZooKeeperDataWatcher(_zooKeeper, ConfigurationManager.AppSettings["ZooKeeperKey"] + "/table", _logFactory);
watcher.ValueStream.Subscribe(val => Debug.WriteLine("Got: " + val));
}
public class ZooKeeperDataWatcher : Watcher
{
private readonly ZooKeeper _zooKeeper;
private readonly string _node;
private readonly ISubject<string> _subject;
private readonly ILogger _log;
public ZooKeeperDataWatcher(ZooKeeper zooKeeper, string node)
{
_zooKeeper = zooKeeper;
_node = node;
_subject = new ReplaySubject<string>(1);
_zooKeeper.getDataAsync(_node, this)
.ContinueWith(async (task, o) =>
{
_subject.OnNext(Encoding.UTF8.GetString(task.Result.Data));
await _zooKeeper.getDataAsync(_node, this);
}, null, TaskContinuationOptions.OnlyOnRanToCompletion)
.ContinueWith((task, o) =>
{
_log.Warn("Unable to watch znode " + node, task.Exception);
}, null, TaskContinuationOptions.NotOnRanToCompletion);
}
public IObservable<string> ValueStream
{
get { return _subject; }
}
public override Task process(WatchedEvent ev)
{
return new Task(() => { }); // BREAKPOINT HERE IS NEVER HIT
}
}
Now the initial value is loaded correctly from ZooKeeper. The ClientCnxn class logs the following when I change the value in ZK: "Got WatchedEvent state:SyncConnected type:NodeDataChanged path:/znode-location for sessionid 0x15902e4bae003c4
" but my process() method doesn't get called back.
FYI I'm running .Net 4.5.2 x32 on Win7, using v3.4.9.2 of ZooKeeperNetEx.
TIA,
John
When the application is in heavy workload, especially when there are many threads (100+), it frequently receives 'Disconnected' and then 'SyncConnected' event. After a while, It would eventually get 'Expired' event.
(I've tried to set the session timeout to from 15 to 30 seconds. It didn't help.)
I guess it's because heartbeats were not sent in time due to too many Tasks running. It there a way to workaround this?
My app logs:
2018-06-14 00:09:19,303 [54] DEBUG ZooClient - Got WatchedEvent Disconnected.
2018-06-14 00:09:20,526 [51] DEBUG ZooClient - Got WatchedEvent SyncConnected.
2018-06-14 00:09:20,526 [51] DEBUG ZooClient - Connected with session id = [99864533958005145]
2018-06-14 00:11:13,084 [33] DEBUG ZooClient - Got WatchedEvent Disconnected.
2018-06-14 00:11:14,309 [49] DEBUG ZooClient - Got WatchedEvent SyncConnected.
2018-06-14 00:11:14,310 [49] DEBUG ZooClient - Connected with session id = [99864533958005145]
2018-06-14 00:13:11,964 [36] DEBUG ZooClient - Got WatchedEvent Disconnected.
2018-06-14 00:13:28,892 [35] DEBUG ZooClient - Got WatchedEvent Expired.
I try to create port of Curator library.
Java code of Curator library actively use KeeperException(and it's subtypes) and KeeperException.Code. But in your library this types have very restricted access.
Can you change accessability modifiers to public for constructors of KeeperException(and it's subtypes) and also make public KeeperException.Code enum? This can help me:)
I wrote a simple leader election code.
Everything workfine in the intial run and the code connects to zookeeper.
Now when one of the nodes gets disconnected, the other node's Process method gets called, but when I make any zookeeper api method call(like getchildren() etc) from inside this Process method the function gets stuck on the wait() indefinitely, like if I make a
Process(){
Task t= this.zk.getChildrenAsync(node,null);
t.Wait();
}
the code waits on t.Wait() and does not proceed, and a few seconds later the ephermal znode for this node gets deleted from zookeeper, but the code still stuck on the t.wait() line.
very rarely, like 1 out of 20 times, the code proceeds without any problem
I am trying this on .NET Core Version: 3.1.101, Ubuntu 18.04
hey shay.
I download source,and build 'ZooKeeperNetEx' but have some build errors.
like this
"netcoreapp1.0" is an unsupported framework. ZooKeeperNetEx.Tests D:\githubProj\zookeeper\src\csharp\test\ZooKeeperNetEx.Tests\project.json 14
"netstandard13" is an unsupported framework. ZooKeeperNetEx D:\githubProj\zookeeper\src\csharp\src\ZooKeeperNetEx\project.json 34
I don't know where has wrong,can you help me?
For example deleteAsync why not write DeleteAsync.
A recent CoreCLR change created a regression when trying to connect to Zookeeper with this package.
Specifically, this issue causes an issue in ClientCnxnSocketNIO.connect(DnsEndPoint add)
because it may attempt to connect to multiple addresses that resolve for this endpoint. I get the same result whether I pass an IP address or a DNS hostname to the ZooKeeper constructor. I think the correct fix would be to resolve the IP address before caking the call to Socket.Connect
.
https://github.com/dotnet/corefx/issues/5829
Connecting works fine under mono and Windows .NET, but on OS X or Linux, running under the latest CoreCLR results in the error message below.
[2016-03-31 18:41:42.517 GMT ERROR ClientCnxnSocketNIO Unable to open socket to Unspecified/10.0.1.36:2181]
Exc level 0: System.PlatformNotSupportedException: Sockets on this platform are invalid for use after a failed connection attempt, and as a result do not support attempts to connect to multiple endpoints.
at System.Net.Sockets.Socket.ConnectAsync(SocketAsyncEventArgs e)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(Socket sock, DnsEndPoint addr)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(DnsEndPoint addr)
[2016-03-31 18:41:42.586 GMT WARNING ClientCnxn Session 0x0 for server Unspecified/10.0.1.36:2181, unexpected error, closing socket connection and attempting reconnect]
Exc level 0: System.PlatformNotSupportedException: Sockets on this platform are invalid for use after a failed connection attempt, and as a result do not support attempts to connect to multiple endpoints.
at System.Net.Sockets.Socket.ConnectAsync(SocketAsyncEventArgs e)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(Socket sock, DnsEndPoint addr)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(DnsEndPoint addr)
at org.apache.zookeeper.ClientCnxn.d__56.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at org.apache.zookeeper.ClientCnxn.d__59.MoveNext()
How would I connect to a secured instance of Zookeeper using this library? Would it just know which directory has the correct SSL certs?
if I understood well there is not any caller of addlistener in LeaderElectionSupport. Also, the dummy variable is always empty.
So how should we use addListeners?
Zookeeper 3.4.14 has been released. We'd really appreciate an updated ZookeeperNetEx when you have time.
I have a client that quickly changes the state of the nodes.
This causes on the server side, using a watcher, the loss of some events.
Is there a robust implementation that handles event loss?
Several times a day I get the following error log for a few of my ZooKeepers:
ClientCnxn Session 0x0 for server {192.168.1.243:8181}, unexpected error, closing socket connection and attempting reconnect]
Exc level 0: System.IO.IOException: Packet len1213486160 is out of range!
at org.apache.zookeeper.ClientCnxnSocket.readLength() in D:\ZooKeeper\src\csharp\src\ZooKeeperNetEx\zookeeper\ClientCnxnSocket.cs:line 92
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO() in D:\ZooKeeper\src\csharp\src\ZooKeeperNetEx\zookeeper\ClientCnxnSocketNIO.cs:line 60
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport() in D:\ZooKeeper\src\csharp\src\ZooKeeperNetEx\zookeeper\ClientCnxnSocketNIO.cs:line 288
at org.apache.zookeeper.ClientCnxn.startSendTask() in D:\ZooKeeper\src\csharp\src\ZooKeeperNetEx\zookeeper\ClientCnxn.cs:line 726
Any idea what could be the cause for such a packet length issue?
it returns the node data , which makes it difficult to know which node it is
is it possible to get the leader node ?
··· c#
[TestClass]
public class UnitTest7 : Watcher
{
private bool connected = false;
public override async Task process(WatchedEvent @event)
{
if (@event.getState() == Event.KeeperState.SyncConnected)
{
this.connected = true;
}
await Task.FromResult(1);
}
[TestMethod]
public async Task TestMethod1()
{
foreach (var i in Com.Range(100))
{
var client = new ZooKeeper("localhost:32771",
(int)TimeSpan.FromSeconds(5).TotalMilliseconds, this);
try
{
while (!this.connected)
{
await Task.Delay(10);
}
if (await client.existsAsync("/home", false) == null)
{
var path = await client.createAsync("/home", "".GetBytes(),
Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
}
var bs = new { id = 2, name = "fas", age = 44, time = DateTime.Now }.ToJson().GetBytes();
await client.setDataAsync("/home", bs);
var data = await client.getDataAsync("/home", false);
var children = await client.getChildrenAsync("/home");
var t = client.transaction();
//t.delete("/home");
t.setData("/home", $"{DateTime.Now.Ticks}".GetBytes());
var res = await t.commitAsync();
}
catch (Exception e)
{
//
}
finally
{
await client.closeAsync();
}
}
}
}
···
This is a bug in the Java code but am also posting here in case you want to fix it in ZookeeperNetEx Recipes before its fixed in core.
On a busy system, I'm fairly frequently seeing WriteLock that is never granted to client and gets stuck.
What I believe is happening is the lock sets a watch on the request before him via this code:
I'm actually using the WriteLock from the ZookeeperNetEx C# code but I've verified that the same issue exists in the Java recipe. On a busy system, I'm fairly frequently seeing WriteLock that is never granted to client and gets stuck.
What I believe is happening is the lock sets a watch on the request before him via this code:
Stat stat = await writeLock.zookeeper.existsAsync(writeLock.lastChildId, new LockWatcher(writeLock)).ConfigureAwait(false);
if (stat != null)
{
return false;
}
LOG.warn("Could not find the" + " stats for less than me: " + lastChildName.Name);
The problem (as I see it and I'm still fairly new to Zookeeper) is that if the node represented by lastChildId has been deleted before the call to exists is made, stat will return null and the watch will only ever be invoked when the znode is created. And of course that will never happen.
The message is appearing in my log and my watcher for the lock is never invoked.
[2018-02-13 16:49:17.905 GMT WARNING WriteLock Could not find the stats for less than me: /token/SegmentProfileQueueToken/x-72057953399865370-0000000724]
I'm not entirely sure of the proper way of fixing this but I think setting
Id = null;
When stat is null should work.
Pls find issue reproducing gist here https://gist.github.com/nj4x/a13c7b8c4362ded9cedc
It will be thrown null pointer exception when watcher is null in Zookeeper.cs line 396. if watcher can be null it is more stable.
Also see follow code:
public ZooKeeper(string connectString, int sessionTimeout, Watcher watcher,
long sessionId, byte[] sessionPasswd, bool canBeReadOnly = false) {
LOG.info("Initiating client connection, connectString=" + connectString
+ " sessionTimeout=" + sessionTimeout
+ " watcher=" + watcher
+ " sessionId=" + sessionId.ToHexString()
+ " sessionPasswd="
+ (sessionPasswd == null ? "" : ""));
sessionPasswd = sessionPasswd ?? NO_PASSWORD;
watchManager.defaultWatcher = new WatcherDelegate(@event =>
{
if (@event.getState() == Watcher.Event.KeeperState.SyncConnected || @event.getState() == Watcher.Event.KeeperState.ConnectedReadOnly)
{
connectedSignal.Set();
}
else
{
if(connectedSignal.Task.IsCompleted) connectedSignal.Reset();
}
return watcher.process(@event);
});
Hi,
Here is the page of the latest NuGet package:
https://www.nuget.org/packages/ZooKeeperNetEx
It contains a package with assemblies named ZooKeeperNetEx.dll and all of them are compiled with:
[AssemblyConfiguration("Debug")]
This means that these assemblies are Debug assemblies and are not intended to be distributed in the Release package.
Many previous versions contain only Debug assemblies as well.
If you take care of that, can you please also sign the new Released assemblies?
Thank you!
Hi!
I have a very simple example, I am reading a node from ZK, and then I want to see that I get notified when updates are made to that node. However, only my first change in the note gets notified, the rest not. Also, I am unsure what I must return in the process method, when extending the Watch abstract class.
Thank you,
Alex
class Program
{
async static Task Main(string[] args)
{
Watcher watcher = new MyWatcher();
ZooKeeper zk = new ZooKeeper("kafka1:2181,kafka2:2181,kafka3:2181", 30000, watcher);
DataResult stat = await zk.getDataAsync("/alex/test", watcher);
string json = Encoding.UTF8.GetString(stat.Data);
while (true)
{
await Task.Delay(1000);
}
}
}
public class MyWatcher : Watcher
{
public override Task process(WatchedEvent @event)
{
Console.WriteLine(@event.ToString());
return Task.Delay(0);
}
}
Hi Can you please add a removeWatcher method to this api ?
Hello,
I am using the csharp client 3.4.12.1 and zookeeper server 3.4.13 (tried to update to 3.5.5 also). I have the Restful API and creating clients per request, when request ended I disposing of the clients using the closeAsync() method synchronously. When it was disposed I see in code that connection state is closed, but in zookeeper, I see those connections were not closed and every time it increasing (see screenshot 1, I used the telnet with stat command) and count of established TCP ports increasing too (maximum was see screenshots 2,3 - perf monitor, 582 ports).
My maxClientCnxns=500 in zoo.cfg, when I ran the stress test the count of connections was more than 500 and I got the LossException in my code, then count of connections was decreased to 319. In this case, when I shut down the Restful API the ports are release sometimes.
I tried to understand the problem. I tried to rewrite the code in ClientCnxn.close() and disconnect methods: instead of the calling the clientCnxnSocket.wakeupCnxn(); I used the following code:
private async Task disconnect() {
if (LOG.isDebugEnabled()) {
LOG.debug("Disconnecting client for session: 0x"
+ getSessionId().ToHexString());
}
await close();
queueEventOfDeath();
}
private async Task close()
{
state.Value = (int)ZooKeeper.States.CLOSED;
LOG.info("clientCnxnSocket cleanup() in close()");
await clientCnxnSocket.cleanup().ConfigureAwait(false);
LOG.info("clientCnxnSocket close()");
clientCnxnSocket.close();
}
but it was not resolved the issue.
Please advise
Hit this exception when using WriteLock without LockListener callback:
Location: org.apache.zookeeper.recipes.lock.WriteLock.LockZooKeeperOperation
Reason: writeLock.callback is null
lock (writeLock.callback) {
if (writeLock.callback != null) {
writeLock.callback.lockAcquired();
}
}
I noticed that the C# port of LeaderElectionSupport has a mistake when comparing it with the original Java. On this line:
, the C# port is comparing an event type to a node path.However, in the original Java here:
The C# version doesn't seem correct.
Hi
I am using the csharp client (3.4.12.1). I have a thrift service, which uses ephemeral nodes for service registry. The problem is that even though the application is running, the ephemeral zookeeper node created by the service gets deleted.
Upon looking into the logs, I found that I get the following message
"org.apache.zookeeper.ClientCnxn+SessionTimeoutException: Client session timed out, have not heard from server in 6687ms for sessionid 0x201dd32015e2c80
at org.apache.zookeeper.ClientCnxn.startSendTask()"
It seems like the csharp library is not able to read messages from zookeeper and closes the session.
Has anyone else faced any such issues with deletion of ephemeral nodes?
I'm seeing some zookeeper disconnects that is almost surely because business logic is running for a significant amount of time in synchronous code. Ideally, I would like to avoid application developers from having to go down the async rabbit hole if possible.
This is also an issue when I'm stopped in a debugger as it will likely disconnect.
It seems that it should be possible so that zookeeper's main loop is run in a dedicated thread so that it can send it heartbeats to zookeeper server even if the application is blocking in synchronous code.
Are there any recommendations or best practices to achieve this?
ZooKeeperNetEx 3.4.12
Microsoft.AspNetCore.All 2.0.5
netcoreapp2.1
Zookeeper 3.4.12
Ubuntu 18.04.1 LTS
I can connect and do actions on ZK with python (kazoo client 2.5.0), but not c#. I tried to switch to your library from "ZooKeeper.Net"( Version="3.4.6.2"), it was working but really unstable, on 3rd or 5th request it could crash with ConnectionExpired, even if i was recreating/disposing ZKClient every time on client request.
So i thought your library can help me, but in my case i can't even validate if node exists.
Could you please point me what i did wrong? ZK instances running on same machine as you see - could it be problem or it because of watcher and i need to implement something in it?
I have simple case, i just need to create node from c#, and then python part will read it, thats all.
private async Task<ZooKeeper> GetZooKeeper()
{
var testZk = new ZooKeeper("localhost:2181,localhost:2182,localhost:2183", 4000, ZkWatcher.Instance);
while (testZk.getState() != ZooKeeper.States.CONNECTED)
{
await Task.Delay(1000);
}
var stat = await testZk.existsAsync("/v1", false); //Watcher prints session expired
return testZk;//never goes here
}
private class ZkWatcher : Watcher
{
public static readonly ZkWatcher Instance = new ZkWatcher();
private ZkWatcher() { }
public override Task process(WatchedEvent @event)
{
Console.WriteLine(@event.ToString());
return Task.CompletedTask;
}
}
Console output:
WatchedEvent state:SyncConnected type:None path:
Server Warning: 0 : [2018-12-06 22:30:07.550 GMT WARNING ClientCnxn Client session timed out, have not heard from server in 15455ms for sessionid 0x3000000da550002]
WatchedEvent state:Disconnected type:None path:
WatchedEvent state:Expired type:None path:
Local compiler error, missing the following files: org.apache.zookeeper.data and org.apache.zookeeper.proto
Hello,
I was trying to implement a zookeeper change monitor. It seems that a situation could occur where calling getChildrenAsync would take forever. I made a small test program to demonstrate this behavior.
Steps:
TrackChildren(eventPath);
line in the ChildrenElementsChanged
method.TrackChildren
call and step over _zookeeper.getChildrenAsync
.The program:
namespace ZooKeeperDemo
{
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using org.apache.zookeeper;
internal sealed class ChildrenBug
{
public static async Task Start(string connectString, string root)
{
var connection = new ZooKeeper(connectString, 10_000_000, null);
var listener = new TreeMonitor(connection, "/");
listener.Start();
await CreateElement(root, connection);
var first = Guid.NewGuid();
for (var i = 0; i < 3; i++)
{
await CreateElement(root + "/" + first, connection);
await connection.deleteAsync(root + "/" + first);
await CreateElement(root + "/" + Guid.NewGuid(), connection);
}
while (true)
{
await Task.Delay(TimeSpan.FromDays(1));
}
}
private static async Task CreateElement(string path, ZooKeeper zookeeper)
{
var channelExists = await zookeeper.existsAsync(path);
if (channelExists != null)
{
return;
}
await zookeeper.createAsync(path, Array.Empty<byte>(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
}
class TreeMonitor
{
private readonly object _lockObject = new object();
private readonly ZooKeeper _zookeeper;
private readonly string _root;
private readonly SimpleWatcher _childrenWatcher;
private readonly SimpleWatcher _childWatcher;
private readonly Dictionary<string, byte[]> _state = new Dictionary<string, byte[]>();
public TreeMonitor(ZooKeeper zookeeper, string root)
{
_zookeeper = zookeeper;
_root = root == "/" ? string.Empty : root;
_childrenWatcher = new SimpleWatcher(ChildrenElementsChanged);
_childWatcher = new SimpleWatcher(ChildElementsChanged);
}
public void Start()
{
lock (_lockObject)
{
TrackChildren(_root);
}
}
private void TrackChildren(string trackPath)
{
var rootToUse = trackPath == string.Empty ? "/" : trackPath;
var children = _zookeeper.getChildrenAsync(rootToUse, _childrenWatcher).ConfigureAwait(false).GetAwaiter().GetResult();
foreach (var child in children.Children.Where(child => !_state.ContainsKey(child)))
{
var childPath = $"{(trackPath != "/" ? trackPath : string.Empty)}/{child}";
TrackChild(childPath);
TrackChildren(childPath);
}
}
private void TrackChild(string path)
{
if (path == null)
{
return;
}
try
{
var dataFetch = _zookeeper.getDataAsync(path, _childWatcher).ConfigureAwait(false).GetAwaiter().GetResult();
_state[path] = dataFetch.Data;
}
catch (KeeperException.NoNodeException)
{
_state.Remove(path);
}
}
private Task ChildrenElementsChanged(WatchedEvent watchedEvent)
{
if (watchedEvent.get_Type() == Watcher.Event.EventType.NodeChildrenChanged)
{
lock (_lockObject)
{
var eventPath = watchedEvent.getPath();
TrackChildren(eventPath);
}
}
return Task.CompletedTask;
}
private Task ChildElementsChanged(WatchedEvent watchedEvent)
{
var child = watchedEvent.getPath();
var eventType = watchedEvent.get_Type();
lock (_lockObject)
{
if (eventType == Watcher.Event.EventType.NodeDataChanged || eventType == Watcher.Event.EventType.NodeCreated)
{
TrackChild(child);
}
if (eventType == Watcher.Event.EventType.NodeDeleted)
{
_state.Remove(child);
// The child could already be created in the meantime. Try to track it, it will fail when it is still missing.
TrackChild(child);
}
}
return Task.CompletedTask;
}
}
}
}
Please consider making Fenced and SignalTask (especially this) public instead of internal. Curator InterProcessMutex uses java wait/notifyAll which normally would be converted to Monitor.Wait/Monitor.PulseAll except it needs to call PulseAll in a watcher that doesn't lock.
SignalTask looks like it provides the appropriate type of signaling needed and I would prefer to not make a copy it.
i want to debug zookeeper ,i download source(3.4.6.1006), open it in vs2017 , but cannot find "org.apache.zookeeper.data",and so on ,where is "org.apache.zookeeper.data" and "org.apache.zookeeper.proto" ? thanks
I understand that it is not a part of java implementation, but is it correct that method with such description:
Closes this strategy and releases any ZooKeeper resources; but keeps the ZooKeeper instance open
does not release the lock?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.