zalando-incubator / clin Goto Github PK
View Code? Open in Web Editor NEWCli for Nakadi for event types and subscriptions management
License: MIT License
Cli for Nakadi for event types and subscriptions management
License: MIT License
Set up CI/CD pipeline to build, test and publish to PyPI
Set up a pipeline that builds and tests all pull requests. When a release is created, the app is packaged and pushed to PyPI: https://pypi.org/project/clin
When running clin process -d --env=staging -t $TOK ./nakadi/mops/paas.clin.yaml
and there are lots of event-types / subsciptions to process, sometimes clin is failing with an exception, (see below).
EDIT: hm. somehow failing just a lot on TEST-env. Already made 8 attempts and it is failing sooner or later
If possible it would be great to avoid termination of the app and have some retry logic inside (or increase the amount of retries if they already present (?) ).
Just was running clin
several times in a row and it was failing with the following errors:
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Traceback (most recent call last):
File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen
chunked=chunked,
File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 426, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 421, in _make_request
httplib_response = conn.getresponse()
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 1373, in getresponse
response.begin()
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 319, in begin
version, status, reason = self._read_status()
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 288, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/adapters.py", line 450, in send
timeout=timeout
File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 725, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/util/retry.py", line 403, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/packages/six.py", line 734, in reraise
raise value.with_traceback(tb)
File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen
chunked=chunked,
File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 426, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 421, in _make_request
httplib_response = conn.getresponse()
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 1373, in getresponse
response.begin()
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 319, in begin
version, status, reason = self._read_status()
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 288, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/clin/run.py", line 184, in process
processor.apply(task.target, task.envelope)
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/clin/processor.py", line 55, in apply
apply(env, envelope.spec)
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/clin/processor.py", line 147, in apply_subscription
sub.event_types, sub.owning_application, sub.consumer_group
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/clin/clients/nakadi.py", line 84, in get_subscription
resp = requests.get(url, headers=self._headers, params=params)
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/sessions.py", line 529, in request
resp = self.send(prep, **send_kwargs)
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/sessions.py", line 645, in send
r = adapter.send(request, **kwargs)
File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/adapters.py", line 501, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
clin process -d --env=staging -t $TOK ./nakadi/mops/paas.clin.yaml
Creating a nakadi SQL query with repartition_parameters
is possible when repartitioning
is set in the spec (within outputEventType
).
The repartitioning
dict is successfully parsed, but output in the wrong place in the body of the POST to create the SQL query.
According to the Nakadi SQL documentation repartition_parameters
is a key within output_event_type
, not a top level key itself.
I've hacked a patch in to make the change and can confirm that the following change in JSON output works:
{
"id": "event-name.multiple-partitions",
"sql": "SELECT * FROM \"event-name\" AS a",
"envelope": false,
"output_event_type": {
"name": "event-name.multiple-partitions",
"owning_application": "app
"category": "business",
"audience": "component-internal",
"cleanup_policy": "delete",
- "retention_time": 86400000
- },
- "repartition_parameters": {
- "number_of_partitions": 6,
- "partition_strategy": "hash",
- "partition_key_fields": [
- "config_sku"
- ]
+ "retention_time": 86400000,
+ "repartition_parameters": {
+ "number_of_partitions": 6,
+ "partition_strategy": "hash",
+ "partition_key_fields": [
+ "config_sku"
+ ]
+ }
}
}
repartitioning
key to an event like so:diff --git a/docs/examples/single/sql-query.yaml b/docs/examples/single/sql-query.yaml
index 5c2d0f2..7f5932b 100644
--- a/docs/examples/single/sql-query.yaml
+++ b/docs/examples/single/sql-query.yaml
@@ -13,6 +13,10 @@ spec:
cleanup:
policy: delete
retentionTimeDays: 2
+ repartitioning:
+ partitionCount: 6
+ strategy: hash
+ keys: ["important_key"]
auth:
users:
admins:
Side note: This may be good to add to the documentation/examples. It wasn't immediately clear how to specify this.
clin apply -e ... -t ... docs/examples/single/sql-query.yaml -v -p -X
Note the location of repartition_parameters
The event will be created, but with the default number of parameters, not the ones you specified in repartitioning
.
To fetch the already exists sql-queries smoothly.
return OutputEventType(
TypeError: __init__() missing 1 required positional argument: 'partition_compaction_key_field'```
## Steps to Reproduce the Problem
1. create a sql-query
2. try to re-create the same query
## Specifications
- Version:1.4.2
clin creates a valid json-schema from the yaml specs. In case the json-schema is not correct, fail fast and notify the user before (trying) to create/update an event-type in Nakadi.
This also allows for better error messages that hint the user to the actual error in the json-schema.
If the EventTypeSchema.schema
is no valid json-schema, clin still posts to Nakadi and receives a 422
error.
Location: "@@@./definitions/location.yaml"
)clin apply -X -e production -t $(token) wrong.yaml
)Recenty Nakadi introduced compact_and_delete
cleanup policy which is needed to be reflected in the cln tool
compact_and_delete
as a value for $.cleanup.policycompact_and_delete
policy still taken from $.cleanup.policy. retentionTimeDayscompact_and_delete
policy can be created/updated/dumped by clinLink in SECURITY.md leads to meaningful information on how to report security issues or apply for the Bug Bounty program on HackerOne.
The provided link leads to corporate contact page that does not instruct how to report bugs or apply for bug bounty program.
To create a successful nakadi sql-query
{"detail":"Compacted output event type requires partition_compaction_key","status":400,"title":"Bad Request"}
clin
as usualTo read nakadi events yaml files in ordered way
so this source.glob("*.yaml") method doesn't return the yaml files ordered, so in our case, we have dependent nakadi sql queries that should be executed in the right order.
When modifying the SQL of a Nakadi SQL query with clin apply
the request either successfully updates (if the SQL change is valid) or fails with an error as to why the SQL change is not valid (i.e. the error coming from Nakadi).
Updating the SQL results in an error:
× Modifying output event type is forbidden: ...
It looks like clin is trying to update the whole event, and not using the /sql
sub-resource of the query to update the query itself.
clin apply -t ... -e ... docs/examples/single/sql-query.yaml -X
)diff --git a/docs/examples/single/sql-query.yaml b/docs/examples/single/sql-query.yaml
index 5c2d0f2..fb90be2 100644
--- a/docs/examples/single/sql-query.yaml
+++ b/docs/examples/single/sql-query.yaml
@@ -4,7 +4,7 @@ spec:
sql: |
SELECT *
FROM "derokhin.clin.test" AS e
- WHERE e."important_key" = 'hello world'
+ WHERE e."important_key" = 'hello world' OR e."important_key" = 'new phone. Who dis?'
envelope: false
outputEventType:
category: business # business | data | undefined
clin apply -t ... -e ... docs/examples/single/sql-query.yaml -X
)When dumping a schema to a file, I expect valid json or yaml output
e.g.
{
"name": "my-event-type",
"category": "data",
...
The output contains terminal colour codes, and out.json contains:
{
�[38;2;0;128;0;01m"name"�[39;00m: �[38;2;186;33;33m"my-event-type"�[39m,
�[38;2;0;128;0;01m"category"�[39;00m: �[38;2;186;33;33m"data"�[39m,
It would be great to have verbose mode in the tool.
When you run it now, you see:
⦿ Will update: event type event.type.1
⦿ Will update: event type event.type.2
⦿ Will update: event type event.type.3
...
✔ Up to date: event type event.type.4
...
but it is not entirely clear what kind of changes are going to happen in ⦿ Will update ...
In this way it would be helpful to see the diff between the current state and new one.
Clin update event type retention policy for all event types, even log compacted (to the default value)
The retention time field is not pushed to the Nakadi for log compacted event types.
As far as event type is log-compacted, the cleanup field is not actually used by Nakadi. However, it presents in the payloads leading to such divergences (which is impossible to fix via Clin):
⦿ Found 1 changes: event type _____
values_changed:
root.cleanup.retention_time_days:
new_value: 1
old_value: 4
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.