romaklimenko / cluedin Goto Github PK
View Code? Open in Web Editor NEWCluedIn Python SDK
License: MIT License
CluedIn Python SDK
License: MIT License
The scope of this issue is two features to support CluedIn Rules:
In both cases, we need to consider that a rule consists of:
Where each action can have its own conditions.
Hence, whether to execute an action or not, we must evaluate the rule's conditions first and then the action's conditions.
For rules evaluation, we need to map raw entities' properties to data frame column names.
We also need methods to get all rules (of a given type: data part, golden record, or survivorship).
And we need to get one rule.
Hence, a draft plan for this scope is:
main
branch (what if I don't change code?)We already have the entries method:
Instead of writing this every time:
def flatten_properties(d):
for k, v in d['properties'].items():
if k == 'attribute-type':
continue
if k.startswith('property-'):
k = k[9:] # len('property-') == 9
k = k.replace('.', '_')
d[k] = v
del d['properties']
return d
df_titles = pd.DataFrame(
map(
flatten_properties,
cluedin.gql.entries(ctx, query, { 'query': 'entityType:/IMDb/Title', 'pageSize': 10_000 })))
Add cluedin.utils.flatten_properties
method and use it by default for cluedin.gql.entries
:
df_titles = pd.DataFrame(cluedin.gql.entries(ctx, query, { 'query': 'entityType:/IMDb/Title', 'pageSize': 10_000 }))
Which is equivalent to:
df_titles = pd.DataFrame(
cluedin.gql.entries(
context=ctx,
query=query,
variables={ 'query': 'entityType:/IMDb/Title', 'pageSize': 10_000 },
flat=True))
For search GraphQL queries, implement an iterator (or a generator) to return paged results.
API should be like:
cluedin.gql.entries(context, query, variables)
Inside the entries
method, send a GraphQL request, and if there's a cursor
and entries
in the response, yield
the entries and request the same GraphQL but with the cursor
this time.
import cluedin
ctx = cluedin.Context.from_json_file(os.environ['CLUEDIN_CONTEXT'])
ctx.get_token()
rules = cluedin.rules.get_rules(ctx, scope=cluedin.rules.scope.DATA_PART)
GraphQL:
query getRules($searchName: String, $isActive: Boolean, $pageNumber: Int, $sortBy: String, $sortDirection: String, $scope: String) {
management {
id
rules(
searchName: $searchName
isActive: $isActive
pageNumber: $pageNumber
sortBy: $sortBy
sortDirection: $sortDirection
scope: $scope
) {
total
data {
id
name
order
description
isActive
createdBy
modifiedBy
ownedBy
createdAt
modifiedAt
author {
id
username
__typename
}
scope
isReprocessing
__typename
}
__typename
}
__typename
}
}
with variables
{
"scope": "DataPart"
}
import cluedin
context = cluedin.utils.load(os.environ['CLUEDIN_CONTEXT'])
cluedin.load_token_into_context(context)
query = """
query searchEntities($cursor: PagingCursor, $query: String, $pageSize: Int) {
search(query: $query, sort: DATE, cursor: $cursor, pageSize: $pageSize) {
totalResults
cursor
entries {
id
name
entityType
}
}
}
"""
variables = {
"query": "entityType:/Infrastructure/User",
"pageSize": 1
}
response = cluedin.gql.gql(context, query, variables)
evaluator = cluedin.rules.get_evaluator(rule)
filtered_entities = evaluator.get_matching_entitties(rule, entities)
Commit the Postman collection with CluedIn API calls and keep it up to date.
It's better late than never: the context passed to each method is a dict
now, but it would be better if it were an object of a class.
It may be considered as a breaking change, but actually, we just need to change the logic of load_token_into_context
to return a Context
instead of a dict
, and deprecate it.
/Infrastructure/User
entity.AspNetUsers
, but there's no corresponding entity.The goal is to post a Clue to restore this entity.
cluedin.gql.gql
calls the {api_url}/graphql
but CluedIn provides another GraphQL endpoint at {org_url}/graphql
โ this is what we need to support.
/entity/blob
/entity/schema
/entity/clue
Account:
{{auth}}/api/account/accounts
{{auth}}/api/account/accounts?organizationId={{organization_id}}
Availability:
{{auth}}/api/account/available?clientId={{organization}}
{{auth}}/api/account/username?username={{user}}&clientId={{organization}}
Register:
InvitationCode
new
Register
User:
user
user?id={}
import cluedin
ctx = cluedin.Context.from_json_file(os.environ['CLUEDIN_CONTEXT'])
ctx.get_token()
rule = cluedin.rules.get_rule(ctx, id='...')
import cluedin
context = {
"org": "http://foobar.cluedin.local",
"user": "[email protected]",
"password": "Foobar23!"
}
cluedin.get_token(context)
search_results = cluedin.graphql.search(
context=context,
query="*",
sort="DATE",
payload=[ "totalResults", "cursor" ],
entries=[ "id", "name" ])
Use Docstrings: https://realpython.com/documenting-python-code/
Add the following helping methods:
cluedin.utils.load_json(file)
cluedin.utils.save_json(obj, file)
Name: cluedin
Basic usage:
import cluedin
context = {
"protocol": "http", # default - `https`
"domain": "cluedin.local",
"organization": "foobar",
"user": "[email protected]",
"password": "Foobar23!"
}
cluedin.load_token_into_context(context)
cluedin.graphql.search(context, ...)
The search API to be designed and implemented in #3.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.