Coder Social home page Coder Social logo

googlecloudplatform / dfcx-scrapi Goto Github PK

View Code? Open in Web Editor NEW
82.0 24.0 53.0 4.6 MB

A high level scripting API for bot builders, developers, and maintainers.

License: Apache License 2.0

Makefile 0.09% Python 99.80% Shell 0.11%
dialogflow chatbot nlu dialogflow-fulfillment python google-cloud-platform google

dfcx-scrapi's Introduction

Scrappy, the SCRAPI mascot!

Python Dialogflow CX Scripting API (SCRAPI)

A high level scripting API for bot builders, developers, and maintainers.

Table of Contents
  1. Introduction
  2. Getting Started
  3. Usage
  4. Library Composition
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgements

Introduction

The Python Dialogflow CX Scripting API (DFCX SCRAPI) is a high level API that extends the official Google Python Client for Dialogflow CX. SCRAPI makes using DFCX easier, more friendly, and more pythonic for bot builders, developers, and maintainers.

SCRAPI --> Python Dialogflow CX
as
Keras --> Tensorflow

What Can I Do With DFCX SCRAPI?

With DFCX SCRAPI you can perform many bot building and maintenance actions at scale including, but not limited to:

  • Create, Update, Delete, Get, and List for all CX resources types (i.e. Intents, Entity Types, Pages, Flows, etc.)
  • Convert commonly accessed CX Resources to Pandas Dataframes
  • Have fully automated conversations with a CX agent (powerful for regression testing!)
  • Extract Validation information
  • Extract Change History information
  • Search across all Flows/Pages/Routes to find a specific parameter or utterance using Search Util functions
  • Quickly move CX resources between agents using Copy Util functions!
  • Build the fundamental protobuf objects that CX uses for each resource type using Builder methods
  • ...and much, much more!

Built With

  • Python 3.8+

Authentication

Authentication can vary depending on how and where you are interacting with SCRAPI.

Google Colab

If you're using SCRAPI with a Google Colab notebook, you can add the following to the top of your notebook for easy authentication:

project_id = '<YOUR_GCP_PROJECT_ID>'

# this will launch an interactive prompt that allows you to auth with GCP in a browser
!gcloud auth application-default login --no-launch-browser

# this will set your active project to the `project_id` above
!gcloud auth application-default set-quota-project $project_id

After running the above, Colab will pick up your credentials from the environment and pass them to SCRAPI directly. No need to use Service Account keys! You can then use SCRAPI simply like this:

from dfcx_scrapi.core.intents import Intents

agent_id = '<YOUR_AGENT_ID>'
i = Intents() # <-- Creds will be automatically picked up from the environment
intents_map = i.get_intents_map(agent_id)

Cloud Functions / Cloud Run

If you're using SCRAPI with Cloud Functions or Cloud Run, SCRAPI can pick up on the default environment creds used by these services without any additional configuration!

  1. Add dfcx-scrapi to your requirements.txt file
  2. Ensure the Cloud Function / Cloud Run service account has the appropriate Dialogflow IAM Role

Once you are setup with the above, your function code can be used easily like this:

from dfcx_scrapi.core.intents import Intents

agent_id = '<YOUR_AGENT_ID>'
i = Intents() # <-- Creds will be automatically picked up from the environment
intents_map = i.get_intents_map(agent_id)

Local Python Environment

Similar to Cloud Functions / Cloud Run, SCRAPI can pick up on your local authentication creds if you are using the gcloud CLI.

  1. Install gcloud CLI.
  2. Run gcloud init.
  3. Run gcloud auth login
  4. Run gcloud auth list to ensure your principal account is active.

This will authenticate your principal GCP account with the gcloud CLI, and SCRAPI can pick up the creds from here.


Exceptions and Misc.

There are some classes in SCRAPI which still rely on Service Account Keys, notably the DataframeFunctions class due to how it authenticates with Google Sheets.

In order to use these functions, you will need a Service Account that has appropriate access to your GCP project.
For more information and to view the official documentation for service accounts go to Creating and Managing GCP Service Accounts.

Once you've obtained a Service Account Key with appropriate permissions, you can use it as follows:

from dfcx_scrapi.core.intents import Intents
from dfcx_scrapi.tools.dataframe_functions import DataframeFunctions

agent_id = '<YOUR_AGENT_ID>'
creds_path = '<PATH_TO_YOUR_SERVICE_ACCOUNT_JSON_FILE>'

i = Intents(creds_path=creds_path)
dffx = DataframeFunctions(creds_path=creds_path)

df = i.bulk_intent_to_df(agent_id)
dffx.dataframe_to_sheets('GOOGLE_SHEET_NAME', 'TAB_NAME', df)

Getting Started

Environment Setup

Set up Google Cloud Platform credentials and install dependencies.

gcloud auth login
gcloud auth application-default login
gcloud config set project <project name>
python3 -m venv venv
source ./venv/bin/activate
pip install -r requirements.txt

Usage

To run a simple bit of code you can do the following:

  • Import a Class from dfcx_scrapi.core
  • Assign your Service Account to a local variable
from dfcx_scrapi.core.intents import Intents

creds_path = '<PATH_TO_YOUR_SERVICE_ACCOUNT_JSON_FILE>'
agent_path = '<FULL_DFCX_AGENT_ID_PATH>'

# DFCX Agent ID paths are in this format:
# 'projects/<project_id>/locations/<location_id>/agents/<agent_id>'

# Instantiate your class object and pass in your credentials
i = Intents(creds_path, agent_id=agent_path)

# Retrieve all Intents and Training Phrases from an Agent and push to a Pandas DataFrame
df = i.bulk_intent_to_df()

Library Composition

Here is a brief overview of the SCRAPI library's structure and the motivation behind that structure.

Core

The Core folder is synonymous with the core Resource types in the DFCX Agents (agents, intents, flows, etc.)

  • This folder contains the high level building blocks of SCRAPI
  • These classes and methods can be used to build higher level methods or custom tools and applications

Tools

The Tools folder contains various customized toolkits that allow you to do more complex bot management tasks, such as

  • Manipulate Agent Resource types into various DataFrame structures
  • Copy Agent Resources between Agents and GCP Projects on a resource by resource level
  • Move data to and from DFCX and other GCP Services like BigQuery, Sheets, etc.
  • Create customized search queries inside of your agent resources

Builders

The Builders folder contains simple methods for constructing the underlying protos in Dialogflow CX

  • Proto objects are the fundamental building blocks of Dialogflow CX
  • Builder classes allow the user to construct Dialogflow CX resource offline without any API calls
  • Once the resource components are constructed, they can then be pushed to a live Dialogflow CX agent via API

Contributing

We welcome any contributions or feature requests you would like to submit!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the Apache 2.0 License. See LICENSE for more information.

Contact

Patrick Marlow - [email protected] - @kmaphoenix
Milad Tabrizi - [email protected] - @MRyderOC

Project Link: https://github.com/GoogleCloudPlatform/dfcx-scrapi

Acknowledgements

Dialogflow CX Python Client Library
Hugging Face - Pegasus Paraphrase

dfcx-scrapi's People

Contributors

adiegocalonso avatar cbradgoog avatar dcsan avatar dependabot[bot] avatar dtstry avatar greenford avatar gyar-denim avatar hgithubacct avatar hjosiah avatar hkhaitan1 avatar jkshj21 avatar jmound avatar karkipra avatar katherinez22 avatar kmaphoenix avatar lambertaurellegcp avatar mejindal avatar mrpatekful avatar mryderoc avatar omerside avatar seanscripts avatar sidpagariya avatar totemws avatar zya-codes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dfcx-scrapi's Issues

[BUG] Operations.get_lro does not use the correct API host

Expected Behavior

Operations.get_lro should use the correct API host with the region prefix, for example us-central1-dialogflow.googleapis.com.

Current Behavior

Operations.get_lro always uses dialogflow.googleapis.com as the API host. This result in error below, for example, when accessing an LRO that is scoped to a specific region:

google.api_core.exceptions.InvalidArgument: 400 Please switch to 'us-central1-dialogflow.googleapis.com' to access resources located in 'us-central1'.

Possible Solution

Check the location in the LRO ID and use the correct API host.

Steps to Reproduce

  1. Export an agent using Agents.export_agent, which returns an LRO
  2. Call Operations.get_lro
  3. Observe the error as described above

Context (Environment)

N/A

Detailed Description

N/A

Possible Implementation

An older version of SCRAPI had this that worked:

        location = lro.split("/")[3]
        if location != "global":
            base_url = "https://{}-dialogflow.googleapis.com/v3beta1".format(
                location
            )
        else:
            base_url = "https://dialogflow.googleapis.com/v3beta1"

A similar logic may need to be reinstated.

[BUG] Misaligned DataFrame Schemas

Current Behavior

Currently, there are several areas in SCRAPI that we export and import DataFrames, and their schemas are misaligned.
This causes issues with streamlining a pipeline of events because column renaming or ETLs need to be done.

Examples:
Intents.intent_proto_to_dataframe exports columns = display_name, training_phrase in basic mode.
In advanced mode for the same method, the utterance is now called text.
Mismatch of schema and semantics in the same method.

In DataframeFunctions.bulk_update_intents_from_dataframe, the basic mode expects input columns of display_name and text.
This is misaligned from the above schemas of the generated dataframes in Intents class.

So if your workflow is this:

  1. Intents to Dataframe
  2. Dataframe to Sheet
  3. Sheet to Dataframe

Step 3 will break due to misaligned schema.
We should always be in alignment with "like for like" export/import (i.e. basic and basic should match 100%).
We should also be in alignment semantically across modes (i.e. basic and advanced have different schemas, but the columns that are shared are 100% named identically)

Expected Behavior

All DataFrame schemas within the same Resource type (i.e. Intents, Entity Types, etc.) should be in alignment.

Possible Solution

Centralize the creation and validation of all schema types to a file outside of the class that is using them.
Introduce core/schemas.py or similar to maintain a central schema repository.
Then each respective class can pull their schema and schema validation rules from the central class, ensuring that we have continuity in DataFrame resources.

Steps to Reproduce

Try the following

  1. Intents to Dataframe
  2. Dataframe to Sheet
  3. Sheet to Dataframe (without modifying your sheet. leave it as-is)

Change dataframe format for create_test_dataset

Is your proposal related to a problem?

Currently, create_test_dataset returns a dataframe with columns - utterances and intent_display_name - which is not a suitable test set format for run_intent_detection.

Describe the solution you'd like

Change output dataframe format for create_test_dataset to match dataframe format required for run_intent_detection. run_intent_detection requires a dataframe with column names - flow_display_name, page_display_name, and utterance.

[FR] Add Session Entity Type Class

Is your proposal related to a problem?

Need to create parity with the base Dialogflow CX Python class.

Describe the solution you'd like

Creata net-new class in core/session_entity_types.py that has all of the functionality provided by the base library.

Missing update_entity_type function in core.entity_types

I found this project quite useful for beginners like me to access dialogflow cx API on using python.
But I think the update_entity_type function in core-entity_types is missed, so I would like to know if you plan to add it in the future?

Best!

[FR] Entity Types to DataFrame

As a user, I want to be able to quickly export all Entity Types and their related properties (ie. synonyms, settings, etc.) to a DataFrame in order to inspect, review, and update as needed.

Describe the solution you'd like

Similar to bulk_intent_to_df we should include a method that allows the user to extract Entity Types to a DF.

[BUG] Page Validation Regex Does not handle Special Page Cases

When passing a page_id to the regex validator in scrapi_base that contains one of the special page cases, the validator returns an invalid response, even though it is a valid id.

Special Page cases are START_PAGE, END_SESSION, END_FLOW

Expected Behavior

Validator should accept this as a valid page_id

Current Behavior

Validator rejects the valid page_id

Possible Solution

Update the regex as follows:

r"[-0-9a-f]{1,36}|START_PAGE|END_SESSION|END_FLOW"

[FR] Ability to specify the environment when exporting an agent

Is your proposal related to a problem?

When exporting an agent using the DFCX web GUI, there is the option to specify the environment. The ExportAgentRequest class in the underlying protobuf lib also has the optional environment attribute:

environment (str):
    Optional. Environment name. If not set, draft environment is
    assumed.

Describe the solution you'd like

Add an optional environment attribute to Agents.export_agent and pass it to ExportAgentRequest.

Describe alternatives you've considered

N/A

Additional context

The ability to specify the environment is important to make sure that the correct version gets exported. In an automation workflow, we would like to use a special environment to designate the version of a flow to be exported. While working on building a flow, a conversational architect can cut a version that is ready to be exported. Then, they may continue to make more changes. Having the ability to specify the environment to export will allow a CA to rest assured that only the version that has been cut will be sent elsewhere and not the draft.

Pre-configured Issue and Pull Request Templates for this project

Background

I noticed that this repository does not have an issue template or PR template yet, so I was thinking of following these guides to create the respective templates. This can make writing bug reports, issues, PRs quicker and less ambiguous.

Proposed solution

Issue Template:

Pull Request Template:

Questions

  • Should these templates be public or hidden?

`update_transition_route_group` should include a `language_code` flag for multilingual agents

Is your proposal related to a problem?

Certain DFCX agents require functionality for agents to have support for fulfillment messages in multiple languages.

Describe the solution you'd like

The solution for update_transition_route_group method can be implemented similarly to how intents.py already contains a language_code flag in its update_intent function. This can be done by expanding the TransitionRoutesGroup class to include the language_code flag.

Describe alternatives you've considered

N/A that I'm aware of

Additional context

@kmaphoenix is probably already very knowledgeable about this ๐Ÿ˜ƒ

@zya-codes

[FR] Add Entry Fulfillment as Type for Fulfillment Message DF Output

Is your proposal related to a problem?

When using the SearchUtil.get_agent_fulfillment_messsage_df method, it's not immediately evident when a fulfillment message comes from the Entry Fulfillment portion of a page.
With all of the other Fulfillment types, you can track them to an Event, Intent, Condition, or Intent+Condition route.

Describe the solution you'd like

Add entry_fulfillment as type for the output of this df:
https://github.com/GoogleCloudPlatform/dfcx-scrapi/blob/main/src/dfcx_scrapi/tools/search_util.py#L730

Additional context

Motivation is so that users can modify fulfillments in Google Sheets, then push those results back to Dialogflow CX when they have been updated.
On the "push back to CX" stage, it's important to know exactly where this fulfillment needs to be updated.

[FR] Add implicit behavior to CopyUtil.copy_intent_to_agent()?

Situation:

A tagged intent is dependent on 1+ entity types. Current behavior of copy_intent_to_agent()>_remap_parameters_in_intent() is to ONLY create or update the intent itself without consideration to the entity type dependencies. Entity type dependencies absent from the destination agent cause the method to throw an error when attempting to retag the intent's training phrases with an entity type in the dest agent with the same display name.

Proposal:

Edit copy_intent_to_agent() to create or update entity type dependencies and log the changes, possibly adding a parameter to make this behavior optional. Work started in branch feature/copy_intent_update

Alternative:

Add error handling around CopyUtil line 338 providing more explicit explanation than KeyError:<entity_type.display_name>

Ask:

Thoughts? Do you prefer the proposal or alternative?

[FR] CLI Driven Testing via App / Tool module

Is your proposal related to a problem?

Some users are not able to run Colab notebooks due to enterprise security concerns with web-based runtimes. Additionally, using Python scripts via cron (i.e. Cloud Scheduler or similar) may also be off limits due to security.

In these cases, many organizations only allow CLI based commands (i.e. gcloud commands and similar) to be run ad-hoc via terminal sessions.

Describe the solution you'd like

We've developed an internal CLI based tool that utilizes SCRAPI under the hood, but is able to run directly on the command line to perform many of the basic functions that SCRAPI uses, including but not limited to:

  • CRUD functions for Resources
  • End to End Conversation Testing
  • NLU Regression Testing

Describe alternatives you've considered

One alternative that exists is a Go-based CLI Tool developed by another Google CE. While the Go library is completely functional and works great, we've had requests to consolidate tools like this into a single library like SCRAPI.

Since SCRAPI is broadly used across the organization and community, we will start to build this CLI tool into SCRAPI to provide a "one stop shop" for many of these types of resources.

[Style] Repo-wide style cleanup

Sub-issue 1:

Google Python Style Guide 2.2:
"Use import statements for packages and modules only, not for individual classes or functions."
means a lot our imports are style-breaking, like:
from dfcx_scrapi.core.scrapi_base import ScrapiBase
A module is a .py file, and is named in snake-case by convention. A class is a section of code within a module, and is camel-case by convention.

The are probably other class or function imports that should be module imports instead. This is a repo-wide style issue.

Recommend Solution

For example, the above import would be correctly written as:
from dfcx_scrapi.core import scrapi_base
and code within the module that previously referenced ScrapiBase directly should be changed to
scrapi_base.ScrapiBase. Other class and function imports should be changed to module imports in the same way.

[BUG] _set_region not properly setting quota_project_id

Expected Behavior

When the _set_region method is called, it should properly set the quota_project_id

Current Behavior

quota_project_id is not being set, leading to SCRAPI having to infer the Project ID from the environment.
This is sub-optimal in some situations, and breaking in others.
For example, the inferred project ID could be different from the ID being passed in the resource, leading to authentication issues.
In other scenarios, like working in Google Colab, the inferred project ID is ephemeral (colab runtime) so it will not properly pick up the project ID from the provided resource.

Possible Solution

Implement the method as follows:

  @staticmethod
  def _set_region(item_id: str, project_id: str):
      """Different regions have different API endpoints

      Args:
        item_id: agent/flow/page - any type of long path id like
          `projects/<GCP PROJECT ID>/locations/<LOCATION ID>

      Returns:
        A dictionary containing the api_endpoint to use when
        instantiating other library client objects, or None
        if the location is "global"
      """
      try:
          location = item_id.split("/")[3]
      except IndexError as err:
          logging.error("IndexError - path too short? %s", item_id)
          raise err

      if location != "global":
          api_endpoint = f"{location}-dialogflow.googleapis.com:443"
          client_options = {
              "api_endpoint": api_endpoint,
              "quota_project_id": project_id}
          return client_options

      else:
          api_endpoint = "dialogflow.googleapis.com:443"
          client_options = {
              "api_endpoint": api_endpoint,
              "quota_project_id": project_id}
          return client_options

Default value results in AttributeError

https://github.com/GoogleCloudPlatform/dfcx-scrapi/blob/57a97d416a15df8fd3470fdd217b153013a53b51/src/dfcx_scrapi/tools/dataframe_functions.py#L777-782

In function create_entity_from_dataframe
If meta = None, which is the default value, these lines throw the error:

AttributeError: 'NoneType' object has no attribute 'get'

In function bulk_create_entity_from_dataframe, the docstring says that the required dataframe columns are display_name, value, synonyms. Does it also require meta?

[FR] Implement Automation to Support Test Driven Design (TDD) in Dialogflow CX Agents

Is your proposal related to a problem?

As a user, I need a way to author Dialogflow CX Test Cases outside of the console/IDE.
Preferably, this can be in a format that is easily human-readable like Google Sheets or YAML.

Describe the solution you'd like

  • Implement a Test Case Builders class to support creating the Test Case Protos
  • Implement a Test Case Utils class that offers several methods including:
    • Converting a Google Sheet of Test Case data into a List of Test Case Protos ready for upload
    • Converting a YAML file of Test Case data into a List of Test Case Protos ready for upload
    • Methods for parsing inputs from all formats
    • Methods for validating input from all formats

Additional context

A primer on Test Driven Development and why it can be a powerful paradigm for designing Dialogflow CX Agents.

[FR] Add Conditional Response as response_type for Search Util Fulfillment Message DF

Is your proposal related to a problem?

When using the SearchUtil.get_agent_fulfillment_messsage_df method, conditional response types are not marked appropriately in the response_type column. Currently they come back as null/blank.

Describe the solution you'd like

If the response_type is of conditional_response, add that to the response_type value.

[FR] <Create Custom Model for Utterance Generation>

Problem

To make SCRAPI utterance generation more flexible, it would be worthwhile to supply users a function that will allow them to fine-tune their own custom Pegasus model for a given use case.

Solution

Reconfigure utterance_generator.py to set SCRAPI custom model as default, but provide a create_pegasus_model function that allows users to fine-tune a Pegasus-model on an existing or custom tensor flow dataset.

[BUG] Intent Parameters overwritten on Bulk Update Intent from Dataframe

Expected Behavior

When updating Intents from dataframe, all existing parameters in the Intent should be preserved and any new parameters should be appended to the existing parameters.

Current Behavior

Existing parameters are being deleted.

Possible Solution

Some Intent attributes are not appropriately being copied over in the following internal method _remap_intent_values

Intent Attribue ref:
https://github.com/googleapis/python-dialogflow-cx/blob/main/google/cloud/dialogflowcx_v3beta1/types/intent.py#L54

Steps to Reproduce

  1. Create an Intent with Training Phrases and tagged entities. Note in the UI that paramters exist for your Intent.
  2. Using the bulk_update_intent_from_dataframe method in advanced mode, update the Intent from the API without passing any Parameters in with your advanced update.
  3. Check to see that Parameters have been removed. (Use Change History in UI/Console)

Importing Changelogs has an unresolved dependency on pandas

from dfcx_scrapi.core.changelogs import Changelogs

File ".../.local/lib/python3.9/site-packages/dfcx_scrapi/core/changelogs.py", line 20, in
import pandas as pd
ModuleNotFoundError: No module named 'pandas'

Running pip3 install pandas resolves the issue. Need to check whether pandas is actually used in Changelogs. If so, we may need to add it to a requirements.txt?

[BUG] update_webhook doesn't work properly when you pass a webhook_obj

Expected Behavior

Update the webhook in the target agent.

Current Behavior

Does not perform any action.

Possible Solution

based on https://github.com/googleapis/python-dialogflow-cx/blob/f2d12c53804dec7b236509aa29b200aebcc53c8a/google/cloud/dialogflowcx_v3beta1/types/webhook.py#L305-L308

we have to update the mask only if the kwargs pass to the function and not when we pass the webhook_obj.

Steps to Reproduce

  1. Create a webhook in the agent.
  2. Pull the webhook using list_webhooks method in dfcx_scrapi.core.webhoks.Webhooks
  3. Change an attribute of the webhook e.g. timeout
from datetime import timedelta
my_webhook.timeout = timedelta(seconds=12)
  1. Update the webhook
webhook_instance.update_webhook(
    webhook_id=my_webhook.name,
    webhook_obj=my_webhook
)

Possible Implementation

mask = None
if kwargs:
    ... # existing code
    mask = field_mask_pb2.FieldMask(paths=paths)
... # existing code
if mask:
    request.update_mask = mask
... # existing code 

[FR] <Replace utterance_generator.py model>

Problem

Pegasus models are developed by Google Research. The current model implemented in utterance_generator.py is a fine-tuned Pegasus model for paraphrasing that was fine-tuned external to Google.

Solution

Fine-tune a Pegasus model for paraphrasing tasks that will serve as a base model for SCRAPI.

[Pull Request] Agent Assist Class Addition

Pull Request Template

Description

The change aims to add modules in the test_cases class. This change will help the DDs & CAs get detailed information on test coverage rate with respect to intents, flows, transitions & route groups coverage.

No new dependencies needs to be added or changed.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

I extended the ScrapiBase class to create a new test_case_extend class. This was done to ensure that the module will integrate with the existing test cases class. Then I added the new modules as method to this class.
I created a dummy agent in a GCP project and manually generated few test cases. I then used the agent id and a service account key from the above GCP project to test my code. The results were validated by matching the numbers in the DFCX console.

Requirements to run the above script-

  • Agentid with some DFCX test cases configured
  • SAK with Dialogflow test cases admin role

Please describe the tests that you ran to verify your changes. Provide instructions and code snippets so we can reproduce. Please also list any relevant details for your test configuration (i.e. new dependencies).

Psuedo Code-

Code Snippets:

# Ex:
# from dfcx_scrapi.core.intents import Intents
# i = Intents()
# intents = i.list_intents()
# for intent in intents:
#   print(intent.display_name)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My code passes the linter as defined in the .pylintrc file
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

[FR] export_flow should not require gcs bucket

Problem

At the moment, exporting a flow (core.flows.export_flow()) requires a bucket URI as the destination. However, I have run into an instance where I would rather have the serialized flow to be returned inline.

Proposed Solution

The parameter gcs_path should be optional, and in its absence, the exported flow should be returned as a serialized value. This functionality is supported by the underlying cloud.dialogflowcx_v3beta1.types.ExportFlowRequest object, which itself does not require a bucket URI.

[BUG] copy_entity_type_to_agent method is not working

I'm having an issue trying to copy an entity from a source to a destination agent that is totally new (and empty) using the method
cu.copy_entity_type_to_agent

Expected Behavior

Copy of the source angent's entity to the destination agent.

Current Behavior

I'm having the following error message:
TypeError: name: "projects/projectID/locations/global/agents/agentID/e has type EntityType, but expected one of: bytes, unicode

(note the '/e' in the project path)

Possible Solution

Steps to Reproduce

  1. instantiate CopyUtil obecjt
  2. use the method copy_entity_type_to_agent. Ex:

cu.copy_entity_type_to_agent("entityName", "projects/projectId/locations/global/agents/sourceAgentID", "projects/projectId/locations/global/agents/destAgentID")
3.

Context (Environment)

Detailed Description

Possible Implementation

[FR] Parameters to DataFrame

Is your proposal related to a problem?

As a user, it is difficult to know what Parameters are "active" across the CX Agent at any given time since CX Agents can have numerous Flows / Pages and it can be hard to track all of the various Parameters.

Describe the solution you'd like

Similar to bulk_intent_to_df we should include a method that allows the user to extract all Form Parameters, their associated Entity Types, and any Form Settings (i.e. redact in log, etc.) that may be available in a CX Agent at design-time.

[FR] Examples for annotated TPs uploading

Is your proposal related to a problem?

It's related to understand on how to upload annotated TPs, how is the input file format and wich SCRAPI's resource I need to use to do that.

Describe the solution you'd like

A notebook with some examples.

Describe alternatives you've considered

(Write your answer here.)

Additional context

(Write your answer here.)

[FR] Allow Optional Schema Mapping for Sheets to Dataframe Method

Is your proposal related to a problem?

As a user who imports Google Sheets to Dataframe, I want the ability to specify the data schema that I'm providing so everything is not interpreted as object, or potentially misinterpreted.

Describe the solution you'd like

Provide an optional arg in sheets_to_dataframe called schema that has a default value of None.
A user can then provide a schema map of the column names and dtypes that they want for each column.

Under the hood, SCRAPI will handle the assignment of the dtypes when importing the Sheet to Dataframe.

Additional context

This would save downstream schema manipulation for tools, scripts, apps, etc.

[FR] Add Example Notebook for NLU Regression Testing

Is your proposal related to a problem?

In light of recent additions to the conversations.py class, we now have the ability to perform some very robust NLU Regression Testing. However, it may not be immediately apparent to users how to accomplish this.

Describe the solution you'd like

We have an internal standard Colab that we user for Ad Hoc NLU Testing that I will open source for the community to utilize.

Additional context

Artifacts to deliver will be:

  1. Google Colab Notebook w/step-by-step instructions for Ad Hoc NLU testing
  2. Python file for scheduled NLU Testing via cron (i.e. Cloud Scheduler + Cloud Function or similar)
  3. Sample Input File
  4. Sample Output File
  5. Sample Logs

Access or Roles requirement given to service account key for integration with DFCX #PermissionDenied #Roles

I got a service account key from my organization to make use of scrapi and automate some functionalities in DFCX like create bulk test cases and uploading intent with tp in bulk etc.
So the organization gave some access to that service account and when I use it to even read the agent using `get_agent()' it throws "PERMISSION DENIED" and basically says that my account doesn't have the right permission to perform this task without actually specifying which roles exactly.
I have less idea on what all roles/access the organization has given to the service account and I found no ways to find that unless I had access to IAM which I don't.
I would request to get clarity on what all roles does a service account need to perform basic functionalities like upload intents, create test cases etc.
I can see that in every example there is prerequisite to have "API admin role" assigned to the account but my organization can't provide me that access. So is that the issue or do we have some alternative roles that can compensate for it?
Please make use of this link https://cloud.google.com/dialogflow/cx/docs/concept/access-control
and state exact roles needed in a service account to perform basic functionalities in DFCX.

[BUG] Update Text Truncate to 256 char limit and throw WARNING instead of ERROR

Expected Behavior

Change logging.error to logging.warning instead since it is not causing the process to fail.

Update the truncate of text to 256.

Current Behavior

There is a logging.error that is being posted at this stage of DialogflowConversation.reply
https://github.com/GoogleCloudPlatform/dfcx-scrapi/blob/main/src/dfcx_scrapi/core/conversation.py#L261

We are truncating the text input to 250 chars but it should be 256
There is a logging.error that is being posted at this stage of DialogflowConversation.reply
https://github.com/GoogleCloudPlatform/dfcx-scrapi/blob/main/src/dfcx_scrapi/core/conversation.py#L260

Possible Solution

Change logging.error to logging.warning instead since it is not causing the process to fail.

Update the truncate of text to 256.

Context

https://cloud.google.com/dialogflow/quotas

[BUG] restore_agent should return a google.api_core.Operation

def restore_agent(self, agent_id: str, gcs_bucket_uri: str) -> str:

When restore_agent returns a value, it should not be a str, it should be a google.api_core.Operation, which is the actual type of the response returned from line 422:

response = client.restore_agent(request)

This is a call to Dialogflow CX v3beta1's client.restore_agent(), which returns a google.api_core.Operation object. When we return it as a string, we are essentially casting the Operation object to a string, so we trigger it's str() function and receive a string in the format 'projects/{project_id}/locations/{region_id}/operations/{operation_id}'

https://github.com/googleapis/python-dialogflow-cx/blob/main/google/cloud/dialogflowcx_v3beta1/services/agents/client.py#L1127

dataframe_functions _make_schema concat dataframe with dictionary

error in _make_schema concatenating dataframe to dictionary

Expected Behavior

Current Behavior

Possible Solution

implement better type coercing method
use dictionary to add to the dataframe

Steps to Reproduce

Context (Environment)

when calling bulk_create_intent_from_dataframe function

or try running this example https://github.com/GoogleCloudPlatform/dfcx-scrapi/blob/main/examples/bot_building_series/bot_building_102-intents-with-annotated_tp.ipynb

Detailed Description

Possible Implementation

[FR] Add more example notebooks to showcase SCRAPI usage

Is your proposal related to a problem?

New users of the SCRAPI library may not know where to get started or how to start coding with SCRAPI.
We will provide simple notebooks with step by step instructions to do various tasks in the SCRAPI library.

Describe the solution you'd like

Add more notebooks to examples/ folder.

[BUG] Can not delete pages with incoming transition

Expected Behavior

Delete a page if it has one or more incoming transition(s).

Current Behavior

Throw an error as FailedPrecondition

Possible Solution

Add a parameter to force the delete.

Steps to Reproduce

  1. Create two pages A and B.
  2. Create a random transition route in page A with target page of B.
  3. Try to delete page B using SCRAPI.

Possible Implementation

Based on dfcx client library we can add force to behave this way.

[BUG] KeyError in route_groups_to_dataframe for special pages

If there are special pages (CURRENT_PAGE / PREVIOUS_PAGE) in one of the route groups, the route_groups_to_dataframe method will throw a KeyError exception.

Expected Behavior

Interpret the special pages as they are, i.e. CURRENT_PAGE or PREVIOUS_PAGE

Current Behavior

Throw a KeyError.

Possible Solution

Use .get instead of the [] operator and extract the special page name from the route.target_page:

Steps to Reproduce

  1. Create a new route group in an existing dialogflow CX agent
  2. Add a route that transitions to "Current Page" or "Previous Page"
  3. Use route_groups_to_dataframe to generate the dataframe

Possible Implementation

Replace

if route.target_page:
temp_dict.update(
{"target_page": all_pages_map[route.target_page]}
)

with

if route.target_page:
    t_p = all_pages_map.get(route.target_page)
    # Handling special pages
    if not t_p:
        t_p = str(route.target_page).split("/")[-1]

    temp_dict.update({"target_page": t_p})

[FR] Non-technical error reporting for service account errors

Should we do some error handling for common service-account related errors? For instance, if the service key is expired then initializing a core class with creds_path would throw:

RefreshError: ('invalid_grant: Invalid JWT Signature.', {'error': 'invalid_grant', 'error_description': 'Invalid JWT Signature.'})

but instead we could try/catch and throw a plain-language error like "Your service account key has expired. Please make a new one and try again".

This would hopefully keep non-technical user working in a notebook from getting blocked and needing a programmer's help.

Another error case might be when the service account key and agent id are for different projects.

[BUG] Intent builder adds a space before TP parts that start with an apostrophe when adding a new TP

When using the intent builder to add a new training phrase as the one shown below, the builder adds a space between the annotated part and the following part whose text starts with the possessive "'s":

"phone's not working" where "phone" is annotated with ProvidedProductLine

Expected Behavior

Adding the tp as follows:
intent_builder.add_training_phrase(["phone", "'s not working"], ["ProvidedProductLine", ""])

should result in a tp "phone's not working" where "phone" is annotated with ProvidedProductLine

Current Behavior

Adding the tp as follows:
intent_builder.add_training_phrase(["phone", "'s not working"], ["ProvidedProductLine", ""])

results in a tp "phone 's not working" where "phone" is annotated with ProvidedProductLine

Possible Solution

Check if a part starts with the possessive "'s", and, if it does, do not add a space before it. This may be true in other situations as well, for example "," and "?" (these may already be accounted for, I haven't seen them, just a thought.)

[FIX] Replace Frame.append with pd.concat

Context:

/layers/google.python.pip/pip/lib/python3.9/site-packages/dfcx_scrapi/core/conversation.py:117: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
/layers/google.python.pip/pip/lib/python3.9/site-packages/dfcx_scrapi/core/conversation.py:117: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.

Reference:

flow_mapped = flow_mapped.append(

[BUG] non-intuitive behavior in tools/search_util function

Current Behavior

search = search_util.SearchUtil(creds_path=creds)
handlers = search.find_event_handlers()

2nd line throws AttributeError: 'SearchUtil' object has no attribute 'agent_id'
agent_id isn't a parameter in find_event_handlers.

Possible Solution

  1. agent_id should be a constructor required param -or- 2. should be a parameter for find_event_handlers(). Recommend 2.

[BUG] _page_level_handlers() doesn't return df with all pages

Expected Behavior

function returns dataframe for all pages in all flows within the agent

Current Behavior

function returns a fraction of the expected pages.

I wrote my own code to pull page event handlers before I was aware of find_event_handlers() (1/3 of which is _page_level_handlers) which returned EHs for 1379 for a certain agent. For the same agent, find_event_handlers() pulled EHs for only 77 pages.

[FR] languageCode argument for Scrapi methods

Is your proposal related to a problem?

We are using the scrapi tools for several process on a multilingual agent. We need to translate entire flows, also we are doing copy/pasting of elements, because if the experience in one language differs from the other one, we need to gate it with a conditional handler based on a languageCode session parameter.

Describe the solution you'd like

we need in the methods, were it applies, the languageCode as argument to select which language version of the agent affect with the API calls via scrapi.

Describe alternatives you've considered

(Write your answer here.)

Additional context

I was applying my own changes to the code for some of the methods. Here is my fork as example:

migoogle@5ee4450

[FR] Intents to DataFrame Transposed

Problem Statement

For some Intent/Training Phrase analysis tasks, Conversational Designers and Linguists find that editing/modifying Intents/TPs in a transposed manner is more manageable then in the basic method we offer in bulk_intent_to_df

By providing another method to export Intents/TPs in a transposed manner, we can enabled these teams to work more efficiently to tune NLU in DFCX.

Example of Standard Output

display_name training_phrase
Default Welcome Intent hi
Default Welcome Intent heya
Default Welcome Intent hello
Default Welcome Intent howdy
Default Welcome Intent hey there
head_intent.billing bill
head_intent.billing billing
head_intent.billing about my bill
head_intent.billing a question regarding billing
head_intent.billing need to discuss my bill

Example of Transposed Output

Default Welcome Intent head_intent.billing
hi bill
heya billing
hello about my bill
howdy a question regarding billing
hey there need to discuss my bill

Describe the solution you'd like

We can approach this in 1 of 2 ways:

  1. Add a boolean flag to the existing bulk_intent_to_df method called transpose with a default value of False.
  2. Create a net new method dedicated to transposing this data like bulk_intent_to_df_transposed

I'm personally leaning toward [1] above, but there would be a caveat we would need to handle. The transposition is only usable/applicable for the basic mode of the extract method. Therefore, we would want to throw an Error/Exception if the user provides the following input:

  • mode = advanced
  • transpose = True

The simplest way to implement this would be with the following 2 lines of code:

_, intent_dict = i.intents_to_df_cosine_prep(pmarlow_demo)
df_transposed = pd.DataFrame.from_dict(intent_dict,'index').transpose()

The existing method intents_to_df_cosine_prep exists to support the Cosine Similarity Distance computations outside of the SCRAPI library (although this is on the roadmap to include here). The output of this is a Dictionary that can be easily dumped into a DataFrame and transposed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.