Coder Social home page Coder Social logo

azure-search-custom-skill-python's Introduction

Custom API Skill for Azure Search with Serverless (Python)

Background

Cognitive Search is an AI feature in Azure Search, used to extract text from images, blobs, and other unstructured data sources - enriching the content to make it more searchable in an Azure Search index. Extraction and enrichment are implemented through cognitive and custom skills attached to an indexing pipeline.

This repository contains an Azure Function (Python HTTP Trigger) that implements the Web API custom skill interface, allowing you to extend Cognitive Search by calling out to an API endpoint providing custom operations.

Prerequisites

Before you start, you must have the following:

For a better development experience it's recommended the use of Visual Studio Code with Python and Azure Functions extensions.

About the sample

The sample is a framework that can be used for any Azure Search custom skill you want, it is not tied to any specific service except Azure Functions. Key features/advantages:

  • Developers only have to worry about the business logic and fill the values property as the output.
  • Built with Marshmallow schemas, strengthening data consistency by serializing/deserializing objects to primitive Python types and simplifying data validation.

How to use

import logging
import azure.functions as func
from typing import List
from .models.output import OutputRecord
from .utils.schemas_helper import output_dumps
from .utils.functions_helper import load_request, bad_request, ok

def main(req: func.HttpRequest) -> func.HttpResponse:
    logging.info('Custom kill processed a request.')

    req_result: RequestResult = load_request(req)

    if not req_result.valid:
        return bad_request(req_result.error)

    input_skill: InputSkill = req_result.input_skill

    # YOUR CODE HERE

    values: List[OutputRecord] = [] # Update your values property
    output_json, error = output_dumps(values)

    if error:
        return bad_request('Invalid output format')
    
    return ok(output_json)

In the Azure Functions main file, there are basically three tasks that need to be done:

  1. Read data from input_skill and create your logic/processing (e.g. replace words from documents, apply regex, etc)
  2. Update the OutputData class with the property name you defined on Azure Search. In this sample, the generic property name created was contractTextProcessed.
  3. Update the values list with results from the previous processing.

Running Unit Tests

The sample uses unittest framework. You can follow the Python testing in Visual Studio Code article to configure your VSCode to run unit tests. Otherwise, you can test through command line:

python -m unittest discover ./skill/tests

Note: All tests are under the skill/tests folder, so make sure you are not looking only on skill folder by default.

azure-search-custom-skill-python's People

Contributors

fedeoliv avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

azure-search-custom-skill-python's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.