Coder Social home page Coder Social logo

Add API to Turkle about turkle HOT 21 CLOSED

hltcoe avatar hltcoe commented on July 22, 2024
Add API to Turkle

from turkle.

Comments (21)

cfortune avatar cfortune commented on July 22, 2024

I notice that mTurk uses xml as the data format, but Turkle uses CSV with json formatted data. Will this create a problem for us when it comes to reusing a prebuilt mTurk client? It would be necessary to either replace the csv format or to write a data emulation layer, ie for 2-way XML <-> CSV+JSON translation. Personally I hate xml and prefer working with csv+json due to its simplicity, but it depends on your project goals -- how much do you want to clone mTurk in every aspect?

https://micropyramid.com/blog/how-to-convert-xml-content-into-json-using-xmltodict/

from turkle.

cfortune avatar cfortune commented on July 22, 2024

Usually in Django (using the Django Rest Framework or DRF), your API resources are mapped to your Models through serializers. I believe we would normally map out the existing models like this, but Turkle's application design seems really different than mTurk:

  • class Task
  • class TaskAssignment
  • class Batch
  • class Project

Example:
GET /tasks # Returns a list of tasks
GET /tasks/<id> # Returns information for a specific task
POST /tasks # Create a new task
PUT /tasks/<id> # Completely modifies a specific task
PATCH /tasks/<id> # Partially updates a specific task
DELETE /tasks/<id> # Remove a specific task

So, we would need to build a bunch of custom serializers using the DRF, as an application layer which would respond to mTurk client requests, in order to access the existing models.

from turkle.

cash avatar cash commented on July 22, 2024

mTurk uses xml for the question parameter when creating a hit using their API (https://docs.aws.amazon.com/AWSMechTurk/latest/AWSMturkAPI/ApiReference_CreateHITOperation.html). Have you noticed it anywhere else?

from turkle.

cfortune avatar cfortune commented on July 22, 2024

These API actions require sending and/or receiving XML payloads:

  • CreateHIT
  • CreateHITWithHITType
  • CreateQualificationType
  • GetAssignmentsForHIT
  • GetQualificationRequests
  • GetQualificationType
  • UpdateQualificationType

Here is a list of data structures which are in XML. https://docs.aws.amazon.com/AWSMechTurk/latest/AWSMturkAPI/ApiReference_SchemaLocationArticle.html

We would need to decide how much of this to support / not support. For example, there is a pretty elaborate xml templating language which we may not support (in Formatted Content: XHTML doc), because plain HTML/javascript templates work pretty well anyway...

from turkle.

cash avatar cash commented on July 22, 2024

We're not interested in qualifications. Is the same true for your use cases?

GetAssignmentsForHIT is deprecated so that leaves CreateHIT and CreateHITWithHITType. Right now, we are only using html/javascript templates and are happy with them.

from turkle.

charman avatar charman commented on July 22, 2024

Turkle currently cares about Group membership, which might be functionally equivalent to the mTurk notion of a Qualification. Only Users who (are part of a particular Group|have a particular Qualification) can work on a particular task.

from turkle.

charman avatar charman commented on July 22, 2024

But, for our use cases, Group membership is always assigned by an Admin. We don't have a need for a process where Users complete a Qualification Task that is then reviewed to determine if they would be (assigned a Qualification|added to a Group).

I do think we want the API to support the programmatic creation of Tasks that are restricted to specific Groups, so limited API support for "Qualifications" (to the extent that they provide the same functionality as Groups) may be worthwhile.

from turkle.

cash avatar cash commented on July 22, 2024

I've been assuming that one motivation for implementing the mturk API is using boto as the client. Looks like boto has a schema file per service: https://github.com/boto/botocore/blob/develop/botocore/data/mturk/2017-01-17/service-2.json

The schema file is then used by a validator. This is fine as long as we use a subset of their API. But if we add to it by adding a Group parameter to CreateHIT, it won't validate. I haven't actually tried this yet.

from turkle.

cfortune avatar cfortune commented on July 22, 2024

Seen in the clear light of day, the more I read the mTurk specs., the more I see an impedence mismatch to Turkle's specs. How would these Turkle script operations map to the mTurk api operations?

  • add_user
  • import_users
  • upload_tasks
  • download_results

I'm thinking that, maybe we should just let Turkle "be who he is", meaning we can write a custom api and client that allows us our feature set with the minimal fuss. This may be the shortest development path, even with the added burden of writing our own customized api client software. Thoughts?

from turkle.

cash avatar cash commented on July 22, 2024

add_user and import_users are both operations that are not supported on mTurk. They have a different user management approach.

upload_tasks is the same as create HITs. download_results is the same as get assignments.

I'm not quite ready to develop our own API, but am certainly open to that. I'd like to see if I can pass a custom service definition to boto and use it. I have a lot of meeting today, but will try to squeeze that in.

from turkle.

cash avatar cash commented on July 22, 2024

I've been working on an issue that we discovered related to unicode characters. I hope to get back to this soon...

from turkle.

cash avatar cash commented on July 22, 2024

I've been able to get the boto client to work with my mock mturk site. I had to pass an endpoint_url to the client. It also required a fake region and a fake aws access token and secret. The access token and key are used for authentication so we would have to implement the same authentication system in turkle.

I added a parameter not in their spec and as expected, the boto validator failed. I haven't seen a way to turn that off. I then grabbed the mturk service definition, modified it, set an environment variable, and it worked. So we would be able to add parameters and methods to their API without much trouble.

Still not sure this is worth it. I'm checking up on the authentication code next - hoping it is some standard like OAuth.

from turkle.

cfortune avatar cfortune commented on July 22, 2024

Hi Cash and Craig,

I hope you are both well. I've been studying up on Django api's and clients. I think we could possibly use code generators in order to do the heavy lifting for both the server and client development. The generated code would have nice standard code design across all models and views, etc. It could eliminate weeks of trial and error, if all goes as promised....

Thoughts?

from turkle.

cash avatar cash commented on July 22, 2024

Thanks @cfortune - I took a look at drf-generators and it seems to want to blow away the views.py file in the turkle app to do its magic. Maybe it is intended for API only sites? Using the generator may not be possible for a HTML first site.

I'm starting to read up on DRF - specifically applications that already have HTML views and want to add an api.

from turkle.

cfortune avatar cfortune commented on July 22, 2024

Hi @cash , drf-generator lets you choose which types of serializers to generate, but I think it assumes you will generate your files at the beginning on an empty project, so, ya, it would blow away existing files. Maybe the way to use it is to let it blow away all the files, then we merge those files with the existing Turkle functions. Git merge tools should allow for that. It would be great to make contact with the authors of drf-generators project to get their input on modifying Turkle.

from turkle.

cash avatar cash commented on July 22, 2024

drf-generators created an API for CRUD operations on projects, batches, tasks, and task assignments. I believe we would only want to keep the methods for projects and batches. I'm not expecting the workers to use the API so having methods for working with tasks and task assignments don't make as much sense to me.

Maybe its possible to create another app called api that imports the model from turkle for projects and batches and create the API using that.

from turkle.

cfortune avatar cfortune commented on July 22, 2024

Maybe its possible to create another app called api that imports the model from turkle for projects and batches and create the API using that.

I think that is probably the right approach.

I assume we could import the user and group models too, from django admin, for use by drf auth, and limit actions via a drf group, or add drf permissions (read only, read/write, no access).

from turkle.

cfortune avatar cfortune commented on July 22, 2024

brobin, author of drf-generators wrote this:

If the files already exist (urls, views, etc.) they would overwrite existing code. It will warn you before overwriting. You could always run it and then merge back your existing stuff.

from turkle.

cash avatar cash commented on July 22, 2024

After looking through the generated code, I'm less interested in this. It was really simple code that doesn't save that much time over doing it yourself with DRF.

I'd like to get a list of design requirements for the API. On our side we want support for:

  • Managing user accounts and groups
  • Creating projects and batches
  • Monitoring progress
  • Downloading results

The above list does not include

  • Assigning a task to a specific user
  • CRUD operations on tasks (right now this is done at the batch level)
  • Completing assignments

@charman Do you have any comments on the above list?

@cfortune What are your highest priority items

from turkle.

cfortune avatar cfortune commented on July 22, 2024

That's too bad about drf-generators, I thought they would have more introspection of the models and would generate more code. It still may be worthwhile to use them in order to create a nicely scoped scaffold initially.

The highest priority item for us is batch/task management. We can create projects, batches, and do user management manually via crud, because they won't change much over time. Our AI system will be hitting the API day and night, though. I would be interested in the ability to do rest operations on individual tasks rather than on a batch of tasks as a whole. For example:

  • put one or more tasks to an existing batch
  • get and delete (archive) all completed tasks (in one step).

Can we simply reuse existing batches, or does the program assume that each new collection of tasks will need a new batch?

from turkle.

cash avatar cash commented on July 22, 2024

Using the html interface, you are restricted to one time batch creation. It doesn't support adding new tasks to a batch - at least not currently.

The mTurk API is set up to work like you describe. It doesn't have the concept of batches. I'm looking through the code to see what assumptions we made on this.

from turkle.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.