This issue can serve as a place where <a class="user-mention notranslate" data-hoverca

Here's a draft! Notes: It's all in one big dump. We can talk

Notes from Standup on data structure: <a class="user-mention notrans

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Ideally, the root JSON file would expose three things: The lis

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Define JSON schema about coral-ask-election-2016 HOT 21 CLOSED

kadamwhite commented on May 28, 2024

Define JSON schema

from coral-ask-election-2016.

Comments (21)

jde commented on May 28, 2024

Here's a draft! Notes:

It's all in one big dump. We can talk about how to slice it up.
The first level (starting with "all") are the "Groups". There is a special group called "all" then each other group represents an answer to the question.
In each group there are three things: a count, an array of text answers for each question marked "include in group" and counts for each multiple choice answer.
the keys are pretty ugly and can be ignored on the fe. They are required for us to create the aggregations.

https://gist.github.com/jde/1874baf1521e7d2317b7c09d66e01ce8

from coral-ask-election-2016.

jde commented on May 28, 2024

Notes from Standup on data structure:

@kadamwhite - we need to find a way to reliably key to the question/answer we want.

David and kadam to solve issue.

Edit: https://gist.github.com/jde/75fbf858d543ebde17fe0a073f5afbf2 proposing "groups" key alongside aggregations

from coral-ask-election-2016.

kadamwhite commented on May 28, 2024

It could be useful to have a key on the top-level JSON file be a definition of the questions and available responses, independently of their counts and relations:

questions: {
    emoji: {
      type: 'mc',
      question_id: '86d88d9e4e39979f13b822d1643a95f4',
      options: [
        '🇺🇸',
        '😒',
        '😳',
        '😸',
        '😃',
        '😪',
      ]
    },
    sentiment: {
      type: 'mc',
      question_id: '9e5a1ebfc5b29bbccdbfdde4c0dc621e',
      options: [
        'Economy',
        'Justice',
        'Pestilence',
        'Famine',
        'War'
      ]
    },
    shortanswer: {
      type: 'text',
      question_id: 'd1bccc85fc14440d4201e4fa2a7a88a2'
    }
  }

from coral-ask-election-2016.

kadamwhite commented on May 28, 2024

@jde notes that in this structure the options: [] arrays should provide a key for each option:

{
  value: '😳',
  id: '214120abe871590'
}

which we can use to identify the output JSON files on S3 that should be loaded for specific results for that response. This should be provided for all multiple choice questions.

@jde Do the forms have a sense of strict field ordering, and could we put an order: n key on each question, too, so that we have context for how to order the keys when requesting response pages for multi-dimensional data?

from coral-ask-election-2016.

kadamwhite commented on May 28, 2024

Ideally, the root JSON file would expose three things:

The list of questions, as hypothesized above
Aggregations for the grouping questions vs the other multiple choice questions
A list of the latest n questions

A full file is mocked up here, in JS format so I could leave comments; I still have open questions about how we can be sure to properly map things, but I hypothesize a groupby: [] top-level array that would contain the ID of any question marked for grouping, and if we specify that the grouping question has to be the emoji, I think we can have some flexiblity with how the rest of it falls out. In this context, the keys like "emoji:" would probably change to be the ID of their respective question, or else that keyed object just becomes an array of question objects, given that order is relevant and that the question already contains the question ID.

The aggregation I would imagine to be a subset of the existing aggregations, where the grouping IDs are keys for objects containing an mc property keyed by multiple choice question ID.

  aggregations: {
    // ID of grouping question
    "86d88d9e4e39979f13b822d1643a95f4": {
      // answer_id #1
      "e2ae2d7ea54e4fc189b463a5b60af2c3": {
        count: 9,
        mc: {
          // ID of multiple choice question (second data dimension)
          "9e5a1ebfc5b29bbccdbfdde4c0dc621e": [
            {
              id: "fdcf0e4d3d4f49b8807617f74262d70d",
              count: 1
            },
            {
              id: "0acdb2d80a174eddbf4c3157f7574dc0",
              count: 4
            },
            {
              id: "8a877a4816564ceea5af6cf3d8df8196",
              count: 2
            },
            {
              id: "a027d0d2c8a54663b62b4e1de0f333cd",
              count: 0
            },
            {
              id: "d904fa4351de41de868254d23b0acc9b",
              count: 2
            }
          ]
        }
      },
      // .. next q, and so on
    }
  }

I couldn't think of a way to make this more concise without using the value of the answer itself as the key, and given that answer_ids are unique and we control them, it felt like a better key to use.

Finally, having a straight array of the most recent approved questions always baked in to the root JSON will allow us to render something right off the bat, before the user interacts with the app:

  latest: [
    {
      "86d88d9e4e39979f13b822d1643a95f4": "😪",
      "9e5a1ebfc5b29bbccdbfdde4c0dc621e": "Justice",
      "d1bccc85fc14440d4201e4fa2a7a88a2": "Let's get our criminal justice system in line with any coherent conception of justice.",
      "e36f865a311bb2bcec6a270ab3083028": "New York, NY"
    },
    {
      "86d88d9e4e39979f13b822d1643a95f4": "😳",
      "9e5a1ebfc5b29bbccdbfdde4c0dc621e": "Justice",
      "d1bccc85fc14440d4201e4fa2a7a88a2": "Congrats on your victory, Mrs. President! Here's to ours!",
      "e36f865a311bb2bcec6a270ab3083028": "Wrigley Field, Chicago"
    },
    // ... and so on
  }

from coral-ask-election-2016.

kadamwhite commented on May 28, 2024

If we assume a fixed "responses per page" size for each page of emoji-only (one dimension slice) or emoji-and-focus (two-dimension slice) answer lists, we can avoid having any additional paging parameters in favor of calculating the total pages available from the counts on the nested MC question answer array's member objects

from coral-ask-election-2016.

iros commented on May 28, 2024

@kadamwhite that makes a lot of sense to me. One thing the sample file doesn't have is short answers related to the mc questions, but I assume for our purposes that will just be another text question, right?

from coral-ask-election-2016.

kadamwhite commented on May 28, 2024

@iros Can you clarify what you mean?

For the user submission,

Multiple Choice 1, "😪"
Multiple Choice 2, "Justice"
Short answer, "Let's get our criminal justice system in line with any coherent conception of justice."
Second text field, "New York, NY"

the only values I understand we need to aggregate are the multiple choice question ones: how many people answered "😪", of them how many people chose "Justice". If I know the answer to that is 38, and we have settled on 20 responses per page of output JSON for the collections results, I can get the second page of questions for 😪 /Justice (answer_ids AAAAA and BBBBB for simplicity) at,

responses-AAAAA-BBBBB-0002.json or similar.

@jde & I were discussing using

summary.json,
responses-[Emoji Answe ID]-nnn.json, and
responses-[Emoji Answer ID]-[MC Answer ID]-nnn.json

to split up the data in a "queryable" fashion, but I didn't fully articulate that file naming idea above. Does that answer the question about how short-answer responses correlating to a specific combination of values could be retrieved?

from coral-ask-election-2016.

iros commented on May 28, 2024

@kadamwhite I thought the intension was to have the MC questions, have an accompanying short statement.... so you'd say "Justice" and "because crime is OUT OF CONTROL" or something like that.
Then there would be a separate text question to the president, which is standalone.
So, a total of 7 questions, right?

from coral-ask-election-2016.

kadamwhite commented on May 28, 2024

@iros That wasn't present in the reduced question set that @losowsky shared, we should discuss in slack what the final question list is/should be.

from coral-ask-election-2016.

kadamwhite commented on May 28, 2024

From slack:

@jde latest draft, w/ aggregations broken down by a single group: https://gist.github.com/jde/aea38e7760f306b9efcdada8a02d6418

My initial reaction: a separate aggregation for each grouping doesn't quite map to what I proposed above. There's no nesting relationship between the grouping and other multiple choice questions. I don't need an aggregation file per answer, I need pages of individual response per answer combination and one single aggregation file that has a hierarchical relationship between the grouping question and each subsidiary mc question.

@jde, clarifying: "Are the breakdowns of the other multiple choice by emoji not needed? Do you want to show: “Of all the people who clicked :smile-face: 4 of them chose economy, 12 chose immigration, etc…"

@kadamwhite: Yes, we need to show that. That hierarchy is represented in the aggregations key within the json I mocked up

@jde: that was the intent with this, the proffered candidate format can be used to reference that data without needing to do nesting, which would be much more complex for me.

@kadamwhite: If we don't have nesting then we need one entry and count for each pair of emoji - other-mc-question responses, which is a significantly more complex object for me

  aggregations: {
    [ID of grouping question]: {
      [grouping question answer_id #1]: {
        count: 9,
        mc: {
          [ID of non-grouping MC question #1]: [
            {
              id: "[Non-grouping MC question answer_id #1]",
              count: 1
            },
            {
              id: "[non-grouping MC question answer_id #2]",
              count: 4
            },
            // other answers
          ]
        }
      }
    }
  }

would be best for me

@jde: Sounds good. And I like this format. What I have is just leaving off the “aggregations” obj and ID of Grouping Question levels.

from coral-ask-election-2016.

kadamwhite commented on May 28, 2024

Re: the structure of the individual multiple choice rollups nested within each grouping question's answer, if we have the questions dictionary then this format:

        "63ea32f4ca977c1a7ab037224d9d2561": {
          "answer": "Greatness",
          "count": 1
        },

could really be abbreviated to

        "63ea32f4ca977c1a7ab037224d9d2561": 1,

from coral-ask-election-2016.

kadamwhite commented on May 28, 2024

Follow-up Q: what do you think about whether the mc answers need to be embedded in the output JSON? e.g., instead of,

    {
      "86d88d9e4e39979f13b822d1643a95f4": "😳",
      "9e5a1ebfc5b29bbccdbfdde4c0dc621e": "Justice",
      "d1bccc85fc14440d4201e4fa2a7a88a2": "Congrats on your victory, Mrs. President! Here's to ours!",
      "e36f865a311bb2bcec6a270ab3083028": "Wrigley Field, Chicago"
    },

do this:

    {
      "86d88d9e4e39979f13b822d1643a95f4": "[answer_id corresponding to 😳]",
      "9e5a1ebfc5b29bbccdbfdde4c0dc621e": "[answer_id corresponding to Justice]",
      "d1bccc85fc14440d4201e4fa2a7a88a2": "Congrats on your victory, Mrs. President! Here's to ours!",
      "e36f865a311bb2bcec6a270ab3083028": "Wrigley Field, Chicago"
    },

from coral-ask-election-2016.

jde commented on May 28, 2024

Form digest json:
https://gist.github.com/jde/b31747705a0a4a89b6058b2d0a3fc858

from coral-ask-election-2016.

jde commented on May 28, 2024

New candidate is up. This time for the entire initial packet! (I cheated and stiched this together by hand, but the three sets of endpoints are getting pretty solid and flexible. We should be able to construct packets up to one level of grouping with what I have now.)

https://gist.github.com/jde/8acd74874d6c2972f7f55b6aaa1f550d

Initial data load < 12kb! (This assumes a page size of 10 for the submissions.)

from coral-ask-election-2016.

jde commented on May 28, 2024

Added code to strip the redundant aggregations of the same question that the group was based on... now coming in at 10.6kb! (same link as last comment.)

from coral-ask-election-2016.

jde commented on May 28, 2024

First draft data packet, generated and published end to end:

coral-bocoup-fileset.zip

all submissions here are reverse chronological and have been bookmarked.
I have not yet implemented pagination.

(x-posted to slack)

from coral-ask-election-2016.

jde commented on May 28, 2024

Updates to the current schema to be implemented:

Replace the mc key with MultipleChoice for consistency with type
Ensure that the current saved state of the form drives the aggregations
An "order" field to the questions in the digest.

from coral-ask-election-2016.

jde commented on May 28, 2024

Updates completed and pushed to askd in the coral-bocoup branch. To update:

cd $GOPATH/src/github.com/coralproject/shelf/cmd/askd
git checkout coral-bocoup
git pull origin coral-bocoup

coral-ask-form-aggregations.zip

from coral-ask-election-2016.

kadamwhite commented on May 28, 2024

@jde are you still thinking about paginating the JSON feeds for each grouping question?

from coral-ask-election-2016.

kadamwhite commented on May 28, 2024

Marking this as resolved

from coral-ask-election-2016.

Define JSON schema about coral-ask-election-2016 HOT 21 CLOSED

Comments (21)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent