Coder Social home page Coder Social logo

chaoss / grimoirelab-sirmordred Goto Github PK

View Code? Open in Web Editor NEW
37.0 17.0 119.0 14.42 MB

Orchestrate the execution of GrimoireLab tools to produce a dashboard

License: GNU General Public License v3.0

Python 97.76% Shell 0.71% Dockerfile 1.52%
grimoirelab orchestration software-analytics hacktoberfest

grimoirelab-sirmordred's Introduction

SirMordred Build Status Coverage Status PyPI version

SirMordred is the tool used to coordinate the execution of the GrimoireLab platform, via two main configuration files, the setup.cfg and projects.json, which are summarized in their corresponding sections.

Contents

Setup.cfg

The setup file holds the configuration to arrange all processes underlying GrimoireLab. It is composed of sections that allow defining the general settings such as which phases to activate (e.g., collection, enrichment) and where to store the logs, as well as the location and credentials for SortingHat and the ElasticSearch instances where the raw and enriched data is stored. Furthermore, it also includes backend sections to set up the parameters used by Perceval to access the software development tools (e.g., GitHub tokens, Gerrit username) and fetch their data.

Dashboards can be automatically uploaded via the setup.cfg if the phase panels is enabled. The Data Status and Overview dashboards will contain widgets that summarize the information of the data sources declared in the setup.cfg. Note that the widgets are not updated when adding new data sources, thus you need to manually delete the dashboards Data Status and Overview (under Stack Management > Saved Objects in Kibiter), and restart mordred again (making sure that the option panels is enabled).

[es_collection]

[es_enrichment]

  • autorefresh (bool: True): Execute the autorefresh of identities
  • autorefresh_interval (int: 2): Time interval (days) to autorefresh identities
  • url (str: http://172.17.0.1:9200): Elasticsearch URL (Required)

[general]

  • bulk_size (int: 1000): Number of items to write in Elasticsearch using bulk operations
  • debug (bool: True): Debug mode (logging mainly) (Required)
  • logs_dir (str: logs): Directory with the logs of sirmordred (Required)
  • min_update_delay (int: 60): Short delay between tasks (collect, enrich ...)
  • scroll_size (int: 100): Number of items to read from Elasticsearch when scrolling
  • short_name (str: Short name): Short name of the project (Required)
  • update (bool: False): Execute the tasks in loop (Required)
  • aliases_file (str: ./aliases.json): JSON file to define aliases for raw and enriched indexes
  • menu_file (str: ./menu.yaml): YAML file to define the menus to be shown in Kibiter
  • global_data_sources (list: bugzilla, bugzillarest, confluence, discourse, gerrit, jenkins, jira): List of data sources collected globally, they are declared in the section 'unknown' of the projects.json
  • retention_time (int: None): the maximum number of minutes wrt the current date to retain the data
  • update_hour (int: None): The hour of the day the tasks will run ignoring min_update_delay (collect, enrich ...)

[panels]

  • community (bool: True): Include community section in dashboard
  • kibiter_default_index (str: git): Default index pattern for Kibiter
  • kibiter_time_from (str: now-90d): Default time interval for Kibiter
  • kibiter_url (str): Kibiter URL (Required)
  • kibiter_version (str: None): Kibiter version
  • kafka (bool: False): Include KIP section in dashboard
  • github-comments (bool: False): Enable GitHub comments menu. Note that if enabled, the github2:issue and github2:pull sections in the setup.cfg and projects.json should be declared
  • github-events (bool: False): Enable GitHub events menu. Note that if enabled, the github:event section in the setup.cfg and projects.json should be declared
  • github-repos (bool: False): Enable GitHub repo stats menu. Note that if enabled, the github:repo section in the setup.cfg and projects.json should be declared
  • gitlab-issues (bool: False): Enable GitLab issues menu. Note that if enabled, the gitlab:issue section in the setup.cfg and projects.json should be declared
  • gitlab-merges (bool: False): Enable GitLab merge requests menu. Note that if enabled, the gitlab:merge sections in the setup.cfg and projects.json should be declared
  • mattermost (bool: False): Enable Mattermost menu
  • code-license (bool: False): Enable code license menu. Note that if enabled, colic sections in the setup.cfg and projects.json should be declared
  • code-complexity (bool: False): Enable code complexity menu. Note that if enabled, cocom sections in the setup.cfg and projects.json should be declared
  • strict (bool: True): Enable strict panels loading
  • contact (str: None): Support repository URL

[phases]

  • collection (bool: True): Activate collection of items (Required)
  • enrichment (bool: True): Activate enrichment of items (Required)
  • identities (bool: True): Do the identities tasks (Required)
  • panels (bool: True): Load panels, create alias and other tasks related (Required)

[projects]

  • projects_file (str: projects.json): Projects file path with repositories to be collected grouped by projects
  • projects_url (str: None): Projects file URL, the projects_file is required to store the file locally

[sortinghat]

  • affiliate (bool: True): Affiliate identities to organizations (Required)
  • autogender (bool: False): Add gender to the profiles (executes autogender)
  • autoprofile (list: ['customer', 'git', 'github']): Order in which to get the identities information for filling the profile (Required)
  • database (str: sortinghat_db): Name of the Sortinghat database (Required)
  • host (str: 127.0.0.1): Host with the Sortinghat database (Required)
  • port (int: None): GraphQL server port
  • path (str: None) GraphQL path
  • ssl (bool: False) GraphQL server use SSL/TSL connection
  • matching (list: ['email']): Algorithm for matching identities in Sortinghat (Required)
  • password (str: ): Password to access the Sortinghat database (Required)
  • reset_on_load (bool: False): Unmerge and remove affiliations for all identities on load
  • sleep_for (int: 3600): Delay between task identities executions (Required)
  • strict_mapping (bool: True): rigorous check of values in identities matching (i.e, well formed email addresses, non-overlapping enrollment periods)
  • unaffiliated_group (str: Unknown): Name of the organization for unaffiliated identities (Required)
  • user (str: root): User to access the Sortinghat database (Required)

[backend-name:tag] (tag is optional)

  • collect (bool: True): enable/disable collection phase
  • raw_index (str: None): Index name in which to store the raw items (Required)
  • enriched_index (str: None): Index name in which to store the enriched items (Required)
  • studies (list: []): List of studies to be executed
  • anonymize (bool: False): enable/disable anonymization of personal user information
  • backend-param-1: ..
  • backend-param-2: ..
  • backend-param-n: ..

The template of a backend section is shown above. Further information about Perceval backends parameters are available at:

Note that some backend sections allow to specify specific enrichment options, which are listed below.

[jenkins]

  • node_regex: regular expression for extracting node name from builtOn field. This regular expression must contain at least one group. First group will be used to extract node name. More groups are allowed but not used to extract anything else.

[studies-name:tag] (tag is optional)

  • study-param-1: ..
  • study-param-2: ..
  • study-param-n: ..

A template of a study section is shown above. A complete list of studies parameters is available at:

Projects.json

The projects.json aims at describing the repositories grouped by a project that will be shown on the dashboards.

The project file enables the users to list the instances of the software development tools to analyse, such as local and remote Git repositories, the URLs of GitHub and GitLab issue trackers, and the name of Slack channels. Furthermore, it also allows the users to organize these instances into nested groups, which structure is reflected in the visualization artifacts (i.e., documents and dashboards). Groups can be useful to represent projects within a single company, sub-projects within a large project such as Linux and Eclipse, or the organizations within a collaborative project.

  1. First level: project names
  2. Second level: data sources and metadata
  3. Third level: data source URLs

There are some filters, labels, and a special section:

  • Filter --filter-no-collection=true: This filter is used to show old enriched data within the dashboards from repositories that don't exist anymore in upstream.
  • Filter --filter-raw and the section unknown: The data sources will only collected at the section unknown but this allow to add the same source in different sections to enrich using the filter --filter-raw.
  • Label --labels=[example]: The data source will have the label of example which can be used to create visualisations for specific sets of data
  • Section unknown: If the data source is only under this section it will be enriched as project main.
{
    "Chaoss": {
        "gerrit": [
            "gerrit.chaoss.org --filter-raw=data.projects:CHAOSS"
        ]
        "git": [
            "https:/github.com/chaoss/grimoirelab-perceval",
            "https://<username>:<api-token>@github.com/chaoss/grimoirelab-sirmordred"
        ],
        "github": [
            "https:/github.com/chaoss/grimoirelab-perceval --filter-no-collection=true",
            "https:/github.com/chaoss/grimoirelab-sirmordred  --labels=[example]"
        ]
    },
    "GrimoireLab": {
        "gerrit": [
            "gerrit.chaoss.org --filter-raw=data.projects:GrimoireLab"
        ],
        "meta": {
            "title": "GrimoireLab",
            "type": "Dev",
            "program" : "Bitergia",
            "state": "Operating"
        },
    }
    "unknown": {
        "gerrit": [
            "gerrit.chaoss.org"
        ],
        "confluence": [
            "https://wiki.chaoss.org"
        ]
}

In the projects.json above:

  • The data included in the repo gerrit.chaoss.org will be collected entirely since the repo is listed in the unknown section. However only the project GrimoireLab will be enriched as declared in the GrimoireLab section.
  • In the section Chaoss in the data source github the repository grimoirelab-perceval is not collected for raw index but it will enriched in the enriched index.
  • In the section GrimoireLab the metadata will showed in the enriched index as extra fields.
  • In the section unknown the data source confluence will be enriched as the project main.

Supported data sources

These are the data sources GrimoireLab supports: askbot, bugzilla, bugzillarest, cocom, colic, confluence, crates, discourse, dockerhub, dockerdeps, dockersmells, functest, gerrit, git, github, github2, gitlab, gitter, google_hits, groupsio, hyperkitty, jenkins, jira, kitsune, mattermost, mbox, mediawiki, meetup, mozillaclub, nntp, pagure, phabricator, pipermail, puppetforge, redmine, remo, rocketchat, rss, slack, stackexchange, supybot, telegram, twitter

askbot

Questions and answers from Askbot site

  • projects.json
{
    "Chaoss": {
        "askbot": [
            "https://askbot.org/"
        ]
    }
}
  • setup.cfg
[askbot]
raw_index = askbot_raw
enriched_index = askbot_enriched

bugzilla

Bugs from Bugzilla

  • projects.json
{
    "Chaoss": {
        "bugzilla": [
            "https://bugs.eclipse.org/bugs/"
        ]
    }
}
  • setup.cfg
[bugzilla]
raw_index = bugzilla_raw
enriched_index = bugzilla_enriched
backend-user = yyyy (optional)
backend-password = xxxx (optional)
no-archive = true (suggested)

bugzillarest

Bugs from Bugzilla server (>=5.0) using its REST API

  • projects.json
{
    "Chaoss": {
        "bugzillarest": [
            "https://bugzilla.mozilla.org"
        ]
    }
}
  • setup.cfg
[bugzillarest]
raw_index = bugzillarest_raw
enriched_index = bugzillarest_enriched
backend-user = yyyy (optional)
backend-password = xxxx (optional)
no-archive = true (suggested)

cocom

Code complexity integration. Some graal dependencies like cloc might be required, https://github.com/chaoss/grimoirelab-graal#how-to-installcreate-the-executables

  • projects.json
{
    "Chaoss":{
        "cocom": [
            "https://github.com/chaoss/grimoirelab-toolkit"
        ]
    }
}
  • setup.cfg
[cocom]
raw_index = cocom_chaoss
enriched_index = cocom_chaoss_enrich
category = code_complexity_lizard_file
studies = [enrich_cocom_analysis]
branches = master
worktree-path = /tmp/cocom/

colic

Code license backend.

  • projects.json
{
    "Chaoss":{
        "colic": [
            "https://github.com/chaoss/grimoirelab-toolkit"
        ]
    }
}
  • setup.cfg
[colic]
raw_index = colic_chaoss
enriched_index = colic_chaoss_enrich
category = code_license_nomos
studies = [enrich_colic_analysis]
exec-path = /usr/share/fossology/nomos/agent/nomossa
branches = master
worktree-path = /tmp/colic

confluence

contents from Confluence

  • projects.json
{
    "Chaoss": {
        "confluence": [
            "https://wiki.open-o.org/"
        ]
    }
}
  • setup.cfg
[confluence]
raw_index = confluence_raw
enriched_index = confluence_enriched
no-archive = true (suggested)

crates

packages from Crates.io

  • projects.json
{
    "Chaoss": {
        "crates": [
            ""
        ]
    }
}
  • setup.cfg
[crates]
raw_index = crates_raw
enriched_index = crates_enriched

discourse

Topics from Discourse

  • projects.json
{
    "Chaoss": {
        "discourse": [
            "https://foro.mozilla-hispano.org/"
        ]
    }
}
  • setup.cfg
[discourse]
raw_index = discourse_raw
enriched_index = discourse_enriched
no-archive = true (suggested)

dockerhub

Repositories info from DockerHub

  • projects.json
{
    "Chaoss": {
        "dockerhub": [
            "bitergia kibiter"
        ]
    }
}
  • setup.cfg
[dockerhub]
raw_index = dockerhub_raw
enriched_index = dockerhub_enriched
no-archive = true (suggested)

dockerdeps

Dependencies extracted from Dockerfiles. Requires https://github.com/crossminer/crossJadolint

  • projects.json
{
    "Chaoss": {
        "dockerdeps": [
            "https://github.com/chaoss/grimoirelab"
        ]
    }
}
  • setup.cfg
[dockerdeps]
raw_index = dockerdeps_raw
enriched_index = dockerdeps_enrich
category = code_dependencies_jadolint
exec-path = <jadolint-local-path>/jadolint.jar
in-paths = [Dockerfile, Dockerfile-full, Dockerfile-secured, Dockerfile-factory, Dockerfile-installed]

dockersmells

Smells extracted from Dockerfiles. Requires https://github.com/crossminer/crossJadolint

  • projects.json
{
    "Chaoss": {
        "dockersmells": [
            "https://github.com/chaoss/grimoirelab"
        ]
    }
}
  • setup.cfg
[dockersmells]
raw_index = dockersmells_raw
enriched_index = dockersmells_enrich
category = code_quality_jadolint
exec-path = <jadolint-local-path>/jadolint.jar
in-paths = [Dockerfile, Dockerfile-full, Dockerfile-secured, Dockerfile-factory, Dockerfile-installed]

functest

Tests from functest

  • projects.json
{
    "Chaoss": {
        "functest": [
            "http://testresults.opnfv.org/test/"
        ]
    }
}
  • setup.cfg
[functest]
raw_index = functest_raw
enriched_index = functest_enriched
no-archive = true (suggested)

gerrit

Reviews from Gerrit

You have to add your public key in the gerrit server.

  • projects.json
{
    "Chaoss": {
        "gerrit": [
            "review.opendev.org"
        ]
    }
}
  • setup.cfg
[gerrit]
raw_index = gerrit_raw
enriched_index = gerrit_enriched
user = xxxx
no-archive = true (suggested)
blacklist-ids = [] (optional)
max-reviews = 500 (optional)
studies = [enrich_demography:gerrit, enrich_onion:gerrit, enrich_demography_contribution:gerrit] (optional)

[enrich_demography:gerrit] (optional)

[enrich_onion:gerrit] (optional)
in_index = gerrit_enriched
out_index = gerrit-onion_enriched

[enrich_demography_contribution:gerrit] (optional)
date_field = grimoire_creation_date
author_field = author_uuid

git

Commits from Git

Note: If you want to analyze private git repositories, make sure you pass the credentials directly in the URL.

  • projects.json
{
    "Chaoss": {
        "git": [
            "https:/github.com/chaoss/grimoirelab-perceval",
            "https://<username>:<api-token>@github.com/chaoss/grimoirelab-sirmordred"
        ]
    }
}
  • setup.cfg
[git]
raw_index = git_raw
enriched_index = git_enriched
latest-items = true (suggested)
studies = [enrich_demography:git, enrich_git_branches:git, enrich_areas_of_code:git, enrich_onion:git, enrich_extra_data:git] (optional)

[enrich_demography:git] (optional)

[enrich_git_branches:git] (optional)
run_month_days = [1, 23] (optional)

[enrich_areas_of_code:git] (optional)
in_index = git_raw
out_index = git-aoc_enriched

[enrich_onion:git] (optional)
in_index = git_enriched
out_index = git-onion_enriched

[enrich_extra_data:git]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt

[enrich_forecast_activity]
out_index = git_study_forecast

github

Issues and PRs from GitHub

issue
  • projects.json
{
    "Chaoss": {
        "github:issue": [
            "https:/github.com/chaoss/grimoirelab-perceval",
            "https:/github.com/chaoss/grimoirelab-sirmordred"
        ]
    }
}
  • setup.cfg
[github:issue]
raw_index = github_raw
enriched_index = github_enriched
api-token = xxxx
category = issue
sleep-for-rate = true
no-archive = true (suggested)
studies = [enrich_onion:github,
           enrich_geolocation:user,
           enrich_geolocation:assignee,
           enrich_extra_data:github,
           enrich_backlog_analysis,
           enrich_demography:github] (optional)

[enrich_onion:github] (optional)
in_index_iss = github_issues_onion-src
in_index_prs = github_prs_onion-src
out_index_iss = github-issues-onion_enriched
out_index_prs = github-prs-onion_enriched

[enrich_geolocation:user] (optional)
location_field = user_location
geolocation_field = user_geolocation

[enrich_geolocation:assignee] (optional)
location_field = assignee_location
geolocation_field = assignee_geolocation

[enrich_extra_data:github]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt

[enrich_backlog_analysis]
out_index = github_enrich_backlog
interval_days = 7
reduced_labels = [bug,enhancement]
map_label = [others, bugs, enhancements]

[enrich_demography:github]
pull request
  • projects.json
{
    "Chaoss": {
        "github:pull": [
            "https:/github.com/chaoss/grimoirelab-perceval",
            "https:/github.com/chaoss/grimoirelab-sirmordred"
        ]
    }
}
  • setup.cfg
[github:pull]
raw_index = github-pull_raw
enriched_index = github-pull_enriched
api-token = xxxx
category = pull_request
sleep-for-rate = true
no-archive = true (suggested)
studies = [enrich_geolocation:user,
           enrich_geolocation:assignee,
           enrich_extra_data:github,
           enrich_demography:github] (optional)

[enrich_geolocation:user]
location_field = user_location
geolocation_field = user_geolocation

[enrich_geolocation:assignee]
location_field = assignee_location
geolocation_field = assignee_geolocation

[enrich_extra_data:github]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt

[enrich_demography:github]
repo

The number of forks, stars, and subscribers in the repository.

  • projects.json
{
    "Chaoss": {
        "github:repo": [
            "https:/github.com/chaoss/grimoirelab-perceval",
            "https:/github.com/chaoss/grimoirelab-sirmordred"
        ]
    }
}
  • setup.cfg
[github:repo]
raw_index = github-repo_raw
enriched_index = github-repo_enriched
api-token = xxxx
category = repository
sleep-for-rate = true
no-archive = true (suggested)
studies = [enrich_extra_data:github, enrich_demography:github]

[enrich_extra_data:github]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt

[enrich_demography:github]

githubql

Events from GitHub

The corresponding dashboards can be automatically uploaded by setting github-events to true in the panels section within the setup.cfg

  • projects.json
{
    "Chaoss": {
        "githubql": [
            "https://github.com/chaoss/grimoirelab-toolkit"
        ]
    }
}
  • setup.cfg
[panels]
github-events = true

[githubql]
raw_index = github_event_raw
enriched_index = github_event_enriched
api-token = xxxxx
sleep-for-rate = true
sleep-time = "300" (optional)
no-archive = true (suggested)
studies = [enrich_duration_analysis:kanban, enrich_reference_analysis] (optional)

[enrich_duration_analysis:kanban]
start_event_type = MovedColumnsInProjectEvent
fltr_attr = board_name
target_attr = board_column
fltr_event_types = [MovedColumnsInProjectEvent, AddedToProjectEvent]

[enrich_duration_analysis:label]
start_event_type = UnlabeledEvent
target_attr = label
fltr_attr = label
fltr_event_types = [LabeledEvent]

[enrich_reference_analysis] (optional)

github2

Comments from GitHub

The corresponding dashboards can be automatically uploaded by setting github-comments to true in the panels section within the setup.cfg

issue
  • projects.json
{
    "Chaoss": {
        "github2:issue": [
            "https:/github.com/chaoss/grimoirelab-perceval",
            "https:/github.com/chaoss/grimoirelab-sirmordred"
        ]
    }
}
  • setup.cfg
[github2:issue]
api-token = xxxx
raw_index = github2-issues_raw
enriched_index = github2-issues_enriched
sleep-for-rate = true
category = issue
no-archive = true (suggested)
studies = [enrich_geolocation:user, enrich_geolocation:assignee, enrich_extra_data:github2, enrich_feelings] (optional)

[enrich_geolocation:user] (optional)
location_field = user_location
geolocation_field = user_geolocation

[enrich_geolocation:assignee] (optional)
location_field = assignee_location
geolocation_field = assignee_geolocation

[enrich_extra_data:github2]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt

[enrich_feelings]
attributes = [title, body]
nlp_rest_url = http://localhost:2901
pull request
  • projects.json
{
    "Chaoss": {
        "github2:pull": [
            "https:/github.com/chaoss/grimoirelab-perceval",
            "https:/github.com/chaoss/grimoirelab-sirmordred"
        ]
    }
}
  • setup.cfg
[github2:pull]
api-token = xxxx
raw_index = github2-pull_raw
enriched_index = github2-pull_enriched
sleep-for-rate = true
category = pull_request
no-archive = true (suggested)
studies = [enrich_geolocation:user, enrich_geolocation:assignee, enrich_extra_data:git, enrich_feelings] (optional)

[enrich_geolocation:user] (optional)
location_field = user_location
geolocation_field = user_geolocation

[enrich_geolocation:assignee] (optional)
location_field = assignee_location
geolocation_field = assignee_geolocation

[enrich_extra_data:github2]
json_url = https://gist.githubusercontent.com/zhquan/bb48654bed8a835ab2ba9a149230b11a/raw/5eef38de508e0a99fa9772db8aef114042e82e47/bitergia-example.txt

[enrich_feelings]
attributes = [title, body]
nlp_rest_url = http://localhost:2901

gitlab

Issues and MRs from GitLab

GitLab issues and merge requests need to be configured in two different sections. The corresponding dashboards can be automatically uploaded by setting gitlab-issue and gitlab-merge to true in the panels section within the setup.cfg

If a given GitLab repository is under more than 1 level, all the slashes / starting from the second level have to be replaced by %2F. For instance, for a repository with a structure similar to this one https://gitlab.com/Molly/lab/first.

issue
  • projects.json
{
    "Chaoss": {
        "gitlab:issue": [
            "https://gitlab.com/Molly/first",
            "https://gitlab.com/Molly/lab%2Fsecond"
        ]
    }
}
  • setup.cfg
[panels]
gitlab-issues = true

[gitlab:issue]
category = issue
raw_index = gitlab-issues_raw
enriched_index = gitlab-issues_enriched
api-token = xxxx
sleep-for-rate = true
no-archive = true (suggested)
studies = [enrich_onion:gitlab-issue] (optional)

[enrich_onion:gitlab-issue] (optional)
in_index = gitlab-issues_enriched
out_index = gitlab-issues-onion_enriched
data_source = gitlab-issues
merge request
  • projects.json
{
    "Chaoss": {
        "gitlab:merge": [
            "https://gitlab.com/Molly/first",
            "https://gitlab.com/Molly/lab%2Fsecond"
        ],
    }
}
  • setup.cfg
[panels]
gitlab-merges = true

[gitlab:merge]
category = merge_request
raw_index = gitlab-mrs_raw
enriched_index = gitlab-mrs_enriched
api-token = xxxx
sleep-for-rate = true
no-archive = true (suggested)
studies = [enrich_onion:gitlab-merge] (optional)

[enrich_onion:gitlab-merge] (optional)
in_index = gitlab-mrs_enriched
out_index = gitlab-mrs-onion_enriched
data_source = gitlab-merges

gitter

Messages from gitter rooms

You have to join the rooms you want to mine.

  • projects.json
{
    "Chaoss": {
        "gitter": [
            "https://gitter.im/jenkinsci/jenkins",
        ]
    }
}
  • setup.cfg
[gitter]
raw_index = gitter_raw
enriched_index = gitter_enriched_raw
api-token = xxxxx
sleep-for-rate = true
sleep-time = "300" (optional)
no-archive = true (suggested)

google_hits

Number of hits for a set of keywords from Google

  • projects.json
{
    "Chaoss": {
        "google_hits": [
            "bitergia grimoirelab"
        ]
    }
}
  • setup.cfg
[google_hits]
raw_index = google_hits_raw
enriched_index =google_hits_enriched

groupsio

Messages from Groupsio

To know the lists you are subscribed to: https://gist.github.com/valeriocos/ad33a0b9b2d13a8336230c8c59df3c55

  • projects.json
{
    "Chaoss": {
        "groupsio": [
            "group1",
            "group2"
        ]
    }
}
  • setup.cfg
[groupsio]
raw_index = groupsio_raw
enriched_index = groupsio_enriched
email = yyyy
password = xxxx

hyperkitty

Messages from a HyperKitty

  • projects.json
{
    "Chaoss": {
        "hyperkitty": [
            "https://lists.mailman3.org/archives/list/[email protected]"
        ]
    }
}
  • setup.cfg
[hyperkitty]
raw_index = hyperkitty_raw
enriched_index = hyperkitty_enriched

jenkins

Builds from a Jenkins

  • projects.json
{
    "Chaoss": {
        "jenkins": [
            "https://build.opnfv.org/ci"
        ]
    }
}
  • setup.cfg
[jenkins]
raw_index = jenkins_raw
enriched_index = jenkins_enriched
no-archive = true (suggested)

jira

Issues data from JIRA issue trackers

  • projects.json
{
    "Chaoss":{
        "jira": [
            "https://jira.opnfv.org"
        ]
    }
}
  • setup.cfg
[jira]
raw_index = jira_raw
enriched_index = jira_enriched
no-archive = true (suggested)
backend-user = yyyy (optional)
backend-password = xxxx (optional)

kitsune

Questions and answers from KitSune

  • projects.json
{
    "Chaoss": {
        "kitsune": [
            ""
        ]
    }
}
  • setup.cfg
[kitsune]
raw_index = kitsune_raw
enriched_index = kitsune_enriched

mattermost

Messages from Mattermost channels

  • projects.json
{
    "Chaoss": {
        "mattermost": [
            "https://chat.openshift.io 8j366ft5affy3p36987pcugaoa"
        ]
    }
}
  • setup.cfg
[mattermost]
raw_index = mattermost_raw
enriched_index = mattermost_enriched
api-token = xxxx
studies = [enrich_demography:mattermost] (optional)

[enrich_demography:mattermost] (optional)

mbox

Messages from MBox files

For mbox files, it is needed the name of the mailing list and the path where the mboxes can be found. In the example below, the name of the mailing list is set to "mirageos-devel".

  • projects.json
{
    "Chaoss": {
        "mbox": [
            "mirageos-devel /home/bitergia/mbox/mirageos-devel/"
        ]
    }
}
  • setup.cfg
[mbox]
raw_index = mbox_raw
enriched_index = mbox_enriched

mediawiki

Pages and revisions from MediaWiki

-projects.json

{
    "Chaoss": {
        "mediawiki": [
            "https://www.mediawiki.org/w https://www.mediawiki.org/wiki"
        ]
    }
}
  • setup.cfg
[mediawiki]
raw_index = mediawiki_raw
enriched_index = mediawiki_enriched
no-archive = true (suggested)

meetup

Events from Meetup groups

For meetup groups it is only needed the identifier of the meetup group and an API token: https://chaoss.github.io/grimoirelab-tutorial/gelk/meetup.html#gathering-meetup-groups-data

  • projects.json
{
    "Chaoss": {
        "meetup": [
        "Alicante-Bitergia-Users-Group",
        "South-East-Bitergia-User-Group"
        ]
    }
}
  • setup.cfg
[meetup]
raw_index = meetup_raw
enriched_index = meetup_enriched
api-token = xxxx
sleep-for-rate = true
sleep-time = "300" (optional)
no-archive = true (suggested)

mozillaclub

Events from Mozillaclub

  • projects.json
{
    "Chaoss": {
        "mozillaclub": [
            "https://spreadsheets.google.com/feeds/cells/1QHl2bjBhMslyFzR5XXPzMLdzzx7oeSKTbgR5PM8qp64/ohaibtm/public/values?alt=json"
        ]
    }
}
  • setup.cfg
[mozillaclub]
raw_index = mozillaclub_raw
enriched_index = mozillaclub_enriched

nntp

Articles from NNTP newsgroups

The way to setup netnews is adding the server and the news channel to be monitored. In the example below, the news.myproject.org is the server name.

  • projects.json
{
    "Chaoss": {
        "nntp": [
            "news.myproject.org mozilla.dev.tech.crypto.checkins",
            "news.myproject.org mozilla.dev.tech.electrolysis",
            "news.myproject.org mozilla.dev.tech.gfx",
            "news.myproject.org mozilla.dev.tech.java"
        ]
    }
}
  • setup.cfg
[nntp]
raw_index = nntp_raw
enriched_index =  nntp_enriched

pagure

Issues from Pagure repositories

  • projects.json
{
    "Chaoss": {
        "pagure": [
            "https://pagure.io/Test-group/Project-example-namespace"
        ]
    }
}
  • setup.cfg
[pagure]
raw_index = pagure_raw
enriched_index = pagure_enriched
api-token = xxxx
sleep-for-rate = true
sleep-time = "300" (optional)
no-archive = true (suggested)

phabricator

Tasks from Phabricator

  • projects.json
{
    "Chaoss": {
        "phabricator": [
            "https://phabricator.wikimedia.org"
        ]
    }
}
  • setup.cfg
[phabricator]
raw_index = phabricator_raw
enriched_index = phabricator_enriched
api-token = xxxx
no-archive = true (suggested)

pipermail

Messages from Pipermail

  • projects.json
{
    "Chaoss": {
        "pipermail": [
            "https://lists.linuxfoundation.org/pipermail/grimoirelab-discussions/"
        ]
    }
}
  • setup.cfg
[pipermail]
raw_index = pipermail_raw
enriched_index = pipermail_enriched

puppetforge

Modules and their releases from Puppet's forge

  • projects.json
{
    "Chaoss": {
        "puppetforge": [
            ""
        ]
    }
}
  • setup.cfg
[puppetforge]
raw_index = puppetforge_raw
enriched_index = puppetforge_enriched

redmine

Issues from Redmine

  • project.json
{
    "Chaoss": {
        "redmine": [
            "http://tracker.ceph.com/"
        ]
    }
}
  • setup.cfg
[redmine]
raw_index = redmine_raw
enriched_index = redmine_enriched
api-token = XXXXX

remo

Events, people and activities from ReMo

  • project.json
{
    "Chaoss": {
        "remo": [
            "https://reps.mozilla.org"
        ]
    }
}
  • setup.cfg
[remo]
raw_index = remo_raw
enriched_index = remo_enriched

rocketchat

Messages from Rocketchat channels

  • projects.json
{
    "Chaoss": {
        "rocketchat": [
            "https://open.rocket.chat general"
        ]
    }
}
  • setup.cfg
[rocketchat]
raw_index = rocketchat_raw
enriched_index = rocketchat_enriched
api-token = xxxx
sleep-for-rate = true
user-id = xxxx
no-archive = true (suggested)

rss

Entries from RSS feeds

  • project.json
{
    "Chaoss": {
        "remo": [
            "https://reps.mozilla.org"
        ]
    }
}
  • setup.cfg
[rss]
raw_index = rss_raw
enriched_index = rss_enriched

slack

Messages from Slack channels

The information needed to monitor slack channels is the channel id.

  • projects.json
{
    "Chaoss": {
        "slack": [
            "A195YQBLL",
            "A495YQBM2"
        ]
    }
}
  • setup.cfg
[slack]
raw_index = slack_raw
enriched_index = slack_enriched
api-token = xxxx
no-archive = true (suggested)

stackexchange

Questions, answers and comments from StackExchange

  • projects.json
{
    "Chaoss": {
        "stackexchange": [
            "http://stackoverflow.com/questions/tagged/chef",
            "http://stackoverflow.com/questions/tagged/chefcookbook",
            "http://stackoverflow.com/questions/tagged/ohai",
            "http://stackoverflow.com/questions/tagged/test-kitchen",
            "http://stackoverflow.com/questions/tagged/knife"
        ]
    }
}
  • setup.cfg
[stackexchange]
raw_index = stackexchange_raw
enriched_index = stackexchange_enriched
api-token = xxxx
no-archive = true (suggested)

supybot

Messages from Supybot log files

For supybot files, it is needed the name of the IRC channel and the path where the logs can be found. In the example below, the name of the channel is set to "irc://irc.freenode.net/atomic".

  • projects.json
{
    "Chaoss": {
        "supybot": [
            "irc://irc.freenode.net/atomic /home/bitergia/irc/percevalbot/logs/ChannelLogger/freenode/#atomic"
        ]
    }
}
  • setup.cfg
[supybot]
raw_index = supybot_raw
enriched_index = supybot_enriched

telegram

Messages from Telegram

You need to have an API token: https://github.com/chaoss/grimoirelab-perceval#telegram

  • projects.json
{
    "Chaoss": {
        "telegram": [
            "Mozilla_analytics"
        ]
    }
}
  • setup.cfg
[telegram]
raw_index = telegram_raw
enriched_index = telegram_enriched
api-token = XXXXX

twitter

Messages from Twitter

You need to provide a search query and an API token (which requires to create an app). The script at https://gist.github.com/valeriocos/7d4d28f72f53fbce49f1512ba77ef5f6 helps obtaining a token.

  • projects.json
{
    "Chaoss": {
        "twitter": [
            "bitergia"
        ]
    }
}
  • setup.cfg
[twitter]
raw_index = twitter_raw
enriched_index = twitter_enriched
api-token = XXXX

weblate

Changes from Weblate

You need to have an API token: The token can be obtained after registering to a weblate instance (e.g., https://translations.documentfoundation.org/), via the page /accounts/profile/#api

  • projects.json
{
    "Chaoss": {
        "weblate": [
            "https://translations.documentfoundation.org"
        ]
    }
}
  • setup.cfg
[weblate]
raw_index = weblate_raw
enriched_index = weblate_enriched
api-token = XXXX
no-archive = true (suggested)
sleep-for-rate = true (suggested)
studies = [enrich_demography:weblate] (optional)

[enrich_demography:weblate] (optional)

Micro Mordred

Micro Mordred is a simplified version of Mordred which omits the use of its scheduler. Thus, Micro Mordred allows running single Mordred tasks (e.g., raw collection, enrichment) per execution.

Micro Mordred is located in the sirmordred/utils folder of this same repository. It can be executed via command line, its parameters are summarized below:

--debug: execute Micro Mordred in debug mode

--raw: activate raw task

--enrich: activate enrich task

--identities: activate merge identities task

--panels: activate panels task

--cfg: path of the configuration file

--backends: list of cfg sections where the active tasks will be executed

--logs-dir: single parameter denoting the path of folder in which logs are to be stored

Examples of possible executions are shown below:

cd .../grimoirelab-sirmordred/sirmordred/utils/
micro.py --raw --enrich --cfg ./setup.cfg --backends git # execute the Raw and Enrich tasks for the Git cfg section
micro.py --panels # execute the Panels task to load the Sigils panels to Kibiter
micro.py --raw --enrich --debug --cfg ./setup.cfg --backends groupsio --logs-dir logs # execute the raw and enriched tasks for the groupsio cfg section with debug mode on and logs being saved in the folder logs in the same directory as micro.py

grimoirelab-sirmordred's People

Contributors

abhiandthetruth avatar albertinisg avatar alch-emi avatar animeshk08 avatar aswanipranjal avatar canasdiaz avatar dependabot[bot] avatar dlumbrer avatar dpose avatar heming6666 avatar imnitishng avatar inishchith avatar jgbarah avatar jjmerchante avatar k----n avatar kaushik-p9 avatar kritisingh1 avatar kshitij3199 avatar mafesan avatar rcheesley avatar ria18405 avatar sduenas avatar snack0verflow avatar sp35 avatar sumitskj avatar trbs avatar valeriocos avatar vchrombie avatar willemjiang avatar zhquan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grimoirelab-sirmordred's Issues

[studies] Error in configuration file

The following configuration for studies raises an error when running Mordred (see below):

studies = []

[enrich_demography]

[enrich_areas_of_code]

The error is raised after retrieval of git data is done:

Enrichment for git: starting...
2018-06-04 15:29:09,565 - mordred.task_enrich - ERROR - Missing config for study :
2018-06-04 15:29:09,565 - mordred.task_manager - ERROR - Exception in Task Manager Missing config for study :
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_manager.py", line 92, in run
    task.execute()
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_enrich.py", line 291, in execute
    raise e
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_enrich.py", line 281, in execute
    self.__enrich_items()
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_enrich.py", line 126, in __enrich_items
    studies_args = self.__load_studies()
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_enrich.py", line 73, in __load_studies
    raise DataEnrichmentError(msg)
mordred.error.DataEnrichmentError: Missing config for study :
Exception in thread git:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_manager.py", line 92, in run
    task.execute()
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_enrich.py", line 291, in execute
    raise e
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_enrich.py", line 281, in execute
    self.__enrich_items()
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_enrich.py", line 126, in __enrich_items
    studies_args = self.__load_studies()
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_enrich.py", line 73, in __load_studies
    raise DataEnrichmentError(msg)
mordred.error.DataEnrichmentError: Missing config for study :

This with 18.05-03.

Maybe this could be related to #155?

keyError: 'assignee_data'

I am getting this error with mordred 'bitergia/mordred:18.04-01'

2018-04-13 07:45:52,128 - grimoire_elk.elk - ERROR - Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/elk.py", line 583, in enrich_backend
total_ids = load_identities(ocean_backend, enrich_backend)
File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/elk.py", line 380, in load_identities
identities = enrich_backend.get_identities(item)
File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/enriched/github.py", line 134, in get_identities
user = self.get_sh_identity(item[identity + "_data"])
KeyError: 'assignee_data'

Git Repositories Not Updating

After configuring and setting up the dashboard using mordred, my dashboard is unable to retrieve new data from my git repositories.

Issue running areas of code study

From chaoss/grimoirelab-elk#349

In the original issue you can find more details on installed packages and tests done to try to reproduce it from https://github.com/chaoss/grimoirelab-elk.

@jsmanrique wrote:

I am getting the following error running an areas of code analysis:

Executing for git the studies ['enrich_demography', 'enrich_areas_of_code']
/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/pandas/core/ops.py:816: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
 result = getattr(x, name)(y)
2018-05-11 07:46:03,905 - grimoire_elk.elk - ERROR - Problem executing study <bound method GitEnrich.enrich_areas_of_code of <grimoire_elk.enriched.git.GitEnrich object at 0x7f0f6ccf9c18>>
Traceback (most recent call last):
 File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/grimoire_elk/elk.py", line 460, in do_studies
   study(enrich_backend, no_incremental)
 File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/grimoire_elk/enriched/git.py", line 787, in enrich_areas_of_code
   areas_of_code(git_enrich=enrich_backend, in_conn=in_conn, out_conn=out_conn)
 File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/grimoire_elk/enriched/study_ceres_aoc.py", line 194, in areas_of_code
   ndocs = aoc.analyze()
 File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/grimoire_elk/enriched/ceres_base.py", line 93, in analyze
   process_results = self.process(item_block)
 File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/grimoire_elk/enriched/study_ceres_aoc.py", line 158, in process
   events_df = data_filtered.filter_(["filepath"], "-")
 File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/cereslib/dfutils/filter.py", line 71, in filter_
   self.data = self.data[self.data[column] != value]
 File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/pandas/core/ops.py", line 879, in wrapper
   res = na_op(values, other)
 File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/pandas/core/ops.py", line 818, in na_op
   raise TypeError("invalid type comparison")
TypeError: invalid type comparison

Config files:

My mordred.cfg file:

[general]
short_name = Project_Name
update = false
min_update_delay = 10
debug = true
logs_dir = /tmp/logs

[projects]
projects_file = projects.json

[es_collection]
url = http://localhost:9200
user =
password =

[es_enrichment]
url = http://127.0.0.1:9200
user =
password =
autorefresh = true

[sortinghat]
host = localhost
user = root
password = *******************
database = shdb
load_orgs = true
orgs_file = orgs.json
unaffiliated_group = Unknown
affiliate = true
autoprofile = [github,git]
matching = [email-name]
sleep_for = 0
#bots_names = []

[panels]
kibiter_time_from= "now-90d"
kibiter_default_index= "git"

[phases]
collection = true
identities = true
enrichment = true
panels = true

[git]
raw_index = git-gathered-raw
enriched_index = git-gathered
studies = [enrich_demography, enrich_areas_of_code]

[github]
raw_index = github-gathered-raw
enriched_index = github-gathered
api-token = **************************************************
sleep-for-rate = true
sleep-time = 300
no-archive = true

[*pipermail]
raw_index = mls-gathered-raw
enriched_index = mls-gathered

[*meetup]
raw_index = meetup-gathered-raw
enriched_index = meetup-gathered
api-token = *****************************************
sleep-for-rate = true
no-archive = true

[*stackexchange]
raw_index = stackexchange-gathered-raw
enriched_index = stackexchange-gathered
api-token = *************************************

Grimoirelab deployed via docker does not render anything in dashboard

Hi!

I have recently stood up a grimoirelab deployment via docker. It seems to be up! I can access it on a URL! Yay!

However, the first screen I see when loading it up in the browser is this:

screen shot 2018-04-10 at 6 49 05 pm

I wonder what I did wrong? I'm not even sure how to start debugging this. Would someone be willing to help me sort this out?

My relevant configs:

-bash-4.2$ cat credentials.cfg
[github]
api-token = supersecret
enterprise-url = https://git.corp.adobe.com

[projects]
projects_file = /projects.json
-bash-4.2$ cat projects.json
{
    "opensource_submission_process": {
        "github": [
            "https://git.corp.adobe.com/OpenSourceAdvisoryBoard/opensource_submission_process"
        ],
        "git": [
            "https://git.corp.adobe.com/OpenSourceAdvisoryBoard/opensource_submission_process"
        ]
    }
}

Here's how the docker container was stood up:

docker run --name grim -d -p 127.0.0.1:9200:9200 -p 127.0.0.1:5601:5601 -v $(pwd)/logs:/logs -v $(pwd)/es-data:/var/lib/elasticsearch -v $(pwd)/credentials.cfg:/override.cfg -v $(pwd)/projects.json:/projects.json -t grimoirelab/full

Thanks in advance for any help!

Cheers,
Filip Maj

[conf] Split conf file in parts

Most of the configuration file is about the dashboard itself. But a tiny part has to do with the infrastructure needed to produce it: now, that's mainly links to ElasticSearch and MariaDB. I think it would be better to have that part in a separate file. The first one could even be public, so that anyone seeing the dashboard could have access to it. The second one could be private, so that infrastructure details don't get compromised.

Time spent enriching data is not well calculated

Using the release elasticgirl.17 I've seen that we are doing the wrong time calculation when the enrichment lasts more than a day.

See these two lines:

2017-10-26 11:39:37,216 - mordred.task_enrich - INFO - [git] enrichment starts                 
..
2017-10-27 18:26:58,117 - mordred.task_enrich - INFO - [git] enrichment finished in 06:47:20 

Complete log below:

2017-10-26 11:39:35,183 - mordred.mordred - INFO -                                             
2017-10-26 11:39:35,183 - mordred.mordred - INFO - ----------------------------                
2017-10-26 11:39:35,183 - mordred.mordred - INFO - Starting Mordred engine ...                 
2017-10-26 11:39:35,183 - mordred.mordred - INFO - - - - - - - - - - - - - - -                 
2017-10-26 11:39:36,215 - mordred.mordred - INFO - [<class 'mordred.task_projects.TaskProjects'>] will be executed on Fri, 27 Oct 2017 11:39:36                                               
2017-10-26 11:39:37,216 - mordred.task_enrich - INFO - [git] enrichment starts                 
2017-10-27 11:39:37,317 - mordred.task_projects - INFO - Reading projects data from  /home/bitergia/conf/sources/projects.json                                                                
2017-10-27 18:26:58,117 - mordred.task_enrich - INFO - [git] enrichment finished in 06:47:20   
2017-10-27 18:26:59,182 - mordred.task_enrich - INFO - [git] enrichment starts 

Doubt about how to configure GrimoireLab / mordred.cfg

I am deploying a new demo of GrimoireLab and I would like to test affiliations information management.

As first step, I've seen there are dedicated params in mordred.cfg about this. I've been reading the docs. But, it's not clear for me if which params I need to touch. Let me explain why.

If I've understood right, affiliate, load_orgs, and orgs_file, are the main params. And I plan to use this orgs.json file as data source:

And this is what the doc says:

affiliate (bool: True): Affiliate identities to organizations (Required)
load_orgs (bool: False):
orgs_file (str: None): File path with the organizations to be loaded in Sortinghat

So, without any further docs (I've not found anything else about these params anywhere), I think the values should be:

affiliate: True
load_orgs: True
orgs_file: orgs.json

But, then, I check current examples provided in the main GrimoireLab repository (1, 2) and I see this:

affiliate: True
load_orgs: false
orgs_file: orgs.json

Could someone clarify why or how is the right way to work?

My objective is that at least, people committing using a corporate account email domain are rightly affiliated.

Thanks!

Plan for documentation

I've been looking at ways of improving the documentation for this project (i.e. #166) and just recently stumbled upon the README that exists under the docker/ subdirectory. Cool! Also noticed that there is auto-generated docs under the doc/ dir. Nice!

TL;DR I think we need to consolidate the docs a little bit, between the various doc locations in this repo, as well as the tutorial repo.

Between the docs under the subdirectories, and the README, I feel like we're getting a bit lost. It would be nice to converge on a single approach.

I personally am a fan of immediate, getting-started type stuff being documented in the README. Things like requirements (needed dependencies, accounts that need to be setup), a quickstart-style usage guide, and how to run the tests are all things I think every project should put in a README.

But then the README-in-a-subdir approach that the docker/ subdir has taken, vs. a single top-level docs/ directory, are approaches that I think are against each other. I think we should choose one.

As an aside, I like the fact that detailed config documentation is auto-generated and the docs live with the code. That is great. Perhaps we could even add some automation to the CI in this project to generate that code on every push to master and publish it to e.g. this project's github page? Perhaps long-term we could document mordred's API as well and publish an API reference? This kind of documentation feels very specific to this project, so I think this information living within the repo (or on this repo's GitHub Pages) makes sense.

It feels like having documentation that is specific around Docker for this project makes sense and is needed as this project is the keystone piece when it comes to the grimoirelab docker container. But does it make sense for the docker/ subdir to have its own README? Maybe. Or maybe some of that information belongs in the docker guide that is part of the grimoirelab tutorial? The main sections I see in the current docker/README are:

  • The different configuration and setup files mordred uses, feels like it could belong in the top-level README if we could condense it down. OR, we could move that to a guide in the tutorial under Mordred? I think I prefer the latter option.
  • I believe we already have tutorial documentation around the projects.json file. If so, we should simply link to that and avoid duplicating our documentation.
  • im not super familiar how docker-compose.yml fits in. but is this still needed?
  • again not clear to me what the advanced features are about but sounds like this could be better served as a tutorial guide

Let me know your thoughts! I know it's a lot to take in but I think these kinds of writeups from someone like me, a newcomer with no context around how a lot of this stuff works, can be helpful to smooth out the contribution process and therefore scale the development of GrimoireLab up by getting more committers and contributors involved! As always I am happy to send pull requests with ideas on how we could change stuff, or implementations of ideas that come out of this conversation.

Cheers,
Fil

Mordred should remove repositories not included in the projects file

In order to offer a better interaction with the user, the system must keep updated the enriched indexes based on the projects.json file. The update process is not 100% done as data sources are never removed from the enriched indexes even the they are deleted from the file.

These feature request is aimed to make Mordred smart enough to remove repositories when they are deleted from the project.json file.

document requirements and how to run the tests

It would be nice to know how to get set up in this repository, as well as how to run the tests.

My assumptions / attempts so far have included:

  1. Get a virtualenv, install/update pip, setuptools and wheel as per https://grimoirelab.gitbooks.io/tutorial/content/before-you-start/installing-grimoirelab.html
  2. Run python3 setup.py install
  3. Run pip install -r requirements.txt - maybe? Maybe the last command implicitly installs the reqs?

But how do I run the tests? I tried python3 tests/run_tests.py, but that failed completely. I tried to run it from the tests/ directory and got farther, but it looks like I'm missing a ton of dependencies:

src/grimoirelab-mordred/tests on master [?] via grimoirelab
➔ ./run_tests.py
.EEEEEE
======================================================================
ERROR: test_task (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: test_task
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 428, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 369, in _get_module_from_name
    __import__(name)
  File "/Users/maj/src/grimoirelab-mordred/tests/test_task.py", line 33, in <module>
    from mordred.task import Task
  File "../mordred/task.py", line 27, in <module>
    from grimoire_elk.elk import get_ocean_backend
ImportError: cannot import name 'get_ocean_backend'


======================================================================
ERROR: test_task_collection (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: test_task_collection
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 428, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 369, in _get_module_from_name
    __import__(name)
  File "/Users/maj/src/grimoirelab-mordred/tests/test_task_collection.py", line 35, in <module>
    from mordred.task_collection import TaskRawDataCollection
  File "../mordred/task_collection.py", line 42, in <module>
    from grimoire_elk.elk import feed_backend
ImportError: cannot import name 'feed_backend'


======================================================================
ERROR: test_task_enrich (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: test_task_enrich
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 428, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 369, in _get_module_from_name
    __import__(name)
  File "/Users/maj/src/grimoirelab-mordred/tests/test_task_enrich.py", line 36, in <module>
    from mordred.task_projects import TaskProjects
  File "../mordred/task_projects.py", line 36, in <module>
    from mordred.task import Task
  File "../mordred/task.py", line 27, in <module>
    from grimoire_elk.elk import get_ocean_backend
ImportError: cannot import name 'get_ocean_backend'


======================================================================
ERROR: test_task_identities (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: test_task_identities
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 428, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 369, in _get_module_from_name
    __import__(name)
  File "/Users/maj/src/grimoirelab-mordred/tests/test_task_identities.py", line 26, in <module>
    import httpretty
ModuleNotFoundError: No module named 'httpretty'


======================================================================
ERROR: test_task_panels (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: test_task_panels
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 428, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 369, in _get_module_from_name
    __import__(name)
  File "/Users/maj/src/grimoirelab-mordred/tests/test_task_panels.py", line 33, in <module>
    from mordred.task_panels import TaskPanels
  File "../mordred/task_panels.py", line 35, in <module>
    from mordred.task import Task
  File "../mordred/task.py", line 27, in <module>
    from grimoire_elk.elk import get_ocean_backend
ImportError: cannot import name 'get_ocean_backend'


======================================================================
ERROR: test_task_projects (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: test_task_projects
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 428, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/loader.py", line 369, in _get_module_from_name
    __import__(name)
  File "/Users/maj/src/grimoirelab-mordred/tests/test_task_projects.py", line 27, in <module>
    import httpretty
ModuleNotFoundError: No module named 'httpretty'


----------------------------------------------------------------------
Ran 7 tests in 1.630s

FAILED (errors=6)

If someone could help me with this, I will happily issue a pull request to update the README.md.

Thanks!

Move docker/unified_releases to grimoirelab/grimoirelab

Now, we have the grimoirelab/grimoirelab for stuff common to all of GrimoireLab. This unified_releases file seems like a clear example of that, since it controls which commits in all repos are a part of a release. So, I propose to move the directory to be the releases directory in grimoirelab/grimoirelab.

If somebody else agrees, I volunteer to produce the corresponding pull requests (here and in grimoirelab/grimoirelab).

GitHub data source works good, git data source not so much

My projects.json:

{
    "opensource_submission_process": {
        "github": [
            "https://REDACTED/OpenSourceAdvisoryBoard/opensource_submission_process"
        ],
        "git": [
            "https://REDACTED/OpenSourceAdvisoryBoard/opensource_submission_process"
        ]
    },
    "react-spectrum": {
        "github": [
            "https://REDACTED/React/React-spectrum"
        ],
        "git": [
            "https://REDACTED/React/React-spectrum"
        ]
    }
}

The GitHub data shows up great on the dashboard. The git data gives me a bunch of warnings along the lines of "could not locate that index-pattern-field":

screen shot 2018-06-26 at 1 44 18 pm

The URLs from my projects.json all point to the same GitHub Enterprise instance my company uses.

Thanks for any help!

"Missing mbox index" on page load

I have set up my own Grimoirelab instance via the handy Docker container.

I only have a couple of GitHub data sources defined in my projects.json.

When I load Grimoirelab I see the following warnings at the top of the page:

screen shot 2018-06-26 at 11 16 11 am

Can I do anything with my configuration to avoid this warning?

Thanks for any info! I am also happy to send a PR to fix stuff up for the tutorial/docs to reflect what I learn :)

Error when executing areas_of_code study

When running master/HEAD of all the tools, and launching mordred, after a while I see the following error:

2018-03-07 22:07:24,248 - grimoire_elk.arthur - ERROR - Problem executing study <bound method GitEnrich.enrich_areas_of_code of <grimoire_elk.elk.git.GitEnrich object at 0x7f68fb4a0940>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/arthur.py", line 479, in do_studies
    study(enrich_backend, no_incremental)
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/elk/git.py", line 748, in enrich_areas_of_code
    areas_of_code(git_enrich=enrich_backend, in_conn=in_conn, out_conn=out_conn)
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/elk/study_ceres_aoc.py", line 194, in areas_of_code
    ndocs = aoc.analyze()
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/elk/ceres_base.py", line 80, in analyze
    from_date = self._out.latest_date()
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/elk/ceres_base.py", line 264, in latest_date
    raise nfe
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/elk/ceres_base.py", line 252, in latest_date
    response = search.execute()
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch_dsl/search.py", line 679, in execute
    **self._params
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch/client/utils.py", line 76, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch/client/__init__.py", line 636, in search
    doc_type, '_search'), params=params, body=body)
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch/transport.py", line 314, in perform_request
    status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch/connection/http_urllib3.py", line 163, in perform_request
    self._raise_error(response.status, raw_data)
  File "/usr/local/lib/python3.5/dist-packages/elasticsearch/connection/base.py", line 125, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.NotFoundError: TransportError(404, 'index_not_found_exception', 'no such index')

The dashboard seems to be generated correctly, though. Any idea of what's happening? Just in case it matters, I'm testing this with Elasticsearch 6.1.

Mordred require manuscripts to be installed

I have updated the grimoire-mordred Python package and now, when I run mordred -c mordred.cfg, I get this error message:

$ mordred -c mordred-ubuconeu.cfg 
Traceback (most recent call last):
  File "/home/jsmanrique/grimoirelab/venv/bin/mordred", line 39, in <module>
    from mordred.mordred import Mordred
  File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/mordred/mordred.py", line 52, in <module>
    from mordred.task_report import TaskReport
  File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/mordred/task_report.py", line 35, in <module>
    from manuscripts.report import Report
ImportError: No module named 'manuscripts'

Why?

And if manuscripts Python package is needed, is it not installed through pip install --upgrade grimoire-mordred ??

How to configure studies?

Hey I noticed that studies are now configured in each backend section, but I can't find documentation for it.
I noticed that git backend accept following studies studies = [enrich_demography, enrich_onion,enrich_areas_of_code] but what about other backend (github,...)?
Is mordred taking care of onion and areas of code or do I have to follow this doc doc to create aliases?

[git]
studies = [enrich_demography, enrich_onion,enrich_areas_of_code]

Github backend category config?

What's the goal of the category config in perceval?
Should we either use 'pull_request', 'issue', or both?

.mordred logs

perceval.errors.BackendError: ['pull_request', 'issue'] category not valid for GitHub

.mordred config

category = [pull_request, issue]

Add a config section per study with the params for the study

The approach will be to add new sections per studies to mordred config, and that all params included in those sections, will be passed as a kwargs dict to the study. But all of these must be done in mordred, so opening this ticket in mordred.

[Jira] Project name should be in project.json, not in mordred.cfg

Perceval Jira backend allow user to specify a project:

$ perceval jira 'https://tickets.puppetlabs.com' --project PUP

In this case, only issues for that project will be retrieved from the Jira API.

The current way of specifying this for Mordred is in mordred.cfg:

[jira]
raw_index = jira_test-raw
enriched_index = jira_test
project = PUP
max-issues = 10

But this is weird, and allows only for a single project to be filtered. This should in fact be in projects.json, and should be specified in a way that any list of projects from the Jira API endpoint are specified. For example, it could be like:

"jira": [
     "https://tickets.puppetlabs.com --project project1",
     "https://tickets.puppetlabs.com --project project2",
     "https://tickets.puppetlabs.com --project project3",
     ...
],

See some more details in a recent question, #121.

Mordred should be able to read a remote projects.json file from Gitlab

In order to have a quicker response with the changes applied to production, we need the projects.json file to be read from a remote gitlab file which can be private. It is exactly the same behavior we have for the identities file.

[general]
..
gitlab_api_token = ***

[projects]
projects_file = https://URL/raw/master/projects.json

By including gitlab_api_token we avoid having the token repeated.

Mordred continuously exiting with "Exit 127"

👋 Hey all, been trying to get your example of Mordred up and running from the example folder. I've made all the directory calls relative to the path and included a copy of the docker-compose file on my branch here.

Once I run it (docker-compose up -d, I get into Kibana but do not have any items in ES as visible here:

screen shot 2017-06-01 at 11 36 00 am

I believe it is due to an error with Mordred running to completion. I can see it fails in this gist

[studies] Weird working of the config file for studies

Apparently, when specifying no studies (eg, studies = []), the corresponding section for specifying the study arguments is still needed. My impression is that only sections for active studies should be in the config file. That is, both the following configurations should be valid:

studies = []
studies = [enrich_demography, enrich_areas_of_code]

[enrich_demography]

[enrich_areas_of_code]

Right now, apparently only the second one is valid.

Does modred support kibiter 6.1.0 ?

Hi
I checked kibiter new version release (https://github.com/grimoirelab/kibiter/releases)
So i tried to use this kibiter and elasticsearch 6.1.0 with modred.
But it looks like not working well now.

2018-01-26 11:30:07,219 - mordred.mordred - DEBUG -  Waiting for all threads to complete. This could take a while ..
2018-01-26 11:30:07,226 - mordred.task_manager - DEBUG - Executing task <mordred.task_panels.TaskPanels object at 0x7fc416ab3320>
2018-01-26 11:30:07,229 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): 127.0.0.1
2018-01-26 11:30:07,271 - urllib3.connectionpool - DEBUG - http://127.0.0.1:9200 "GET /.kibana/config/_search HTTP/1.1" 503 145
2018-01-26 11:30:07,272 - mordred.task_manager - ERROR - Exception in Task Manager 503 Server Error: Service Unavailable for url: http://127.0.0.1:9200/.kibana/config/_search
2018-01-26 11:30:07,275 - mordred.mordred - DEBUG - No exceptions in threads. Let's continue ..
2018-01-26 11:30:07,276 - mordred.mordred - DEBUG -  Task manager and all its tasks (threads) finished!
2018-01-26 11:30:07,276 - mordred.mordred - DEBUG - Tasks Manager starting ..
2018-01-26 11:30:07,276 - mordred.mordred - DEBUG - backend_tasks = []

Thank you

Perceval exceptions cause Mordred steps fails

Trying to run some Meetups groups analysis with Mordred, I get exceptions related with "forbidden items" from the Meetup API (depending on group configuration).

These issues cause errors in the threads running in Mordred, and that some following steps, like enriched index creation, fail.

Could Mordred handle that? Or is it a Grimoire ELK issue?

Create a kibiter section in mordred.cfg

When #80 is merged, we will have two separate kinds of information in the panels section: information about the dashboard (such as the default index pattern, or the default time period for the dashboard), and about the infrastructure supporting it (Kibiter url, for example).

I propose to split that into two sections: panels and kibiter. The first one about stuff which is directly related to the dashboard, and the second one stuff which would reflect details of the infrastructure, related to Kibiter, that could change in different deployments of the same dashboard, not affecting the dashboard itself.

For now, the new parameters introduced in #80 would come under kibiter.

Missing files

It seems that this repository doesn't have any license and README file seems to be lost somewhere.

Always pointing to Github but not the GHE (Question)

When using mordred and pointing the git urls to Enterprise Github
it is always trying to point to Github only.

How can i point to Enterprise Github?

which api token should be used and also are the orgs_file and identities file needs to be created?

Error: ElasticSearch object has no attribute bulk_upload_sync

When launching mordred for CHAOSS, with 18.03-03 packages, I get the following error:

Elasticsearch aliases for pipermail: creating...
Elasticsearch aliases for pipermail: created!
Enrichment for pipermail: finished after 00:00:10 hours
2018-03-20 22:47:28,881 - mordred.task_manager - ERROR - Exception in Task Manager 'ElasticSearch' object has no attribute 'bulk_upload_sync'
Exception in thread pipermail:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_manager.py", line 92, in run
    task.execute()
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_enrich.py", line 257, in execute
    self.__autorefresh()
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_enrich.py", line 178, in __autorefresh
    enrich_backend.elastic.bulk_upload_sync(eitems, field_id)
AttributeError: 'ElasticSearch' object has no attribute 'bulk_upload_sync'

This happens for all the data sources. It seems that something is happening with the ElasticSearch object...

Menu set up as a phase (feature request)

Currently, Mordred tries always to set up a sidebar menu for Kibiter / Kibana.

I don't need that menu, so wouldn't it be nice having that as an optional phase in the Mordred set up file?

Problem with Mordred Docker Container

Hello,

I tried to follow the instructions found at mordred/docker/readme.md; I followed everything exactly, but only switched the git repos I'm tracking in projects.json.

All of the containers launch successfully, but after about 5 minutes the mordred container crashes and it crashes again immediately if I try to start it with docker-compose. I checked the docker logs and found this at the end:

/usr/lib/python3.4/distutils/dist.py:260: UserWarning: Unknown distribution option: 'namespaces' warnings.warn(msg) /home/bitergia/stage: line 79: bin/mordred: No such file or directory

It seems the bin/mordred executable is, in fact, missing.

Here is my docker-compose.yml config:

mordred:
image: bitergia/mordred:latest
volumes:
- devel/mordred/docker/conf/:/home/bitergia/conf/
- devel/logs/:/home/bitergia/logs/
links:
- mariadb
- elasticsearch

mariadb:
restart: "always"
image: mariadb:10.0
expose:
- "3306"
ports:
- "3306:3306"
environment:
- MYSQL_ROOT_PASSWORD=
- MYSQL_ALLOW_EMPTY_PASSWORD=yes

elasticsearch:
restart: "always"
image: elasticsearch:2.2
command: elasticsearch -Des.network.bind_host=0.0.0.0 -Dhttp.max_content_length=2000mb
ports:
- "9200:9200"

kibana:
image: bitergia/kibiter:4.4.1
environment:
- PROJECT_NAME=mytest
- NODE_OPTIONS=--max-old-space-size=200
links:
- elasticsearch
ports:
- "8081:5601"

kibana-ro:
image: bitergia/kibiter:4.4.1-public
environment:
- PROJECT_NAME=mytest
- NODE_OPTIONS=--max-old-space-size=200
links:
- elasticsearch
ports:
- "8091:5601"

Errors during mordred config file consumption #128

I tried to run mordred with the mordred config file in https://grimoirelab.gitbooks.io/tutorial/mordred/a-grimoirelab-dashboard-in-one-step.html, but I get errors like ('Wrong section param:', 'general', 'sleep'), ('Wrong section param:', 'general', 'kibana') and Error while consuming configuration: ('Wrong section param:', 'es_enrichment', 'studies'). It is here in this line:
https://github.com/chaoss/grimoirelab-mordred/blob/master/mordred/config.py#L623 where we compare the params in cfg file and the params_general, which does not have the fields kibana, sleep, etc that raises the error. So either the documentation is old or there is problem is the params_general variable.

Bugzilla Products

For bugzilla, is there a way to specify which product to collect information from in the project.json file?

Add support for refreshing identities files from mailmap

The idea of this feature is to have a way to automate the refreshing of the identities file taken from a remote repository, like gitdm or mailmap.

Without having this, if a user wants this data in the identities database she has to:

  • download the files
  • run the scripts available in sortinghat/misc to convert them to Sorting Hat format
  • place the files into an identities folder inside each project
  • load them using Mordred and placing the files in the identities_file parameter

Issue when running mordred from command line

when i ran the mordred from cmd line
$ mordred -c infra.cfg dashboard.cfg project.cfg override.cfg

getting the below error. Not clear what is the exact issue
2018-04-03 18:11:33,843 - mordred.mordred - INFO -
2018-04-03 18:11:33,844 - mordred.mordred - INFO - ----------------------------
2018-04-03 18:11:33,844 - mordred.mordred - INFO - Starting Mordred engine ...
2018-04-03 18:11:33,844 - mordred.mordred - INFO - - - - - - - - - - - - - - -
2018-04-03 18:11:33,850 - requests.packages.urllib3.connectionpool - WARNING - Retrying (Retry(total=20, connect=11, read=8, redirect=5, status=None)) after connection broken by 'NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f984b5f4780>: Failed to establish a new connection: [Errno 111] Connection refused',)': /

Useless log message about refreshing identities fields

Using master version of GrimoireELK+Mordred I've seen this log message:

2017-11-14 13:33:48,247 - mordred.task_enrich - INFO - Refreshing identities fields in enriched index

It does not add any useful information.

So it should contain:

  • the name of the index it is refreshing
  • the number of identities to be refreshed (wishlist)

Error installing mordred

Hi, I am following the guide: https://chaoss.github.io/grimoirelab-tutorial/before-you-start/installing-grimoirelab.html

An error occures when typed (grimoirelab) $ pip3 install grimoire-mordred
It shows a long message. In the middle, it says
Failed building wheel for dulwich
and at the end:
Command "/home/emresulun93/venvs/grimoirelab/bin/python3.5 -u -c "import setuptools, tokenize;__file__='/tmp/pip- install-hu_ieezp/dulwich/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n' );f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-6gh_7yc7/install-record.txt - -single-version-externally-managed --compile --install-headers /home/emresulun93/venvs/grimoirelab/include/site/p ython3.5/dulwich" failed with error code 1 in /tmp/pip-install-hu_ieezp/dulwich/

Python 3.6 is installed on the system and I tried to create the virtual environment with Python 3.5 instead of 3.6 but got the same error. Also, the same error occurred while installing the perceval.

Docker container logs should receive the all.log file

The way dockerize applications log messages is via sending the logs via stdout and sterr. So, for Mordred it would be great if those logs are redirected that way directly from all.log instead so they could be available via docker logs command.

Mordred should allow to unify identities by source field

Recent versions of SortingHat allow us to merge accounts using different algorithms by data sources, this feature is not possible with Mordred yet. By adding this feature, we'll allow our system to get a list of github users and merge that into the identities database without side effects with usernames from other sources like IRC, where the username is not a good match indicator at all.

This is a proposal for the syntax:
matching = [email, username:github]

I would unify by 'email' and by 'username' only among identities where source = github.

[grimoirelab2sh] Can not generate the SH JSON file from GrimoireLab yaml file

We get this error when we add the identities.yaml

2017-12-04 11:32:44,565 - mordred.task_identities - ERROR - [sortinghat] Error in command ['grimoirelab2sh', '-i', '/tmp/tmpf88twrz3', '-s', 'cord:manual', '-o', '/tmp/tmphoxmsc1b']
2017-12-04 11:32:44,568 - mordred.task_identities - ERROR - Can not generate the SH JSON file from GrimoireLab yaml file. Do the files exists? Is the API token right?

Why menu.yml is needed

If I've understood right, there should be a default menu.yml file as part of grimoire-mordred Python package. But it seems there is not, and mordred command still requires a menu.yml in the folder where it's executed from:

Exception in thread Global tasks:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/mordred/task_manager.py", line 76, in run
    task = tc(self.config)
  File "/home/jsmanrique/grimoirelab/venv/lib/python3.5/site-packages/mordred/task_panels.py", line 61, in __init__
    with open(TaskPanelsMenu.MENU_YAML, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'menu.yaml'

Wouldn't it be nice if mordred would use the existing menu.yml if no other is provided?

Standard panels and menu must be updated automatically

The current behavior of Mordred does not update an existing panel even if the flag panels is set to true. This is confusing because we have fresher available panels that our users are not seeing unless they wipe out the .kibana index.

Mordred should allow user to:

  • get fresh panels when they are released
  • get the menu updated (as the menu is sometimes manually modified, overwriting it could be a pain the neck if so we must allow users to disable this)

My proposal is to decouple the current panels flag into panels and menu. When they are true the content will be replace based on the standard product, when they are false nor standard panels and menu will be updated.

[studies] Unverified access to SSL Elasticsearch

SirMordred does seem to access SSL Elasticsearch without verifying certificates, which is useful for, for example, accessing Elasticsearch with a self-signed SSL certificate. But this is not the case for studies, which seem to always verify. We need to either:

  • make not-verifying certs the behavior for studies, as is now for the rest of the interactions with Elasticsearch, or
  • have an option in the corresponding sections in mordred.cfg files (sections on raw and enriched indexes) to declare that access to Elasticsearch is verifying / non-verifying.

I think we could start with the former, and let this or another issue for the latter, which is maybe a bit more complex.

Latest version of mordred doesn't work with latest version of elk

When launching master/HEAD of mordred, with master/HEAD of GrimoireELK:

Starting Mordred to build a GrimoireLab dashboard
This will usually take a while...
Traceback (most recent call last):
  File "/usr/local/bin/mordred", line 39, in <module>
    from mordred.mordred import Mordred
  File "/usr/local/lib/python3.5/dist-packages/mordred/mordred.py", line 47, in <module>
    from mordred.task_enrich import TaskEnrich
  File "/usr/local/lib/python3.5/dist-packages/mordred/task_enrich.py", line 33, in <module>
    from grimoire_elk.elk.elastic import ElasticSearch
ImportError: No module named 'grimoire_elk.elk.elastic'
Failed to start Mordred: 1

I look in GrimoireELK and it seems grimoire_elk/elk/elastic.py is not there anymore...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.