Coder Social home page Coder Social logo

hub.getdbt.com's Introduction

hub.getdbt.com

Package hub for dbt.

Adding a new package

The hubcap.py script which generates PRs for new versions of packages is hosted at https://github.com/dbt-labs/hubcap and runs hourly on Heroku. To add a new package to the hub index, create a PR which adds the package name to this file.

Rename a package

Renaming a package involves two steps:

  1. Redirecting from the old name to the new name
  2. Removing the old name from the hub homepage

To notify users of a new package name, add a "redirectname" key to data/packages/ORG_NAME/OLD_PACKAGE_NAME/index.json.

To remove an old package name from the hub homepage, add it to blocklist.json.

See #1539 for an example of both steps.

hub.getdbt.com's People

Contributors

absorbb avatar amychen1776 avatar aneiderhiser avatar annafil avatar axelazaid avatar b-per avatar callum-mcdata avatar clrcrl avatar cmcarthur avatar dave-connors-3 avatar davidbloss avatar dbeatty10 avatar dependabot[bot] avatar drewbanin avatar dvalexhiggs avatar dylanbaker avatar eogilvy12 avatar fishtownbuildbot avatar github-actions[bot] avatar jdw818 avatar jkarlavige avatar joellabes avatar jtcohen6 avatar jthandy avatar kristin-bagnall avatar martinguindon avatar mirnawong1 avatar saras-daton avatar seub avatar versusfacit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hub.getdbt.com's Issues

Revert release of dbt-artifacts 2.2.3

I created a release yesterday for dbt-artifacts 2.2.3, that shouldn't have been released as a patch as it contained breaking changes. I've temporarily repointed the tag at the old 2.2.2 version, but please can you provide me instructions on how to remove the 2.2.3 release?

Many thanks.

Latest package releases not having PRs created to be merged to the dbt hub

Hi dbt-labs team!

I know there have been some changes to the CI/CD process for hubcap, but have noticed that two of my teams latest releases of our packages did not have PRs opened on this repo to be merged into the dbt hub. The two releases in question are below:

Is there anything my team should update within hubcap or within this repo to help open the PRs on new releases to be merged in to the dbt hub?

Thanks!

Feature: embedded project documentation site

Can we render a project's documentation website on the Hub?

Potentially just for the integration tests (which makes it even more important to have integration tests)

Esp. with macros now able to be documented.

I think the answer to this is going to rely on changes to how we release projects. We'll need some step that runs dbt docs generate and attaches the artifacts to the release (as binaries???), which Hub can then consume.

Feature: add master merge check to automerge

If several packages are in the same hubcap batch, this repo will get stuck on what to merge and thus merge nothing. Let's check if we can tell it to merge master and retry until merging (or a merge conflict arises).

Instructions in `CONTRIBUTING.md` are not working for MacOS >= 12.3

I am following the instructions to run the Hub locally but bundle install is failing and the troubleshooting comments don't work either.

I get this error with libv8

    current directory: /Users/bper/.rbenv/versions/2.7.2/lib/ruby/gems/2.7.0/gems/libv8-3.16.14.19/ext/libv8
/Users/bper/.rbenv/versions/2.7.2/bin/ruby -I /Users/bper/.rbenv/versions/2.7.2/lib/ruby/2.7.0 -r ./siteconf20220810-84506-ypu5nl.rb extconf.rb
creating Makefile
/Users/bper/.rbenv/versions/2.7.2/lib/ruby/gems/2.7.0/gems/libv8-3.16.14.19/ext/libv8/builder.rb:86:in `setup_python!': libv8 requires python 2 to be installed in order to build,
but it is currently 3.9.12 (RuntimeError)
        from /Users/bper/.rbenv/versions/2.7.2/lib/ruby/gems/2.7.0/gems/libv8-3.16.14.19/ext/libv8/builder.rb:53:in `build_libv8!'
        from /Users/bper/.rbenv/versions/2.7.2/lib/ruby/gems/2.7.0/gems/libv8-3.16.14.19/ext/libv8/location.rb:24:in `install!'
        from extconf.rb:7:in `<main>'

extconf failed, exit code 1

Gem files will remain installed in /Users/bper/.rbenv/versions/2.7.2/lib/ruby/gems/2.7.0/gems/libv8-3.16.14.19 for inspection.
Results logged to /Users/bper/.rbenv/versions/2.7.2/lib/ruby/gems/2.7.0/extensions/arm64-darwin-21/2.7.0/libv8-3.16.14.19/gem_make.out

This seems to be related to the fact that Python 2 has been removed from MacOS since 12.3 and I am on 12.4

Feature: support repos from other git providers

I'm not sure if this issue more precisely belongs on the hubcap repo, since that's where the limitation is documented, but I figured I should open it here. This repo is the better catalog of limitations and requested features re: the Hub site more generally.

The Hub site isn't currently able to support repositories hosted on other git providers, e.g. GitLab. The GitLab team has worked around this by hosting mirrors of their packages on GitHub (dbt-labs/hubcap#15).

See also: dbt-labs/hubcap#39

Saras-Daton/Shopify and Saras-Daton/shopify cause issues on case-insensitive filesystems (like Mac APFS)

In the Saras-Daton package directory, there exist two shopify packages—one capitalized Shopify and one all lowercase shopify: https://github.com/dbt-labs/hub.getdbt.com/tree/master/data/packages/Saras-Daton

This causes issues when cloning the repo onto a machine with a case-insensitive filesystem like MacOS (APFS). It looks like the capitalized version is the one still under development, while the all lowercase version is stale. The latest version of hubcap/hub.json confirms that: https://github.com/dbt-labs/hubcap/blob/main/hub.json#L304

Can we remove the all lowercase version of the shopify package from the Saras-Daton directory?

Only one auto-merge per hour can cause latency for new package versions

Potentially related to #1484

Current functionality

A script runs at the top of every hour and looks for newly released package versions. When a new tag is added to a GitHub repo that is listed on the dbt Package Hub, a pull request is automatically opened within this (hub.getdbt.com) repo. These pull requests are somehow merged automatically.

The problem is that only one pull request is merged per hour. If multiple pull requests are opened within a single hour, then duplicate pull requests will be opened every subsequent hour until all PRs are able to merge.

Desired functionality

Do at least one of the following:

  1. Merge multiple PRs per hour, or
  2. Open a single PR containing all new package versions (rather than separate PRs)

Either one of these options should solve the problems listed above.

Also, it would be nice if 100% duplicate PRs weren't opened. Although related, it could be a separate issue opened within https://github.com/dbt-labs/hubcap. It could be accomplished by naming the PR or the branch using a md5 hash of the git diff or something similar.

Implementation

The 2nd option would actually be implemented within https://github.com/dbt-labs/hubcap rather than this repo. 👈 I lean towards trying this one first.

The 1st option could theoretically be accomplished within this repo, but I don't know what is dictating only one PR merge per hour -- it might be something with these GitHub Actions, but it might be something else.

Research necessity of `www.getdbt.com` submodule

Problem

@pdebelak noted here that there's a private submodule currently listed in the contributing instructions for hub.getdbt.com that is a blocker for building locally for any community members outside of the dbt-labs organization.

Next actions to take

Determine if this submodule is still necessary or not:

If not, then:

  • remove it from .gitmodules
  • remove it from any instructions (like here)

Allow community members to run the website locally

Overview

As noted in #1774, there's a private submodule that is a blocker for building locally for any community members outside of the dbt-labs organization.

If we remove this blocker, than any community member can build the hub.getdbt.com website locally and participate in development.

Implementation steps

  • copy the CSS files from the fishtown-analytics/www.getdbt.com submodule to an internal folder.
  • remove it from .gitmodules
  • remove it from any instructions (like here)
  • add a Dockerfile
  • add a docker-compose and instructions

Feature: Github stars?

This is entirely a cosmetic benefit but it'd be interesting to be able to quickly visually note (by show of popularity) what packages seem to have a greater "following" as a leading indicator of utility?

Terrible visual example of this idea (fair warning, not a designer)

hub repo stars

Just something to indicate utility level to the community, recent activity, etc.

Change installation instructions to be bounded to minor version, not patch

Now that we're actively encouraging people to install the latest patch version instead of a specific one, we should be showing

packages:
  - package: dbt-labs/dbt_utils
    version: [">=0.8.0", "<0.9.0"]

instead of what we currently have:

image

My gut feeling is that I can make a new function to calculate the range in the same way as we strip leading vs

hub.getdbt.com/config.rb

Lines 24 to 30 in 8786986

def strip_leading_v(version)
if version.start_with?("v")
version[1..-1]
else
version
end
end

Add the new version_bounds into the version_data object here

hub.getdbt.com/config.rb

Lines 32 to 41 in 8786986

def _build_package(package, org, name)
entry = package['index'].clone
versions = package['versions']
new_versions = {}
versions.each do |version_num, version_data|
version_num = strip_leading_v(version_num)
version_data['version'] = strip_leading_v(version_data['version'])
new_versions[version_num] = version_data

and reference version.version_bounds here instead of version.version

<p>Include the following in your <code>packages.yml</code> file:
<pre id='install'>
packages:
- package: <%= package.namespace %>/<%= package.name %>
version: <%= version.version %></pre>
<p> Run <code>dbt deps</code> to install the package.

but I haven't written any Ruby before so I don't know what traps await me (or how to test this)

@annafil as the resident Ruby fanatic you seem like my best bet for some pointers 🙏

Remove the link to the dbt Core GitHub site at the top of the page

Carry over from #1719
image
The GitHub button here actually points to dbt Core's GH page, not the package as I'd expect.

There is already a View on GitHub text link inline which goes to the repo's page. I think that's a reasonable location, and don't think we need two buttons that go to the same destination so I'd be OK removing the teal "⭐ GitHub" button altogether.

Include release notes in Hub script, use it to populate PR body.

I've set up a couple of bots to post to #package-releases in dbt Slack.

These are linked directly to the individual package repos since I can then include the release notes. If we instead are a little smarter about including release notes here, I can just have One Zap To Rule Them All.

dbt Hub README doesn't match the GitHub README format

Overview of the Issue

We are making updates to our dbt package READMEs to streamline the onboarding experience for first time package users. One of the updates is including our optional package configuration steps within an expandable section of the README. It seems to be that this section is interpreted within GitHub, but it loses all formatting within the dbt Hub site. See the comparisons below:

Expected Outcome

The formatting and style within the dbt hub README matches that of GitHub

Additional Notes

I noticed a few other formatting options don't translate across as well. For example the badges are all centered on GitHub and are strangely formatted within the dbt hub.

I am happy to contribute if possible to address this issue. Thanks!

Snowflake spend is out of date

https://github.com/gitlabhq/snowflake_spend is a mirror of the GitLab Data team's https://gitlab.com/gitlab-data/snowflake_spend package.

Hub indexes the package version that lives on GitHub but it is significantly out of date from the GitLab hosted version, and depending on it will result in duplicate package errors.

Resolution options:

  • make sure mirroring across the GitLab and GitHub versions of the package is set up and working correctly (effort size: small)
  • rewrite hubcap to support GitLab hosted repositories (effort size: large)
  • implement dbt Core redirects such that fishtown-analytics/dbt_utils and dbt-labs/dbt_utils dependencies resolve to the same location, thereby not triggering the error (effort size: megium to large)

Release instructions

As a maintainer, I'd like release instructions so that I know how to perform production deployments.

Enable dashes in package names on hub.getdbt.com

Description

Repo names in GitHub can have dashes, but package names displayed on hub.getdbt.com does not support dashes. A common convention is to include dashes in git repo names but underscores for hub.getdbt.com. This leads to a mismatch that is confusing and/or annoying to some users.

Options

  1. Leave Hub as-is. (Mostly)) dashes in repo names and underscores in Hub package names.
  2. Leave Hub as-is. Standardize on underscores for repo names.
  3. Enable Hub to support dashes in package names. Let there be a mix of naming conventions.
  4. Enable Hub to support dashes in package names. Standardize package names to match the repo names.

Trade-offs

(Trade-offs to be determined.)

Renaming a package repo breaks ability of existing package users to see updates

The problem

Renaming a repository that houses a dbt package results in multiple copies of said package on the Hub.

Exhibit A: dbt-utils

A couple of years ago, the (then) Fishtown team moved away from - to _ in package names. This created two copies of dbt-utils:

  • fishtown-analytics/dbt-utils [link]; and
  • fishtown-analytics/dbt_utils [link]

Hubcap continued automatically updating the latest version of the package, but only for dbt_utils -- the new package name.

In July, we switched over the fishtown-analytics repository to dbt-labs as part of the brand launch. This created a third copy of dbt_utils under the dbt-labs folder:

This is now the only version of the package that is being updated by Hubcap when new versions are released.

User impact

Pretty soon, more and more folks on the older copies of this package are going to start running into compatibility issues, like this one. We're already getting customer reports about this as well as Community feedback.

80% of packages installed after the fishtown-analytics -> dbt-labs cutover still use the old package name, which means they aren't able to see the upgrade path for dbt-utils 0.7.1.

Screen Shot 2021-08-18 at 2 14 18 PM

❗ This affects not only dbt-utils but any other package that has ever been renamed, including all fishtown-analytics packages and packages from partners and the community that were created before Jul 1st.❗

Things I've tried

I've tried simply using symlinks to point to the new location of the packages, but this doesn't pass registry checks in Core.

Resolution

  1. Short term: we should work on removing FA packages from search results and clearly mark packages as deprecated in their descriptions. @annafil will run point on this one

  2. Long term: after chatting with @drewbanin, sounds like we need to take a few steps back and create a process for deprecating and renaming packages on the Hub. Ideally said process allows us to give a warning message to users about upgrading/removing a dependency when they run dbt deps, i.e. "hey you're using an old version of this package -- it is now called [myfancynamehere/myfancypackage]". @jtcohen6 and @leahwicz let's chat more about working through this one together! :)

GitHub Dark and Light mode hash fragments not being respected

We've recently updated our GitHub page with a different logo for light and dark mode, so that our logo is legible regardless of the appearance the user has selected.

Unfortunately, dbt Hub does not account for this and displays both, please see below:

image

Please see here for the live version

The code in our README looks like this:

  <img src="https://user-images.githubusercontent.com/25080503/237990810-ab2e14cf-a449-47ac-8c72-6f0857816194.png#gh-light-mode-only" alt="AutomateDV">
  <img src="https://user-images.githubusercontent.com/25080503/237990915-6afbeba8-9e80-44cb-a57b-5b5966ab5c02.png#gh-dark-mode-only" alt="AutomateDV">

Here are the Github Docs on this feature

Is there any way to enable support for this on dbt Hub? I imagine it might be as simple as adding some CSS to the stylesheets.

I think, as the dbt Hub theme has a white background and this cannot be user-configured, we'd only want to show the gh-light-mode-only image.

Thanks

Package versions that aren't the latest shouldn’t be indexed by google etc

image
https://getdbt.slack.com/archives/CU4MRJ7QB/p1659984818869459

/calogica/dbt_expectations/everything-not-latest/ should be no-index'd in search, but https://hub.getdbt.com/calogica/ is a valid page to be indexed too.

@krevitt says

the best way to do that would be to put a <meta name="robots" content="noindex"> in the <head> of each version-specific page
it looks like there's intent to do that in the repo, but the noindex isn't applying properly due to some of the if statement criteria

Customize org names

Right now, org names are pulled from GitHub directly. Instead, orgs should be able to supply custom names and descriptions for their organizations.

Fix Dockerfile

Bug description

When trying docker-compose build, I get the error

ERROR: Service 'web' failed to build : When using COPY with more than one source file, the destination must be a directory and end with a /

For info, I'm using docker 20.10.7.

Proposed solution

This error can easily be fixed: in the Dockerfile, this line

COPY Gemfile* .

should be replaced by:

COPY Gemfile* ./

Pull request

#2387

Display Glitch

I noticed there is a display glitch on the website, most likely due to a long package name:

Screen Shot 2022-11-14 at 8 51 51 AM

I Did Something Terrible And Now This Repo Has 70,000 Active Branches

uh.......... sorry!

@drewbanin to investigate why this is happening..... ideally before the kind folks from the GitHub SRE team come knocking.....

See https://github.com/fishtown-analytics/hubcap - My guess is that we're pushing a branch every hour when the script runs, whereas we really only want to be pushing a branch when there are changes to report!

This issue can be closed when:

  • The hubcap script is no longer creating so many frivolous branches
  • The ~70k active branches in this repo have been pruned

Screen Shot 2021-06-29 at 7 07 04 PM

Duplicated HubCap bump PRs with merge conflicts

Description

Some pull requests (PRs) that bump HubCap are double-created and include merge conflicts. An example: #1477.

Root cause(s)

GitHub Actions are utilized to manage the queue of merge requests, and it is unaware if certain merge requests are essentially duplicated.

Potential solutions

  • If the only merge conflict is in relation to the published_at key, it is likely that the pull request can just be closed manually since it is redundant.
  • Is there a method to auto-close pull requests that have merge conflicts plus certain properties?
  • Is there a way to prevent adding a merge request that would introduce a conflict proactively?
  • Could the published_at property be removed without loss of desirable functionality?

Solution for now

Periodically close these pull requests manually for now since it is easy enough to do and no risk. It wouldn't prevent us from pursuing another option down the road.

Package from elementary_data is duplicated

Problem

The "elementary_data_reliability" repo in GitHub was renamed to "elementary", but the package is duplicated here:
image

Solution

Combine the two package listings into one by following this example.

(If the namespace was being renamed, we'd use this example instead.)

Use github repo description for featured repos

Currently, featured packages have a description that says:
"dbt models for {{ repo_name }}"

Screen Shot 2020-06-16 at 9 48 59 AM

Would be cool if these instead said:

  • dbt-utils: Utility functions for dbt projects (source)
  • audit-helper: Useful macros when performing data audits (source)
  • etc

Auto-bump PRs failing to merge inconsistently/fluky

This is happening very rarely and for unknown reasons.

Recent examples:

Since the PR opened the following hour has been consistently succeeding the impact is limited to:

  • hour delay for new package version to be released to hub.getdbt.com
  • stale PR with merge conflicts that needs to be closed manually

It's not currently a high priority to fix this.

It is reminiscent of:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.