Coder Social home page Coder Social logo

hero's People

Contributors

katego520 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

hero's Issues

A survey on Golang's dependency management modes (GOPATH and Go Modules): status quo, problems and challenges

This report has been collected into [golang/go/wiki/ExperienceReports](https://github.com/golang/go/wiki/ExperienceReports#modules).

The following empirical findings are summarized based on a research paper:
[ICSE’21] Ying Wang, Liang Qiao, Chang Xu*, Yepang Liu, Na Meng, Shing-Chi Cheung, Hai Yu and Zhiliang Zhu. Hero: On the Chaos When PATH Meets Modules, In 43rd International Conference on Software Engineering (ICSE 2021).

Background

Golang has two dependency management modes, GOPATH and Go Modules.

GOPATH Mode:

Prior to Golang 1.11, Golang uses GOPATH mode to assist in managing libraries. Libraries referenced by a project are fetched using command go get. This mode does not require developers to provide any configuration file. It works by matching the URLs of the site hosting referenced libraries with the import paths specified by the go get command. However, it fetches only a library's latest version. To overcome this restriction, developers use third-party tools such as Dep and Glide to manage different library versions under the same vendor directory.

https://divan.dev/posts/gopath/

GOPATH offered a simple and clean structure for your directory - bin/, pkg/ and src/ triplet. 
Down the directory depth, the structure was mirroring Go import names, which were mirroring URL of the version control system for the package.
GOPATH didn’t try to solve the versioning problem. It was postponed and offloaded to the third-party tools, and ultimately resulted in Go Modules.
With GOPATH, there is only one ‘master’ version, and that’s it.

https://www.ardanlabs.com/blog/2019/10/modules-01-why-and-what.html

When operating in GOPATH mode, the solution was to use go get to identify and clone all the repos for all the dependencies into your GOPATH workspace. However, this wasn’t a perfect solution since go get only knows how to clone and update the latest code from the master branch for each dependency. Pulling code from the master branch for each dependency might be fine when you write your initial code. Eventually after a few months (or years) of dependencies evolving independently, the dependencies’ latest master code is likely to no longer be compatible with your project. This is because your project is not respecting the version tags so any upgrade might contain a breaking change.
When operating in the new module mode, the option for go get to clone the repos for all the dependencies into a single well defined workspace is no longer preferred. Plus, you need to find a way of referencing a compatible version of each dependency that would work for the entirety of the project. Then there is supporting the use of different major semantic versions of the same dependency within your project incase your dependencies are importing different major versions of the same package.
Although some solutions to these problems already existed in the form of community-developed tooling (dep, godep, glide, …), Go needed an integrated solution. The solution was to reuse the module file to maintain a list of direct and sometimes indirect dependencies by version. Then treat any given version of a repo as a single immutable bundle of code. This versioned immutable bundle is called a module.

Go Modules Mode:

Golang 1.11 introduced the Go Modules mode, which allows multiple library versions to be referenced by a module using different paths. A module comprises a tree of Golang source files with a go.mod configuration file defined in the tree's root directory. The configuration file explicitly specifies the module's dependency of specific library versions as well as a module path by which the module itself can be uniquely referenced by other projects. The file must be specified according to the semantic import versioning (SIV) rules. For instance, projects whose major versions are v2 or above, should include a version suffix like "/v2" at the end of their module paths.

https://github.com/golang/go/wiki/Modules#semantic-import-versioning

As a result of Semantic Import Versioning, code opting in to Go modules must comply with these rules:
1.	Follow semver. (An example VCS tag is v1.2.3).
2.	If the module is version v2 or higher, the major version of the module must be included as a /vN at the end of the module paths used in go.mod files (e.g., module github.com/my/mod/v2, require github.com/my/mod/v2 v2.0.1) and in the package import path (e.g., import "github.com/my/mod/v2/mypkg"). This includes the paths used in go get commands (e.g., go get github.com/my/mod/[email protected]. Note there is both a /v2 and a @v2.0.1 in that example. One way to think about it is that the module name now includes the /v2, so include /v2 whenever you are using the module name).
3.	If the module is version v0 or v1, do not include the major version in either the module path or the import path.

1
https://www.ardanlabs.com/blog/2019/10/modules-01-why-and-what.html

Modules provide an integrated solution for three key problems that have been a pain point for developers since Go’s initial release:
Ability to work with Go code outside of the GOPATH workspace.
Ability to version a dependency and identify the most compatible version to use.
Ability to manage dependencies natively using the Go tooling.

Module-aware VS Module-unaware:

Module-awareness: The capability of recognizing a virtual path ended with a version suffix like "/v2" from projects in Go Modules.

2

Module-aware project: A project is module-aware if and only if it uses a compatible or new Golang version and does not use any third-party tools.
Module-unaware project: A project is module-unaware if and only if it uses a legacy Golang version, or it uses a compatible or new Golang version with a third-party tool.

3

This figure shows how module-aware and module-unaware projects differ in parsing an import path with or without a v2+ version suffix.
For an import path like "github.com/user/projectA", a module-aware project could reference a specific version v0.∗.∗ or v1.∗.∗ of projectA under v2 (latest version under v2, by default), while a module-unaware project would reference the version on projectA's main branch (typically the latest version).
For an import path like "github.com/user/projectA/v2", a module-aware project could reference a specific version v2.∗.∗ of projectA (latest version under v3, by default), while a module-unaware project would fail to recognize it.

Some concrete issues

Many projects suffered from various issues caused by such mixed dependency management modes. Go Modules is not backward compatible with GOPATH. SIV rules can be violated even if a Golang project and its referenced upstream projects both use Go Modules. Resolve these issues for a Golang project requires up-to-date knowledge of its upstream and downstream projects, as well as their possible heterogeneous uses of two dependency management modes. There are some concrete issues.

Issue A:

Build errors can occur when projects in GOPATH with no module-awareness directly or transitively depend on projects in Go Modules which have virtual paths with version suffixes.

4

E.g., issues: pierrec/lz4#33, golang/dep#1962, gin-gonic/gin#2427, micro/go-micro#1839, urfave/cli#866, golang/go#37995, Masterminds/glide#1017, redis/go-redis#1143, Masterminds/glide#968, libp2p/go-libp2p-kad-dht#258, gofrs/uuid#67.

golang/dep#1962

If a library has a major version 2, then it's module line in go.mod will be module github.com/foo/bar/v2 even if it is being fetched from github.com/foo/bar. Go, since 1.10.3, will build just fine when using imports like import bar "github.com/foo/bar/v2", but dep complains that the repo doesn't have a submodule v2.
As a result, we can't use dep at all if we depend on packages using go.mod.

Issue B:

A project that has migrated to Go Modules may not find their referenced libraries in downstream GOPATH mode projects, or may fetch unintended library versions, due to different import path interpretations by the two modes.

  • Issue B.1:
    5
    Project A in Go Modules depend on project B in GOPATH, and B further depend on C in Go Modules with import path "C/pkg". Suppose that C has released a v2+ version with the major branch strategy. From B's perspective, it interprets the import path as C's latest version (i.e., v2+ version on C's main branch). However, in A's build environment, the import path is interpreted as a v0/v1 version of C (no version suffix in the path). As a result, A fails to fetch C's correct version and can encounter errors when building with B.
    E.g., issues: cockroachdb/cockroach#47246, micro/go-micro#1731, kubernetes/client-go#474.

  • Issue B.2:
    6
    Project A in Go Modules depend on project B in GOPATH, and B further depend on project C, which is managed in B's Vendor directory. Project A references C by import path "C/pkg" declared in B's source files rather than from B's Vendor directory. Although the build may work for the time being, A can fail to fetch C if C is deleted or moved to another repository (e.g., renaming). Even if the fetching is successful, the version on C's hosting site could be different from the one in B's Vendor directory, causing potential build errors due to the inconsistency.
    E.g., issues: hybridgroup/gobot#689, sirupsen/logrus#1041, go-kit/kit#940, golang/lint#436, go-macaron/macaron#185, moby/moby#39302, grpc/grpc-go#2700, google/go-cloud#429, hybridgroup/gobot#689.

hybridgroup/gobot#689

This is due to the usage of the dependency "github.com/codegangsta/cli", which has been renamed to "github.com/urfave/cli"

Issue C:

Errors will occur when projects in Go Modules depend on projects also in Go Modules but not following SIV rules:
(1) lacking version suffixes like "/v2" in module paths or import paths, although the versions of concerned projects are v2+ (e.g., issue kataras/iris#1355, pierrec/lz4#39, v2ray/v2ray-core#2438, etcd-io/etcd#11154, prometheus/prometheus#6048, vitessio/vitess#5019, golang/go#32695, dgrijalva/jwt-go#301, shirou/gopsutil#663);
(2) version tags not following the semver (e.g., issue osrg/gobgp#1848, gohugoio/hugo#5639, gin-gonic/gin#1388, rclone/rclone#2960, robfig/cron#196);
(3) module paths in go.mod files inconsistent with URLs associated with concerned projects on their hosting sites (e.g., issue jwplayer/jwplatformgo#9, micro/micro#272, etcd-io/etcd#11808).

golang/go#31543

Of the various go.mod files I have looked at in repos with v2+ semver tags over the last several months, I estimate more than 50% of those go.mod files are incorrect due to missing the required /vN at the end of the module path.

golang/go#32695

Observe people accidentally creating and using modules that have v2+ semver tags but that have not adopted Semantic Import Versioning.
For example, if there is a module example.com/foo that:
>has a go.mod file (that is, it has adopted modules)
>has a v2.0.0 semver tag as its latest tag
>and its module line reads module example.com/foo (without the in theory required /v2)
then this still works for a module-based consumer, even though the module is in a "bad" state:
go get example.com.com/[email protected]+incompatible
This means people can and do create usable v2+ modules that did not adopt Semantic Import Versioning, and consumers can and do consume those "bad" modules.
However, other things such as upgrades or go get example.com.com/foo@latest do not work as expected, which leads to confusion.

Fixing Solutions

In order to resolve the above issues, developers have come up with various solutions. But resolve these issues for a Golang project requires up-to-date knowledge of its upstream and downstream projects, as well as their possible heterogeneous uses of two dependency management modes. Resolving these issues in a project locally without considering the ecosystem in a holistic way can easily cause new issues to its downstream projects.

Cases

  • Case 1:
    Project pierrec/lz4 migrated to Go Modules in version v2.0.7. Following SIV rules, it declared module path "github.com/pierrec/lz4/v2" in its go.mod file with version suffix "/v2". Although the project can be built successfully after migration, it induced Issue A to downstream projects still in GOPATH, since the latter cannot recognize the version suffix in module path.
    E.g., issues: pierrec/lz4#33, IBM/sarama#1163, mholt/archiver#86, filebrowser/filebrowser#530.
    7
    8
    To fix the issue, pierrec/lz4 released version v2.2.4 still in Go Modules, but removed version suffix "/v2" from its module path as a workaround. This resolved the Issue A in its downstream projects in GOPATH, but induced Issue C to downstream projects that had already migrated to Go Modules, causing build errors, since this solution violates SIV rules.
    E.g., issues: pierrec/lz4#39, golang/go#32695, golang/go#34189, golang/go#31428
    9
    As there is no accurate way to estimate the migration impact to its downstream projects, pierrec/lz4 chose to roll back to GOPATH in v2.2.6 and suspended its migration until its most downstream projects had completed migrations.
    Finally, pierrec/lz4 migrated to Go Modules again in version **v3.0.**1. And it induced Issue A to downstream projects still in GOPATH again.
    E.g., issues: kythe/kythe#4208, pierrec/lz4#62, prometheus-community/postgres_exporter#408, forensicanalysis/artifactcollector#18, go-rod/rod#108.

  • Case 2:
    Project golang-migrate/migrate migrated to Go Modules in version v3.5.2 and declared its module path as "github.com/golang-migrate/migrate/v3" in go.mod file with version suffix "/v3". It induced Issue A to downstream projects still in GOPATH, since the latter cannot recognize the version suffix in module path.
    10
    11
    To fix the issue, golang-migrate/migrate took the form of a vote in the ecosystem, learning that downstream users preferred this solution: roll back to GOPATH until the next version v4.
    12

Below summarize eight common fixing solutions with different trade-offs.

13

Solution 1: Projects in GOPATH migrate to Go Modules.

It encourages migration from GOPATH to Go Modules, for fixing Issue A, since Issue A is due to projects still in GOPATH and unable to recognize import paths with version suffixes.
For example, in issue redis/go-redis#1154, project go-redis/redis migrated to Go Modules, but its downstream projects were still in GOPATH. Then, downstream projects were suggested to migrate to Go Modules as well to avoid build errors. This solved downstream projects' Issue A, but also caused Issue A to downstream projects' module-unaware downstream projects. As a result, new Issue A (e.g., issues redpanda-data/connect#232, redpanda-data/connect#270) occurred for these projects.
Examples: golang/go#37995, filebrowser/filebrowser#530, gotestyourself/gotest.tools#203, kataras/iris#1385, labstack/echo#1321, micro/go-micro#1839, urfave/cli#866, DataDog/dd-trace-go#606, oauth2-proxy/oauth2-proxy#642, prometheus/client_golang#673.
[This solution will affect downstream projects in GOPATH]

Solution 2: Projects in Go Modules roll back to GOPATH.

It cancels previous Go Modules migration, for fixing Issue A and Issue C, by solving migration's caused incompatibility.
For example, in issue gofrs/uuid#61(Issue A), project gofrs/uuid's migration to Go Modules broke many downstream projects' building (in GOPATH). As a compromise, gofrs/uuid rolled back to GOPATH, waiting for downstream projects to migrate first. In issue shirou/gopsutil#663(Issue C), shirou/gopsutil and its downstream projects were all in Go Modules, but shirou/gopsutil violated SIV rules (lacking a version suffix in its module path of v2+ release), causing build errors to downstream projects. As such, shirou/gopsutil chose to roll back to GOPATH, temporarily making downstream projects to work again. This solves the problem, but hinders the migration status to the ecosystem.
Examples: dgraph-io/badger#4662, golang-migrate/migrate#103, go-mail/mail#39, patrickmn/go-cache#89, stripe/stripe-go#712, cenkalti/backoff#76, redis/go-redis#1149, go-chi/chi#327, pierrec/lz4#39, go-chi/jwtauth#42, sercand/kuberesolver#11.

Solution 3: Changing the strategy of releasing v2+ projects in Go Modules from major branch to subdirectory.

It targets at Issue A, where module-unaware projects cannot recognize virtual import paths for v2+ libraries in Go Modules. The new strategy creates physical paths by code cloning, so that libraries can be referenced by module-unaware projects. However, this is a workaround treatment and needs extra maintenance in subsequent releases.
Examples: mediocregopher/radix#128, nicksnyder/go-i18n#184, olivere/elastic#1145, gomodule/redigo#366, golang/go#37995, twitchtv/twirp#169.

Solution 4: Maintaining v2+ libraries in Go Modules in downstream projects' Vendor directories rather than referencing them by virtual import paths.

It targets at Issue A. By making a copy of libraries in downstream projects, these projects can avoid fetching them by virtual import paths.
For example, in issue mediocregopher/radix#141, mediocregopher/radix refused to use the major subdirectory strategy for its v2+ project release in Go Modules. Its downstream projects had to make a copy of mediocregopher/radix’s code in their Vendor directories, causing extra maintenance and potential Issue B in future.
Examples: brianvoe/gofakeit#88, gopherjs/gopherjs#881, redis/go-redis#1143, mholt/archiver#192, moby/moby#40371, vmihailenco/msgpack#237.
[This solution may affect downstream projects in Go Modules]

Solution 5: Using a replace directive with version information to avoid using import paths in referencing libraries.

It addresses Issue B.1 (problematic import path interpretations) and Issue C (import path violating SIV rules).
For example, in issue andrewstuart/goq#12, a client project used a directive to replace the original import path: "replace github.com/andrewstuart/goq => astuart.co/goq v1.0.0", to reference its expected project andrewstuart/goq's version. However, this makes developers no longer able to use the go get command to fetch automatically upgraded libraries.
Examples: maistra/istio#78, scylladb/gocql#3, etcd-io/etcd#10773, cockroachdb/cockroach#47246, micro/micro#1149, moby/moby#39302, grpc/grpc-go#3500.

Solution 6: Updating import paths for libraries that have changed their repositories.

It fixes Issue B.2, where libraries in a project's Vendor directory may be inconsistent with the ones referenced by their import paths. It updates import paths to help a project's downstream projects in Go Modules fetch consistent library versions.
For example, in issue google/go-cloud#429, google/go-cloud managed library coreos/etcd in its Vendor directory, which later changed its hosting repository from github.com/coreos/etcd to **go.etcd.io/etcd**. To fix build errors for google/go-cloud's downstream projects in Go Modules, it updated coreos/etcd's import path to the latest one for the consistency. This fixes the issue, benefiting all affected downstream projects, without impacting others in the ecosystem.
Examples: hybridgroup/gobot#689, kythe/kythe#3344, pion/webrtc#1082, census-instrumentation/opencensus-go#1052, micro/go-plugins#372, go-kit/kit#940, golang/lint#436, golang/oauth2#395, sirupsen/logrus#1041, nats-io/nats.go#478, unknwon/com#26.

Solution 7: Projects in Go Modules fix configuration items to strictly follow SIV rules.

It urges projects that have migrated to Go Modules to follow Golang's official guidelines on SIV rules, for fixing Issue C.
For example, in issue redis/go-redis#1149, project go-redis/redis added version suffix "/v7" at the end of its module path to follow SIV rules. However, Issue C fixed, but the project's downstream projects in GOPATH may encounter Issue A (unable to recognize such version suffixes, e.g., issue redis/go-redis#1151).
Examples: etcd-io/etcd#11154, godbus/dbus#125, gotestyourself/gotest.tools#140, redpanda-data/connect#232, kataras/iris#1355, labstack/echo#1244, mholt/archiver#187, golang/go#27009, golang/go#33879, gin-gonic/gin#1388, golangci/golangci-lint#371, osrg/gobgp#1848, golang/go#37529, gohugoio/hugo#5639, istio/api#1201, golang/go#29731, vitessio/vitess#5019, googleforgames/open-match#675, micro/micro#272, microsoft/go-winio#156, golang/go#31437.
[This solution may affect downstream projects in GOPATH]

Solution 8: Using a hash commit ID for a specific version to replace a problematic version number in library referencing.

Downstream projects use this approach to avoid Issue C, where some projects in Go Modules violate SIV rules in version numbers and cause build errors to downstream projects also in Go Modules. It avoids referencing problematic version numbers, by require directives with the specific hash commit ID in downstream projects' go.mod files.
For example, in issue prometheus/prometheus#6048, one of prometheus/prometheus's downstream projects in Go Modules chose to use directive "require github.com/prometheus/prometheus 43acd0e" to reference its expected version v2.12.0. As Solution 5, this also causes developers cannot fetch automatically upgraded libraries.
Examples: argoproj/argo-workflows#2602, concourse/concourse#3952, ibm-messaging/mq-golang#121, dexidp/dex#1710, zouyx/agollo#78, rwynn/monstache#316, pingcap/parser#812.
[This solution will affect downstream projects in Go Modules]

As it stands, GOPATH mode and Go Modules mode will co-exist for a while in the ecosystem. Developers are likely to encounter these issues again.

The purpose of this report is to help developers understand these issues better and find the best solution to fix the issues. Hope this report can help the ecosystem make a smooth transition from GOPATH to Go Modules.

References:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.