dominikbraun / graph Goto Github PK

View Code? Open in Web Editor NEW

1.7K 1.7K 94.0 429 KB

A library for creating generic graph data structures and modifying, analyzing, and visualizing them.

Home Page: https://graph.dominikbraun.io

License: Apache License 2.0

Go 100.00%

algorithm graph graph-algorithms graph-theory graph-traversal graph-visualization graphs graphviz visualization

graph's People

Contributors

Stargazers

Watchers

Forkers

avineshwar dolanor-galaxy geoah 0xforked altoplano wnz27 isgasho super-rain dextercai serhankarakoc wu-xian wahello grapery daniel-007 nightfurex kokizzu rizalgowandy zsigacsaba freeformz simonrichardson jpdro icodein marcel-zaripov shnwang donnol pangeran-bottor therockstardba demonoid81 luz71930 colindickson tobikris aryanlath indreshrsingh hrily postmannen pd93 novamacintyre spirosorfanos rickdana woshizilong schofi libmonsoon-dev jarrodhroberson sparrowochon rachlenko subruber sbromberger app-creative vcsfrl slary1260 shijieqin tzq0301 williamfzc jackyyangpassion liucy1898 lkarlslund yojn jtlsq agfn ljk314 letterbeezps jonjohnsonjr jiangpluss 5l1v3r1 ymm135 donnaaazz0101 christie1992 7velenpiggu jonbrandenburg wylswz manuelnixon930 warpmatrix mengjiaheng s111 gavincarr klothoplatform davidovich versaurabh hoyychoi ozoniuss zeidlermicha umangachapagain shehackedyou catundercar dtomasi akashkumar7902 kamibababa twartzek orzel7 zoonage snowonion albaltimore klases mnhafid

graph's Issues

Graph attributes

Is it possible to have a generic Graph attributes that can be passed like Vertex/Edge to GraphViz?

I was looking at adding this but couldn't find anything :
rankdir=LR;

altering an edge property

Given an already created edge (1 to 2), changing the weight doesn't seems to be currently possible

Tested

anEdge, _ := g.Edge(1, 2)
anEdge.Properties.Weight = 5

anEdge.Properties = graph.EdgeProperties{Weight: 11}`

_ = g.AddEdge(1, 2, graph.EdgeWeight(50))`

none changes the weight.

A function to do this or a workaround in documentation would be great. Very nice library btw.

Add `AdjacencyList` method to `Graph`

There should be an AdjacencyList method that returns an adjacency list as a map[K][]K containing a list of adjacency hashes for each vertex hash.

Behavior

AdjacencyList should return an adjacency list that contains a key for all vertices of the graph, even if a vertex has no adjacent vertices and the corresponding list will be empty. This allows users to safely retrieve the adjacency list for a vertex without a preceding check.

Example:

A -> B -> C
B -> D

Adjacency list:

A: [B]
B: [C, D]
C: []
D: []

Draw straight lines

func DrawGraph() {
    // new a graph    
    g := graph.New(graph.StringHash, graph.Directed())    
    
    // add vertexes    
    for vertex := range file2index {    
        g.AddVertex(vertex, graph.VertexWeight(5))    
    }    
    
    for i := 0; i < fileAmount; i++ {    
        forerunner := index2file[i]    
        for j := 0; j < fileAmount; j++ {    
            if directedGraph[i][j] {    
                successor := index2file[j]    
                // add egdes    
                g.AddEdge(forerunner, successor)    
                fmt.Println(forerunner, successor)    
            }    
        }    
    }    
    
    // save the graph    
    file, _ := os.Create("./mygraph.gv")    
    draw.DOT(g, file)    
    
    // print strongly connected components    
    scc, _ := graph.StronglyConnectedComponents(g)    
    fmt.Println(scc)    
    
    // print topological sort result    
    order, _ := graph.TopologicalSort(g)    
    fmt.Println(order)    
    
}

Hi bro, can you tell me how to make edges straight? Just like your examples of readme.md. Thank you very much!

Change unit tests to use subtests

All unit tests should use subtests. That is, calling the entire testing logic inside t.Run.

Subgraphs

Hey folks, really cool library! I was wondering if its possible to have subgraphs. If I had two graphs of two kinds with different hashes, I'd like to connect them somehow.

New public API

At version 0.8.0, there is a central Graph interface that offers two kinds of methods: Methods to build, modify, and "read" the graph, and methods to perform operations and algorithms on the graph. These two things should be separated. Algorithms shouldn't be the graph's concern.

What I mean by "reading" is to retrieve vertices and edges.

I'm thinking of a lighter Graph interface that only has methods for building and reading the graph itself, and standalone package functions for algorithms such as pathfinding, that accept a Graph instance and do the rest themselves.

The obvious challenge is to find an interface surface for Graph that satisfies the needs of pretty much all algorithms, which was the reason why I didn't go for this approach in the first place. Now that there are many functions and methods in place, figuring out the needs for the algorithms should be a bit easier.

Example for finding SCCs:

Old:

g := graph.New(graph.IntHash, graph.Directed())
scc := g.StronglyConnectedComponents()

New:

g := graph.New(graph.IntHash, graph.Directed())
scc := graph.StronglyConnectedComponents(g)

Why?

The advantage is that this scales much better. For 20 new algorithms, the Graph interface doesn't grow by 20 new methods.

This is relevant in that there are many algorithms that roughly look the same for directed and undirected graphs, and yet they need to be implemented twice. A standalone package function would not only make the Graph interface simpler but also prevent code and test duplication.

Method for updating an edge

There should be a method for updating an edge, for example UpdateEdge[K comparable](source, target K, options ...func(*EdgeProperties)) error. The API should be exactly the same as for AddEdge, so that the user can use the existing functional options for setting the edge properties.

DFS and BFS only work for connected undirected graphs

The current DFS and BFS implementations, that is, DFS, DFSByHash, BFS, and BFSByHash, only work correctly for connected undirected graphs. In case one or more of the vertices are not reachable from the running DFS/BFS, these vertices will never be traversed. Vertices are not reachable if the graph is disconnected, or if it is directed and the edge directions don't allow a traversal between traversed and "isolated" vertices.

Add graph isomorphism computation

Graph isomorphism can be used to determine whether two given graphs are structurally equal, or simply put "the same".

It may look like a trivial check at first glance, but it is one of the hardest problems to solve in graph theory.

There are a couply of algorithms for computing isomorphisms of two graphes, and I don't have a strong opinion on which one of them I would like to see implemented in this library.

At this time, I would prefer a simpler implementation that works for 80% of use cases and graph sizes over a more complex implementation that covers 99% of possible graphs.

AdjacencyMap may cause panic

Concurrent calls to AddVertex, AddEdge, and AdjacencyMap redundantly panic.

panic: assignment to entry in nil map

goroutine 1996 [running]:
github.com/dominikbraun/graph.(*undirected[...]).AdjacencyMap(0xc0009542c0?)
	github.com/dominikbraun/[email protected]/undirected.go:167 +0x225

ListVertices and ListEdges directly occur AddVertex and AddEdge will cause this problem

Question: how are the example images generated?

Hey there nice module Just wanna ask how you generated those example graph images? Is there any plan to add visualisation feature into the module?

please support remove vertex

How to draw a graph as beautiful as in README?

Hi, I am trying to find a package that can draw beautiful directed graphs. It seems the DOT files generated by draw package does not contains any predeinfed attributes. Does the images in README is drawed by DOT? What vertex/edge attributes is it using?

Thank you~

Example code:

g := graph.New(graph.IntHash)

_ = g.AddVertex(1)
_ = g.AddVertex(2)
_ = g.AddVertex(3)
_ = g.AddVertex(4)
_ = g.AddVertex(5)

_ = g.AddEdge(1, 2)
_ = g.AddEdge(1, 4)
_ = g.AddEdge(2, 3)
_ = g.AddEdge(2, 4)
_ = g.AddEdge(2, 5)
_ = g.AddEdge(3, 5)

Corrsponding image in REDAME:

The image I got:

List all path between vertices

I have a need to list all the path available to traverse between any 2 given vertices. Does this exist? Thoughts on how to implement it?

Store adjacency map instead of two maps internally?

The current Graph implementations store at least two maps containing the edges, and adjacency maps are computed on the fly using AdjacencyMap.

This could possibly replaced with an implementation that just stores an adjacency map internally (instead of maintaining two maps), and returns that adjacency map directly when needed.

Which errors should be returned by `Store` implementations?

Now that the Store interface is available, it should be clarified which and what kind of errors should be returned by its implementations.

Example: AddEdge adds a new edge. Should the store return ErrEdgeAlready exists if the edge already exists, or should the Graph implementation check this in advance and return ErrEdgeAlready itself?

In short: Should a Store only handle technical errors or domain errors, too?

Advantages of only handling technical errors

The developer that implements a Store doesn't have to wrap their head around domain problems and focus on the actual storage logic. They don't need to be familiar with the behavior of the graph in edge cases.
Separation of concerns: The Store implementation stores data and does nothing else. Everything graph-related happens in the Graph implementations.

Advantages of handling technical and domain errors

More control over the behavior of the graph. If it is ok for the user to overwrite existing edges, they just don't return an error in AddEdge if the edge already exists and can safely rely on that.

CC @geoah

Unify graph traits and properties

This is a v1.0 feature. As long as graph is in version 0, it must not be implemented.

At the moment, there are Traits and Properties. Traits contains graph traits such as IsDirected, and Properties contains dynamic graph properties such as Attributes. These two types shall be unified in a new Properties type.

Motivation

The reason is that New only accepts functional options for traits as a variadic parameter:

func New[K comparable, T any](hash Hash[K, T], options ...func(*Traits)) Graph[K, T]

Accepting a ...func(*Traits) implies that you can't pass functional options for Properties (func(*Properties)). An example for a functional option for Properties is Attribute(key, value string) func(*Properties) that adds a key-value pair to the properties, as it is the case for VertexProperties. Therefore, the user can't create a directed graph with a key-value pair directly:

g := graph.New(graph.IntHash, graph.Directed(), graph.Attribute("name", "my-graph")) // Won't work!

The motivation of this issue is to unify both in a Properties type, accept functional options for Properties as a variadic parameter of New, and allow New calls as shown above.

Add support for edge attributes

At the moment, edges consist of a source vertex, a target vertex, and a weight. It would be great to have a label as well:

type Edge[T any] struct {
	Source T
	Target T
	Weight int
	Label string
}

Possible APIs

One approach to provide an API for adding a label to an edge is to implement methods such as EdgeWithLabel and WeightedEdgeWithLabel. This is a very simple approach, but the drawback is that we would end up with 4 new methods (because of the additional ByHash methods), which would make a total of 8 methods to create an edge.
The second approach would be to introduce functional options for both the weight and the label for an edge. Those functional options would be added to the Edge and EdgeByHashes functions. Since they are an optional parameter, adding them would not be a breaking change. Functional options are already being used by the New function.

Functional options

I personally prefer the second approach. The Edge and EdgeByHashes functions would look as follows:

Edge(source, target T, options ...func(Edge[T])) error
EdgeByHashes(sourceHash, targetHash K, options ...func(Edge[T])) error

For adding the weight and labels, there would be two predefined functional options:

func EdgeWeight[T any](weight int) func(Edge[T]) {
    return func(edge Edge[T]) {
        edge.Weight = weight
    }
}

Same for the Label field. Creating an edge from "A" to "B" would look like this:

g.Edge("A", "B", graph.EdgeWeight(5), graph.EdgeLabel("my label"))

As a result, the WeightedEdge and WeightedEdgeByHashes functions could be deprecated.

DOT not found

Can anyone tell me how to fix this problem?

Method for adding vertices from another graph

The Graph interface should provide a method for adding vertices from another graph, for example AddVerticesFrom[K comparable, T any](source Graph[K, T]) error. This method should clone all vertices from source, including their properties, and add them to the method receiver.

Shortest path is not the shortest

func TestShortestPath(t *testing.T) {
	g := graph.New(graph.StringHash, graph.Weighted(), graph.Directed())

	g.AddVertex("start")
	g.AddVertex("a")
	g.AddVertex("b")
	g.AddVertex("fin")

	g.AddEdge("start", "a", graph.EdgeWeight(6))
	g.AddEdge("start", "b", graph.EdgeWeight(2))
	g.AddEdge("a", "fin", graph.EdgeWeight(1))
	g.AddEdge("b", "a", graph.EdgeWeight(3))
	g.AddEdge("b", "fin", graph.EdgeWeight(5))

	path, _ := graph.ShortestPath(g, "start", "fin")

	wanted := []string{"start", "b", "a", "fin"}
	if !reflect.DeepEqual(path, wanted) {
		t.Errorf("shortest path failed, result: %v, wanted: %v", path, wanted)
	}
}

shortest path failed, result: [start b fin], wanted: [start b a fin]

Change `New` signature to accept `...func(*Properties)`

This is a v1.0 feature. As long as graph is in version 0, it must not be implemented.

To enable #66, the function signature of New has to look as follows:

func New[K comparable, T any](hash Hash[K, T], options ...func(*Properties)) Graph[K, T]

Introduce `PreventCycles()` option and don't prevent cycles in `AddEdge` for acyclic graphs by default

At the moment, if the graph has been created with the Acyclic() option, AddEdge will return an error if a cycle would be introduced.

This behavior should be transferred to a new PreventCycles option: A graph created using Acyclic() should still allow the introduction of cycles in AddEdge, and only if the creation of cycles has been explicitly prevented using PreventCycles(), AddEdge should return an error.

Why?

There are two reasons for this:

The cycle check is an O(n) operation. Running an O(n) operation in AddEdge in the background might be an unexpected behavior for the user, who probably assumes that AddEdge is a O(1) operation (which it should be).
There are scenarios where you have an acyclic graph but still want to maintain and allow temporary cycles for an indefinite period, and delay the cycle checks.

Fix golangci-lint in CI pipeline

golangci-lint doesn't seem to work anymore. This job reports some weird errors that haven't appeared before, and I'm having issues with golangci-lint in other repos, too.

Add `SimpleCycles` function

There should be a SimpleCycles standalone function that detects all simple cycles in a given graph.

Analogous to functions like StronglyConnectedComponents, SimpleCycles should store the hashes of the vertices that form a cycle as a []K and return all of them as a slice, so that the return type is[][]K.

Example

In the graph shown above, there are two cycles: 1-2-4 and 2-3-5. This is how SimpleCycle should look and work like:

g := graph.New(graph.IntHash)

// Add vertices and edges ...

cycles, _ := graph.SimpleCycles(g)

fmt.Println(cycles)

[[1 2 4] [2 3 5]]

The function signature should be SimpleCycles[K comparable, T any](g Graph[K, T]) ([][]K, error).

Utility functions for "querying" graph?

Hi,
I'm the developer of Fasten Health (which is an open-source personal health record aggregator), that uses Graph under the hood to help "answer" patient questions using their medical records.

FHIR medical records have complicated relationships. I've stored these relationships in an acyclic directed graph using your library, but it's been a long time since I've use Graph theory professionally, and I'm running into some use-cases where native query functionality would make my life much easier.

Right now, our medical record graph easily answers the question: "Give me a list of all patient Conditions, and all related resources (encounters, practitioners, organizations, devices, locations, ect)"

I'd also like to use the graph to answer other questions like:

generate a Phonebook/list of contacts - "create a list of all Locations, Organizations and Practitioners and then find the related Conditions and/or Encounters where the patient interacted with that entity"

Since Locations, Organizations and Practitioners may not have a direct relationship with Conditions and Encounters, I have to walk the graph. Is there a utility function I can use to "query" data within the Graph?

Totally understand if none of this makes sense and it's completely out-of-scope for this library. I might just need to do more reading on Graph theory.

Resolve undefined behavior for unreachable targets in `ShortestPath`

The behavior of the current ShortestPath implementation is not clearly defined when the target vertex isn't reachable from the source vertex. The documentation says that an error is returned when the target isn't reachable, but I'm not so sure whether this is true at all... At the moment, this scenario isn't covered by too many test cases either.

The behavior should be clearly defined, documented, and tested. Further checks in the implementation to determine whether the target is reachable are needed in any case.

Is graph a DAG?

It's common practice to have a graph library where you are primarily dealing with DAG's but one in which you first add nodes and edges, and then later on determine if there is a cycle. The specific cycle found is sometimes useful information. We do this a lot in https://github.com/purpleidea/mgmt/

We have our own internal graph library. I thought about switching it out for this one, but the following feature seems currently impossible:

Create a graph.
Add vertices and edges
When I'm ready, determine if there is a cycle. (eg: topological sort)

The problem with this lib is that a DAG will block this at AddEdge time... And there is not AFAICT a topological sort function for non DAG's.

See here:

if !isDAG(g) {
return nil, errors.New("topological sort can only be performed on DAGs")
}

from:

graph/dag.go

Line 15 in a1affd9

if !isDAG(g) {

Look at AddEdge as well:

graph/directed.go

Line 63 in a1affd9

// If the graph was declared to be acyclic, permit the creation of a cycle.

Note I think the above comment has the wording logic backwards. I can send a patch if you like.

Cheers!

Escape node id in graphviz dot diagram

Hello, this is a really cool library. Thanks so much for creating it!

I found a small bug in the dotTemplate used to render graphviz dot diagrams; it doesn't quote node ids or escape special characters.

Could we improve support by adding quotes to the dotTemplate and escaping any quotes in the node id?

Here's an id i've got in my graph for reference: /home/folder:hello.world.

If i manually quote the strings in the generated dot diagram it works.

strict digraph {

	"/home/folder:hello.world" -> "something/else" [  weight=0 ];

}

Has graph supported finding all paths between Vertex A and Vertex B ？

draw.DOT escaping

Using draw.DOT over a network with valid golang string nodes, generates invalid .gv files if any string contain some special characters

example:

    g := graph.New(graph.StringHash, graph.Directed())

    _ = g.AddVertex("#cat:\"&(")
    _ = g.AddVertex("@dog")
    _ = g.AddVertex("\"big bird\"")

    _ = g.AddEdge("#cat:\"&(", "@dog")
    _ = g.AddEdge("#cat:\"&(", "\"big bird\"")

    file, _ := os.Create("./mygraph.gv")
    _ = draw.DOT(g, file)

Will generate

strict digraph {

	"#cat:"&(" [  weight=0 ];

	"#cat:"&(" -> ""big bird"" [  weight=0 ];

	"#cat:"&(" -> "@dog" [  weight=0 ];

	"@dog" [  weight=0 ];

	""big bird"" [  weight=0 ];

}

which will fail trying to use dot -Tsvg -O mygraph.gv since it has non escaped characters

dot -Tsvg -O mygraph.gv
Error: mygraph.gv: syntax error in line 3 near '&'

I think a flag to optionaly escape html entities may be a good alternative

Graph attributes

Hi, as I see in https://github.com/dominikbraun/graph/blob/main/draw/draw.go current logic doesn't contains possibility to add graph attributes.
Example:

strict digraph {
  rankdir="LR";
  dpi="200.0";

  "a" [  weight=0 ];
  "b" [  weight=0 ];
  "a" -> "b" [ weight=0 ];
}

I mean rankdir and dpi attributes.

Is it okay if I add this logic if it is really not present?

Algorithms to implement

This is an umbrella issue for various algorithms that might or should be implemented mid- to long-term.

Clustering

Transitivity

Connectivity

Is connected
Strongly connected components
Weakly connected components
Condensation

Cycles

Simple cycles (#39)
Creates cycle

DAGs

Topological sort
Transitive reduction
Transitive closure

Eulerian

Eulerian circuit
Eulerian path

Paths

Dijkstra
A*
All simple paths between two vertices

Traversal

DFS
BFS
DFS tree
BFS tree

Trees

Minimum spanning tree
Maximum spanning tree

Isomorphism & Comparisons

Tree isomorphism
Graph isomorphism (#47)
Equals

Sets

Union-find

These algorithms along with their tests should live in a file named after their category, e.g. DFS is located in traversal.go and tested in traversal_test.go. They should accept a Graph[K comparable, T any] instance and vertex values (T) or vertex hashes (K) where appropriate.

(In-)Neighbors Directed Graph

Hi,

I am trying to find the in-neighbors for a node in a directed graph. To do so, I thought about using the AdjacencyMap since the documentation says:

AdjacencyMap computes and returns an adjacency map containing all vertices in the graph.
There is an entry for each vertex, and each of those entries is another map whose keys are
the hash values of the adjacent vertices.

However, it turns out, that in this example (omitting error checks for shortness):

g.AddVertex(0)
g.AddVertex(1)
g.AddVertex(2)

g.AddEdge(0, 1)
g.AddEdge(0, 2)

m, err := g.AdjacencyMap()
log.Println("Testing key 0: ", m[0])
log.Println("Testing key 1: ", m[1])
log.Println("Testing key 2: ", m[2])

we get:

Testing key 0:  map[1:{0 1 {map[] 0}} 2:{0 2 {map[] 0}}]
Testing key 1:  map[]
Testing key 2:  map[]

Empty maps for node 1 and node 2 while they do have node 0 as in-neighbor.
This is not necessarily wrong if one says that vertices are only adjacent if there is an edge from the source vertex to this vertex. However, for directed graphs this is not well-defined without further information.

Hence, I'd say at least this should be described better in the documentation. Even more, I want to propose something along a neighbors(vertex) function which could then also be configured with in- and out-neighbors in the case of directed graphs.

Let me know how you think about this proposal and if there are meaningful starting points for this.

Add support for graph visualization using Graphviz

Visualizing the graphs created with this library is an often-requested feature. The best way to implement this is to generate an output in the DOT language and let the user pass the generated file to Graphviz.

Since this is not an operation of the graph itself, the corresponding function should not be part of the Graph interface. A standalone function in an own package that accepts a Graph instance makes more sense.

The name of this package is to be determined. Maybe something like draw?

import "github.com/dominikbraun/graph/draw"

// Create a graph `g` ...

err := draw.Graph(g, draw.Directory("."))

This function should create a .dot in the current directory. There might be other functional options that could make sense.

stable topological sort

Sometimes it is desirable to have a stable output for operations such as topological sort.

There are two ways of doing this (depending on if the goal is stability, or merely determinism):

remember the order in which edges were declared, and use that order when walking;
or, sort the edges of each node in a deterministic way before walking.

I tend to favor the former approach, because it provides both properties at once (and also because it feels more general: edge order is a nice tiebreaker to have in a lot of scenarios, not just this one). But either works for producing deterministic outputs.

The current implementation can produce random reordering among nodes that aren't constrained. (I assume this is because a golang map is used somewhere.) Sometimes this is fine! But in some scenarios, less fune: unstable output ordering is a big bummer if you have application level predictability desires... Or even just are doing text fixtures which depend on consistency :)

TransitiveReduction intermittently fails with "transitive reduction cannot be performed on graph with cycle"

The following code rarely completes more that 10 iterations without failing with:

transitive reduction cannot be performed on graph with cycle

Code:

package main

import (
	"fmt"
	"os"
	"testing"

	"github.com/dominikbraun/graph"
)

func TestGraphStandard(t *testing.T) {

	for i := 1; i <= 100; i++ {

		vRoot := "_root"
		vA := "A"
		vB := "B"
		vC := "C"
		vD := "D"
		vE := "E"
		vF := "F"

		g := graph.New(graph.StringHash, graph.Directed(), graph.Acyclic())

		g.AddVertex(vRoot)
		g.AddVertex(vA)
		g.AddVertex(vB)
		g.AddVertex(vC)
		g.AddVertex(vD)
		g.AddVertex(vE)
		g.AddVertex(vF)

		g.AddEdge(vRoot, vA)
		g.AddEdge(vRoot, vB)
		g.AddEdge(vRoot, vC)
		g.AddEdge(vRoot, vD)
		g.AddEdge(vRoot, vE)
		g.AddEdge(vRoot, vF)

		g.AddEdge(vE, vC)
		g.AddEdge(vF, vD)
		g.AddEdge(vF, vC)
		g.AddEdge(vF, vE)
		g.AddEdge(vC, vA)
		g.AddEdge(vC, vB)

		_, err := graph.TransitiveReduction(g)
		if err != nil {
			fmt.Println(err.Error())
			os.Exit(1)
		}

		fmt.Printf("iteration: %d\n", i)

	}
}

eg output:

Running tool: /opt/homebrew/bin/go test -timeout 30s -run ^TestGraphStandard$ bugtesting -v

=== RUN   TestGraphStandard
iteration: 1
iteration: 2
iteration: 3
iteration: 4
iteration: 5
iteration: 6
iteration: 7
transitive reduction cannot be performed on graph with cycle
FAIL	bugtesting	1.079s

Version: v0.16.0
Go version: 1.20.1 (macOS 12.6.3 homebrew default)

Make edge layout and memory consumption configurable

Currently, an edge is represented as an Edge[T any], where the source and target vertices are of type T. This is convenient when retrieving an edge, because the user can operate on the source and target value directly - but the downside is that if T is a larger custom type, the memory consumption is equally large. In fact, each instance of T is stored multiple times: In vertices, edges, outEdges, and inEdges.

The other option is to only store the hash values of the vertices instead of the vertices themselves. This means that the user has to look up the actual vertices by their hash values first, but the memory consumption will be lower in cases where there are large custom data types.

All in all, this is a trade-off between memory efficiency and developer-friendliness. Whether one or the other is preferable depends on the use case, and therefore, this behavior should be configurable.

My proposal is to introduce a new functional option that can be passed to New when creating a graph. It should make Edge[T any] store hash values instead of vertices (which is possible because T is any and hash values are of type K comparable). This function could be called WithHashValues:

g := graph.New(myHashingFunction, graph.Directed(), graph.WithHashValues())

Another name could be WithHashEdges or something like that.

Add support for vertex attributes in `draw` package

Coloring edges when rendering a graph using draw.DOT is no problem, because edges can have arbitrary attributes:

_ = g.AddEdge("A", "B", graph.EdgeAttribute("color", "red"))

These attributes are rendered into the DOT description. So everything's fine here.

This doesn't work for vertices, because vertices are of no dedicated type defined by the library, but of any type T instead. Therefore, there is no straightforward way to add "vertex attributes" like there is for edges.

At the moment, the only viable solution I see is to provide a map when rendering the graph: Each vertex hash is mapped against a set of custom attributes.

attributes := map[string]map[string]string{
    "A": map[string]string{"color": "red"},
    "B": map[string]string{"color": "blue"},
}

_ = draw.DOT(g, myFile, attributes)

I'm open to other ideas and approaches.

Update documentation to explicitly ignore errors returned by `AddVertex`

At the moment, AddVertex calls in the documentation (README + GoDoc) look like this:

g.AddVertex(1, 2)

But AddVertex returns an error, and that should be made clear in the docs. The error should be ignored explicitly, just as with the AddEdge calls:

_ = g.AddVertex(1, 2)

Again, this should be done in both the README and the code comments.

Add `Order` and `Size` methods

The Graph interface should have an Order method returning the number of vertices and a Size method returning the number of edges in the graph.

CI pipeline: Don't check code for `gofumpt`

The golangci-lint configuration in graph's CI pipeline checks for gofumpt. I don't like it. The very purpose of gofmt is to provide an official, unified formatting for all Go projects, and gofumpt pretty much defeats this purpose.

If you format your code with gofmt, the pipeline may still fail because it checks for gofumpt -extra. I want this to be removed.

New `ShortestPaths` function for computing multiple shortest paths at once

In situations with a fixed starting vertex and a number of different target vertices, calling ShortestPath for all of these vertices is quite expensive. Because ShortestPath already computes the cost of reaching a vertex from the start for all vertices in the graph, the algorithm can possibly easily be optimized for multiple targets.

This should be implemented in a new function ShortestPaths[K comparable, T any](g Graph[K, T], source K, targets ...K) ([][]K, error) where [n][]K contains the vertex hashes of the shortest path for targets[n].

The difference to the current implementation is that the backtracking needs to be performed for all target vertices.

graph/paths.go

Lines 130 to 142 in f1f0bea

    
           // Backtrack the predecessors from target to source. These are the least-weighted edges. 
        
           path := []K{target} 
        
           hashCursor := target 
        
           for hashCursor != source { 
        
           	// If hashCursor is not a present key in bestPredecessors, hashCursor is set to the zero 
        
           	// value. Without this check, this leads to endless prepending of zeros to the path. 
        
           	if _, ok := bestPredecessors[hashCursor]; !ok { 
        
           		return nil, fmt.Errorf("vertex %v is not reachable from vertex %v", target, source) 
        
           	} 
        
           	hashCursor = bestPredecessors[hashCursor] 
        
           	path = append([]K{hashCursor}, path...) 
        
           }

Change `TopologicalSort` and `TransitiveReduction` to fail at runtime for acyclic graphs

With release 0.12.0, TopologicalSort and TransitiveReduction check whether the graph actively prevents the creation of cycles with the PreventCycles option being set.

In the future, those functions should not rely on the graph not having any cycles, but should just run and fail if a cycle is detected. The preceding isDag() check should be removed.

ShortestPath does not acually return the shortest path on a big graph

This question is from advent of code Day 12 part 1. https://adventofcode.com/2022/day/12

Problem

The heightMap is a 2d array with shape [41 143], when I create a graph and input all the data, ShortestPath returns a path that looks to be semi random, ranging from 500 to 600.

After this failed I implimented my own Dijkstra's Algorithm and got the accepted answer 468. Maybe you can look into Dijkstra's Algorithm?

This method works fine with the small example on the question page, but no the much bigger input data.

Code

func findShortestPathToHighest(heightMap [][]int, start Coordinate, end Coordinate) int {
	coordinateHash := func(c Coordinate) string {
		return strconv.Itoa(c.X) + "," + strconv.Itoa(c.Y)
	}
	g := graph.New(coordinateHash, graph.Directed())
	for y, line := range heightMap {
		for x, _ := range line {
			err := g.AddVertex(Coordinate{X: x, Y: y})
			if err != nil {
				panic(err)
			}
		}
	}

	for y, line := range heightMap {
		for x, height := range line {
			neighbors := getNeighbors(heightMap, x, y)
			for _, neighbor := range neighbors {
				if heightMap[neighbor.Y][neighbor.X] <= height+1 {
					err := g.AddEdge(coordinateHash(Coordinate{X: x, Y: y}), coordinateHash(neighbor))
					if err != nil {
						panic(err)
					}
				}
			}
		}
	}
	shortest := 999999999
	for i := 0; i < 1000; i++ {
		path, err := graph.ShortestPath(g, coordinateHash(start), coordinateHash(end))
		if err != nil {
			panic(err)
		}
		if len(path)-1 < shortest {
			println("new shortest:", len(path)-1)
			shortest = len(path) - 1
		}
	}
	return shortest
}

My input data

https://gist.github.com/foxwhite25/06ee39218c9f3340056d2a14b81b3f50

Function for creating a graph with the same traits as another graph

There should be a function (or method) that creates (or updates) a graph so that the graph has the same traits and hash as a given input graph, for example NewWithSameTraitsAs[K comparable, T any](g Graph[K, T]) (Graph[K, T], error) or Graph.TraitsFrom[K comparable, T any](source Graph[K, T]). Note that these two suggestions have the problem that they don't indicate that the hashing function is adopted as well.

This call should then be replaced accordingly:

graph/trees.go

Line 38 in af07d94

mst := New(g.(*undirected[K, T]).hash)

Method for adding edges from another graph

There should be a method for adding edges from another graph, for example AddEdgesFrom[K comparable, T any](source Graph[K, T]) error. Maybe accepting a []Edge[K] instead of a Graph[K, T] might make sense, too. In that case, a new method for retrieving all edges as a []Edge[K] would be necessary as well.

Implement priority queue using a binary heap

The current priority queue implementation uses a simple slice to store its items.

This approach is simple, but because all the entries in the queue need to be sorted by priority descendingly, inserting a new item has O(n) performance. Using a binary heap will optimize this towards O(log n) time complexity.

Configure golangci-lint for CI pipeline

graph's CI pipeline should perform more extensive linting using golangci-lint.

A pretty standard linter configuration would suffice here, the only particularity that needs to be taken into account is that GoDoc comments should only go until column 100.

	// Backtrack the predecessors from target to source. These are the least-weighted edges.
	path := []K{target}
	hashCursor := target

	for hashCursor != source {
	// If hashCursor is not a present key in bestPredecessors, hashCursor is set to the zero
	// value. Without this check, this leads to endless prepending of zeros to the path.
	if _, ok := bestPredecessors[hashCursor]; !ok {
	return nil, fmt.Errorf("vertex %v is not reachable from vertex %v", target, source)
	}
	hashCursor = bestPredecessors[hashCursor]
	path = append([]K{hashCursor}, path...)
	}