Explore multi layer networks and infomap and make proposals for how Graphoscope could support this.

Update NetworkDensity for FGraph

As mentioned in #33, the network density on FGraph works with the calculations for maximum edge count in reference to undirected graphs. Update this to match the directed character of the FGraph.

Description
The Triangle Count algorithm counts the number of triangles for each node in the graph. A triangle is a set of three nodes where each node has a relationship to the other two. In graph theory terminology, this is sometimes referred to as a 3-clique. The Triangle Count algorithm in the GDS library only finds triangles in undirected graphs.

Pointers
The clustering coefficient makes use of triangles, which might give some ideas about the implementation.

References
https://neo4j.com/docs/graph-data-science/current/algorithms/triangle-count/

Define and implement core data structure

This should involve comparing different options in terms of performance and usability.

Define initial list of graph types and their ideal data structures.

Further info is here

[Feature Request] `FGraph.getNodes`

Atm., it is tedious to simply retrieve all nodes from an FGraph. It'd be nice to have a built-in function for this.

Current workaround (provided by @LibraChris):

let getNodes (graph : FGraph<'Nk,'Nd,'Ed) =
    graph
    |> Seq.map (
	fun kvp ->
            let nodeKey = kvp.Key
            let p,nd,s = kvp.Value
            nodeKey, nd
    )

Add literature reference in code comment for all measure and algorithms.

.Using the “Hierarchy of truth” -> Atlas, Networkx, Konnect, Wikipedia

Dijkstra overhaul

Dijkstra for FGraph assumes that 'EdgeData = float.
Rewrite the function and add parameter function that converts 'EdgeData to float.

Impment Random graph generators

Ideally Erdos-Renyi and Watts–Strogatz on the core structure.

Implement project structure

Following the sprint 2 discussion and any further input from Timo please set up the project with the following principles:

Overloaded methods for basic operations so users don't need to open different modules for different data structures

Different namespaces and/or modules for directed and undirected graphs
For now we can treat unweighted graphs as having a weight of 1 in every edge.

Consistency in definition for retrieving incoming and outgoing edges/neighbours in DiGraph

Discussed in #40

^{Originally posted by general-rishkin August 29, 2023}
For a DiGraph G, to retrieve the incoming neighbours of a node (say, node 1), you write:

1 |> DiGraph.getInEdges G

whereas to retrieve the outgoing neighbours, you write:

G |> DiGraph.getOutEdges 1

It would be more consistent to select either:

1 |> DiGraph.getInEdges G
1 |> DiGraph.getOutEdges G

OR

G |> DiGraph.getInEdges 1
G |> DiGraph.getOutEdges 1
```</div>

Immutability when consuming FGraph

Might be a matter of taste but when using FSX files I find immutability very helpful.

Should we update operations like removeNode to create, update and return a copy of the passed in graph.

This keeps the usage closer to a functional F# style and removes cognitive overhead of mutating objects in scope of interactive sessions. Any thoughts?

Decide on matrix structure

discuss with Benedict to ensure alignment with fsharp.stats

Current options include using libtorch

Implement Watts-Strogatz Model for Small-World Networks

Overview:

The Watts-Strogatz model, introduced by Duncan J. Watts and Steven Strogatz in 1998, is a fundamental concept in network science and complex systems. It allows us to simulate and understand small-world networks, which possess unique characteristics including high local clustering and short average path lengths between nodes.

Problem Statement:

We need to implement the Watts-Strogatz model to create small-world networks within our project. This model will be invaluable for studying network structures in various applications such as social networks, transportation systems, and more.

Expected Behavior:

Create a regular lattice network as the starting point.

Implement a random rewiring process with a parameter 'p' to introduce randomness.
Ensure that, as 'p' varies from 0 to 1, the network transitions from a regular lattice to a small-world network with shorter average path lengths while maintaining high local clustering.

Acceptance Criteria:

Code implementation of the Watts-Strogatz model.
Ability to configure the 'p' parameter to control the level of randomness.
Verify that the model generates networks with appropriate small-world characteristics.

Additional Information:

The Watts-Strogatz model has wide-ranging applications in fields such as sociology, epidemiology, and network analysis. Implementing this model will enhance our project's capability to simulate and analyze complex systems and networks.

Sources

Implement Dijkstra for Adjacency Matrix

Parallelise and produce benchmarks is possible.

Restructure namespaces

Rename of DiGraph to LGraph and restructure top level namespaces.

Following structure should be used (Note we can discuss changing this structure in sprint 6.)

Barabási-Albert model doesn't appear to work

I have tried it for small graphs but it never seems to finish.

#r "nuget: Graphoscope, 0.2.0"
open Graphoscope.RandomModels
open Graphoscope
open FGraph

let N = 10
let edgesPerIteration = 2
let myBarabasiAlbert = BarabasiAlbert.initFGraph  N edgesPerIteration id id (fun x -> 1.0) FGraph.empty

Do you have sample which works?

thanks

Produce v0.1 Module for Directed Graph

Using type alias approach with functions and mutable objects. Key decision is whether to use dictionarys or resize arrays. The module is likely to be combination of FGraph and DiGraph. Need to determine if an IdMap object significantly improves performance and on which operations.

Improve test coverage

Aim to double test coverage to about 60% for both LGraph and FGraph

Benchmark conversion to Adjacency Matrix

Do this for both for DiGraph and FGraph.

This is a preliminary step for the Floyd-Warshall algorithm.

Implement basic operations on core graph representation

Further to this discussion implement agreed syntax for

Add Node
Get Edge
Create Edge
Get Weight
Get Neighbours
Get Degree

Implement 2 additional graph representations with conversion functions to core structure

The other representation will be defined by Michele as per this issue. Chose two that you think will be useful :)

Conversions will be to core structure defined here

Further discussion is here

Implement Floyd-Warshall for Adjacency Matrix

And produce benchmarks.

Identify verification graph(s)

We need one or more graphs of sufficient size and complexity to be used for testing. Find such a graph and source of reliable metrics for test assertions.

Implement "copy" functionality for Graph and DiGraph

Description
Our ultimate goal is to leave the decision to use mutation or not to the user. Mutation makes sense for heavy duty operations that run on the cloud. But while developing and working with smaller graphs mutation gets in the way and can be unwieldy to use. The initial step for this is to have a deep copy functionality.

Pointers
https://learn.microsoft.com/en-us/dotnet/api/system.object.memberwiseclone?view=net-7.0

Implement wedge count

Description
The Wedge Count algorithm, like the Triangle Count algorithm, is a graph analysis technique used to identify certain structural patterns within a graph. In this case, it focuses on detecting wedges. A wedge (sometimes called Triad) is a collection of three nodes in the graph where two of these nodes share a direct edge or relationship, and the third node serves as a common neighbour to the other two. This concept is often referred to as a "2-clique" in graph theory.

Pointers
The clustering coefficient makes use of triangles, which might give some ideas about the implementation.

References

http://konect.cc/statistics/twostars/

Implement Star graph generator

Description
Star graph is a special type of graph in which n-1 vertices have degree 1 and a single vertex have degree n – 1. This looks like n – 1 vertex is connected to a single central vertex.

Pointers
Assume graph is undirected.

References
https://www.geeksforgeeks.org/check-star-graph/
https://networkx.org/documentation/stable/reference/generated/networkx.generators.classic.star_graph.html

Setup nuget publication process

We need to process for automating the publishing of nuget packages from sprint releases.

This will make Michele's testing feedback easier (he can reference the current version in fsx rather than having to build locally) and I think rapid iterative publication will improve quality faster.

I think we could publish v0.1 to the public nuget feed. Clearly we will need to caveat it with "alpha version - unfit for consumption" etc. Alternatively we could use a private package manager.

Happy to discuss.

loopcount for DiGraph does not work

The following code to obtain the loop count of a digraph fails. none of the suggestions from the error words. the FGraph version works fine.

let mutable lg: DiGraph<int, float> = 
  DiGraph.createFromEdges [| (0, 2, 1.0); (0, 1, 1.0)|]
  |> DiGraph.addEdge (1, 2, 1.0)
  |> DiGraph.addEdge (0, 1, 1.0)

Measures.Loop.loopCount lg

typecheck error Value restriction. The value 'it' has been inferred to have generic type
val it: '_a
Either define 'it' as a simple data term, make it a function with explicit arguments or, if you do not intend for it to be generic, add a type annotation.

Update documentation to introduce alternative graph structures and their benefits .

Resolve Type-Based Overload Issue in Compute Functionality

Issue Description:

In the current state of the GraphoscopeRestructure branch, the compute functions for directed and undirected FContextMaps are facing a challenge where they cannot share the same name. Consequently, they have been temporarily separated into compute and computeUndirected. This distinction was introduced as a workaround until a more sustainable solution can be implemented.

Longest path algorithm

Hello,

I picked up about the existence of this library during the Data Science in F# conference!
There are two ways to determine the shortest path in a graph and I'd like to know if it would be difficult to get a longest path algorithm.

My use case is the following: recently, a new F# compiler flag was added (--test: GraphBasedChecking). This allows your files to be type-checked in parallel where possible. In short, a graph of a project is being constructed that lists all the dependencies (edges) between files.
In the theoretical scenario that you have infinite cores for maximum paralyzation, the times it will take to type-check a code base would be the longest chain in the graph.

As I'm not quite familiar (and clever) enough to figure this out, I'm wondering if this could be added to this library by someone. I naively hope this isn't much work since the shortest path is already available. In return, I'd be happy to help out with something else. A blog post about the compiler flag comes to mind and there I could showcase this package and expose it outside the pure data science scene. Or anything else really.

Happy to hear your thoughts!

Cheers,

Florian

Rename Degree measures of FGraph to match DiGraph implementation

Rename Degree measures of FGraph to match DiGraph implementation as seen in #33

Implement Leiden Community Detection

Description
Community detection is often used to understand the structure of large and complex networks. One of the most popular algorithms for uncovering community structure is the so-called Louvain algorithm. We show that this algorithm has a major defect that largely went unnoticed until now: the Louvain algorithm may yield arbitrarily badly connected communities. In the worst case, communities may even be disconnected, especially when running the algorithm iteratively. In our experimental analysis, we observe that up to 25% of the communities are badly connected and up to 16% are disconnected. Leiden algorithm overcomes these problems.

Pointers
Work your way up from Louvain as it is described in the Appendix of the paper to Leiden.

References
https://www.nature.com/articles/s41598-019-41695-z
https://static-content.springer.com/esm/art%3A10.1038%2Fs41598-019-41695-z/MediaObjects/41598_2019_41695_MOESM1_ESM.pdf

Implement and update Code coverage tool .

Assert some basic tests on test graphs

Tests should include
node/edge count
average degree

[Feature Request] Graph partitioning in FGraph

Graph partitioning refers to dividing a graph into smaller, non-overlapping subgraphs while optimizing some objective function or constraint.
It can be used to optimise graph reduction and fast component parting.
Since FGraph is based on a dictionary structure, graph partitioning can be based on dictionary partitioning.

Create Function overhaul

At the moment the create functions work with simple collections of tupel or tripel.
Add a create functions that work of type lists:
NodSeq as seq seq and EdgeSeq as seq.

Create graph types from reference graphs

Chose a canonical example from the reference graphs for the following graph types:

unweighted undirected
unweighted directed
weighted directed
weight undirected

Add create graph helper methods to the test lib. Graphs should be built from edgelist txt files

Produce FSdcos Documentation for v0.1

This should include section on creating graphs from edge lists and running basic operations and algorithms in an fsx environment.

Implement "copy" functionality for FGraph

Description
Our ultimate goal is to leave the decision to use mutation or not to the user. Mutation makes sense for heavy duty operations that run on the cloud. But while developing and working with smaller graphs mutation gets in the way and can be unwieldy to use. The initial step for this is to have a deep copy functionality.

Pointers
https://learn.microsoft.com/en-us/dotnet/api/system.object.memberwiseclone?view=net-7.0

Add tests to new graph structure for v0.1

Expand existing tests to cover end to end use cases from creation to running algorithms.

fslaborg / graphoscope Goto Github PK

graphoscope's People

Contributors

Stargazers

Watchers

Forkers

graphoscope's Issues