Coder Social home page Coder Social logo

benchmarks's Introduction

benchmarks

Fortran benchmarks

benchmarks's People

Contributors

arunningcroc avatar certik avatar euler-37 avatar milancurcic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

benchmarks's Issues

precision problem for optimized.f90

y<0.8
should be changed as y<0.8_dp

and

a=0.01

should be changed as a=0.01_dp

and the result(using "Windows,GCC 11.2.0 " from 'equation.com',flag -Ofast)

Fortran:  0.45312500000000000            46337
C      : iters 46213 Execution time: 0.473000

Random number generation

There are several algorithms that might be good for benchmarks:

LCG algorithm:

subroutine lcg_int32(x)
! Linear congruential generator
! https://en.wikipedia.org/wiki/Linear_congruential_generator
integer(int32), intent(out) :: x
integer(int32), save :: s = 26250493
s = modulo(s * 48271, 2147483647)
x = s
end subroutine

subroutine lcg_real32(x)
real(real32), intent(out) :: x
integer(int32) :: s
call lcg_int32(s)
x = s / 2147483647._real32
end subroutine

Displaying results

I think it would be good if the final output is HTML+JSON that we can serve up at fortran-lang.org/benchmarks via gh-pages.
This can include various customizable and interactive plots to allow easily comparing different scenarios, and links for raw CSV and JSON download.

Some additional features that would be really nice for each benchmark case are the ability to show compiler reports for optimisation and vectorisation as well as disassembled output like godbolt.org for checking generated instructions.

Should we benchmark languages other than Fortran, why, and how?

I see great value in implementing a variety of simple yet real-world algorithms in Fortran and benchmarking them along multiple axes:

  • Different problem sizes (e.g. array or matrix size)
  • DIfferent compilers
  • Different optimization flags
  • Different hardware

How about different languages? What would be the main purpose of that?

Are we interested in comparing the performance of Fortran and other language implementations, using idiomatic, naive code (i.e. the code that a novice would write), and thus comparing the compilers capability to optimize?

Or are we interested in writing code in different languages that produces the same (or as similar as possible) assembly, and then compare the source code?

Automatic benchmarks via Github workflow

I created a repository (St-Maxwell/benchmark) with the use of Github workflow to perform automatic benchmarks and publish the results to Github page. I imitated Julia's Microbenchmarks to accomplish this prototype.

By working with my automatic benchmarks, I think fortran-lang/benchmarks require the following tasks:

  • a basic framework of benchmark for each language
  • workflow of automatic benchmarks (and deploying github page)
  • documentation

I believe my repository is helpful for the first two tasks. Of course, it needs further polishing. So I'd like to hear your suggestions. Thanks!

USM3D CFD benchmark

Discussed here, summarized here, and at Hacker News here. Quoting the 2nd source,

"Hunter benchmarks USM3D, is described by NASA as “a tetrahedral unstructured flow solver that has become widely used in industry, government, and academia for solving aerodynamic problems. Since its first introduction in 1989, USM3D has steadily evolved from an inviscid Euler solver into a full viscous Navier-Stokes code.”

As previously noted, this is a computational fluid dynamics test, and CFD tests are notoriously memory bandwidth sensitive. We’ve never tested USM3D at ExtremeTech and it isn’t an application that I’m familiar with, so we reached out to Hunter for some additional clarification on the test itself and how he compiled it for each platform. There has been some speculation online that the M1 Ultra hit these performance levels thanks to advanced matrix extensions or another, unspecified optimization that was not in play for the Intel platform."

Benchmark criteria

We should decide on criteria for what makes a good/suitable benchmark problem and how it can be implemented.

Himeno benchmark

Dr. Ryutaro Himeno, Director of the Advanced Center for Computing and Communication, has developed this benchmark to evaluate performance of incompressible fluid analysis code. This benchmark program takes measurements to proceed major loops in solving the Poisson’s equation solution using the Jacobi iteration method.

Being the code very simple and easy to compile and to execute, users can measure actual speed (in MFLOPS) immediately.

Full link: https://i.riken.jp/en/supercom/documents/himenobmt/

The benchmark is used in some recent HPC papers and presentations including:

The benchmark appears closely related to the current Poisson2d benchmark, but instead features a 3D Jacobi stencil.

I've seen similar (MPI-enabled) Jacobi benchmarks before in the book of Hager & Wellein, Introduction to High Performance Computing for Scientists and Engineers.

Should we rename the repository from `benchmarks` to `idioms`?

See #10 for the background discussion. @rouson and I brainstormed a better name, and we came up with idioms.

It seems idioms communicates better what we are trying to achieve:

  • Have mainly idiomatic code (in each language) how to solve a given problem
  • Document each problem, give a mathematical background and then several versions in each language how to solve it
  • We'll also have non-idiomatic code that tries to extract the best performance, but the idiomatic code could help compilers to optimize it better
  • What is "idiomatic" is subjective, and thus we can and should have several different versions
  • As a user, I would love to browse the approaches how to solve a given problem, even just in Fortran. But also in C++, Julia, Python and other languages. To learn and educate myself. And also to compare how "easy" it is to write something like this myself in a given language.
  • I would like to see timings for each version in various compilers, options, platforms (but this is only part of the goal)

Where to run the benchmarks for the published results?

Users will be able to run benchmarks locally. However, how do we choose on what kind of machine to run the benchmarks for the published results? Should it just be a reasonably recent server, and then we also document the machine specs alongside the benchmarks like https://julialang.org/benchmarks/ did?

For multi-compiler benchmarks this may be tricky. With several compilers we can build for x86, but some will be specific to the vendor hardware (IBM, Cray, ARM). Should we consider these as well?

Considering the support from the community so far and potential impacts of fortran-lang, down the road it shouldn't be difficult to get a dedicated cloud instance donated.

In the meantime, perhaps it will be good enough to just run on a recent workstation that one of us owns.

Choosing License

While studying the stopwatch links @ivan-pi posted in #6, it occurred to me that each one was under a different license. While preparing benchmarks we will need to use software written in different languages under different licenses. As I understand, benchmarks will be part of the fortran-lang website therefore under MIT license.
What happens in the following scenario: I am preparing a Nbody benchmark, I already coded the Fortran version under MIT, I copied the C++ version from rosettacode.org, in order to start comparisons, which is under GFDL (v1.2) (and will do the same for the Julia version) and if I choose to time it with eg. the second link in #6 which is under GPLv3 will there be a conflict/problem? Can I publish the codes and the results and if yes, under what license?

Benchmark driver

We need a framework for compiling and running benchmark cases across a test matrix and, if necessary, collecting and post-processing the results.

The main dimensions for the test matrix are:

  1. Different compilers (and eventually different languages)
  2. Different optimisation levels

and there may be others. The framework needs to be able to run locally and in CI for testing.

This can be achieved with a makefile, though this may not be as flexible as a custom solution.

TeaLeaf benchmark

TeaLeaf is a mini-app that solves the linear heat conduction equation on a spatially decomposed regularly grid using a 5 point stencil with implicit solvers.

The GitHub repository can be found here: https://github.com/UK-MAC/TeaLeaf

If not else it can be added to a list of third party benchmarks.

add topics

I suggest adding the topics benchmark, benchmarks, fortran, python, c in the About section.

HPCG Benchmark

Another HPC Benchmark: http://hpcg-benchmark.org/. The reference version is C++ with OpenMPI and OpenMP parallelization.

A co-array Fortran version of the benchmark is discussed in the book by Robert W. Numrich, Parallel programming with co-arrays, Chapman & Hall / CRC Press, 2019. It also includes a very interesting performance analysis.

Write code as tests

It would be a really big time-saver for someone encountering these codes for the first time if every code is written as a test that reports success or failure so that a newcomer doesn't have to digest the entire algorithm and then read verbose output to determine whether a particular run succeeded or failed. End each program with something along the lines of

  block 
    logical, parameter :: testing=.true. 
    if (testing) call verify(calculation_result) ! error terminate if the calculation failed
    print *, "Poisson test passed."
  end block

where making testing a compile-time constant allows an optimizing compiler to completely eliminate the verification code during a dead-code removal phase when you want to do runs to measure performance. You could use a preprocessor macro to switch the value to false when so desired.

Even better would be to adopt a unit-testing framework that automates the execution of all the tests. I recommend Vegetables.

Timing routines

For benchmarking purposes it would be useful to have a set of timing routines.

Some examples of prior art include:

In Julia they use two options:

  • the @time macro, which measures the time taken to execute an expression
  • the BenchmarkTools.jl package including the @btime macro which executes the expression multiple times and uses regression to reduce noise.

A week or two ago, I tried to build some timing macros using fypp:

#:def NTIC(n=1000)
  #:global BENCHMARK_NREPS
  #:set BENCHMARK_NREPS = n
  block
    use, intrinsic :: iso_fortran_env, only: int64, dp => real64
    integer(int64) :: benchmark_tic, benchmark_toc, benchmark_count_rate
    integer(int64) :: benchmark_i
    real(dp) :: benchmark_elapsed
    call system_clock(benchmark_tic,benchmark_count_rate)
    do benchmark_i = 1, ${BENCHMARK_NREPS}$
#:enddef

#:def NTOC(*args)
    #:global BENCHMARK_NREPS
    end do
    call system_clock(benchmark_toc)
    benchmark_elapsed = real(benchmark_toc - benchmark_tic)/real(benchmark_count_rate)
    benchmark_elapsed = benchmark_elapsed/${BENCHMARK_NREPS}$
  #:if len(args) > 0
    ${args[0]}$ = benchmark_elapsed
  #:else
    write(*,*) "Average time is ",benchmark_elapsed," seconds."
  #:endif
  end block
  #:del BENCHMARK_NREPS
#:enddef

These can be used then as follows:

  real :: x(1000), y(1000), avg_time
  call random_number(x)

  @:NTIC(100)
  y = sqrt(x)
  @:NTOC() ! print average time

  @:NTIC(100)
  y = sqrt(x)
  @:NTOC(avg_time) ! save average time to variable

Perhaps a combination of a StopWatch class and some fypp macros, could enable us to do some similar regression tests as done by Julia.

Ising model

I just found a nice example of idiomatic Fortran in the work

Reid, J. K. (1990). Fortran 8X features and the exploitation of parallelism. In Scientific Software Systems (pp. 102-111). Springer, Dordrecht. https://doi.org/10.1007/978-94-009-0841-3_7

It is based on an example by Alan Wilson, showing a simple Ising model, which is a well-known Monte Carlo simulation in 3-dimensional space.

image

The paper by Wilson is the following one

Reid, J. K., & Wilson, A. (1985). The array features in FORTRAN 8x with examples of their use. Computer Physics Communications, 37(1-3), 125-132. https://doi.org/10.1016/0010-4655(85)90144-4

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.