跳转至

Benchmarks

Basic Benchmarks

Basic benchmarks involve some basic parallel algorithms, including matrix and vector operations such as matrix transposition inversion, vector sum inner product, as well as various parallel reduction and sorting algorithms. We also selected some programs in Polybench, including some slightly more complex linear algebra algorithms such as convolution operations, various forms of matrix multiplication and addition operations, covariance calculations, finite difference calculations.

Program Type Discription Support or Not
BITONIC_SORT alphaTest bitonic sort ✅
DOT_PRODUCT alphaTest dot product ✅
MATRIX_INVERSION alphaTest matrix inversion ✅
MATRIX_TRANSPOSE alphaTest matrix transpose ✅
MERGE_SORT alphaTest merge sort ✅
NBODY alphaTest Simulation of N-body problems ✅
NQUEEN alphaTest N-Queue ✅
PREFIX_SUM alphaTest Prefix sum ✅
RADIX_SORT alphaTest Radix sort ✅
REDUCTION_MAX alphaTest Maxium reduction ✅
REDUCTION_SUM alphaTest sum reduction ✅
TRIANGLE_AREA alphaTest Heron's formula to calculate the area of a triangle ✅
VECTORADD alphaTest Vector addition of multiple data types ✅
VECTORDIV alphaTest Single-precision floating-point vector division ✅
VECTORMMA alphaTest Vector multiply add with constant memory ✅
2DCONV Polybench 2D convolution ✅
2MM Polybench 2 matrix multiplication(alpha * A * B * C + beta * D) ✅
3DCONV Polybench 3D convolution ✅
3MM Polybench 3 matrix multiplication(((AB)(C*D)) ✅
ATAX Polybench matrix transpose vector multiply ✅
BICG Polybench matrix transpose vector multiply ✅
CORR Polybench Correlation coefficient calculation ✅
COVAR Polybench Covariance calculation ✅
FDTD-2D Polybench 2D time domain finite difference calculation ✅
GEMM Polybench General Matrix Multiply Add(C=alpha.A.B+beta.C) ✅
GESUMMV Polybench scalar-vector-matrix multiplication ✅
GRAMSCHM Polybench Gram-Schmidt ✅
MVT Polybench Matrix-Vector Inner Product Transpose ✅
SR2K Polybench Symmetric Rank Two Update ✅
SRK Polybench Symmetric Rank Update ✅

Advanced Benchmarks

Advanced benchmarks cover complex algorithms in many fields such as image processing, high-performance computing, and machine learning, including some of Rodinia's benchmarks involving medical images, physical simulation, etc. They also include some typical Pytorch operators such as Conv2d operator, ReLU operator, MaxPool2d operator, etc,and provide some complete neural network models, such as ResNet18 network, AlexNet network, Yolov3 network, etc.

Program Type Discription Support or Not
rgb2gray image Convert rgb format image to grayscale image ✅
Img scale image image scaling ✅
NOISEREMOVEV1 image Image Denoising ✅
NOISEREMOVEV2 image Image denoising optimized with shared meomory 🔲
SobelFilter image Sobel operator, also known as Sobel-Feldman operator, or Sobel filter, is an image edge detection algorithm widely used in image processing and computer vision. 🔲
bilateralFilter image Bilateral filter is a non-linear filtering method, which is a compromise processing combining the spatial proximity of the image and the similarity of pixel values, while considering the spatial information and gray similarity to achieve the purpose of edge preservation and denoising. It is simple, non-iterative, and local. 🔲
gaussian Rodinia/Linear Algebra Gaussian elimination calculates the result row by row, solving for all variables in the linear system ✅
lud Rodinia/Linear Algebra LU Decomposition is an algorithm to calculate the solutions of a set of linear equations. The LUD kernel decomposes a matrix as the product of a lower triangular matrix and an upper triangular matrix. ✅
heartwall Rodinia/Medical Imaging Heart Wall app tracks the movement of the mouse heart on a series of 104,609×590 ultrasound images to record responses to stimuli. In its initial phase, the program performs image processing operations on the first image to detect the initial partial shape of the inner and outer walls of the heart. These operations include: edge detection, SRAD despeckling, morphological transformation and dilation. To reconstruct the approximate full shape of the heart wall, the program generates ellipses that are superimposed on the image and sampled to mark points on the heart wall (Hough Search). In its final stage (core tracking presented here), the program tracks the movement of the surface by detecting the movement of the image region under the sample point as the shape of the core changes throughout the image sequence. 🔲
particle_filter_naive Rodinia/Medical Imaging The particle filter (PF) is statistical estimator of the location of a target object given noisy measurements of that target’s location and an idea of the object’s path in a Bayesian framework. ✅
leukocyte Rodinia/Medical Imaging The Leukocytes app detects and tracks rolling leukocytes (leukocytes) in in vivo video microscopy of blood vessels. The speed at which white blood cells roll provides important information about the inflammatory process that could help biomedical researchers develop anti-inflammatory drugs. In this application, cells are detected in the first video frame and then tracked through subsequent frames. Detection is done by computing the maximum gradient inverse coefficient of variation (GICOV) score for each pixel in the frame over a range of possible ellipses. The GICOV score of an ellipse is the average gradient magnitude along the ellipse divided by the standard deviation of the gradient magnitude. The GICOV score matrix is then extended to simplify the process of finding local maxima. For each local maximum, an active contour algorithm is used to more accurately determine the shape of the cell. Tracking is done by first computing a motion gradient vector flow (MGVF) matrix in the region around each cell. The MGVF is a gradient field biased along the direction of blood flow, computed using an iterative Jacobian solution procedure. After computing the MGVF, the active contour is used again to refine the shape and determine a new location for each cell. ✅
particlefilter_double Rodinia/Medical Imaging A particle filter (PF) is a statistical estimation of target object positions given noisy measurements of target positions and a notion of target paths in a Bayesian framework. PF has many applications ranging from video surveillance for tracking vehicles, cells, and faces to video compression. This particular embodiment is optimized for tracking cells, in particular leukocytes and cardiomyocytes. After selecting a target object, PF starts tracking that object by making a series of guesses for the current frame, because of what is already known from the previous frame. PF then uses a predefined likelihood model to determine how likely each of these guesses is to occur. PF then normalizes these guesses based on their likelihood, and then sums the normalized guesses to determine the object's current location. Finally, PF updates the guess based on the object's current location before repeating the process for all remaining frames in the video. 🔲
hotspot Rodinia/Physics Simulation Hotspot is a widely used tool to estimate processor temperature based on an architectural floorplan and simulated power measurements. The thermal simulation iteratively solves a series of differential equations for block. ✅
hotspot3d Rodinia/Physics Simulation Thermal simulation based on 3D space ✅
srad_v1 Rodinia/Imgae Processing SRAD (Speckle Reducing Anisotropic Diffusion) is a diffusion method for ultrasonic and radar imaging applications based on partial differential equations (PDEs). Version 1. ✅
srad_v2 Rodinia/Image Processing SRAD (Speckle Reducing Anisotropic Diffusion) is a diffusion method for ultrasonic and radar imaging applications based on partial differential equations (PDEs).Version 2. ✅
nn Rodinia/Data Mining NN (Nearest Neighbor) finds the k-nearest neighbors from an unstructured data set. ✅
streamcluster Rodinia/Data Mining Streaming Cluster (SC), which solves the problem of online clustering. The streamcluster kernel is modified based on the streamcluster benchmark in the Parsec suite developed by Princeton University. Here is a description of stream clustering in the Parsec technical report [1]: "For a stream of input points, it finds a predetermined number of intermediate points in order to assign each point to its closest center. The quality of the clustering is determined by the sum of the squared distances (SSQ) metric.” 🔲
lavaMD Rodinia/Molecular Dynamics The code calculates particle potential and relocation due to mutual forces between particles within a large 3D space. ✅
myocyte Rodinia/Biological Simulation The Myocyte app simulates cardiomyocytes (cardiomyocytes) and simulates their behavior. The model combines cardiomyocyte electrical activity with the calcineurin pathway, a key aspect of heart failure development. The model spans a large number of timescales to reflect how changes in heart rate observed during exercise or stress promote activation of the calcineurin pathway, which ultimately leads to the expression of many genes that remodel cardiac structure. It can be used to identify potential therapeutic targets that may be useful for heart failure treatment. Biochemical reactions, ion transport, and electrical activity in cells are modeled with 91 ordinary differential equations (ODEs) determined by more than 200 experimentally validated parameters. Simulate the model by solving this set of ODEs over specified time intervals. 🔲
nw/needle Rodinia/Bioinformatics Needleman-Wunsch is a nonlinear global optimization method for DNA sequence alignments. ✅
pathfinder Rodinia/Grid Traversal PathFinder uses dynamic programming to find a path on a 2-D grid from the bottom row to the top row with the smallest accumulated weights, where each step of the path moves straight ahead or diagonally ahead. ✅
histogram image 64-bit and 256-bit histograms 🔲
bucketsort miscellaneous The bucket sort algorithm allocates elements to multiple buckets, and inserts and sorts each bucket. ✅
RBF miscellaneous RBF (Radial Basis Function, Radial Basis Function) network is generally a single hidden layer feed-forward neural network, which uses radial basis function as the hidden layer neuron activation function, and the output layer is the hidden layer A linear combination of the outputs of layer neurons. 🔲
pcm miscellaneous Pulse Code Modulation (PulseCodeModulation), referred to as PCM. It is a digital signal generated by sampling, quantizing and encoding a continuously changing analog signal. The advantage of PCM is that the sound quality is good, but the disadvantage is that it is bulky. PCM can provide users with digital data dedicated line services at a rate from 2M to 155M, and can also provide other services such as voice, image transmission, and distance learning. PCM has two standards (manifestations): E1 and T1. Pulse Code Modulation (Pulse Code Modulation) is the most commonly used and simplest waveform coding. It is a method of directly and simply encoding the digits obtained by sampling and A/D conversion evenly and then encoding them. It is the basis of other encoding algorithms. 🔲
DCT 8x8 miscellaneous The discrete cosine transform (DCT for Discrete Cosine Transform) is a transform related to the Fourier transform, which is similar to the discrete Fourier transform (DFT for Discrete Fourier Transform), but only uses real numbers. The discrete cosine transform is equivalent to a discrete Fourier transform whose length is about twice its length. This discrete Fourier transform is performed on a real even function (because the Fourier transform of a real even function is still a real even function ), in some variants it is necessary to move the position of the input or output by half a unit. 🔲
BlackSholes miscellaneous The Black-Scholes Model (English: Black-Scholes Model), referred to as the BS model, also known as the Black-Scholes-Merton model (Black–Scholes–Merton model), is a financial derivative instrument such as options or warrants. The mathematical model of pricing was first proposed by American economists Myron Scholes and Fischer Black, and modified by Robert C. Merton to distribute dividends It can also be used to make it more perfect. 🔲
eigenvalues miscellaneous Eigenvalue is an important concept in Linear Algebra. It means that if A is an n-order square matrix, if there is a number m and a non-zero n-dimensional column vector x, so that Ax=mx holds, then m is said to be an eigenvalue of A (characteristic value) or eigenvalue (eigenvalue). 🔲
fastwalshTransform miscellaneous Walsh-Hadamard Transform (Walsh-Hadamard Transform) is a generalized Fourier transform, which is a transformation method for spectrum analysis in signal processing, integrated circuits and image processing, and is used to replace the discrete Fourier transform. Fast Walsh-Hadamard Transform is a fast algorithm for WHT, similar to FFT. 🔲
page rank miscellaneous PageRank, webpage ranking, also known as webpage level, Google left ranking or page ranking, is a technology calculated based on the hyperlinks between webpages, and as one of the elements of webpage ranking, it is named after the founder of Google Named after Larry Page. Google uses it to reflect the relevance and importance of web pages, and it is one of the effectiveness factors often used to evaluate web page optimization in search engine optimization operations. Google founders Larry Page and Sergey Brin invented the technology at Stanford University in 1998. PageRank determines the rank of a page through the vast hyperlink relationship of the network. Google interprets the link from page A to page B as page A voting for page B, and Google decides a new rating based on the source of the vote (even the source of the source, that is, the page linked to page A) and the rating of the voting target. Simply put, a high-ranked page can boost the rank of other low-ranked pages. 🔲
mandelbortset miscellaneous Mandelbrot set (Mandelbrot set, or translated as Mandelbrot complex number set) is a collection of points that form a fractal on the complex plane, named after the mathematician Benhua Mandelbrot. Mandelbrot sets have some similarities to Julia sets, such as using the same complex quadratic polynomials for iteration. 🔲
torch.nn.Conv3d Pytorch OP Conv3d operator 🔲
torch.nn.BatchNorm3d Pytorch OP BatchNorm3d operator 🔲
torch.nn.Conv2d Pytorch OP Conv2d operator ✅
torch.nn.LeakyReLU Pytorch OP LeakyReLU operator ✅
torch.nn.ReLU Pytorch OP torch.nn.AdaptiveAvgPool2d ✅
torch.nn.MaxPool2d Pytorch OP MaxPool2d operator ✅
torch.nn.AdaptiveAvgPool2d Pytorch OP AdaptiveAvgPool2d operator ✅
torch.nn.Dropout Pytorch OP Dropout operator ✅
torch.nn.Linear Pytorch OP Linear operator ✅
torch.nn.BatchNorm2d Pytorch OP BatchNorm2d operator ✅
torch.nn.Hardswish Pytorch OP Hardsiwsh operator ✅
torch.nn.Hardsigmoid Pytorch OP Hardsigmoid operator ✅
torch.nn.SiLU Pytorch OP SiLU operator ✅
torch.nn.Sigmoid Pytorch OP Sigmoid operator ✅
torch.nn.Embedding Pytorch OP Embedding operator ✅
torch.nn.LayerNorm Pytorch OP LayerNorm operator ✅
torch.nn.GELU Pytorch OP GELU operator ✅
torch.nn.Tanh Pytorch OP Tanh operator ✅
torch.nn.Softmax Pytorch OP Softmax operator ✅
torch.nn.GRU Pytorch OP GRU operator ✅
torch.nn.LSTM Pytorch OP LSTM operator ✅
lenet Pytorch NN lenet network ✅
AlexNet Pytorch NN AlexNet network ✅
GoogLeNet Pytorch NN GoogLeNet network ✅
VGG Pytorch NN VGG network ✅
ResNet18 Pytorch NN ResNet18 network ✅
Yolov3 Pytorch NN Yolov3 network ✅
Yolov5 Pytorch NN Yolov5 network ✅
Densenet Pytorch NN Densenet network ✅
squeezenet Pytorch NN squeezenet network ✅
mobilenetv2 Pytorch NN mobilenetv2 network ✅
mobilenetv3 Pytorch NN mobilenetv3 network ✅
inception_v1 Pytorch NN inceptionv1 network ✅
inception_v2 Pytorch NN inceptionv2 network ✅
inception_v3 Pytorch NN inceptionv3 network ✅
ShuffleNetV2 Pytorch NN ShuffleNetV2 network ✅
EfficientNet Pytorch NN EfficientNet network ✅
transformer Pytorch NN transformer network ✅
Lenet train Pytorch NN LeNet train ✅