News

I tried running a basic matrix multiplication CUDA program (using the WMMA API) with two configuration files w_tc.config (exact same thing as gpgpusim.config) and wo_tc.config (modified). Here are the ...
As large language model (LLM) inference demands ever-greater resources, there is a rapid growing trend of using low-bit weights to shrink memory usage and boost inference efficiency. However, these ...
CUDA and Tensor Cores are some of the most prominent specs on an NVIDIA GPU. These cores are the fundamental computational blocks that allow a GPU to perform a bunch of tasks such as video rendering, ...
PARAFAC2 tensor models can handle irregular/ragged tensors and have shown to be especially useful for modelling dynamic data with unaligned or irregular time profiles. However, existing PARAFAC2-based ...
The paper compares of the multidimensional matrix algebra and the tensor algebra. It is shown that tensor algebra operations are realized in the multidimensional matrix algebra more efficiently.
The BCG growth share matrix is a heuristic approach or mental shortcut developed by the Boston Consulting Group. It’s used to classify a firm’s project outlooks.
We investigate a novel approach to approximate tensor-network contraction via the exact, matrix-free decomposition of full tensor-networks. We study this method as a means to eliminate the propagat ...