I: pbuilder: network access will be disabled during build I: Current time: Wed Apr 15 13:30:49 -12 2026 I: pbuilder-time-stamp: 1776303049 I: Building the build Environment I: extracting base tarball [/var/cache/pbuilder/unstable-reproducible-base.tgz] I: copying local configuration W: --override-config is not set; not updating apt.conf Read the manpage for details. I: mounting /proc filesystem I: mounting /sys filesystem I: creating /{dev,run}/shm I: mounting /dev/pts filesystem I: redirecting /dev/ptmx to /dev/pts/ptmx I: policy-rc.d already exists I: Copying source file I: copying [gemmlowp_0.0~git20211220.e844ffd-1.dsc] I: copying [./gemmlowp_0.0~git20211220.e844ffd.orig.tar.xz] I: copying [./gemmlowp_0.0~git20211220.e844ffd-1.debian.tar.xz] I: Extracting source dpkg-source: warning: cannot verify inline signature for ./gemmlowp_0.0~git20211220.e844ffd-1.dsc: unsupported subcommand dpkg-source: info: extracting gemmlowp in gemmlowp-0.0~git20211220.e844ffd dpkg-source: info: unpacking gemmlowp_0.0~git20211220.e844ffd.orig.tar.xz dpkg-source: info: unpacking gemmlowp_0.0~git20211220.e844ffd-1.debian.tar.xz dpkg-source: info: using patch list from debian/patches/series dpkg-source: info: applying 0001-cmake-build-fix.patch I: using fakeroot in build. I: Installing the build-deps I: user script /srv/workspace/pbuilder/3713724/tmp/hooks/D02_print_environment starting I: set BUILDDIR='/build/reproducible-path' BUILDUSERGECOS='first user,first room,first work-phone,first home-phone,first other' BUILDUSERNAME='pbuilder1' BUILD_ARCH='amd64' DEBIAN_FRONTEND='noninteractive' DEB_BUILD_OPTIONS='buildinfo=+all reproducible=+all parallel=42 ' DISTRIBUTION='unstable' HOME='/root' HOST_ARCH='amd64' IFS=' ' INVOCATION_ID='59c8aa00abba412d972fe0a93a657a82' LANG='C' LANGUAGE='en_US:en' LC_ALL='C' MAIL='/var/mail/root' OPTIND='1' PATH='/usr/sbin:/usr/bin:/sbin:/bin:/usr/games' PBCURRENTCOMMANDLINEOPERATION='build' PBUILDER_OPERATION='build' PBUILDER_PKGDATADIR='/usr/share/pbuilder' PBUILDER_PKGLIBDIR='/usr/lib/pbuilder' PBUILDER_SYSCONFDIR='/etc' PPID='3713724' PS1='# ' PS2='> ' PS4='+ ' PWD='/' SHELL='/bin/bash' SHLVL='2' SUDO_COMMAND='/usr/bin/timeout -k 18.1h 18h /usr/bin/ionice -c 3 /usr/bin/nice /usr/sbin/pbuilder --build --configfile /srv/reproducible-results/rbuild-debian/r-b-build.IQ53BROc/pbuilderrc_BoLf --distribution unstable --hookdir /etc/pbuilder/first-build-hooks --debbuildopts -b --basetgz /var/cache/pbuilder/unstable-reproducible-base.tgz --buildresult /srv/reproducible-results/rbuild-debian/r-b-build.IQ53BROc/b1 --logfile b1/build.log gemmlowp_0.0~git20211220.e844ffd-1.dsc' SUDO_GID='110' SUDO_UID='105' SUDO_USER='jenkins' TERM='unknown' TZ='/usr/share/zoneinfo/Etc/GMT+12' USER='root' _='/usr/bin/systemd-run' http_proxy='http://213.165.73.152:3128' I: uname -a Linux ionos5-amd64 6.12.12+bpo-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.12-1~bpo12+1 (2025-02-23) x86_64 GNU/Linux I: ls -l /bin lrwxrwxrwx 1 root root 7 Mar 4 2025 /bin -> usr/bin I: user script /srv/workspace/pbuilder/3713724/tmp/hooks/D02_print_environment finished -> Attempting to satisfy build-dependencies -> Creating pbuilder-satisfydepends-dummy package Package: pbuilder-satisfydepends-dummy Version: 0.invalid.0 Architecture: amd64 Maintainer: Debian Pbuilder Team Description: Dummy package to satisfy dependencies with aptitude - created by pbuilder This package was created automatically by pbuilder to satisfy the build-dependencies of the package being currently built. Depends: debhelper-compat (= 13), cmake dpkg-deb: building package 'pbuilder-satisfydepends-dummy' in '/tmp/satisfydepends-aptitude/pbuilder-satisfydepends-dummy.deb'. Selecting previously unselected package pbuilder-satisfydepends-dummy. (Reading database ... 19783 files and directories currently installed.) Preparing to unpack .../pbuilder-satisfydepends-dummy.deb ... Unpacking pbuilder-satisfydepends-dummy (0.invalid.0) ... dpkg: pbuilder-satisfydepends-dummy: dependency problems, but configuring anyway as you requested: pbuilder-satisfydepends-dummy depends on debhelper-compat (= 13); however: Package debhelper-compat is not installed. pbuilder-satisfydepends-dummy depends on cmake; however: Package cmake is not installed. Setting up pbuilder-satisfydepends-dummy (0.invalid.0) ... Reading package lists... Building dependency tree... Reading state information... Initializing package states... Writing extended state information... Building tag database... pbuilder-satisfydepends-dummy is already installed at the requested version (0.invalid.0) pbuilder-satisfydepends-dummy is already installed at the requested version (0.invalid.0) The following NEW packages will be installed: autoconf{a} automake{a} autopoint{a} autotools-dev{a} bsdextrautils{a} cmake{a} cmake-data{a} debhelper{a} dh-autoreconf{a} dh-strip-nondeterminism{a} dwz{a} file{a} gettext{a} gettext-base{a} groff-base{a} intltool-debian{a} libarchive-zip-perl{a} libarchive13t64{a} libbrotli1{a} libcom-err2{a} libcurl4t64{a} libdebhelper-perl{a} libelf1t64{a} libexpat1{a} libffi8{a} libfile-stripnondeterminism-perl{a} libgnutls30t64{a} libgssapi-krb5-2{a} libicu72{a} libidn2-0{a} libjsoncpp26{a} libk5crypto3{a} libkeyutils1{a} libkrb5-3{a} libkrb5support0{a} libldap2{a} libmagic-mgc{a} libmagic1t64{a} libnghttp2-14{a} libnghttp3-9{a} libp11-kit0{a} libpipeline1{a} libproc2-0{a} libpsl5t64{a} librhash1{a} librtmp1{a} libsasl2-2{a} libsasl2-modules-db{a} libssh2-1t64{a} libtasn1-6{a} libtool{a} libuchardet0{a} libunistring5{a} libuv1t64{a} libxml2{a} m4{a} man-db{a} po-debconf{a} procps{a} sensible-utils{a} The following packages are RECOMMENDED but will NOT be installed: ca-certificates curl krb5-locales libarchive-cpio-perl libldap-common libltdl-dev libmail-sendmail-perl libsasl2-modules linux-sysctl-defaults lynx psmisc publicsuffix wget 0 packages upgraded, 60 newly installed, 0 to remove and 0 not upgraded. Need to get 40.7 MB of archives. After unpacking 149 MB will be used. Writing extended state information... Get: 1 http://deb.debian.org/debian unstable/main amd64 libproc2-0 amd64 2:4.0.4-7 [64.9 kB] Get: 2 http://deb.debian.org/debian unstable/main amd64 procps amd64 2:4.0.4-7 [878 kB] Get: 3 http://deb.debian.org/debian unstable/main amd64 sensible-utils all 0.0.24 [24.8 kB] Get: 4 http://deb.debian.org/debian unstable/main amd64 libmagic-mgc amd64 1:5.45-3+b1 [314 kB] Get: 5 http://deb.debian.org/debian unstable/main amd64 libmagic1t64 amd64 1:5.45-3+b1 [108 kB] Get: 6 http://deb.debian.org/debian unstable/main amd64 file amd64 1:5.45-3+b1 [43.3 kB] Get: 7 http://deb.debian.org/debian unstable/main amd64 gettext-base amd64 0.23.1-1 [243 kB] Get: 8 http://deb.debian.org/debian unstable/main amd64 libuchardet0 amd64 0.0.8-1+b2 [68.9 kB] Get: 9 http://deb.debian.org/debian unstable/main amd64 groff-base amd64 1.23.0-7 [1185 kB] Get: 10 http://deb.debian.org/debian unstable/main amd64 bsdextrautils amd64 2.40.4-5 [92.4 kB] Get: 11 http://deb.debian.org/debian unstable/main amd64 libpipeline1 amd64 1.5.8-1 [42.0 kB] Get: 12 http://deb.debian.org/debian unstable/main amd64 man-db amd64 2.13.0-1 [1420 kB] Get: 13 http://deb.debian.org/debian unstable/main amd64 m4 amd64 1.4.19-7 [294 kB] Get: 14 http://deb.debian.org/debian unstable/main amd64 autoconf all 2.72-3 [493 kB] Get: 15 http://deb.debian.org/debian unstable/main amd64 autotools-dev all 20220109.1 [51.6 kB] Get: 16 http://deb.debian.org/debian unstable/main amd64 automake all 1:1.17-3 [862 kB] Get: 17 http://deb.debian.org/debian unstable/main amd64 autopoint all 0.23.1-1 [770 kB] Get: 18 http://deb.debian.org/debian unstable/main amd64 cmake-data all 3.31.6-1 [2268 kB] Get: 19 http://deb.debian.org/debian unstable/main amd64 libicu72 amd64 72.1-6 [9421 kB] Get: 20 http://deb.debian.org/debian unstable/main amd64 libxml2 amd64 2.12.7+dfsg+really2.9.14-0.2+b2 [699 kB] Get: 21 http://deb.debian.org/debian unstable/main amd64 libarchive13t64 amd64 3.7.4-1.1 [349 kB] Get: 22 http://deb.debian.org/debian unstable/main amd64 libbrotli1 amd64 1.1.0-2+b7 [307 kB] Get: 23 http://deb.debian.org/debian unstable/main amd64 libkrb5support0 amd64 1.21.3-5 [33.0 kB] Get: 24 http://deb.debian.org/debian unstable/main amd64 libcom-err2 amd64 1.47.2-1 [24.0 kB] Get: 25 http://deb.debian.org/debian unstable/main amd64 libk5crypto3 amd64 1.21.3-5 [81.5 kB] Get: 26 http://deb.debian.org/debian unstable/main amd64 libkeyutils1 amd64 1.6.3-4 [9092 B] Get: 27 http://deb.debian.org/debian unstable/main amd64 libkrb5-3 amd64 1.21.3-5 [326 kB] Get: 28 http://deb.debian.org/debian unstable/main amd64 libgssapi-krb5-2 amd64 1.21.3-5 [138 kB] Get: 29 http://deb.debian.org/debian unstable/main amd64 libunistring5 amd64 1.3-1 [476 kB] Get: 30 http://deb.debian.org/debian unstable/main amd64 libidn2-0 amd64 2.3.8-1 [109 kB] Get: 31 http://deb.debian.org/debian unstable/main amd64 libsasl2-modules-db amd64 2.1.28+dfsg1-9 [19.8 kB] Get: 32 http://deb.debian.org/debian unstable/main amd64 libsasl2-2 amd64 2.1.28+dfsg1-9 [57.5 kB] Get: 33 http://deb.debian.org/debian unstable/main amd64 libldap2 amd64 2.6.9+dfsg-2 [194 kB] Get: 34 http://deb.debian.org/debian unstable/main amd64 libnghttp2-14 amd64 1.64.0-1 [75.5 kB] Get: 35 http://deb.debian.org/debian unstable/main amd64 libnghttp3-9 amd64 1.8.0-1 [67.7 kB] Get: 36 http://deb.debian.org/debian unstable/main amd64 libpsl5t64 amd64 0.21.2-1.1+b1 [57.2 kB] Get: 37 http://deb.debian.org/debian unstable/main amd64 libffi8 amd64 3.4.7-1 [23.9 kB] Get: 38 http://deb.debian.org/debian unstable/main amd64 libp11-kit0 amd64 0.25.5-3 [425 kB] Get: 39 http://deb.debian.org/debian unstable/main amd64 libtasn1-6 amd64 4.20.0-2 [49.9 kB] Get: 40 http://deb.debian.org/debian unstable/main amd64 libgnutls30t64 amd64 3.8.9-2 [1464 kB] Get: 41 http://deb.debian.org/debian unstable/main amd64 librtmp1 amd64 2.4+20151223.gitfa8646d.1-2+b5 [58.8 kB] Get: 42 http://deb.debian.org/debian unstable/main amd64 libssh2-1t64 amd64 1.11.1-1 [245 kB] Get: 43 http://deb.debian.org/debian unstable/main amd64 libcurl4t64 amd64 8.12.1-3 [369 kB] Get: 44 http://deb.debian.org/debian unstable/main amd64 libexpat1 amd64 2.6.4-1 [106 kB] Get: 45 http://deb.debian.org/debian unstable/main amd64 libjsoncpp26 amd64 1.9.6-3 [81.7 kB] Get: 46 http://deb.debian.org/debian unstable/main amd64 librhash1 amd64 1.4.5-1 [132 kB] Get: 47 http://deb.debian.org/debian unstable/main amd64 libuv1t64 amd64 1.50.0-2 [154 kB] Get: 48 http://deb.debian.org/debian unstable/main amd64 cmake amd64 3.31.6-1 [12.0 MB] Get: 49 http://deb.debian.org/debian unstable/main amd64 libdebhelper-perl all 13.24.1 [90.9 kB] Get: 50 http://deb.debian.org/debian unstable/main amd64 libtool all 2.5.4-4 [539 kB] Get: 51 http://deb.debian.org/debian unstable/main amd64 dh-autoreconf all 20 [17.1 kB] Get: 52 http://deb.debian.org/debian unstable/main amd64 libarchive-zip-perl all 1.68-1 [104 kB] Get: 53 http://deb.debian.org/debian unstable/main amd64 libfile-stripnondeterminism-perl all 1.14.1-2 [19.7 kB] Get: 54 http://deb.debian.org/debian unstable/main amd64 dh-strip-nondeterminism all 1.14.1-2 [8620 B] Get: 55 http://deb.debian.org/debian unstable/main amd64 libelf1t64 amd64 0.192-4 [189 kB] Get: 56 http://deb.debian.org/debian unstable/main amd64 dwz amd64 0.15-1+b1 [110 kB] Get: 57 http://deb.debian.org/debian unstable/main amd64 gettext amd64 0.23.1-1 [1680 kB] Get: 58 http://deb.debian.org/debian unstable/main amd64 intltool-debian all 0.35.0+20060710.6 [22.9 kB] Get: 59 http://deb.debian.org/debian unstable/main amd64 po-debconf all 1.0.21+nmu1 [248 kB] Get: 60 http://deb.debian.org/debian unstable/main amd64 debhelper all 13.24.1 [920 kB] Fetched 40.7 MB in 0s (101 MB/s) Preconfiguring packages ... Selecting previously unselected package libproc2-0:amd64. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 19783 files and directories currently installed.) Preparing to unpack .../00-libproc2-0_2%3a4.0.4-7_amd64.deb ... Unpacking libproc2-0:amd64 (2:4.0.4-7) ... Selecting previously unselected package procps. Preparing to unpack .../01-procps_2%3a4.0.4-7_amd64.deb ... Unpacking procps (2:4.0.4-7) ... Selecting previously unselected package sensible-utils. Preparing to unpack .../02-sensible-utils_0.0.24_all.deb ... Unpacking sensible-utils (0.0.24) ... Selecting previously unselected package libmagic-mgc. Preparing to unpack .../03-libmagic-mgc_1%3a5.45-3+b1_amd64.deb ... Unpacking libmagic-mgc (1:5.45-3+b1) ... Selecting previously unselected package libmagic1t64:amd64. Preparing to unpack .../04-libmagic1t64_1%3a5.45-3+b1_amd64.deb ... Unpacking libmagic1t64:amd64 (1:5.45-3+b1) ... Selecting previously unselected package file. Preparing to unpack .../05-file_1%3a5.45-3+b1_amd64.deb ... Unpacking file (1:5.45-3+b1) ... Selecting previously unselected package gettext-base. Preparing to unpack .../06-gettext-base_0.23.1-1_amd64.deb ... Unpacking gettext-base (0.23.1-1) ... Selecting previously unselected package libuchardet0:amd64. Preparing to unpack .../07-libuchardet0_0.0.8-1+b2_amd64.deb ... Unpacking libuchardet0:amd64 (0.0.8-1+b2) ... Selecting previously unselected package groff-base. Preparing to unpack .../08-groff-base_1.23.0-7_amd64.deb ... Unpacking groff-base (1.23.0-7) ... Selecting previously unselected package bsdextrautils. Preparing to unpack .../09-bsdextrautils_2.40.4-5_amd64.deb ... Unpacking bsdextrautils (2.40.4-5) ... Selecting previously unselected package libpipeline1:amd64. Preparing to unpack .../10-libpipeline1_1.5.8-1_amd64.deb ... Unpacking libpipeline1:amd64 (1.5.8-1) ... Selecting previously unselected package man-db. Preparing to unpack .../11-man-db_2.13.0-1_amd64.deb ... Unpacking man-db (2.13.0-1) ... Selecting previously unselected package m4. Preparing to unpack .../12-m4_1.4.19-7_amd64.deb ... Unpacking m4 (1.4.19-7) ... Selecting previously unselected package autoconf. Preparing to unpack .../13-autoconf_2.72-3_all.deb ... Unpacking autoconf (2.72-3) ... Selecting previously unselected package autotools-dev. Preparing to unpack .../14-autotools-dev_20220109.1_all.deb ... Unpacking autotools-dev (20220109.1) ... Selecting previously unselected package automake. Preparing to unpack .../15-automake_1%3a1.17-3_all.deb ... Unpacking automake (1:1.17-3) ... Selecting previously unselected package autopoint. Preparing to unpack .../16-autopoint_0.23.1-1_all.deb ... Unpacking autopoint (0.23.1-1) ... Selecting previously unselected package cmake-data. Preparing to unpack .../17-cmake-data_3.31.6-1_all.deb ... Unpacking cmake-data (3.31.6-1) ... Selecting previously unselected package libicu72:amd64. Preparing to unpack .../18-libicu72_72.1-6_amd64.deb ... Unpacking libicu72:amd64 (72.1-6) ... Selecting previously unselected package libxml2:amd64. Preparing to unpack .../19-libxml2_2.12.7+dfsg+really2.9.14-0.2+b2_amd64.deb ... Unpacking libxml2:amd64 (2.12.7+dfsg+really2.9.14-0.2+b2) ... Selecting previously unselected package libarchive13t64:amd64. Preparing to unpack .../20-libarchive13t64_3.7.4-1.1_amd64.deb ... Unpacking libarchive13t64:amd64 (3.7.4-1.1) ... Selecting previously unselected package libbrotli1:amd64. Preparing to unpack .../21-libbrotli1_1.1.0-2+b7_amd64.deb ... Unpacking libbrotli1:amd64 (1.1.0-2+b7) ... Selecting previously unselected package libkrb5support0:amd64. Preparing to unpack .../22-libkrb5support0_1.21.3-5_amd64.deb ... Unpacking libkrb5support0:amd64 (1.21.3-5) ... Selecting previously unselected package libcom-err2:amd64. Preparing to unpack .../23-libcom-err2_1.47.2-1_amd64.deb ... Unpacking libcom-err2:amd64 (1.47.2-1) ... Selecting previously unselected package libk5crypto3:amd64. Preparing to unpack .../24-libk5crypto3_1.21.3-5_amd64.deb ... Unpacking libk5crypto3:amd64 (1.21.3-5) ... Selecting previously unselected package libkeyutils1:amd64. Preparing to unpack .../25-libkeyutils1_1.6.3-4_amd64.deb ... Unpacking libkeyutils1:amd64 (1.6.3-4) ... Selecting previously unselected package libkrb5-3:amd64. Preparing to unpack .../26-libkrb5-3_1.21.3-5_amd64.deb ... Unpacking libkrb5-3:amd64 (1.21.3-5) ... Selecting previously unselected package libgssapi-krb5-2:amd64. Preparing to unpack .../27-libgssapi-krb5-2_1.21.3-5_amd64.deb ... Unpacking libgssapi-krb5-2:amd64 (1.21.3-5) ... Selecting previously unselected package libunistring5:amd64. Preparing to unpack .../28-libunistring5_1.3-1_amd64.deb ... Unpacking libunistring5:amd64 (1.3-1) ... Selecting previously unselected package libidn2-0:amd64. Preparing to unpack .../29-libidn2-0_2.3.8-1_amd64.deb ... Unpacking libidn2-0:amd64 (2.3.8-1) ... Selecting previously unselected package libsasl2-modules-db:amd64. Preparing to unpack .../30-libsasl2-modules-db_2.1.28+dfsg1-9_amd64.deb ... Unpacking libsasl2-modules-db:amd64 (2.1.28+dfsg1-9) ... Selecting previously unselected package libsasl2-2:amd64. Preparing to unpack .../31-libsasl2-2_2.1.28+dfsg1-9_amd64.deb ... Unpacking libsasl2-2:amd64 (2.1.28+dfsg1-9) ... Selecting previously unselected package libldap2:amd64. Preparing to unpack .../32-libldap2_2.6.9+dfsg-2_amd64.deb ... Unpacking libldap2:amd64 (2.6.9+dfsg-2) ... Selecting previously unselected package libnghttp2-14:amd64. Preparing to unpack .../33-libnghttp2-14_1.64.0-1_amd64.deb ... Unpacking libnghttp2-14:amd64 (1.64.0-1) ... Selecting previously unselected package libnghttp3-9:amd64. Preparing to unpack .../34-libnghttp3-9_1.8.0-1_amd64.deb ... Unpacking libnghttp3-9:amd64 (1.8.0-1) ... Selecting previously unselected package libpsl5t64:amd64. Preparing to unpack .../35-libpsl5t64_0.21.2-1.1+b1_amd64.deb ... Unpacking libpsl5t64:amd64 (0.21.2-1.1+b1) ... Selecting previously unselected package libffi8:amd64. Preparing to unpack .../36-libffi8_3.4.7-1_amd64.deb ... Unpacking libffi8:amd64 (3.4.7-1) ... Selecting previously unselected package libp11-kit0:amd64. Preparing to unpack .../37-libp11-kit0_0.25.5-3_amd64.deb ... Unpacking libp11-kit0:amd64 (0.25.5-3) ... Selecting previously unselected package libtasn1-6:amd64. Preparing to unpack .../38-libtasn1-6_4.20.0-2_amd64.deb ... Unpacking libtasn1-6:amd64 (4.20.0-2) ... Selecting previously unselected package libgnutls30t64:amd64. Preparing to unpack .../39-libgnutls30t64_3.8.9-2_amd64.deb ... Unpacking libgnutls30t64:amd64 (3.8.9-2) ... Selecting previously unselected package librtmp1:amd64. Preparing to unpack .../40-librtmp1_2.4+20151223.gitfa8646d.1-2+b5_amd64.deb ... Unpacking librtmp1:amd64 (2.4+20151223.gitfa8646d.1-2+b5) ... Selecting previously unselected package libssh2-1t64:amd64. Preparing to unpack .../41-libssh2-1t64_1.11.1-1_amd64.deb ... Unpacking libssh2-1t64:amd64 (1.11.1-1) ... Selecting previously unselected package libcurl4t64:amd64. Preparing to unpack .../42-libcurl4t64_8.12.1-3_amd64.deb ... Unpacking libcurl4t64:amd64 (8.12.1-3) ... Selecting previously unselected package libexpat1:amd64. Preparing to unpack .../43-libexpat1_2.6.4-1_amd64.deb ... Unpacking libexpat1:amd64 (2.6.4-1) ... Selecting previously unselected package libjsoncpp26:amd64. Preparing to unpack .../44-libjsoncpp26_1.9.6-3_amd64.deb ... Unpacking libjsoncpp26:amd64 (1.9.6-3) ... Selecting previously unselected package librhash1:amd64. Preparing to unpack .../45-librhash1_1.4.5-1_amd64.deb ... Unpacking librhash1:amd64 (1.4.5-1) ... Selecting previously unselected package libuv1t64:amd64. Preparing to unpack .../46-libuv1t64_1.50.0-2_amd64.deb ... Unpacking libuv1t64:amd64 (1.50.0-2) ... Selecting previously unselected package cmake. Preparing to unpack .../47-cmake_3.31.6-1_amd64.deb ... Unpacking cmake (3.31.6-1) ... Selecting previously unselected package libdebhelper-perl. Preparing to unpack .../48-libdebhelper-perl_13.24.1_all.deb ... Unpacking libdebhelper-perl (13.24.1) ... Selecting previously unselected package libtool. Preparing to unpack .../49-libtool_2.5.4-4_all.deb ... Unpacking libtool (2.5.4-4) ... Selecting previously unselected package dh-autoreconf. Preparing to unpack .../50-dh-autoreconf_20_all.deb ... Unpacking dh-autoreconf (20) ... Selecting previously unselected package libarchive-zip-perl. Preparing to unpack .../51-libarchive-zip-perl_1.68-1_all.deb ... Unpacking libarchive-zip-perl (1.68-1) ... Selecting previously unselected package libfile-stripnondeterminism-perl. Preparing to unpack .../52-libfile-stripnondeterminism-perl_1.14.1-2_all.deb ... Unpacking libfile-stripnondeterminism-perl (1.14.1-2) ... Selecting previously unselected package dh-strip-nondeterminism. Preparing to unpack .../53-dh-strip-nondeterminism_1.14.1-2_all.deb ... Unpacking dh-strip-nondeterminism (1.14.1-2) ... Selecting previously unselected package libelf1t64:amd64. Preparing to unpack .../54-libelf1t64_0.192-4_amd64.deb ... Unpacking libelf1t64:amd64 (0.192-4) ... Selecting previously unselected package dwz. Preparing to unpack .../55-dwz_0.15-1+b1_amd64.deb ... Unpacking dwz (0.15-1+b1) ... Selecting previously unselected package gettext. Preparing to unpack .../56-gettext_0.23.1-1_amd64.deb ... Unpacking gettext (0.23.1-1) ... Selecting previously unselected package intltool-debian. Preparing to unpack .../57-intltool-debian_0.35.0+20060710.6_all.deb ... Unpacking intltool-debian (0.35.0+20060710.6) ... Selecting previously unselected package po-debconf. Preparing to unpack .../58-po-debconf_1.0.21+nmu1_all.deb ... Unpacking po-debconf (1.0.21+nmu1) ... Selecting previously unselected package debhelper. Preparing to unpack .../59-debhelper_13.24.1_all.deb ... Unpacking debhelper (13.24.1) ... Setting up libexpat1:amd64 (2.6.4-1) ... Setting up libpipeline1:amd64 (1.5.8-1) ... Setting up libkeyutils1:amd64 (1.6.3-4) ... Setting up libicu72:amd64 (72.1-6) ... Setting up bsdextrautils (2.40.4-5) ... Setting up libmagic-mgc (1:5.45-3+b1) ... Setting up libarchive-zip-perl (1.68-1) ... Setting up libdebhelper-perl (13.24.1) ... Setting up libbrotli1:amd64 (1.1.0-2+b7) ... Setting up libuv1t64:amd64 (1.50.0-2) ... Setting up libmagic1t64:amd64 (1:5.45-3+b1) ... Setting up libnghttp2-14:amd64 (1.64.0-1) ... Setting up gettext-base (0.23.1-1) ... Setting up m4 (1.4.19-7) ... Setting up libcom-err2:amd64 (1.47.2-1) ... Setting up file (1:5.45-3+b1) ... Setting up libelf1t64:amd64 (0.192-4) ... Setting up libkrb5support0:amd64 (1.21.3-5) ... Setting up libsasl2-modules-db:amd64 (2.1.28+dfsg1-9) ... Setting up autotools-dev (20220109.1) ... Setting up libjsoncpp26:amd64 (1.9.6-3) ... Setting up libproc2-0:amd64 (2:4.0.4-7) ... Setting up libunistring5:amd64 (1.3-1) ... Setting up autopoint (0.23.1-1) ... Setting up libk5crypto3:amd64 (1.21.3-5) ... Setting up libsasl2-2:amd64 (2.1.28+dfsg1-9) ... Setting up autoconf (2.72-3) ... Setting up libnghttp3-9:amd64 (1.8.0-1) ... Setting up libffi8:amd64 (3.4.7-1) ... Setting up dwz (0.15-1+b1) ... Setting up sensible-utils (0.0.24) ... Setting up libuchardet0:amd64 (0.0.8-1+b2) ... Setting up procps (2:4.0.4-7) ... Setting up libtasn1-6:amd64 (4.20.0-2) ... Setting up cmake-data (3.31.6-1) ... Setting up librhash1:amd64 (1.4.5-1) ... Setting up libkrb5-3:amd64 (1.21.3-5) ... Setting up libssh2-1t64:amd64 (1.11.1-1) ... Setting up libxml2:amd64 (2.12.7+dfsg+really2.9.14-0.2+b2) ... Setting up libldap2:amd64 (2.6.9+dfsg-2) ... Setting up automake (1:1.17-3) ... update-alternatives: using /usr/bin/automake-1.17 to provide /usr/bin/automake (automake) in auto mode Setting up libfile-stripnondeterminism-perl (1.14.1-2) ... Setting up gettext (0.23.1-1) ... Setting up libtool (2.5.4-4) ... Setting up libidn2-0:amd64 (2.3.8-1) ... Setting up intltool-debian (0.35.0+20060710.6) ... Setting up dh-autoreconf (20) ... Setting up libp11-kit0:amd64 (0.25.5-3) ... Setting up libgssapi-krb5-2:amd64 (1.21.3-5) ... Setting up dh-strip-nondeterminism (1.14.1-2) ... Setting up groff-base (1.23.0-7) ... Setting up libarchive13t64:amd64 (3.7.4-1.1) ... Setting up libgnutls30t64:amd64 (3.8.9-2) ... Setting up po-debconf (1.0.21+nmu1) ... Setting up libpsl5t64:amd64 (0.21.2-1.1+b1) ... Setting up man-db (2.13.0-1) ... Not building database; man-db/auto-update is not 'true'. Setting up librtmp1:amd64 (2.4+20151223.gitfa8646d.1-2+b5) ... Setting up libcurl4t64:amd64 (8.12.1-3) ... Setting up debhelper (13.24.1) ... Setting up cmake (3.31.6-1) ... Processing triggers for libc-bin (2.41-4) ... Reading package lists... Building dependency tree... Reading state information... Reading extended state information... Initializing package states... Writing extended state information... Building tag database... -> Finished parsing the build-deps Reading package lists... Building dependency tree... Reading state information... fakeroot is already the newest version (1.37-1). 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. I: Building the package I: Running cd /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/ && env PATH="/usr/sbin:/usr/bin:/sbin:/bin:/usr/games" HOME="/nonexistent/first-build" dpkg-buildpackage -us -uc -b && env PATH="/usr/sbin:/usr/bin:/sbin:/bin:/usr/games" HOME="/nonexistent/first-build" dpkg-genchanges -S > ../gemmlowp_0.0~git20211220.e844ffd-1_source.changes dpkg-buildpackage: info: source package gemmlowp dpkg-buildpackage: info: source version 0.0~git20211220.e844ffd-1 dpkg-buildpackage: info: source distribution unstable dpkg-buildpackage: info: source changed by Mo Zhou dpkg-source --before-build . dpkg-buildpackage: info: host architecture amd64 debian/rules clean dh clean -Scmake debian/rules override_dh_auto_clean make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' rm -f CMakeLists.txt dh_auto_clean make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' dh_clean -O-Scmake debian/rules binary dh binary -Scmake dh_update_autotools_config -O-Scmake dh_autoreconf -O-Scmake debian/rules override_dh_auto_configure make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' ln -s contrib/CMakeLists.txt . dh_auto_configure -- \ -DCMAKE_C_FLAGS="-g -O2 -Werror=implicit-function-declaration -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2" \ -DCMAKE_CXX_FLAGS="-g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2" cd obj-x86_64-linux-gnu && DEB_PYTHON_INSTALL_LAYOUT=deb PKG_CONFIG=/usr/bin/pkg-config cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=None -DCMAKE_INSTALL_SYSCONFDIR=/etc -DCMAKE_INSTALL_LOCALSTATEDIR=/var -DCMAKE_EXPORT_NO_PACKAGE_REGISTRY=ON -DCMAKE_FIND_USE_PACKAGE_REGISTRY=OFF -DCMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY=ON -DFETCHCONTENT_FULLY_DISCONNECTED=ON -DCMAKE_INSTALL_RUNSTATEDIR=/run -DCMAKE_SKIP_INSTALL_ALL_DEPENDENCY=ON "-GUnix Makefiles" -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_INSTALL_LIBDIR=lib/x86_64-linux-gnu "-DCMAKE_C_FLAGS=-g -O2 -Werror=implicit-function-declaration -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2" "-DCMAKE_CXX_FLAGS=-g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2" .. CMake Deprecation Warning at CMakeLists.txt:5 (cmake_minimum_required): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument value. Or, use the ... syntax to tell CMake that the project requires at least but has been updated to work with policies introduced by or earlier. -- The C compiler identification is GNU 14.2.0 -- The CXX compiler identification is GNU 14.2.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Configuring done (1.6s) -- Generating done (0.0s) CMake Warning: Manually-specified variables were not used by the project: CMAKE_EXPORT_NO_PACKAGE_REGISTRY CMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY CMAKE_FIND_USE_PACKAGE_REGISTRY FETCHCONTENT_FULLY_DISCONNECTED -- Build files have been written to: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' dh_auto_build -O-Scmake cd obj-x86_64-linux-gnu && make -j42 "INSTALL=install --strip-program=true" VERBOSE=1 make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' /usr/bin/cmake -S"/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" -B"/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/cmake -E cmake_progress_start "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu//CMakeFiles/progress.marks" make -f CMakeFiles/Makefile2 all make[2]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/eight_bit_int_gemm.dir/build.make CMakeFiles/eight_bit_int_gemm.dir/depend make -f CMakeFiles/benchmark.dir/build.make CMakeFiles/benchmark.dir/depend make -f CMakeFiles/benchmark_all_sizes.dir/build.make CMakeFiles/benchmark_all_sizes.dir/depend make -f CMakeFiles/test_math_helpers.dir/build.make CMakeFiles/test_math_helpers.dir/depend make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/eight_bit_int_gemm.dir/DependInfo.cmake" "--color=" make -f CMakeFiles/test_blocking_counter.dir/build.make CMakeFiles/test_blocking_counter.dir/depend make -f CMakeFiles/test_allocator.dir/build.make CMakeFiles/test_allocator.dir/depend make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/benchmark.dir/DependInfo.cmake" "--color=" make -f CMakeFiles/test_fixedpoint.dir/build.make CMakeFiles/test_fixedpoint.dir/depend make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/benchmark_all_sizes.dir/DependInfo.cmake" "--color=" make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_blocking_counter.dir/DependInfo.cmake" "--color=" make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_allocator.dir/DependInfo.cmake" "--color=" make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_fixedpoint.dir/DependInfo.cmake" "--color=" make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/benchmark.dir/build.make CMakeFiles/benchmark.dir/build make -f CMakeFiles/benchmark_all_sizes.dir/build.make CMakeFiles/benchmark_all_sizes.dir/build make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/test_allocator.dir/build.make CMakeFiles/test_allocator.dir/build make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/eight_bit_int_gemm.dir/build.make CMakeFiles/eight_bit_int_gemm.dir/build make -f CMakeFiles/test_blocking_counter.dir/build.make CMakeFiles/test_blocking_counter.dir/build make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_math_helpers.dir/DependInfo.cmake" "--color=" [ 29%] Building CXX object CMakeFiles/test_allocator.dir/test/test_allocator.cc.o [ 29%] Building CXX object CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o [ 17%] Building CXX object CMakeFiles/benchmark.dir/test/benchmark.cc.o [ 29%] Building CXX object CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o [ 29%] Building CXX object CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/benchmark.dir/test/benchmark.cc.o -MF CMakeFiles/benchmark.dir/test/benchmark.cc.o.d -o CMakeFiles/benchmark.dir/test/benchmark.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/benchmark.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -MF CMakeFiles/test_allocator.dir/test/test_allocator.cc.o.d -o CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test_allocator.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -MF CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o.d -o CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test_blocking_counter.cc" make -f CMakeFiles/test_fixedpoint.dir/build.make CMakeFiles/test_fixedpoint.dir/build /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -DBENCHMARK_8bit -DBENCHMARK_QUICK -MD -MT CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -MF CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o.d -o CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/benchmark_all_sizes.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o -MF CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o.d -o CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/eight_bit_int_gemm/eight_bit_int_gemm.cc" make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/test_math_helpers.dir/build.make CMakeFiles/test_math_helpers.dir/build [ 35%] Building CXX object CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -MF CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o.d -o CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test_fixedpoint.cc" make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 41%] Building CXX object CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -MF CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o.d -o CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test_math_helpers.cc" [ 52%] Linking CXX executable test_allocator [ 52%] Linking CXX executable test_blocking_counter /usr/bin/cmake -E cmake_link_script CMakeFiles/test_allocator.dir/link.txt --verbose=1 /usr/bin/cmake -E cmake_link_script CMakeFiles/test_blocking_counter.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -Wl,--dependency-file=CMakeFiles/test_blocking_counter.dir/link.d CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -o test_blocking_counter -lpthread /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -Wl,--dependency-file=CMakeFiles/test_allocator.dir/link.d CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -o test_allocator make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 52%] Built target test_blocking_counter [ 52%] Built target test_allocator [ 58%] Linking CXX executable test_math_helpers /usr/bin/cmake -E cmake_link_script CMakeFiles/test_math_helpers.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -Wl,--dependency-file=CMakeFiles/test_math_helpers.dir/link.d CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -o test_math_helpers make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 58%] Built target test_math_helpers [ 64%] Linking CXX executable benchmark_all_sizes /usr/bin/cmake -E cmake_link_script CMakeFiles/benchmark_all_sizes.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -Wl,--dependency-file=CMakeFiles/benchmark_all_sizes.dir/link.d CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -o benchmark_all_sizes -lpthread make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 64%] Built target benchmark_all_sizes [ 70%] Linking CXX executable benchmark /usr/bin/cmake -E cmake_link_script CMakeFiles/benchmark.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -Wl,--dependency-file=CMakeFiles/benchmark.dir/link.d CMakeFiles/benchmark.dir/test/benchmark.cc.o -o benchmark -lpthread make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 70%] Built target benchmark [ 76%] Linking CXX executable test_fixedpoint /usr/bin/cmake -E cmake_link_script CMakeFiles/test_fixedpoint.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -Wl,--dependency-file=CMakeFiles/test_fixedpoint.dir/link.d CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -o test_fixedpoint make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 76%] Built target test_fixedpoint [ 82%] Linking CXX static library libeight_bit_int_gemm.a /usr/bin/cmake -P CMakeFiles/eight_bit_int_gemm.dir/cmake_clean_target.cmake /usr/bin/cmake -E cmake_link_script CMakeFiles/eight_bit_int_gemm.dir/link.txt --verbose=1 /usr/bin/ar qc libeight_bit_int_gemm.a CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o /usr/bin/ranlib libeight_bit_int_gemm.a make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 82%] Built target eight_bit_int_gemm make -f CMakeFiles/test_gemmlowp.dir/build.make CMakeFiles/test_gemmlowp.dir/depend make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_gemmlowp.dir/DependInfo.cmake" "--color=" make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/test_gemmlowp.dir/build.make CMakeFiles/test_gemmlowp.dir/build make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 88%] Building CXX object CMakeFiles/test_gemmlowp.dir/test/test.cc.o [ 94%] Building CXX object CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_gemmlowp.dir/test/test.cc.o -MF CMakeFiles/test_gemmlowp.dir/test/test.cc.o.d -o CMakeFiles/test_gemmlowp.dir/test/test.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -MF CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o.d -o CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test_data.cc" /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In file included from /usr/include/stdio.h:970, from /usr/include/c++/14/cstdio:42, from /usr/include/c++/14/ext/string_conversions.h:45, from /usr/include/c++/14/bits/basic_string.h:4154, from /usr/include/c++/14/string:54, from /usr/include/c++/14/bits/locale_classes.h:40, from /usr/include/c++/14/bits/ios_base.h:41, from /usr/include/c++/14/ios:44, from /usr/include/c++/14/ostream:40, from /usr/include/c++/14/iostream:41, from /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.h:26, from /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:15: In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 4>, gemmlowp::KernelSideFormat, 5> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 4>, gemmlowp::KernelSideFormat, 5> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:59: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 230 [-Wformat-truncation=] 123 | snprintf(buf, sizeof(buf), "SingleThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 27 and 282 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:59: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 230 [-Wformat-truncation=] 123 | snprintf(buf, sizeof(buf), "SingleThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 27 and 282 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:68:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 68 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ [100%] Linking CXX executable test_gemmlowp /usr/bin/cmake -E cmake_link_script CMakeFiles/test_gemmlowp.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -Wl,--dependency-file=CMakeFiles/test_gemmlowp.dir/link.d CMakeFiles/test_gemmlowp.dir/test/test.cc.o CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -o test_gemmlowp libeight_bit_int_gemm.a -lpthread make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [100%] Built target test_gemmlowp make[2]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' /usr/bin/cmake -E cmake_progress_start "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles" 0 make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' dh_auto_test -O-Scmake cd obj-x86_64-linux-gnu && make -j42 test ARGS\+=--verbose ARGS\+=-j42 make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' Running tests... /usr/bin/ctest --force-new-ctest-process --verbose -j42 UpdateCTestConfiguration from :/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl Parse Config file:/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl UpdateCTestConfiguration from :/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl Parse Config file:/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl Test project /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu Constructing a list of tests Done constructing a list of tests Updating test list for fixtures Added 0 tests to meet fixture requirements Checking test dependency graph... Checking test dependency graph end Connected to MAKE jobserver test 1 Start 1: test_math_helpers 1: Test command: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_math_helpers 1: Working Directory: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 1: Test timeout computed to be: 1500 test 2 Start 2: test_blocking_counter 2: Test command: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_blocking_counter 2: Working Directory: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 2: Test timeout computed to be: 1500 test 3 Start 3: test_allocator 3: Test command: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_allocator 3: Working Directory: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 3: Test timeout computed to be: 1500 test 4 Start 4: test_fixedpoint 4: Test command: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_fixedpoint 4: Working Directory: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 4: Test timeout computed to be: 1500 test 5 Start 5: test_gemmlowp 5: Test command: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_gemmlowp 5: Working Directory: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 5: Test timeout computed to be: 1500 1/5 Test #1: test_math_helpers ................ Passed 0.01 sec 2/5 Test #3: test_allocator ................... Passed 0.00 sec 5: TestWithSmallData: PASS 5: number of matrix entries: 8 5: median value: 136 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 3/5 Test #2: test_blocking_counter ............ Passed 0.01 sec 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 4: PASS (Scalar int32) 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 4: PASS (Scalar int16) 4/5 Test #4: test_fixedpoint .................. Passed 0.19 sec 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 24 5: TestWithRealData: PASS with Lhs: 8 bit, Rhs: 8 bit 5: number of matrix entries: 49152 5: median value: 104 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestWithRealData: PASS with (legacy, no longer requantizing) Lhs: 7 bit, Rhs: 5 bit 5: number of matrix entries: 49152 5: median value: 104 5: median unsigned diff: 0 (tolerating 2) 5: max unsigned diff: 0 (tolerating 10) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0.2) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestWithSmallDataPerChannelQuantization: PASS 5: number of matrix entries: 18 5: median value: 127 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestWithLargeDataPerChannelQuantization: PASS 5: number of matrix entries: 550 5: median value: 7 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestMultithreadedPerChannelQuantization: PASS 5: number of matrix entries: 1280 5: median value: 0 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: All tests passed. 5/5 Test #5: test_gemmlowp .................... Passed 105.64 sec 100% tests passed, 0 tests failed out of 5 Total Test time (real) = 105.65 sec make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' create-stamp debian/debhelper-build-stamp dh_prep -O-Scmake dh_auto_install --destdir=debian/libgemmlowp-dev/ -O-Scmake cd obj-x86_64-linux-gnu && make -j42 install DESTDIR=/build/reproducible-path/gemmlowp-0.0\~git20211220.e844ffd/debian/libgemmlowp-dev AM_UPDATE_INFO_DIR=no "INSTALL=install --strip-program=true" make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' /usr/bin/cmake -S"/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" -B"/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" --check-build-system CMakeFiles/Makefile.cmake 0 make -f CMakeFiles/Makefile2 preinstall make[2]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[2]: Nothing to be done for 'preinstall'. make[2]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' Install the project... /usr/bin/cmake -P cmake_install.cmake -- Install configuration: "None" -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/eight_bit_int_gemm/eight_bit_int_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/base.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_common.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_gemv.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_operations_common.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_single_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_common.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_transform.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels_arm_32.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels_arm_64.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/single_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/single_thread_transform.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams_arm_32.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams_arm_64.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels_arm_32.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels_arm_64.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/bit_depth.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/gemmlowp.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/map.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/output_stages.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/instrumentation.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/profiler.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/pthread_everywhere.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/allocator.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/block_params.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/common.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/compute.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/detect_platform.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/dispatch_gemm_shape.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_avx.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_default.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_msa.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_neon.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_reference.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/multi_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_avx.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_msa.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_neon.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_avx.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_msa.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_neon.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/platform.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_common_neon_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_msa.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_neon.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/single_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/unpack.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_avx.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_msa.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_neon.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_wasmsimd.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/lib/x86_64-linux-gnu/libeight_bit_int_gemm.a -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/lib/x86_64-linux-gnu/cmake/gemmlowp/gemmlowp-config.cmake -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/lib/x86_64-linux-gnu/cmake/gemmlowp/gemmlowp-config-none.cmake make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' dh_install -O-Scmake debian/rules override_dh_installdocs make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' mkdir -p debian/libgemmlowp-dev/usr/share/doc/libgemmlowp-dev/meta/ install meta/README debian/libgemmlowp-dev/usr/share/doc/libgemmlowp-dev/meta/ dh_installdocs make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' dh_installchangelogs -O-Scmake dh_installexamples -O-Scmake dh_installinit -O-Scmake dh_perl -O-Scmake dh_link -O-Scmake dh_strip_nondeterminism -O-Scmake dh_compress -O-Scmake dh_fixperms -O-Scmake dh_missing -O-Scmake dh_dwz -a -O-Scmake dh_strip -a -O-Scmake dh_makeshlibs -a -O-Scmake dh_shlibdeps -a -O-Scmake dh_installdeb -O-Scmake dh_gencontrol -O-Scmake dh_md5sums -O-Scmake dh_builddeb -O-Scmake dpkg-deb: building package 'libgemmlowp-dev' in '../libgemmlowp-dev_0.0~git20211220.e844ffd-1_amd64.deb'. dpkg-genbuildinfo --build=binary -O../gemmlowp_0.0~git20211220.e844ffd-1_amd64.buildinfo dpkg-genchanges --build=binary -O../gemmlowp_0.0~git20211220.e844ffd-1_amd64.changes dpkg-genchanges: info: binary-only upload (no source code included) dpkg-source --after-build . dpkg-buildpackage: info: binary-only upload (no source included) dpkg-genchanges: info: including full source code in upload I: copying local configuration I: unmounting dev/ptmx filesystem I: unmounting dev/pts filesystem I: unmounting dev/shm filesystem I: unmounting proc filesystem I: unmounting sys filesystem I: cleaning the build env I: removing directory /srv/workspace/pbuilder/3713724 and its subdirectories I: Current time: Wed Apr 15 13:35:06 -12 2026 I: pbuilder-time-stamp: 1776303306