There are a number of ways you can use the fast Walsh Hadamard Transform (WHT) for sparse systems.

https://en.wikipedia.org/wiki/Hadamard_transform

Since a change in one single input element alters all the output elements one way or another it provides fully connectivity at a very low cost.

You can use it to provide full initial connectivity before a sparse net. A slight problem with that is the transform takes a spectrum of the input data. This problem can be dealt with by applying a fixed randomly chosen pattern of sign flips to the input data before calculating the transform. That results in effectively a random projection. Sub-random projections are also possible.

https://www.kdnuggets.com/2021/07/wht-simpler-fast-fourier-transform-fft.html

You can also directly use the fast WHT to create neural networks. You can view the WHT as a fixed dense layer of weighted sums with a calculation cost of nlog2(n) add subtract operations. Far cheaper than a convention dense weighted sum layer that has a cost of n squared fused multiply-add operations.

There is a big problem though…there is nothing to adjust. If you create a neural network using the WHT for the weighted sums you end up with a frozen neural network, that does something. Who knows what, but something.

The solution is to actually use individually adjustable parametric activation functions.

Then you can have a complete neural network layer for nlog2(n) add subtract operations and n multiplies using 2n parameters, for example.

https://ai462qqq.blogspot.com/

I’m reluctantly on twitter too:

https://twitter.com/SeanOCo07854298