>>13364431Nah his arguments are bullshit. For example, here's him trying to claim credit for resnets
https://people.idsia.ch/~juergen/highway-networks.htmlSchmidhuber gives this definition of a "highway LSTM"
>Let g, t, h denote non-linear differentiable functions. Each non-input layer of a Highway Net computes g(x)x + t(x)h(x), where x is the data from the previous layer.He makes the claim that any model where non-input layers have the form g(x)x+t(x)h(x) is a derivative of a highway LSTM. He claims resnets are a "special case"
>If we open the gates by setting g(x)=t(x)=1 and keep them open, we obtain the so-called Residual Net or ResNet, a special case of our Highway NetFollowing his own logic leads to the following:
The identity function y=x is just a special case of a LSTM where g(x)=1, t(x)=0
Dropout is a special case of a LSTM where t(x)=0 and g(x) is a binary mask
Any model where your layers can be written as y=t(x) is a special case of an LSTM where g(x)=0 and h(x)=1
His arguments all boil down to basically claiming ownership over broad functional forms, which is complete horseshit