>>12493564first the transformers are indeed used. the reason for this is
- they are mapping protein sequence to structure.
- their assumption is (which is infact true) that protein sequence (how the amino acids are arranged) contains evolutionary and structural information.
- For this mapping they have to associate position of a amino acid with other ones at different places in the sequence.
- this association is classic ability of transformers and hence very useful NLP
Other points
- I doubt they used graph conv net since mapping (type, position) associations of the amino acids in the seq to the 3d structure should suffice for training.
PROS
- Is it useful yes! first cause crystallography is very expensive to obtain 3D structures of protein
- also useful cause you can get structures of unknown sequences (similar to training set) however crystallography would still be needed to confirm structure.
- model can be iteratively improved further by training it on more sequence-->3D structure
CONS
- won't be very reliable with exotic sequences
- disregards process of folding --> why is it favoring one structure over the other
- maybe something else that is not coming in my mind
OVERALL
- Pretty good stuff regardless of academia like it or not, as a person who had extensive experience with academia, simple solutions are sometimes not fondly looked upon.
- I fear that funding for science projects might taken over by companies like Google as such models are useful for quick profits instead of spending time and obtain concrete science.
- This and GPT-3 model are really beefy bois and need 100-200 GPUs to train. This is not possible easily in academia
Honestly, if some grad student did this and showed it to their advisor, quite a few of them would have rejected it completely saying "this is not science" or "model fitting is not solution of protein folding problem"