/sci/ - Science & Math » Thread #13249929

2MiB, 1392x868, MLP-Mixer-Weights.png

View Same Google iqdb SauceNAO

MLP Mixer is weird

Anonymous Tue 08 Jun 00:41:38 2021 No.13249929 View Reply Original Report

Quoted By: >>13250086

I'm messing around with MLP Mixer because I think everyone else is already looking at transformers for image stuff anyways. I want to try to improve the architecture some. I'm just playing with CIFAR10 and I'm not using a pretrained model, and after throwing the book at it to avoid overfitting (RandAugment, AdamW, dropout) I was finally able to get it up to about 75% accuracy in an hour. Is that the highest it's likely to go without pretraining, or should I fuck around with the parameters and try to get it higher before I start making big changes? This is the first network where I've really felt like I understand it well enough to try improving it (because it's so simple,) so I don't know how worthwhile it is to find a good baseline for the base model's hyperparameters before I start screwing with it.

Anonymous

View Same Google iqdb SauceNAO Screenshot at 19-42-39.png, 232KiB, 642x555

Anonymous Tue 08 Jun 2021 00:44:04 No.13249939 Report

Quoted By:

Pic related is what I had to do to the data to get it to stop overfitting around 60%, by the way. The authors weren't kidding when they said it was prone to that.

Anonymous

Anonymous Tue 08 Jun 2021 00:54:04 No.13249973 Report

Quoted By: >>13249974 >>13250035 >>13250061 >>13250287

I have no idea what you're talking about but Rainbow Dash is best pony.

Anonymous

Anonymous Tue 08 Jun 2021 00:54:55 No.13249974 Report

Quoted By: >>13250004 >>13250035

>>13249973
I hope you get banned worthless faggot

Anonymous

Anonymous Tue 08 Jun 2021 01:02:54 No.13250004 Report

Quoted By: >>13250035

>>13249974
Rarity fan spotted

Anonymous

Anonymous Tue 08 Jun 2021 01:11:28 No.13250035 Report

Quoted By:

>>13249974
Next time just use "multi layer perceptron mixer"
>>13249973
>>13250004
Mentally ill subhuman degenerate faggot nigger

Anonymous

View Same Google iqdb SauceNAO Lee bait.jpg, 126KiB, 1024x768

Anonymous Tue 08 Jun 2021 01:16:35 No.13250051 Report

Quoted By:

>MLP
Does Leebot roam around these parts?

Anonymous

View Same Google iqdb SauceNAO Screenshot at 20-17-07.png, 33KiB, 315x514

Anonymous Tue 08 Jun 2021 01:19:03 No.13250061 Report

Quoted By: >>13250258

I'm waiting on some results right now, so in case anyone cares here's the gist of my first plan for an improvement: In the initial paper, patches of only one fixed size were used. There's been some improvements over in ViT-based models lately where people incorporated different patch sizes, I think, so I'm trying that.

With CIFAR, the only valid patch sizes are 1, 2, 4, 8, 16, and 32. At 1 and 32, it seems to me like it becomes a bit pointless, so I'm ignoring those for now. I've adjusted the model so that it takes multiple (valid) patch sizes, and has a MLP with 2 hidden layers for each size. The output of each of those is passed into the mixer layers, and from there everything's the same.

I'm finding that larger patch sizes seem to fuck things up in a bad way. patch_sizes [4,8] is what I'm running now, and it's training much, much faster than [8,16] did. However, I think it may also be overfitting - it reached 70% accuracy in fewer than half the epochs it took the larger patches, but it's stalled out there. I'm considering whether I want to try doing it more like CNNs, rather than just having each segment take patches from the original image, or keep testing out different patch size combinations and write some visualizer for the MLPs.

>>13249973
I don't know anything about it. I thought everyone on 4chan liked the purple one, though.

Anonymous

Anonymous Tue 08 Jun 2021 01:28:06 No.13250086 Report

Quoted By: >>13250167

>>13249929
Hey man, I just wanted to ask, how did you get started fucking around with CIFAR10? Some anon recommended me those datasets a while ago in some threads but I dont even know where to start.
thank you.

Anonymous

Anonymous Tue 08 Jun 2021 01:54:50 No.13250167 Report

Quoted By: >>13250326

>>13250086
Just follow pytorch's blitz tutorials, it'll get you set up with cifar and a basic CNN classifier. https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html

Anonymous

Anonymous Tue 08 Jun 2021 02:26:28 No.13250258 Report

Quoted By:

>>13250061
The [4,8] model was definitely overfitting by the end, but it actually got to 80% before doing so. Adding more parameters might help since my model's only a couple GB, but I'm going to see what happens with [2,4,8] and [4,8,16] first. Also, I'm probably going to stop bumping this thread and just come back and make a ML general at some point (on here or on /g/) so I have some place to talk about this stuff.

Anonymous

Anonymous Tue 08 Jun 2021 02:33:50 No.13250287 Report

Quoted By:

>>13249973
Good taste, but Trixie is the high IQ answer.

Anonymous

Anonymous Tue 08 Jun 2021 02:37:30 No.13250303 Report

Quoted By: >>13250382

Fnet >>>>>>>> MLP Mixer

Anonymous

Anonymous Tue 08 Jun 2021 02:38:46 No.13250311 Report

Quoted By: >>13250358 >>13250839

Also where are you fags getting access to GPUs right now? Both Google and Amazon wouldn't let me use them because I'm a non-corporate customer.

Anonymous

Anonymous Tue 08 Jun 2021 02:45:20 No.13250326 Report

Quoted By: >>13250335 >>13250388

>>13250167
>CNN
I have no idea what you're talking about but Imma go ahead and say fuck off you lame-stream media news kike shill.

Anonymous

Anonymous Tue 08 Jun 2021 02:47:02 No.13250335 Report

Quoted By:

>>13250326
retard

Anonymous

Anonymous Tue 08 Jun 2021 02:51:37 No.13250358 Report

Quoted By: >>13250432

>>13250311
Bought a prebuilt with one. Can't do any serious training on it because my fucking AC can't keep up, but it's enough to test out some ideas.

Anonymous

Anonymous Tue 08 Jun 2021 02:57:30 No.13250382 Report

Quoted By:

>>13250303
FNet is very cool too, but I'm still too new to this to focus on more than one at once, and MLPM is even simpler. I'll probably take a shot at it next though. Does this github look like a decent base to start with? https://github.com/rishikksh20/FNet-pytorch

Anonymous

Anonymous Tue 08 Jun 2021 02:58:57 No.13250388 Report

Quoted By:

>>13250326
retard, not funny

Anonymous

Anonymous Tue 08 Jun 2021 03:04:16 No.13250405 Report

Quoted By: >>13250430 >>13250432

>smelly pajeets ITT

Anonymous

Anonymous Tue 08 Jun 2021 03:14:29 No.13250430 Report

Quoted By:

>>13250405
I'm very white. I'll never understand the sentiment that machine learning isn't for white people.

Anonymous

Anonymous Tue 08 Jun 2021 03:14:58 No.13250432 Report

Quoted By: >>13250468

>>13250358
What sort of setup? How much was it?

>>13250405
100% white

Anonymous

Anonymous Tue 08 Jun 2021 03:25:48 No.13250468 Report

Quoted By: >>13250522

>>13250432
HPs with 3090s go for about $3000 but you can get more from /g/ on that

Anonymous

Anonymous Tue 08 Jun 2021 03:39:17 No.13250522 Report

Quoted By:

>>13250468
Oh yeah, I was kind of hoping for a quad GPU setup. I ended up getting an ultrabook instead, thinking I could use cloud GPUs, only for them to deny my quota requests.

Anonymous

Anonymous Tue 08 Jun 2021 05:11:51 No.13250839 Report

Quoted By: >>13250864

>>13250311
Try using lambda cloud, reasonable prices and works very well

Anonymous

Anonymous Tue 08 Jun 2021 05:19:59 No.13250864 Report

Quoted By:

>>13250839
Thank you!!

Capcode	All Only User Posts Only Moderator Posts Only Admin Posts Only Developer Posts
Show Posts	All Only With Images Only Without Images
Deleted Posts	All Only Deleted Posts Only Non-Deleted Posts
Ghost Posts	All Only Ghost Posts Only Non-Ghost Posts
Post Type	All Only Sticky Threads Only Opening Posts Only Reply Posts
Results	All Grouped By Threads
Order	Latest Posts First Oldest Posts First

Your latest searches