/sci/ - Science & Math » Thread #14315879

310KiB, 2560x1662, machine-learning-helps-life-insurance-scaled.jpg

View Same Google iqdb SauceNAO

Anonymous Fri 18 Mar 23:48:31 2022 No.14315879 View Reply Original Report

Quoted By: >>14316574 >>14316622 >>14316711 >>14316941 >>14317279

>obtain an obscene amount of data
>perform linear regression on it
>if the error is too high, perform regularized linear regression
>if the error is still too high, use a nonlinear transformation on the map and perform linear regression on the transformed data
>if that error is still too high, regularize that regression
>if the error is still too high, claim you just don't have enough data and appeal to the no free lunch theorem
Why is there so much money in this field?

Anonymous

Anonymous Sat 19 Mar 2022 05:27:18 No.14316574 Report

Quoted By:

>>14315879
sorry anon, machine learning isn't interesting enough for scitizens, please pick a better image next time, they rely a lot on visual stimuli.

Anonymous

Anonymous Sat 19 Mar 2022 05:57:29 No.14316622 Report

Quoted By:

>>14315879
because data is not a commodity in the tech space as you describe it, it's more like a biproduct of already making a shit ton of money, but now you can take your biproducts and make a brick ton more money from it.

Anonymous

Anonymous Sat 19 Mar 2022 06:20:55 No.14316657 Report

Quoted By:

>be computer basedintist
>spend 40 years doing "neural net" stuff
>AGI is just around the corner if we just increase the processing power one more time!

>meanwhile
>be Fin-tech Chad
>market trading software written in cobalt running on windows 95 with 1GHz and 400MB memory makes a breakthrough
>finds out if it becomes conscious it can beat the market 50.0001% of the time
>achieves AGI
>never tells a single soul

Anonymous

View Same Google iqdb SauceNAO YKemgD21Pr.jpg, 54KiB, 650x650

Anonymous Sat 19 Mar 2022 07:16:00 No.14316711 Report

Quoted By:

>>14315879
>why is there so much money in this field?
because people who have no idea think its sexy.
>t. data science/ML consulting freelancer
its so fucking easy. I take 160 euros an hour and only do it as a weekend side gig. you literally only need a smile, a suit and R or Python knowledge

Anonymous

Anonymous Sat 19 Mar 2022 09:43:09 No.14316941 Report

Quoted By:

>>14315879
There's money because it's relatively new and most people are absolutely fucking retarded and take decades to learn new stuff.
I went into a data science training program because I was tired of being a starving physicist. Everybody in that program had to have a least a master's. After a year of classes and industry apprenticeship, 2/3rds of the participants are still fucking clueless about python and couldn't make a keyras sequential model if their life depended on it.

Anonymous

Anonymous Sat 19 Mar 2022 13:11:46 No.14317279 Report

Quoted By:

>>14315879
It's funny how many new hot-shot comp-sci people come to intern at our company and go
>oh yeah I built some models with that data and no matter how I tweaked the model it sucks, the data is useless
>wait what do you MEAN the data has errors? I looked, there are no NAs!
ML isn't about knowing how to model ideal data, a toddler could run machine learning on a good dataset
It's about domain-knowledge of datasets, what to expect with datasets in your field, and how to wrestle clean data (e.g., I just KNOW if a collaborator sends us an SDF/SMILES file, there's a 50/50 chance they accidentally covalently bonded all the salts to the molecule, and it's probably full of mixtures and valency issues)
The modeling is a single button click for me that does all of sklearn's algorithms + xgboost/a dozen or so pytorch models, automatic nested 5x CV hyperparameter search. That took a week to setup and runs in a few hours on our biggest dataset, a few minutes on our smaller ones. There's literally no need to think about the models themselves, just grab the best ones (in our domain, it's almost always SVC/random forest, something to do with the fact that 99% of our feature vector is a sparse bit vector, and I'm talking <2% set bits on average for 1000+ features). All of the focus is on how you treat the data, and what data you CAN get. It's no secret that the only real correlate with model-improvement is whether someone knows their domain knowledge or not; if you scramble datasets between experts in different fields, they will always build worse models than the type of data they know intimately.
It's easy to build models. It's easy to build good models if you have good data.
The real challenge is shit data and shit datasets.
It's funny how many companies we contract with have an army of "ML experts" but I somehow build a better model than their team of 15 because I, you know, know what to do with data.
That said, it's an easy as fuck job.

Capcode	All Only User Posts Only Moderator Posts Only Admin Posts Only Developer Posts
Show Posts	All Only With Images Only Without Images
Deleted Posts	All Only Deleted Posts Only Non-Deleted Posts
Ghost Posts	All Only Ghost Posts Only Non-Ghost Posts
Post Type	All Only Sticky Threads Only Opening Posts Only Reply Posts
Results	All Grouped By Threads
Order	Latest Posts First Oldest Posts First

Your latest searches