/sci/ - Science & Math » Thread #13652572

2KiB, 400x121, 1581428399variance-calculator-formula-png.png

View Same Google iqdb SauceNAO

Anonymous Fri 17 Sep 16:25:44 2021 No.13652572 View Reply Original Report

Quoted By: >>13652581 >>13652611 >>13652797 >>13654975

When calculating variance, why do we square the deviation of random variable from it's mean? I figured it's so that positive and negative deviations dont cancel each other out, but obviously you could do that easier just by taking the absolute value of the deviation. So yeah, where does that second power come from?

Anonymous

Anonymous Fri 17 Sep 2021 16:27:23 No.13652581 Report

Quoted By: >>13652589

>>13652572
Euclidean distance.

Anonymous

Anonymous Fri 17 Sep 2021 16:28:57 No.13652589 Report

Quoted By: >>13652714

>>13652581
>Euclidean distance

Care to elaborate? Why euclidean distance instead of just absolute distance?

Anonymous

Anonymous Fri 17 Sep 2021 16:34:10 No.13652611 Report

Quoted By:

>>13652572
idk if my answer is gonna be dumb but here ya go OP.
rms>=am>=gm>=hm
and that means the rms value is going to indicate the maximum possible average 'deviation'

Anonymous

Anonymous Fri 17 Sep 2021 16:38:39 No.13652631 Report

Quoted By:

>https://en.m.wikipedia.org/wiki/Least_absolute_deviations
We do use Least absolute deviations, but the sse is used mainly for its relationship with the normal distribution in that it's a method which relies on parametrization. Sum of absolute errors is not parametric and thus more computationally expensive, but it is useful when you have the possibility of unknown outliers.

Anonymous

Anonymous Fri 17 Sep 2021 17:01:39 No.13652714 Report

Quoted By:

>>13652589
>Why euclidean distance instead of just absolute distance?

the term you are looking for is Manhattan distance, because the total distance would be the sum of the lengths along each axis, like the minimum distance between two points on a grid like the street of manhattan.

Anonymous

Anonymous Fri 17 Sep 2021 17:29:18 No.13652797 Report

Quoted By:

>>13652572

Really we could use any dispersion - a nonnegative real number that is zero if all the data are the same and increases as the data become more diverse - to measure how much our data "varies".

There are tons of measures of dispersion

Standard deviation
Interquartile range
Range
Mean absolute difference (Gini mean absolute difference)
Median absolute deviation
Average absolute deviation (or simply average deviation)
Distance standard deviation
Coefficient of variation
Quartile coefficient of dispersion
Relative mean difference, equal to twice the Gini coefficient
Entropy: While the entropy of a discrete variable is location-invariant and scale-independent, and therefore not a measure of dispersion in the above sense, the entropy of a continuous variable is location invariant and additive in scale: If Hz is the entropy of continuous variable z and z=ax+b, then Hz=Hx+log(a).
Variance
Variance-to-mean ratio
Berger–Parker index
Brillouin index of diversity
Hill's diversity numbers
Margalef's index
Menhinick's index
Q statistic
Shannon–Wiener index
Rényi entropy
McIntosh's D and E
Fisher's alpha
Strong's index
Simpson's E
Smith & Wilson's indices
Heip's index
Camargo's index
Smith and Wilson's B
Nee, Harvey, and Cotgreave's index
Bulla's E
Horn's information theory index
Rarefaction index
Caswell's V
Lloyd & Ghelardi's index
Average taxonomic distinctness index
Index of qualitative variation

and you can come up infinity more of them.

So why do we use variance as calculated? It's pretty much because Var(X+Y) = Var(X) + Var(Y) when X and Y are not correlated. It just makes the math easier. I don't think there is any other reason.

Anonymous

Anonymous Fri 17 Sep 2021 18:09:35 No.13652951 Report

Quoted By:

abs() is not differentiable at 0

Anonymous

Anonymous Fri 17 Sep 2021 19:51:36 No.13653310 Report

Quoted By:

Squaring is much easier to work with and generalizes well.

Anonymous

View Same Google iqdb SauceNAO loiri.jpg, 19KiB, 468x482

Anonymous Fri 17 Sep 2021 20:07:13 No.13653342 Report

Quoted By:

It is pretty much laziness. Abs would be more logical but this is "mathematically easier to analyze"

Anonymous

Anonymous Fri 17 Sep 2021 20:54:28 No.13653511 Report

Quoted By:

>tfw /sci/ gives a better answer than stackexchange

thanks bros

not OP

Anonymous

Anonymous Fri 17 Sep 2021 22:01:05 No.13653828 Report

Quoted By:

yes, it's just because a square is easier to do calculus with than an absolute value

Anonymous

Anonymous Sat 18 Sep 2021 10:50:45 No.13654975 Report

Quoted By:

>>13652572
I use it for model assesment, its nice to understand, especially relative deviation from target value in terms of absolute value. its often used in numerics, some elementary numerical analysis topics are derivations of relative errors for various functions and its possible bounds.
But generally its much less used in calculations, proof and theorem building etc, for example its not differentiable function at 0.

Anonymous

Anonymous Sat 18 Sep 2021 11:33:59 No.13655032 Report

Quoted By: >>13655691 >>13656065

squares also penalizes values distant from the mean more compared to absolute values

Anonymous

Anonymous Sat 18 Sep 2021 18:19:27 No.13655691 Report

Quoted By: >>13656096

>>13655032
>squares also penalizes values distant from the mean more compared to absolute values
this isn't a feature, tho. It's a bug. You have to be very careful when doing variance and weed out the outliers first, otherwise you'll get regressions and trends that look nothing like the data set.

Anonymous

Anonymous Sat 18 Sep 2021 21:16:14 No.13656065 Report

Quoted By:

>>13655032
4th powers penalizes values distant from the mean more compared to squares. Why don't we use 4th powers. Or 100th powers?

It's just because 2nd powers are easier to calculate than bigger powers. All we are doing is measuring the "spread" of the data. As long as we have some measure of spread than is 0 when all the data is the same and increases as stuff gets more spread out, then we can do things like check for 0 or compare 2 "spreads".

So which one do we pick, considering there is an infinite amount of way to create this measure? We just pick one that is easy to calculate and has some nice algebraic properties.

Anonymous

Anonymous Sat 18 Sep 2021 21:25:45 No.13656096 Report

Quoted By:

>>13655691
>massaging the data
Social scientists please leave.

Anonymous

Anonymous Sat 18 Sep 2021 21:27:22 No.13656102 Report

Quoted By:

absolute value is not differentiable

Capcode	All Only User Posts Only Moderator Posts Only Admin Posts Only Developer Posts
Show Posts	All Only With Images Only Without Images
Deleted Posts	All Only Deleted Posts Only Non-Deleted Posts
Ghost Posts	All Only Ghost Posts Only Non-Ghost Posts
Post Type	All Only Sticky Threads Only Opening Posts Only Reply Posts
Results	All Grouped By Threads
Order	Latest Posts First Oldest Posts First

Your latest searches