No.12566378 ViewReplyOriginalReport
Any Data Science/ML fags in here?

I'm working on the house prices project on kaggle right now. Been searching the entire internet and can't find a clear answer on my question.

Which is, for when building a regression model, do Ordinal variables and Discrete variables need to be normally distributed?

A few variables in question are:

OverallQual -> Overall quality of house. Rated on a scale of 1-10.

HalfBaths -> Number of half bathrooms.

When checking for skew, these variables come across as somewhat skewed.

A few public notebooks log-transform these variables while other notebooks don't. Any help would be greatly appreciated.

Also if it matters, I'm going to be using an ensemble consisting of Lasso, Ridge, XGBoost, LGBM, RandomForestRegressor, maybe SVR too depending on how well it performs.