Parametric and Non-parametric Methods in Mass Appraisal on Poorly Developed Real Estate Markets*
Purpose: The objective of the article is to identify machine learning methods that provide the best real estate appraisals for small-sized samples, particularly on poorly developed markets. A hypothesis is verified according to which machine learning methods result in more accurate appraisals than multiple regression models do, taking into account sample sizes. Design/Methodology/Approach: Four types of regression were employed in the study: a multiple regression model, a ridge regression model, random forest regression and k nearest neighbours regression. A sampling scheme was proposed which enables defining the impact of a sample size in training datasets on the accuracy of appraisals in test datasets. Findings: The research enabled drawing several conclusions. First of all, the greater the training set was, the more precise the appraisals in a test set were. The conclusion drawn is that a reduction of a training set causes the deterioration of modelling results, but such deterioration is not substantial. Secondly, ridge regression model appeared to be the best model, and thereby the one most resistant to a low number of data. This model, apart from demonstrating the greatest resistance, additionally has the advantage of being a parametric, hence allowing inference. Practical Implications: Presented considerations are important, for instance in the case of valuations conducted for fiscal purposes, when it becomes necessary to determine the value of every type of real properties, even the ones featuring sporadically occurring states of properties. Originality/Value: The study contains modelling of the values defined by property appraisers, and not prices, as in the majority of studies. This decision enabled increasing the diversity of states of real estate properties, thereby including in the modelling process not just those real properties which are most typically traded.