Split Decisions: Practical Machine Learning for Empirical Legal Scholarship

46 Pages Posted: 11 Dec 2020 Last revised: 6 Feb 2021

See all articles by James Ming Chen

James Ming Chen

Michigan State University - College of Law

Date Written: November 16, 2020

Abstract

Multivariable regression may be the most prevalent and useful task in social science. Empirical legal studies rely heavily on the ordinary least squares method. Conventional regression methods have attained credibility in court, but by no means do they dictate legal outcomes. Using the iconic Boston housing study as a source of price data, this Article introduces machine-learning regression methods. Although decision trees and forest ensembles lack the overt interpretability of linear regression, these methods reduce the opacity of black-box techniques by scoring the relative importance of dataset features. This Article will also address the theoretical tradeoff between bias and variance, as well as the importance of training, cross-validation, and reserving a holdout dataset for testing.

Suggested Citation

Chen, James Ming, Split Decisions: Practical Machine Learning for Empirical Legal Scholarship (November 16, 2020). Michigan State Law Review, 2021, Available at SSRN: https://ssrn.com/abstract=3731307 or http://dx.doi.org/10.2139/ssrn.3731307

James Ming Chen (Contact Author)

Michigan State University - College of Law ( email )

318 Law College Building
East Lansing, MI 48824-1300
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
26
Abstract Views
339
PlumX Metrics