Disclosure Sentiment: Machine Learning vs Dictionary Methods
Management Science, forthcoming
49 Pages Posted: 14 May 2021
Date Written: May 1, 2021
We compare the ability of dictionary-based and machine-learning methods to capture disclosure sentiment at 10-K filing and conference call dates. Like Loughran and McDonald (2011), we use returns to assess sentiment. We find that measures based on machine learning offer a significant improvement in explanatory power over dictionary-based measures. Specifically, machine-learning measures explain returns at 10-K filing dates, while measures based on the Loughran and McDonald dictionary only explain returns at 10-K filing dates during the time period of their study. Moreover, at conference-call dates, machine-learning methods offer an improvement over the Loughran and McDonald dictionary method of a greater magnitude than the improvement of the Loughran and McDonald dictionary over the Harvard Psychosociological dictionary. We further find that the random forest regression tree method better captures disclosure sentiment than alternative algorithms, simplifying the application of the machine-learning approach. Overall, our results suggest that machine-learning methods offer an easily implementable, more powerful and reliable measure of disclosure sentiment than dictionary-based methods.
Keywords: Textual Analysis, Machine Learning, Disclosure, Conference Calls
JEL Classification: M40, M41
Suggested Citation: Suggested Citation