Predicting Innovative Firms Using Web Mining and Deep Learning

12 Pages Posted: 23 Jan 2019 Last revised: 7 Feb 2020

See all articles by Jan Kinne

Jan Kinne

Centre for European Economic Research (ZEW)

David Lenz

Justus-Liebig-University Giessen

Date Written: January 1, 2019

Abstract

Innovation is considered as a main driver of economic growth. Promoting the development of innovation through STI (science, technology and innovation) policies requires accurate indicators of innovation. Traditional indicators often lack coverage, granularity as well as timeliness and involve high data collection costs, especially when conducted at a large scale. In this paper, we propose a novel approach on how to create firm-level innovation indicators at the scale of millions of firms. We use traditional firm-level innovation indicators from the questionnaire-based Community Innovation Survey (CIS) survey to train an artificial neural network classification model on labelled (innovative/non-innovative) web texts of surveyed firms. Subsequently, we apply this classification model to the web texts of hundreds of thousands of firms in Germany to predict their innovation status. Our results show that this approach produces credible predictions and has the potential to be a valuable and highly cost-efficient addition to the existing set of innovation indicators, especially due to its coverage and regional granularity. The predicted firm-level probabilities can also directly be interpreted as a continuous measure of innovativeness, opening up additional advantages over traditional binary innovation indicators.

Keywords: Web Mining, Web Scraping, R&D, R&I, STI, Innovation, Indicators, Text Mining, Natural Language Processing, NLP, Deep Learning

JEL Classification: O30, C81, C83

Suggested Citation

Kinne, Jan and Lenz, David, Predicting Innovative Firms Using Web Mining and Deep Learning (January 1, 2019). ZEW - Centre for European Economic Research Discussion Paper No. 19-01, Available at SSRN: https://ssrn.com/abstract=3321060 or http://dx.doi.org/10.2139/ssrn.3321060

Jan Kinne (Contact Author)

Centre for European Economic Research (ZEW) ( email )

P.O. Box 10 34 43
L 7,1
D-68034 Mannheim, 68034
Germany

David Lenz

Justus-Liebig-University Giessen ( email )

Licher Str. 64
Giessen, 35394
Germany

HOME PAGE: http://https://www.uni-giessen.de/fbz/fb02/fb/professuren/vwl/winker/kontakt/mitarbeiter/lenz

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
144
Abstract Views
1,109
rank
241,620
PlumX Metrics