Communitywide Database Designs for Tracking Innovation Impact: COMETS, STARS and Nanobank
Data availability is arguably the greatest impediment to advancing the science of science and innovation policy and practice (SciSIPP). This paper describes the contents, methodology and use of the public online COMETS (Connecting Outcome Measures in Entrepreneurship Technology and Science) database spanning all sciences, technologies, and high-tech industries; its parent COMETSandSTARS database which adds more data at organization and individual scientist-inventor-entrepreneur level restricted by vendor licenses to onsite use at NBER and/or UCLA; and their prototype Nanobank covering only nano-scale sciences and technologies. Some or all of these databases include or will include: US patents (granted and applications); NIH, NSF, SBIR, STTR Grants; Thomson Reuters Web of Knowledge; ISI Highly Cited; US doctoral dissertations; IPEDS/HEGIS universities; all firms and other organizations which ever publish in ISI listed journals beginning in 1981, are assigned US patents (from 1975), or are listed on a covered grant; additional nanotechnology firms based on web search. Ticker/CUSIP codes enable linking public firms to the major databases covering them. A major matching/disambiguation effort assigns unique identifiers for an organization or individual so that their appearances are linked within and across the constituent legacy databases. Extensive geographic coding enables analysis at country, region, state, county, or city levels. The databases provide very flexible sources of data for serious research on many issues in the study of organizations in innovation systems in the development and spread of knowledge, and the economics of science. Enabling the study of these topics, among others, COMETS contributes substantially to the science of science and technology.