Notes
-
[*]
CORE and Department of Economics, Université catholique de Louvain. Email: luc.bauwens@uclouvain.be
-
[**]
Department of Geography and Environment, London School of Economics, CEPR, and CEP, UK. Email: g.mion@lse.ac.uk
-
[***]
CORE and Department of Economics, Université Catholique de Louvain, and CEPR. Email: jacques.thisse@ uclouvain.be
The authors thank one referee, Kristian Behrens, Jacques Drèze, Gilles Duranton, Michel Lubrano, Gianmarco Ottaviano, and Matthew Turner for their comments They are also grateful to Rytis Bagdziunas for his assistance in collecting and preparing the data on highly cited researchers. The usual disclaimer applies. -
[1]
Note that 5,597 people are associated with an institution. The difference comes from those who have changed affiliation too often to be associated with a particular institution or have passed away before 1999.
-
[2]
Admittedly, the number of patents is another important scientific output of universities. Yet, we believe that publications are the main criterion used in most academic institutions to evaluate the research activities of professors and researchers.
-
[3]
Additional arguments to those developed in this paper may be found in Aghion et al. (2007).
-
[4]
Note, however, that the inverse of the index of the Pareto distribution is the standard deviation of the logarithm of the Pareto variable. So this index retains some meaning as a measure of concentration: the lower the index of the Pareto distribution, the more uneven the distribution of data.
-
[5]
The 12 other members states of the EU 27 have only 7 HCRs all together.
-
[6]
Details and sources of data are reported in the Appendix.
-
[7]
We have checked that from the publications of HCRs who do not not belong to English speaking countries using a random sample of 10 % of them extracted from the Thomson Scientific on-line database. In a few countries, such as Germany, Italy and France, HCRs have a small fraction of their publications in their native language. We have found a single case (a German psychiatrist) in which the publication record was approximately half in English and half in German. In all other cases, the most cited papers are written in English.
-
[8]
Israel never was a British colony. However, the governance of Israeli universities is close to the Anglo-Saxon model. Furthermore, the Hebrew University of Jerusalem was launched before the British left the area.
-
[9]
This is a weighted average of a number of variables that measure individuals’ perceptions of the effectiveness and predictability of the judiciary and the enforcement of contracts in each country between 1997 and 1998.
-
[10]
There are two reasons why we use TOEFL data rather than share of the population speaking English. First, TOEFL data are available for a large number of countries while the share data refer to EU countries only. Second, the TOEFL provides a better proxy of the English proficiency of scientists than data referring to the whole population, because the TOEFL test is undertaken by students who plan to pursue graduate studies.
-
[11]
The imputed TOEFL scores are: Australia (226), Canada (264), Ireland (267), New Zealand (270), United Kingdom (268), United States (268). As a comparison, countries with the best English proficiency, like Denmark and the Netherlands, score around 260. Good English proficiency countries like Germany and Switzerland score around 250, while medium performance countries like France and Spain score around 240. As long as imputed scores are below 285, the Col_UK dummy is still positive and significant. The maximum achievable score of the test is 300.
-
[12]
See, e.g. Wooldridge (2002, section 19.2.2).
-
[13]
These findings are not related to the fact we use countries for which GDP in 1913 is available. Indeed, we have estimated (3) using the same sample as in (5) and have found almost the same results as in (3).
-
[14]
We regress the log of NSc on the log of our assumed conditional mean, which is a linear function in parameters. We further instrument PCGDP using the predicted value of the log of PCGDP coming from the regression on the log of per capita GDP in 1913, the log of RD, the log of HC, the UK colony dummy, the governance quality variable, and the log of English proficiency.
-
[15]
The residuals are the differences between the observed number of researchers and their estimated value using the production function.
-
[16]
It is also worth pointing out that, contrary to a widespread opinion, non US Anglo-Saxon countries do not necessarily have higher expenditure per student than other countries. For example, Denmark, the Netherlands and Sweden spend much more than Ireland and the UK (Aghion et al., 2007).
1 – Introduction
1The title of this paper is inspired by the famous play “The Resistible Rise of Arturo Ui” written (in German) by Bertolt Brecht in 1941. In choosing this title, Brecht intended to say that the rise of Fascism in Europe was not inevitable. We have the same view of the decline of European science. Is there really such a decline? This is what this paper is about.
2To support our view about the unsatisfactory state of European science, we exploit a new data set made freely accessible by Thomson Scientific on the Web site ISIHighlyCited.com. This site gives the top research professionals working in a variety of occupations by name, category, country, and institutional affiliation for 21 disciplines listed in Table 1. In a nutshell, 5,790 researchers, 1,329 institutions and 41 countries are considered. [1] For each discipline, the 250 most highly cited researchers (in short, HCRs) have been selected from 1981 to 1999 (in fact, the actual number of HCRs varies from 1981 to 1999). To build the database from which HCRs are selected, Thomson Scientific considers all the papers belonging to its 21 scientific citation indices, and which have been both published and cited during the period 1981-1999. This data set spans a sufficiently long period of time to make this sample representative of the current state of scientific research in the whole world. Furthermore, we believe that the number of citations is a good proxy of the quality of research output in that it measures the long run impact of publications on the scientific community. Note also that this data set is one of the main inputs used in building the Shanghai world ranking of universities. [2]
3The sample used here might biased for the following two reasons. First, the numbers of scientists in each of the 21 fields may differ widely. Second, citations habits may vary across fields as well as across countries. Regarding the first point, we do not have access to the whole list of cited scientists per field. However, we find it reasonable to believe that Thomson Scientific has chosen 250 as a benchmark number because it represents more or less the same share of scientists. If numbers of cited scientists were to vary a lot across disciplines, Thomson Scientific could have chosen different numbers of HCRs for different fields. As for the second point, it is true that different fields have different habits about citations. For example, on average geographers have many more citations than economists (Brakman et al., 2010). However, we do not see why this would bias the sample because habits are field-specific whereas the selection criterion of HCRs is the same across fields. Furthermore, once a discipline is widely spread, as are most of scientific disciplines considered here, we may expect the citation habits to be fairly similar across countries. A last point, to conclude. As our data display a great deal of heterogeneity at the country level, the bias in our sample need not be too serious an issue for estimating the impact of explanatory variables on the number of HCRs. Recall that, in regression analysis, a consistent estimate of the impact obtains even if the sample is not representative, under correct model specification. A non-representative sample with more heterogeneity in the explanatory variables is preferable in terms of precision of the estimation than a more representative sample with less heterogeneity (see e.g. Stock and Watson, 2007, p 133-134).
4In Section 2, we provide a synthetic account of the information available on the site ISIHighlyCited.com, using simple tools such as statistics, figures and tables. The main striking feature that emerges from this analysis is the massive dominance of American universities that account for two thirds of the sample, whereas the European universities stand for only 22.3 %. [3] Within the European Union, national disparities appear to be huge with a handful of countries doing much better than the others.
5Quite naturally, this state of affairs leads us to raise the following question: how can it be explained? This is what we undertake in Section 3 where we develop an econometric study that aims to uncover the main explanatory variables for the very uneven distribution of top researchers. Using a knowledge production function whose inputs are R&D expenditure and human capital, we find not surprisingly that these two variables are significant. However, the fit is pretty unsafisfactory and calls for the introduction of a country-specific factor-augmenting productivity term. This is reminiscent of the total factor productivity term that must be used to explain world income disparities (Prescott, 1998). What to include in such a term is always somewhat arbitrary. In our setting, we find it natural to consider per capita GDP and the quality of public governance. Our other candidates are English proficiency and colonial ties with the UK. All together, these four variables also contribute to explain the differences across countries. This was expected for per capita GDP and governance quality. English proficiency explains, at least partially, the good performance of English-speaking countries as well as that of a few other countries in which the population is known to have a good knowledge of English. Colonial ties with the UK have a different nature. As argued in the paper, this variable is likely to be related to the governance and organizational design that characterize (more or less) all Anglo-Saxon universities, and which have been duplicated in a few other countries. In this respect, our analysis agrees with recent contributions in economics that show how the design and quality of institutions matters for economic growth and development (Guiso et al. 2004; Bennedsen et al. 2005; Persson and Tabellini, 2006). Section 4 concludes the paper.
6Before proceeding, the following comment is in order. Our approach vastly differs from that taken up by the Times Higher Education Supplement (THES) in its ranking of the top 2000 universities (Tulkens, 2007). THES gives a weight equal to 0.2 to the data used in this paper. The objective of THES is broader than ours as we do not focus on teaching. However, it is our contention that the approach followed here provides a sharper description of the research output of universities. This is confirmed by Van Raan (2005) who finds that the correlation between expert-based rankings, which have a weight equal to 0.4 in THES, and bibliometric outcomes is almost zero.
2 – Where do we stand?
7The distribution of HCRs across institutions is very uneven. Figure 1 depicts the Lorenz representation of the cumulative distribution function, where numbers of HCRs are ordered from the largest to the smallest. The distribution is very well fitted by a Pareto law truncated at 1:
Lorenz curve of the number of highly cited researchers per institution
Lorenz curve of the number of highly cited researchers per institution
8Taking the reverse perspective, we observe that one third of the HCRs are affiliated with 30 institutions only. Out of these ones, there are 27 universities and three non-university research institutions, i.e. the National Institutes of Health (NIH), the Max Planck Institute (Germany) and the National Aeronautics and Space Administration (NASA). The NIH is an agency of the United States Department of Health and Human Services and is the primary agency of the American federal government responsible for biomedical research. The Max-Planck-Gesellschaft operates 80 research institutes all over Germany, which usually bear the name “Max Planck Institute (MPI) of …”. Finally, the NASA is an agency of the United States federal government, responsible for the nation’s public space program.
9Computing the normalized Herfindhal index over the set of institutions leads us to qualify our statement about the unevenness of the distribution of HCRs per institutions. Denoting by xi the share of HCRs affiliated with institution i, the Herfindhal index is given by
11where N is the number of institutions. In order to control for this number, we use the normalized index defined by
13which varies within the range [0,1]: the higher H*, the more concentrated the distribution of data. Applying this index to the set of institutions, we find H* = 0.051. This value is not as high as what the foregoing discussion would suggest. This may be explained by the fact that a large majority of institutions have a fairly small number of HCRs (recall that the median is one), as can be checked on the Web site ISIHighlyCited.com.
14Looking now at the geographical breaking down, the United States gets the lion’s share with 66 % of the total number of HCRs (3829), while the EU17 (EU15 plus Norway and Switzerland) has 22.3 % (1292). [5] It should be emphasized that the United Kingdom has 7.58 % of the total number of HCRs (439), that is, slightly more than one third of the EU-share. In the top 25 institutions, 22 are located in the United States, two in the United Kingdom (Cambridge and Oxford) and one in Germany (the Max Planck Institute). In the top 50 institutions, 5 of them belong to the EU17 but only one is located in continental Europe, the Max Planck Institute. The second institution located in continental Europe (the ETH Zurich, Switzerland), is ranked 51st, the third (Karolinska Institutet, Sweden) 60th, the fourth (Leiden University, the Netherlands) 71st, and the fifth (Wageningen University, the Netherlands) 81st. In the 100 institutions with the largest numbers of HCRs, the EU accounts for only 15 % while continental Europe gets a mere 7 %. With such numbers in mind, we find it hard to think of European science as being in good shape.
15Figure 2 gives the Lorenz representation of the cumulative distribution of the number of HCRs per country. Again, a Pareto distribution truncated at 1 provides a good fit. However, its index is equal to 0.5, which is extremely low. In other words, the distribution of HCRs per country is much more concentrated than the distribution per institutions. This is confirmed by the value of the normalized Herfindhal index, which is now given by H* = 0.4357. This is much higher than the value obtained for the institutions, a result that reflects the dominance of the American institutions as a whole.
Lorenz curve of the number of highly cited researchers per country
Lorenz curve of the number of highly cited researchers per country
16Table 1 also provides a few aggregate statistics that common wisdom would relate to research performance. The EU17 has a larger population but a lower per capita GDP in purchasing power parity. However, the total GDPs over the period 1980-2000 are rather close. The US remarkably outperforms the EU17 in both total R&D expenditure and average years of schooling of population aged 25 and over. Nevertheless, the above-mentioned differences in the numbers of HCRs are so high that it is hard to believe that these variables are sufficient to explain the stark contrast of research performances.
17It should be emphasized that the comparison between the US and the EU17 hides very strong disparities within the European Union. Table 4 provides the number of HCRs per million inhabitants. Switzerland does almost as well as the US, while Israel is not far from the top two countries. The performance of three “small” European countries, i.e. Sweden, the Netherlands and Denmark, is also worth pointing out. With a much smaller population and a native language that is not English, they outperform large European countries like Germany, France and Italy, or even Japan. Five English speaking-countries belong to the top-10, and it is fair to say that English is mastered by the large majority of the population in Sweden, the Netherlands and Denmark. As far as its scientific community is concerned, it is hard to think of Israel as being an outlier. The last member of the top-10, Switzerland, is a multilingual country in which English is not one of the four official languages.
18Even though comparisons between institutions and countries may seem odd, it is worth stressing the fact that Harvard, which ranks first among institutions, has more HCRs than France, that the second and third American universities (Stanford and Berkeley) together have more HCRs than Germany, while the fourth American university (MIT) has more HCRs than Italy. Such performances for three of the largest and richest EU-countries are shocking. To say the least, they suggest that the university system of these three countries works pretty poorly in terms of scientific research.
19Table 5 highlights the specialization of the country-members of the G7 with a focus on their top 4 disciplines. Results probably agree with what we know about the visibility of these countries in some disciplines. The fact that the US dominate most in social sciences and economics/business is the mirror image of the bad results obtained by European universities in these two disciplines. They are the two disciplines where literacy matters the most. Thus, it is tempting to conclude that the US dominance drives the good performance of English-speaking countries. This might well be true, but this explanation does not seem to hold for the United Kingdom. Indeed, Table 6 shows that the US and the UK are specialized in very different fields. More precisely, the rank-correlation between all disciplines in these two countries is equal to -0.44, thus suggesting that knowledge spillovers from one country to the other are not as strong as what is generally believed.
3 – Why is it so bad in Europe?
20In view of the facts summarized in the foregoing, a natural question comes to mind: what factors might explain the tremendous heterogeneity of our measure of scientific performance of countries? This section aims at providing an answer to this puzzle.
21We can think of the scientific output as resulting from the interaction of several types of inputs such as the quantity and quality of physical inputs (buildings, equipment, computers, libraries…) and of human inputs (number of researchers and support staff, their level of education and experience). Measuring the stock of these inputs precisely is very difficult, not to say impossible, at least for many countries and long time periods. We must, therefore, resort to approximations. For material inputs, we use in reported estimations the research and expenditure outlays, denoted by RDc for country c, in 2000. This is clearly a flow measure, but we find it reasonable to assume that this measure is more or less the same fraction of the corresponding stock in every country. In this respect, our supplementary data on the research and expenditure outlays of OECD and some partner-countries over the period 1981-2000 suggests that R&D expenditure differences across countries are strong but very stable across time. We have used this alternative measure for robustness tests. Furthermore, we choose the year 2000 because it is the closest one to the period of analysis (1981-1999) for which the data coverage is best. Regarding human inputs, we follow the literature on economic growth and approximate the stock of human capital in country c (HCc)) by the population size times the average number of years of schooling in 1980 (Benhabib and Spiegel, 1994; Barro and Sala-i-Martin, 1995). This year is selected because those who completed their education after 1980 are unlikely to be parts of the HCRs. [6]
22We assume a Cobb-Douglas production function relating the number of HCRs in country c over the 1981-1999 period (NSc) to the above inputs:
24where ? and ? are parameters to be estimated, while ?c is a factor augmenting Hicks-neutral productivity term for country c. This factor is assumed to take the following form:
26where (i) ?, ?0, ?1, ?2 and ? are parameters to be estimated, (ii) PCGDPc is the average per capita GDP in purchasing power parity of country c over the period 1980-2000, (iii) Col_UKc is a dummy indicating whether a country has been a UK colony with substantial participation in its own governance during the colonial period (UK is also included), (iv) QGc is an index varying in the interval [0,1], which measures the quality of a country’s governance, and (v) Engl_proficc stands for a country’s proficiency in English. This variable accounts for the fact that English is the dominant language of scientific communication. As a matter of fact, HCRs publish predominantly in English. [7] Note also that, unlike unconstrained continuous variables, the dummy Col_UKc and QGc enter the factor augmenting productivity term exponentially. In this way, ?1 and ?2 can be given an easy interpretation, all the other parameters (?, ?, ?. and ?) having the nature of elasticities.
27The dummy Col_UK, listed by country in Table 7, aims to capture the idea that universities in English-speaking countries have specificities related to the design and governance of universities that make them more efficient. [8] We acknowledge the fact that this variable encompasses other similarities between countries sharing colonial ties with the UK, which are not relevant for our purpose. In an attempt to disentangle differences in the quality of university institutions from the overall quality of a country governance and the advantage of a high English proficiency, we have introduced the variables QG and Engl_profic in ?. These two variables should capture the specific impact that the quality of political institutions and the level of English proficiency are likely to have on the research output. We return to these issues below.
28For the variable QG, we use the “rule of law”, such as the quality of judiciary and contract enforcement, which may also be correlated with colonial ties, constructed by Kaufmann et al., (2003). [9] The variable Engl_profic is measured by TOEFL test average scores by country of origin. [10] TOEFL data for the UK, US, Canada, New Zealand, Australia and Ireland were not available because English is the native language in those countries. TOEFL scores have thus been reconstructed by regressing TOEFL scores for available countries on data about the share of English-speaking population and average years of schooling in a given country. We stress the fact that this imputation has no effect on the significance of the Col_UK variables for a fairly large range of score values. [11] Detailed TOEFL scores are reported in Table 7 (see the data Appendix for further details).
29It is standard in the growth and trade literature to consider per capita GDP in purchasing power parity as a proxy of a country’s overall productivity (Barro and Sala-i-Martin, 1995; Trefler, 1995). Restricting ourselves to this single variable would amount to assuming that productivity differences in the research sector mirror those in the rest of the economy. Yet, we expect other variables to influence research productivity. This is why we include Col_UK since the UK and several of its former colonies seem to perform better than other countries (see Section 1). Furthermore, in order to reduce the arguably strong impact of proficiency in English in some disciplines, we consider the hard sciences only to build NSc; i.e. we neglect those HCRs belonging to the “Economics-Business” and “Social Sciences, General”, where literacy matters the most.
30Since NSc is a count variable, we estimate a Poisson model by quasi-maximum likelihood (QML). Specifically, we proceed as if NSc were to follow a Poisson distribution with conditional mean equal to
32and observations were independent. These assumptions determine the likelihood function of the observed sample. However, we depart from the Poisson distribution property, which states that the conditional variance equals the conditional mean, by estimating the parameters in (2) and (3) by QML, while providing robust standard errors for statistical inference. And indeed, over-dispersion tests strongly reject the hypothesis of equal conditional mean and variance. It should be stressed that the Poisson QML method has nice statistical properties with respect to alternative count models (like the negative binomial) and yields consistent estimates provided that the conditional mean is correctly specified. [12]
33One may argue that an alternative estimation strategy is to estimate the log of (2) and (3) via OLS. However, this method would not account for the count nature of our dependent variable while forcing us to consider only countries with at least one HCR. Because of such strong drawbacks, as well as other shortcomings discussed in Santos Silva and Tenreyro (2006), the Poisson QML is our preferred estimation method. Nevertheless, in our last set of estimations, we will provide evidence that our main results still hold under the OLS applied to the log-linearized model.
34Our sample consists of 65 countries (see Table 7). It includes 38 of the 41 countries having at least one HCR (Algeria, Iran, and Taiwan are lost due to data availability) and 27 other countries that have a count of 0. The selection of these additional 27 countries was based on data availability. However, our results are not significantly affected by the introduction of such countries, thus suggesting that there is no strong selection bias in our analysis. Table 8 shows the correlations between our covariates. As one can see, although a few variables are highly correlated (PCGDP and QG, or RD and HC), overall multicollinearity does not seem to be an issue. More precisely, the Col_UK variable is weakly correlated with other covariates. In particular, its correlations with the two variables introduced to separate the role of governance quality and English proficiency (QG and Engl_trofic) are only slightly positive, thus suggesting that Col_UK pick up other institutional features.
35Several estimation results are reported in Table 9. In columns (1) and (2), in which Col_UK, Engl_trofic and QG are not included, the model performs pretty badly in that PCGDP is the only significant variable besides the constant term, while the estimates are very sensitive to the exclusion of the US from the sample. In other words, neglecting English proficiency, the UK legacy, and the quality of a country governance implies that R&D outlays and human capital are not relevant for the production of HCRs, and makes the US a big outlier whose weight changes completely point estimates. In contrast, adding Col_UK, Engl_trofic and QG renders the estimates stable with respect to the exclusion of the US (compare columns (3) and (4)), while improving parameter significance.
36One could argue that endogeneity is a likely issue in the foregoing estimations. While no one would deny that per capita GDP has an impact on the scientific output, one could similarly argue, as in modern growth theories, that there is a feedback effect in that a higher scientific output favors economic growth. In this case, per capita GDP cannot be treated as being exogenous in the estimation of the model parameters. Nevertheless, one may be tempted to say that the knowledge contained in scientific publications is a public good that is freely available to the world’s scientific community. We believe, however, that HCRs contribute disproportionately to the GDP of their host country for at least two reasons. The first one is that part of the knowledge produced by HCRs flows across space and time with frictions, thus providing a local advantage for a while (Jaffe et al., 1993; Peri, 2005). The second one is that HCRs have other activities that may have a direct impact on the national or local GDP, such as consulting activities for local firms and governments on a very large scale as in the US.
37In column (5), our preferred specification, we report the estimates when we instrument PCGDP by the per capita GDP in 1913 (few countries are lost because of the lack of 1913 data). By instrumenting, we mean that PCGDP (in level) is replaced by its predicted value estimated from a linear projection of the log of PCGDP on the log of per capita GDP in 1913, the log of RD, the log of HC, the UK colony dummy, the governance quality variable, and the log of English proficiency. There are two conditions for the log of per capita GDP in 1913 to be a valid instrument for the endogenous variable: it must be uncorrelated with the error term of the production function (a non-testable assumption) and it must be correlated with the log of PCGDP (the endogenous variable). The last condition is clearly satisfied since the t-statistic for the coefficient of the log of per capita GDP in 1913 is equal to 6.81 in the linear projection. The non-testable assumption can be justified by saying that it is unlikely that the level of GDP in 1913 has been determined by the non-observable factors that determined GDP in 1980 and subsequent years (Ciccone and Hall, 1996). Moreover, the presence of structural breaks should provide the condition for a natural experiment. In this respect, almost 70 years separate the two periods, with two world wars in-between, a strong modification in the composition of GDPs from agriculture to services through industry, the Great Depression and the after-war process of economic integration, which all seem to have the nature of structural breaks.
38Taking care of the endogeneity problem, the coefficient of per capita GDP increases considerably from 1.13 in column (3) to 1.81 in column (5). The other parameter estimates are somewhat different from those provided in column (3). In particular, the coefficient of RD decreases sharply but remains highly significant, while the coefficient of HC is larger and becomes significant. The quality of the overall fit is high since the correlation between actual and predicted numbers of HCRs is now equal to 0.99 (the square root of the pseudo-R2 given in Table 9). In unreported results, we also found that excluding the US does not change the estimates.
39All in all, the changes in estimates reveal that GDP endogeneity matters for some coefficients and the significance of human capital. [13] Furthermore, we have estimated the production function on the sample of countries for which R&D spending is available for the entire period 1981-2000, using the reconstructed total R&D outlays over this period to get a better measure of the stock of physical inputs for HCRs production. These unreported results confirm our findings. Finally, we report in column (6) the result of standard IV estimations carried on the log-linearized model. [14] The drop in the number of countries due to the additional requirement that NSc > 0 reduces significance, but the overall results are in line with our Poisson QML findings.
Residuals of the estimated production function (see column (5) of Table 9)
Residuals of the estimated production function (see column (5) of Table 9)
40Figure 3 displays the residuals resulting from the estimation of the model in column (5). [15] Within the group of non Anglo-Saxon countries, we can see that those which are characterized by some degree of flexibility in the management of universities, like Switzerland, the Netherlands and Sweden, have a number of HCRs that exceeds considerably the predicted one. At the other extreme, Germany and France have an actual number of HCRs that is quite smaller than the predicted one. The fact that German and French universities lack flexibility, at least until recently, will come to the mind of those who are familiar with them. While sharing several distinct elements of flexibility that are unmatched in France, Germany and Italy, the non US successful countries, and in particular Anglo-Saxon countries, display enough variability in their university systems for our dummy variable to cover a wide range of institutional features. For example, most Canadian and Swedish universities conduct their own admission, whereas there is no selection in the Netherlands and Switzerland. There are high tuition fees in Australia and the UK, but they are low in the Netherlands, Sweden and Switzerland. Although the welfare state is more or less the same in the Netherlands and Sweden, Swedish universities have a high degree in wage flexibility, whereas Dutch universities have a much lower one (Aghion et al., 2007). Finally, Canada and the UK devote a large share of their high education expenditure on their top universities, whereas Sweden and Switzerland do not have such a systematic policy. All of this suggests that the dummy Col_UK captures a bundle of flexibility parameters, which would be very hard to handle by means of a set of distinct variables. [16]
41Using the estimates of column (5) in Table 9, we see that the English proficiency effect is fairly strong. For example, if French scientists were to improve their English by 10 %, thus reaching the level of the Netherlands, the number of French HCRs would increase by 20 %. Furthermore, the quality of governance matters too. The UK and its ancient colonies have a higher level of governance quality with respect to other countries in the sample (0.68 vs. 0.62). If Italy were to improve its governance by 27 %, thus reaching the level of the UK, the number of Italian HCRs would increase by 54 %. We acknowledge the fact that implementing such deep institutional changes is probably unfeasible in the short run. However, these results are useful as they provide insights regarding potential gains stemming from the efforts to be made to match language and governance standards.
42Note also that, besides their linguistic and governance advantage, former UK colonies display a higher efficiency in producing HCRs. For example, Australia, Canada, Ireland, Israel, New Zealand, Singapore, the UK and the US have, ceteris paribus, 64 % (exp(0.494)-1) more HCRs than other countries. In order to match such an advantage, the EU countries should almost triplicate their research budget, or double their human capital stock, or increase their GDP by around 35 %.
43These numbers give an idea of the strength of the UK legacy in achieving top-level research performances since it matters more than R&D budget, GDP or human capital levels. It is our contention that the choice of US-like academic institutions made in those countries is a key element, although not the only one, in understanding those findings. Indeed, we have already washed out the effect of a country development and governance quality, R&D outlays, human capital and English proficiency. What is left out? The Anglo-Saxon organization of science, i.e. how to carry on and structure a good research environment, has established itself a long time ago as the reference paradigm. It thus seems natural to speculate that, even with a good English proficiency, non Anglo-Saxon countries still suffer from some structural disadvantages. Counterexamples abound, however. For example, a field like economics, where the dominance of US HCRs is very high, suggests that things are not that simple. Indeed, there is a substantial number of well-known European or Japanese economists who have received their Ph.D. in a top US university, and who returned to their country of origin. In addition, there is a lot of academic exchanges between these countries and the US. Last, but not least, English has been the lingua franca of most renowned economists for quite a long time. Hence, it seems fair to say that there has been a deep integration process in the economics profession. Consequently, the importance of UK legacy should be explained by non-cultural factors.
44Another way of looking at the strong impact of Col_UK is to appeal to network effects. In a dynamic perspective, citations are best interpreted as a sort of network. As long as the probability of citing a paper is increasing in physical and/or cultural proximity, as suggested by the citation of patents (Jaffe et al., 1993; Peri, 2005), then several equilibria are a priori possible but only one emerges (Farell and Klemperer, 2007). Once a specific equilibrium is established, it is fairly hard to switch to another one. In other words, size and history matter. In this respect, World War II has shifted a huge amount of intellectual resources to the US. Having said that, network externalities would magnify the initial causal effect. This could explain why, despite the efforts made at the EU level to increase research funding and opportunities, we are still lagging behind. However appealing is this explanation, it is at odds with several facts. (i) The US and the UK are not specialized in the same fields, as shown by the strongly negative rank correlation across different disciplines. In this respect, it is worth mentioning that the significance of Col_UK is robust to excluding both the US and the UK. (ii) If this argument seems plausible for the US and the UK, it is hard to figure out why a paper written in New Zealand has more chances to end up on the desk of a US HCR than a paper written in Germany or Japan.
45We would be the last to claim that we pick up all the causal effects of better academic institutions and research incentives with Col_UK because there are probably several forces at work that we cannot disentangle here. Even if such a variable were a precise measure of the unobserved variation in the quality of university institutions and research incentives, the coefficient we get would provide a reduced form magnitude of the effects sparked by better institutions and incentives. Given this proviso, it should be clear that an Anglo-Saxon country premium exists and is large, while the above discussion together with the evidence coming from other successful countries suggest that the quality of university design matters.
46Finally, we have used our model to simulate the implications of possible policies to be implemented in order to reach a much higher research output. First, if the EU17 were to achieve the Lisbon objective of a GDP-share in R&D equal to 3%, its share of HCRs would just slightly increase from 24.3% to 27%, while the US would still account for 59.7% of HCRs. This sheds new light on the possible inappropriateness of the EU objectives and policies regarding European universities. Moreover, if the 3% objective in R&D was further accompanied by an increase of both the EU educational level and GDP per capita to their corresponding US counterparts, which seems both unfeasible and costly in the short and medium-run, the EU17 share of HCRs (36.1 %) would still lag behind the US share (52.3 %). Hence, money is not enough, thus suggesting that the EU must seek alternative solutions.
47In order to highlight further the importance of the Anglo-Saxon premium, we propose the following counterfactuals. If the 3 % objective in R&D were to be combined with a deep reform of the design and governance of EU research institutions that would bring them at the US level of efficiency, the EU share of HCRs would increase to 34.5 %, while the US share would be equal to 53.6 %. In addition, if the level of English proficiency were to be raised to the level of the Netherlands in non-native English speaking EU17 countries, the gap between the EU and the US would be further reduced (37% for the EU vs. 51.5% for the US). These last two results suggest new and less costly policies to remedy the resistible decline of European science. The results of the different policies are summarized in Table 10.
4 – What to do?
48Money matters in science as it often does in human affairs. Indisputably, a larger research budget would help the EU boost European science. However, money is not the only leverage for European universities to have a better research output.
49In this paper, we documented the existence of a productivity advantage of Anglo-Saxon research institutions and universities, a fact that European researchers and public decision makers tend to dismiss far too often. Even though our econometric analysis relies on an "fuzzy" measure of the quality of research institutions, it is our contention that the governance and design of US-like universities are critical inputs in knowledge production. In order to get a definite answer, we need “experiments” in which European or other universities decide to adopt various institutional characteristics of American/English-style universities. We could then follow the change in the adopters’ outcomes. Obviously, similar experiments in which American universities would adopt the characteristics of European universities would also provide useful information. Needless to say, such experiments are almost impossible to implement.
50As said above, we would be the last to claim that university and research budgets do not matter in the performance of researchers (Aghion et al., 2007). However, it is worth stressing that, to a large extent, those budgets are themselves endogenous: outstanding universities attract big flows of money precisely because they are outstanding, and vice versa. We encounter here the well-known phenomenon of “cumulative causation” developed by Myrdal (1957) fifty years ago. Besides this observation, our analysis suggests that the way the money is used is probably as critical as the amount of money itself.
51At a time when the opportunity cost of public funds is likely to rise sharply, this is not necessarily bad news. The scientific community should become fully aware of the main weaknesses of research institutions in continental Europe. By promoting in-depth reforms, national governments and the European Commission would vastly contribute to the “irresistible” growth of their universities in the production of advanced and successful knowledge. Designing better research institutions, which does not necessarily mean copying Anglo-Saxon universities, and learning better English need not much money. It requires, however, more openness to the rest of the world on the part of quite a few European researchers, as well as collective imagination and political will. The key question thus becomes: does Europe have them?
Appendix
- Historical R&D data for OECD and some partner countries comes from the OECD Main Science and Technology Indicators.
- Data on R&D in 2000 for a larger set of countries comes from the Science and Technology database provided by the UNESCO Institute of Statistics.
- Data on colonial ties comes from CEPII at http://www.cepii.fr/anglaisgraph/bdd/TradeProd.htm.
- Data on Population and GDP per capita in purchasing power parity comes from the World Economic Outlook Database, April 2007 provided by International Monetary Fund.
- Data on GDP for 1913 is provided by Maddison, A. (2001) The World Economy. A Millennial Perspective. Paris OECD.
- Data on Average Years of Schooling for total population aged 25 and over comes from Robert J. Barro and Jong-Wha Lee (2001) International data on educational attainment: updates and implications. Oxford Economic Papers 53, 541-563. Data are available at http://www.cid.harvard.edu/ciddata/ciddata.html.
- Data on TOEFL average scores of computer-based tests by country of origin for the examination period July 2004 to June 2005 comes from TOEFL Test and Score Data: Summary Data. TOEFL data for the UK, US, Canada, New Zealand, Australia and Ireland were not available because English is the native language in those countries and so there is no need to prove English proficiency with a test. TOEFL scores have thus been reconstructed by regressing TOEFL scores for available countries on data about the share of English-speaking population and average years of schooling in a given country. The imputed TOEFL scores are: Australia (266), Canada (264), Ireland (267), New Zealand (270), United Kingdom (270), United States (268). As a comparison, countries with the best English proficiency, like Denmark and the Netherlands, score around 260. Good English proficiency countries like Germany and Switzerland score around 250, while medium performance countries like France and Spain score around 240. We have used other values of the TOEFL test for the above 6 missing countries. As long as scores are below 285, the Col_UK dummy is still positive and significant. The maximum achievable score of the test is 300.
- The data on countries and their English-speaking population comes from different sources collected by Wikipedia at http://en.wikipedia.org/wiki/List_of_countries_by_English-speaking_population). In particular, for EU countries, data comes from a survey whose results are published in the Special Eurobarometer 243 (2006). See http://ec.europa.eu/public_opinion/archives/ebs/ebs_243_en.pdf.
- Data on the quality of a country governance, and in particular the “rule of law”, are provide by Kaufmann et al., (2003) and refers to years 1997 and 1998. Data are available at http://info.worldbank.org/governance/wgi2007/.
Number of highly cited researchers by discipline in the US, EU17 (EU15 plus Norway and Switzerland) and EU17 without the UK
Number of highly cited researchers by discipline in the US, EU17 (EU15 plus Norway and Switzerland) and EU17 without the UK
The average of total GDP in purchasing power parity (PPP) over the period 1980-2000 is measured in current US million dollars. The same unit is used for total R&D Expenditure in the year 2000. Average per capita GDP in PPP over the period 1980-2000 is measured in current US dollars while Population is measured in million number of inhabitants.Number of highly cited researchers by discipline for all countries with at least 100 HCRs but the US
Number of highly cited researchers by discipline for all countries with at least 100 HCRs but the US
Top 25 institutions by number of highly cited researchers
Top 20 countries by number of highly cited researchers per million inhabitants
Top 4 disciplines by percentage of highly cited researchers for the G7 countries
Top 4 disciplines by percentage of highly cited researchers for the G7 countries
Ranking of the UK and the US in the 21 disciplines according to percentage of highly cited researchers
Ranking of the UK and the US in the 21 disciplines according to percentage of highly cited researchers
List of countries included in the analysis
Bold TOEFL values refer to imputed figures.Correlation between covariates
Poisson QML and OLS estimations results for the knowledge production function
Poisson QML and OLS estimations results for the knowledge production function
The model is defined by equations (1) and (2) in the text. Dependent variable: Number of HCRs by country in all disciplines but Economics-Business and Social Sciences, General. QML standard errors in parentheses with ***, ** and * respectively denoting significance at the 1%, 5% and 10% levels.Impact of policy scenarios on share of HCR in Europe and the USA
“current” means that the variable is set at the observed level for each EU17 country; “USA” means that it is set at the observed USA level for each EU17 country; “Netherlands” means that the English proficiency level of the EU17 countries is raised at the level of the Netherlands if lower; “years” is the average number of years of schooling of the population aged 25 and over.- [1]Aghion, Ph., M. Dewatripont, C. Hoxby, A. Mas-Colell and A. Sapir (2007). Why Reform Europe’s Universities? Bruegel Policybrief, Issue 2007/04.
- [2]Barro, R.J. and X. Sala-i-Martin (1995). Economic Growth. New York: McGraw-Hill.
- [3]Benhabib, J. and M.M. Spiegel (1994). The Role of Human Capital in Economic Development: Evidence from Aggregate Cross-Country and Regional U.S. Data. Journal of Monetary Economics 34, 143-173.
- [4]Bennedsen, M., N. Malchow-Møller et F. Vinten (2005). Institutions and Growth – A Literature Survey. Centre for Economic and Business Research, Copenhagen Business School, Report 2005-1.
- [5]Brakman, S., H. Garretsen and Ch. van Marrewijk (2010) References Across the Fence: Measuring the Dialogue between Economists and Geographers, Journal of Economic Geography 10, 1-15.
- [6]Ciccone, A., and R. Hall (1996). Productivity and the Density of Economic Activity. American Economic Review 86, 54-70.
- [7]Farrell, J., and P. Klemperer (2007) Co-ordination and Lock-in: Competition with Switching Costs and Network Effects. In Handbook of Industrial Organization (ed. M. Armstrong and R. Porter), volume III, pp. 1967–2072. Amsterdam: North-Holland.
- [8]Guiso, L., P. Sapienza and L. Zingales (2004). The Role of Social Capital in Financial Development. American Economic Review 94, 526-556.
- [9]Jaffe, A.B., M. Trajtenberg and R. Henderson (1993). Geographic Localization of Knowledge Spillovers as Evidenced by Patent Citations. Quarterly Journal of Economics 63, 577-598.
- [10]Kaufmann, D., A. Kraay and M. Mastruzzi (2003). Governance Matters III: Governance Indicators for 1996-2002. Working Paper No. 3106, World Bank.
- [11]Myrdal, G. (1957). Economic Theory and Underdeveloped Regions. London: Duckworth.
- [12]Peri, G. (2005). Determinants of Knowledge Flows and their Effects on Innovation. Review of Economics and Statistics 87, 308-322.
- [13]Persson, T. and G. Tabellini (2006). Democratic Capital: The Nexus of Political and Economic Change. CEPR DP5654.
- [14]Prescott, E.C. (1998). Needed: A Theory of Total Factor Productivity. International Economic Review 39, 525-551.
- [15]Santos Silva J. and S. Tenreyro (2006). The Log of Gravity. Review of Economics and Statistics 88, 641-658.
- [16]Stock, J.H. and M.W. Watson (2007) Introduction to Econometrics (2d edition). Pearson International Edition.
- [17]Trefler D. (1995). The Case of Missing Trade and Other Mysteries. American Economic Review 85, 1029-1046.
- [18]Tulkens, H. (2007). Ranking Universities: How to Take Better Account of Diversity. CORE Discussion Paper 2007/42.
- [19]van Raan, A.F.J. (2005). Challenges in Ranking of Universities. Invited paper for the First International Conference on World Class Universities, Shanghai Jaio Tong University, Shanghai, June 16-18, 2005.
- [20]Wooldridge, J.M. (2002). Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: The MIT Press.
Mots-clés éditeurs : gouvernance universitaire, recherche scientifique, citations, économie de la connaissance
Date de mise en ligne : 28/01/2012
https://doi.org/10.3917/rel.774.0005Notes
-
[*]
CORE and Department of Economics, Université catholique de Louvain. Email: luc.bauwens@uclouvain.be
-
[**]
Department of Geography and Environment, London School of Economics, CEPR, and CEP, UK. Email: g.mion@lse.ac.uk
-
[***]
CORE and Department of Economics, Université Catholique de Louvain, and CEPR. Email: jacques.thisse@ uclouvain.be
The authors thank one referee, Kristian Behrens, Jacques Drèze, Gilles Duranton, Michel Lubrano, Gianmarco Ottaviano, and Matthew Turner for their comments They are also grateful to Rytis Bagdziunas for his assistance in collecting and preparing the data on highly cited researchers. The usual disclaimer applies. -
[1]
Note that 5,597 people are associated with an institution. The difference comes from those who have changed affiliation too often to be associated with a particular institution or have passed away before 1999.
-
[2]
Admittedly, the number of patents is another important scientific output of universities. Yet, we believe that publications are the main criterion used in most academic institutions to evaluate the research activities of professors and researchers.
-
[3]
Additional arguments to those developed in this paper may be found in Aghion et al. (2007).
-
[4]
Note, however, that the inverse of the index of the Pareto distribution is the standard deviation of the logarithm of the Pareto variable. So this index retains some meaning as a measure of concentration: the lower the index of the Pareto distribution, the more uneven the distribution of data.
-
[5]
The 12 other members states of the EU 27 have only 7 HCRs all together.
-
[6]
Details and sources of data are reported in the Appendix.
-
[7]
We have checked that from the publications of HCRs who do not not belong to English speaking countries using a random sample of 10 % of them extracted from the Thomson Scientific on-line database. In a few countries, such as Germany, Italy and France, HCRs have a small fraction of their publications in their native language. We have found a single case (a German psychiatrist) in which the publication record was approximately half in English and half in German. In all other cases, the most cited papers are written in English.
-
[8]
Israel never was a British colony. However, the governance of Israeli universities is close to the Anglo-Saxon model. Furthermore, the Hebrew University of Jerusalem was launched before the British left the area.
-
[9]
This is a weighted average of a number of variables that measure individuals’ perceptions of the effectiveness and predictability of the judiciary and the enforcement of contracts in each country between 1997 and 1998.
-
[10]
There are two reasons why we use TOEFL data rather than share of the population speaking English. First, TOEFL data are available for a large number of countries while the share data refer to EU countries only. Second, the TOEFL provides a better proxy of the English proficiency of scientists than data referring to the whole population, because the TOEFL test is undertaken by students who plan to pursue graduate studies.
-
[11]
The imputed TOEFL scores are: Australia (226), Canada (264), Ireland (267), New Zealand (270), United Kingdom (268), United States (268). As a comparison, countries with the best English proficiency, like Denmark and the Netherlands, score around 260. Good English proficiency countries like Germany and Switzerland score around 250, while medium performance countries like France and Spain score around 240. As long as imputed scores are below 285, the Col_UK dummy is still positive and significant. The maximum achievable score of the test is 300.
-
[12]
See, e.g. Wooldridge (2002, section 19.2.2).
-
[13]
These findings are not related to the fact we use countries for which GDP in 1913 is available. Indeed, we have estimated (3) using the same sample as in (5) and have found almost the same results as in (3).
-
[14]
We regress the log of NSc on the log of our assumed conditional mean, which is a linear function in parameters. We further instrument PCGDP using the predicted value of the log of PCGDP coming from the regression on the log of per capita GDP in 1913, the log of RD, the log of HC, the UK colony dummy, the governance quality variable, and the log of English proficiency.
-
[15]
The residuals are the differences between the observed number of researchers and their estimated value using the production function.
-
[16]
It is also worth pointing out that, contrary to a widespread opinion, non US Anglo-Saxon countries do not necessarily have higher expenditure per student than other countries. For example, Denmark, the Netherlands and Sweden spend much more than Ireland and the UK (Aghion et al., 2007).