Dynamics Research for Tech Cluster in UK¶

Question¶

For entry pattern, how to pick incorperation year? (Like 1950-2018)

Looking up for literature to find supporting evidence? (Like Apple were found in 1970s)?
How to do map making for entry/exit/churn pattern?

churn pattern may be similar to the entry pattern because dissolution date is a small number in dataset(fame_OC)
How to define the number of all companies in entry rate measurement?

Does it represent the number of tech companies in a specific cluster from 19xx to 2018?
Which regression method I should pick up (multiple regression?)

D/ Regression analysis - I would suggest trying two things to start. Again, for cluster i, year t:

$Dynamics_variable_it = F(#firms_it-n, industry mix_it-n, X_it-n, location_i, year_t)$ (1)

$Firm_performance_it = F(#firms_it-n, dynamics_it-n industry mix_it-n, X_it-n, location_i, year_t)$ (2)

Attributes Question:
- firms_it-n : the number of tech firms in year $t$ and cluster $i$
- industry mix_it-n : ?
- X_it-n: ?
- location_i: tech cluster $i$
- year_t: tech firm incorperation year $t$

1 Key Results¶

1.1 Tech Cluster Identifying¶

According to the previous EDA: https://zeqiang.fun/CASA0012-Dissertation/jupyterlab/EDA.html

The top 10 tech cluster can be divided by TTWA (Travel to Work Area). You can see the bar chart below

The key finding of bar chart:

The top 1 tech cluster has the most number of tech firms, which is as multiple times as that of other clusters.
The statistics of Manchester and Slough&Heathrow are greated than 20,000

The map of top 10 tech clusters based on ttwa boundaries can be found below. The blue area are non-top-10-tech-clusters.

The key finding of map:

Most tech clusters are located in the south of UK (England & Wales)
The number of tech firms sorrounding London tech cluster is the most (the pink and dark brown area)

Frequency_Distribution_of_tech_firms_cluster_by_TTWA_tab20b.png

1.2 Entry Pattern¶

The dataset of entry rate for 10 tech clusters from 1900 to 2018 can be found below

entry_rate.style

The simple linear regression for London tech cluster is operated

regressor_OLS.summary()

As shown above, the regression performance is not good. The exit rate and churn rate will be considered as variables in next multiple regression.

The research plan in the next week:

Multiple regression for entry/exit/churn pattern
EDA for LQs
Lit Review

2 Reproduciable Analysis¶

2.1 Read Data & Preprocessing¶

import pandas as pd
import sys
PATH = sys.path[0] # set the local file path

df = pd.read_csv(PATH + "/Dataset/fame_OC_tech_firm.csv")
# preprocess the data
df = df.drop({"Unnamed: 0","Unnamed: 0.1","Unnamed: 0.1.1","is_tech"},axis = 1)
df.head()

/Users/fangzeqiang/Library/Python/3.7/lib/python/site-packages/IPython/core/interactiveshell.py:3058: DtypeWarning: Columns (3,28,41,66) have mixed types.Specify dtype option on import or set low_memory=False.
  interactivity=interactivity, compiler=compiler, result=result)

According to previous EDA research, there are 88803 missing values in birth_year and incorperation_date columns, accounting for about 17% in the dataset.

# drop the null value record in birth year and ttwa(travel to work area) columns
df_1 = df.dropna(subset = {"birth_year","ttwa"})

print("There are "+ str(df_1.shape[0]/df.shape[0] * 100) + "% value remain \n")

There are 82.46591448838639% value remain

2.2 EDA for Birth Year¶

# count the value
df_1.birth_year.value_counts()

2018.0    66154
2017.0    42583
2015.0    40174
2016.0    39836
2014.0    35894
          ...  
1879.0        1
1881.0        1
1885.0        1
1887.0        1
1866.0        1
Name: birth_year, Length: 140, dtype: int64

# Tranform data type
df_birth_year = pd.DataFrame(df_1.birth_year.value_counts()).reset_index()
df_birth_year.rename({"index":"birth_year","birth_year":"count"},axis=1,inplace=True)
df_birth_year[df_birth_year["count"]>10]

2.3 Data Preparation: 10 Tech Clusters Identifying¶

According to the previous EDA: https://zeqiang.fun/CASA0012-Dissertation/jupyterlab/EDA.html

The top 10 tech cluster can be divided by TTWA (Travel to Work Area). You can see the bar chart below

The key finding of bar chart:

The top 1 tech cluster has the most number of tech firms, which is as multiple times as that of other clusters.
The statistics of Manchester and Slough&Heathrow are greated than 20,000

The map of top 10 tech clusters based on ttwa boundaries can be found below. The blue area are non-top-10-tech-clusters.

The key finding of map:

Most tech clusters are located in the south of UK (England & Wales)
The number of tech firms sorrounding London tech cluster is the most (the pink and dark brown area)

Frequency_Distribution_of_tech_firms_cluster_by_TTWA_tab20b.png

2.4 Methodology Design: Measuring Dynamics¶

Entry rate for tech cluster i, year t = firms incorporated_t / all firms_it

$Entry\ Rate_{i,t} = \frac{incorporating\ firms}{all\ companies}$

Where $i$ means the tech cluster, $t$ means the tech incorporate year 2009, 2014,..., 2020
Exit rate = firms dissolved_t / all firms_it

$Exit\ Rate_{i,t} = \frac{disolution\ firms}{all\ companies}$
Churn rate = (firms incorporated - firms dissolved)_t / all firms_it

$Churn\ Rate_{i,t} = \frac{incorporation\ firms\ -\ disolution\ firms}{all\ companies}$

*this index may be quite similar to the first one because there is a few number

You could also try fitting these in first differences, e.g. change in entry / exit / churn between t and t-1.

2.5 Entry Rate Research¶

The entry rate can reflect the level of tech cluster attraction, the measurement method is given below:

$Entry\ Rate_{i,t} = \frac{incorporating\ firms}{all\ companies}$

Where $i$ means the tech cluster, $t$ means the tech incorporate year 2009, 2014,..., 2020

Note: only a few companies(1%, according to previous EDA research) have dissolution dates, the number of all companies includes dissolution ones.

The 10 tech clusters will be picked to research the entry rate.
If the results are not ideal, more clusters will be added to observe.

2.5.1 Labeling Tech Clusters¶

Refer to previous EDA:https://zeqiang.fun/CASA0012-Dissertation/jupyterlab/EDA.html

The top 10 cluster based on Travel to Work Area(TTWA) will be concerned as shown below

FAQ:

1. Why I choose TTWA?

    This geographical division can better reflect the relationship between population, company and work.

2. What is TTWA?

    The 228 areas forming the 2011 TTWAs, covering the whole of the UK, were defined in 2015 using 2011 Census commuting flow data, indicating home and workplace address. The TTWAs are based on aggregations of Lower Layer Super Output Areas (LSOA) in England and Wales, Data Zones (DZ) in Scotland, and Super Output Areas (SOA) in Northern Ireland and in some cases span country borders. There are six cross-border TTWAs, 149 in England, 18 in Wales, 45 in Scotland and 10 in Northern Ireland.

If ttwa is attributed to 10_tech_clusters, then new column cluster_label will be set to the corresponding one.

# generate the top 10 tech clusters table based on the ttwa
top_10_ttwa = df.ttwa.value_counts().head(10)
top_10_ttwa = pd.DataFrame(top_10_ttwa).reset_index()
top_10_ttwa.columns = ["ttwa_code","counts"]

df_ttwa = pd.read_csv(PATH + "/Dataset/ttwa.csv")
df_ttwa = pd.merge(top_10_ttwa, df_ttwa[["code","name"]], left_on="ttwa_code",right_on="code")
df_ttwa_10 = df_ttwa[["ttwa_code", "name","counts"]]
df_ttwa_10.style

2.5.2 Data Filting¶

Data Filting: remain 10 tech clusters data

Then the ttwa can be considered as the cluster label

# df_10_tc means df_10_tech_clusters
df_10_tc = df.copy()
print(df_10_tc.shape)
df_10_tc = df_10_tc[df_10_tc.ttwa.isin(df_ttwa_10.ttwa_code.to_list())]
print(df_10_tc.shape)

(513708, 64)
(249490, 64)

df_10_tc.birth_year.value_counts()

2018.0    36350
2017.0    22303
2016.0    20551
2015.0    20468
2014.0    17832
          ...  
1905.0        1
1906.0        1
1918.0        1
1908.0        1
1877.0        1
Name: birth_year, Length: 129, dtype: int64

Data Filting: pick birth year from 2013 to 2018

refer to tech cluster

London, UK, from 2009 to 2018. The London technology cluster has occupied the world's leading level in the world venture capital market and start-up unicorn companies. The technology cluster effect has played an important role in this period of time (Kerr & et al., 2020).

Kerr, William R., and Frederic Robert-Nicoud. 2020. "Tech Clusters." Journal of Economic Perspectives, 34 (3): 50-76. https://www.aeaweb.org/articles?id=10.1257/jep.34.3.50

print(df_10_tc.shape)
# df_10_tc = df_10_tc[df_10_tc.birth_year>2008]
print(df_10_tc.shape)

(249490, 64)
(164986, 64)

2.5.3 Pivot Operation (Excel)¶

Export csv to excel to do pivot operation:

rows: ttwa (ttwa there is considered as the tech cluster label)
columns: birth_year
value: count of register_number

Then, calculate the entry number in excel

df_10_tc_clipped = df_10_tc[["registered_number","ttwa","birth_year"]]
df_10_tc_clipped.to_csv(PATH + "/fame_OC_10_tc_clipped.csv")

Import the processed dataset from .xlsx file

# read the entry rate dataset from excel file 
sheetName = "entry_rate"
df_entry_excel = pd.read_excel(PATH + "/Excel/fame_OC_10_tc_clipped.xlsx",sheet_name = sheetName)
df_entry_excel.head()

# Preprocess the entry rate dataset
df_entry_rate = df_entry_excel.rename({"ttwa":"tech_cluster","Grand Total":"all_companies"},axis=1)
df_entry_rate.head()

2.5.4 Regression¶

add ttwa name to the entry rate dataset

entry_rate = pd.merge(df_entry_rate, df_ttwa_10,left_on = "tech_cluster", right_on = "ttwa_code")
entry_rate.drop({"counts"},axis=1,inplace = True)
entry_rate

take London as the first example

# calculate the entry rate of London
London_ER = entry_rate[entry_rate["tech_cluster"]=="E30000234"].reset_index()
London_ER.drop({"index","all_companies","tech_cluster","ttwa_code","name"},axis=1,inplace = True)

X_t = London_ER.columns.to_list() # x axis is t(year) for London
Y_entry_rate = London_ER.values.tolist()[0] # y axis is the entry rate  for London

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.linear_model import LinearRegression
import statsmodels.api as sm

pd.set_option('display.max_rows', 300) # specifies number of rows to show
pd.options.display.float_format = '{:40,.4f}'.format # specifies default number format to 4 decimal places
plt.style.use('ggplot') # specifies that graphs should use ggplot styling
%matplotlib inline

# fit a lineare regression model using statsmodel
regressor_OLS = sm.OLS(endog=Y_entry_rate, exog=sm.add_constant(X_t)).fit()
regressor_OLS.summary()

2.6 Exit Rate¶

to do

2.7 Churn Rate¶

to do

2.8 LQ research¶

to do

	tech_cluster	all_companies	1900	1901	1902	1904	1905	1906	1907	1908	1909	1910	1911	1912	1913	1914	1915	1916	1917	1918	1919	1920	1921	1922	1923	1924	1925	1926	1927	1928	1929	1930	1931	1932	1933	1934	1935	1936	1937	1938	1939	1940	1941	1942	1943	1944	1945	1946	1947	1948	1949	1950	1951	1952	1953	1954	1955	1956	1957	1958	1959	1960	1961	1962	1963	1964	1965	1966	1967	1968	1969	1970	1971	1972	1973	1974	1975	1976	1977	1978	1979	1980	1981	1982	1983	1984	1985	1986	1987	1988	1989	1990	1991	1992	1993	1994	1995	1996	1997	1998	1999	2000	2001	2002	2003	2004	2005	2006	2007	2008	2009	2010	2011	2012	2013	2014	2015	2016	2017	2018	ttwa_code	name
0	E30000169	9592	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000104	0.000104	0.000209	0.000104	0.000104	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000104	0.000000	0.000000	0.000104	0.000000	0.000000	0.000104	0.000000	0.000209	0.000000	0.000104	0.000000	0.000000	0.000000	0.000209	0.000000	0.000000	0.000209	0.000209	0.000000	0.000104	0.000000	0.000000	0.000104	0.000000	0.000000	0.000000	0.000104	0.000313	0.000000	0.000104	0.000104	0.000104	0.000000	0.000209	0.000104	0.000209	0.000209	0.000000	0.000313	0.000104	0.000104	0.000521	0.000209	0.000209	0.000313	0.000417	0.000209	0.000521	0.000521	0.000000	0.000104	0.000521	0.000209	0.000417	0.000834	0.000521	0.000938	0.001251	0.000626	0.000834	0.000834	0.001460	0.000834	0.001251	0.002919	0.002815	0.002606	0.002085	0.001668	0.002606	0.002085	0.002711	0.003440	0.004691	0.004483	0.007089	0.005942	0.008132	0.008549	0.006985	0.013032	0.015638	0.015117	0.015742	0.020746	0.028148	0.027314	0.027314	0.039304	0.040450	0.057339	0.074750	0.090179	0.105088	0.094245	0.098415	0.150751	E30000169	Birmingham
1	E30000175	7546	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000133	0.000000	0.000000	0.000133	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000133	0.000133	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000133	0.000000	0.000000	0.000000	0.000000	0.000000	0.000133	0.000000	0.000000	0.000000	0.000000	0.000000	0.000133	0.000000	0.000000	0.000000	0.000265	0.000000	0.000133	0.000133	0.000000	0.000265	0.000000	0.000000	0.000398	0.000265	0.000398	0.000000	0.000133	0.000000	0.000133	0.000133	0.000265	0.000265	0.000398	0.000265	0.000265	0.000530	0.000530	0.000795	0.000795	0.001060	0.000928	0.001723	0.000530	0.001723	0.000928	0.002120	0.001458	0.002385	0.003446	0.003048	0.002915	0.004506	0.003843	0.005168	0.006361	0.005168	0.006626	0.007156	0.022131	0.010999	0.017493	0.032202	0.041479	0.049960	0.049033	0.076067	0.122184	0.137291	0.161675	0.211105	E30000175	Bournemouth
2	E30000179	7482	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000134	0.000000	0.000000	0.000000	0.000134	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000267	0.000000	0.000000	0.000000	0.000134	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000134	0.000000	0.000000	0.000000	0.000134	0.000000	0.000000	0.000134	0.000000	0.000000	0.000134	0.000000	0.000000	0.000000	0.000000	0.000134	0.000000	0.000000	0.000134	0.000267	0.000267	0.000267	0.000000	0.000134	0.000802	0.000802	0.000401	0.000668	0.000134	0.001203	0.000401	0.000936	0.001604	0.001069	0.002138	0.001604	0.003074	0.003475	0.005213	0.004945	0.006148	0.004812	0.009489	0.012430	0.008688	0.011895	0.016306	0.020315	0.024058	0.022053	0.041834	0.056001	0.063085	0.093692	0.107725	0.130179	0.120556	0.102646	0.117215	E30000179	Brighton
3	E30000180	6176	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000162	0.000000	0.000000	0.000000	0.000162	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000162	0.000000	0.000000	0.000000	0.000000	0.000000	0.000162	0.000162	0.000000	0.000000	0.000162	0.000486	0.000000	0.000162	0.000000	0.000486	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000162	0.000162	0.000000	0.000324	0.000324	0.000162	0.000162	0.000000	0.000162	0.000000	0.000324	0.000324	0.000162	0.000162	0.000486	0.000648	0.000648	0.000162	0.001133	0.000648	0.000162	0.000648	0.001133	0.001133	0.001133	0.001457	0.001781	0.002915	0.002915	0.002267	0.003238	0.002105	0.002753	0.003724	0.005829	0.006639	0.009877	0.010687	0.010363	0.011334	0.010525	0.015220	0.019268	0.016839	0.017487	0.022992	0.039184	0.028983	0.027850	0.038860	0.051975	0.065576	0.069462	0.088407	0.093588	0.092455	0.089216	0.121762	E30000180	Bristol
4	E30000186	6251	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000160	0.000000	0.000160	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000160	0.000000	0.000000	0.000000	0.000160	0.000000	0.000160	0.000000	0.000000	0.000160	0.000160	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000160	0.000000	0.000000	0.000000	0.000160	0.000320	0.000160	0.000160	0.000000	0.000000	0.000160	0.000000	0.000000	0.000160	0.000160	0.000000	0.000160	0.000320	0.000320	0.000320	0.000000	0.000320	0.000160	0.000320	0.000480	0.000160	0.000320	0.000480	0.000640	0.000960	0.000480	0.000800	0.000640	0.000640	0.000800	0.000800	0.001120	0.000640	0.000640	0.001440	0.001280	0.001760	0.001920	0.002720	0.002560	0.003519	0.002240	0.004159	0.003679	0.003199	0.004159	0.002400	0.004959	0.005919	0.008479	0.007839	0.011198	0.010078	0.014078	0.011358	0.020317	0.023356	0.018397	0.019037	0.020957	0.029755	0.025596	0.026716	0.033275	0.041433	0.049912	0.059510	0.065270	0.080947	0.087186	0.099824	0.175492	E30000186	Cambridge
5	E30000234	118998	0.000017	0.000008	0.000000	0.000008	0.000008	0.000008	0.000034	0.000000	0.000017	0.000025	0.000000	0.000000	0.000017	0.000025	0.000025	0.000025	0.000025	0.000008	0.000008	0.000017	0.000017	0.000025	0.000017	0.000042	0.000025	0.000008	0.000008	0.000059	0.000034	0.000076	0.000034	0.000008	0.000034	0.000025	0.000042	0.000059	0.000017	0.000034	0.000017	0.000017	0.000008	0.000017	0.000017	0.000042	0.000025	0.000109	0.000101	0.000050	0.000050	0.000059	0.000025	0.000034	0.000084	0.000092	0.000050	0.000059	0.000059	0.000092	0.000134	0.000168	0.000118	0.000109	0.000143	0.000134	0.000143	0.000193	0.000235	0.000151	0.000235	0.000269	0.000336	0.000336	0.000353	0.000345	0.000345	0.000420	0.000563	0.000563	0.000714	0.000647	0.000798	0.000740	0.001084	0.001185	0.001571	0.001269	0.001714	0.001824	0.001597	0.001765	0.001866	0.002462	0.002580	0.003000	0.004143	0.005051	0.005790	0.006479	0.007437	0.008899	0.007866	0.010185	0.013689	0.012311	0.013799	0.017261	0.022311	0.021614	0.025017	0.035404	0.044169	0.055631	0.067833	0.087287	0.098346	0.100153	0.109523	0.187843	E30000234	London
6	E30000237	8869	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000113	0.000000	0.000113	0.000000	0.000113	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000226	0.000000	0.000000	0.000113	0.000000	0.000000	0.000000	0.000000	0.000113	0.000000	0.000000	0.000000	0.000113	0.000113	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000113	0.000000	0.000113	0.000113	0.000000	0.000113	0.000000	0.000000	0.000000	0.000113	0.000000	0.000113	0.000000	0.000000	0.000338	0.000226	0.000113	0.000113	0.000000	0.000113	0.000451	0.000113	0.000226	0.000451	0.000226	0.000677	0.000226	0.000564	0.000789	0.000564	0.000789	0.000789	0.000451	0.000451	0.001015	0.000789	0.000789	0.000902	0.001466	0.001466	0.001240	0.002142	0.001579	0.001917	0.002819	0.002593	0.002706	0.003946	0.004510	0.004510	0.008344	0.008118	0.008569	0.005976	0.006314	0.011050	0.012516	0.012967	0.013643	0.017138	0.029203	0.027061	0.026159	0.040929	0.052092	0.058293	0.067651	0.087383	0.099899	0.087834	0.112189	0.162927	E30000237	Luton
7	E30000239	18270	0.000000	0.000000	0.000055	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000109	0.000109	0.000000	0.000000	0.000000	0.000000	0.000000	0.000055	0.000000	0.000000	0.000055	0.000000	0.000000	0.000000	0.000000	0.000055	0.000000	0.000109	0.000055	0.000164	0.000055	0.000000	0.000055	0.000000	0.000109	0.000000	0.000055	0.000109	0.000055	0.000164	0.000000	0.000109	0.000109	0.000000	0.000000	0.000000	0.000219	0.000055	0.000109	0.000109	0.000109	0.000055	0.000164	0.000109	0.000109	0.000109	0.000109	0.000055	0.000109	0.000328	0.000219	0.000328	0.000274	0.000109	0.000219	0.000109	0.000055	0.000164	0.000164	0.000219	0.000219	0.000328	0.000547	0.000657	0.000328	0.000383	0.000274	0.000602	0.000547	0.001040	0.000493	0.000930	0.001095	0.001204	0.001040	0.001752	0.001970	0.001916	0.002354	0.002901	0.002463	0.003010	0.002737	0.002573	0.004324	0.004926	0.006568	0.007608	0.009086	0.007499	0.008320	0.007663	0.013519	0.017515	0.014559	0.014614	0.020361	0.029338	0.021456	0.029338	0.038259	0.044663	0.055993	0.065189	0.086864	0.099453	0.097209	0.100219	0.158894	E30000239	Manchester
8	E30000266	17364	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000058	0.000058	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000058	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000058	0.000058	0.000058	0.000058	0.000000	0.000058	0.000000	0.000000	0.000000	0.000115	0.000058	0.000115	0.000000	0.000000	0.000000	0.000000	0.000058	0.000058	0.000000	0.000058	0.000058	0.000058	0.000115	0.000230	0.000115	0.000115	0.000058	0.000058	0.000173	0.000058	0.000058	0.000000	0.000173	0.000115	0.000058	0.000173	0.000173	0.000288	0.000230	0.000173	0.000000	0.000230	0.000173	0.000115	0.000346	0.000058	0.000230	0.000115	0.000173	0.000173	0.000173	0.000346	0.000403	0.000461	0.000576	0.000633	0.000864	0.000633	0.000633	0.000979	0.001094	0.001555	0.001325	0.001843	0.002419	0.003225	0.003110	0.002419	0.002764	0.002073	0.003686	0.003801	0.004031	0.006047	0.007141	0.007947	0.007487	0.010539	0.007199	0.012900	0.015665	0.013188	0.018717	0.019638	0.027586	0.025225	0.030811	0.039795	0.049988	0.062946	0.071124	0.093296	0.101014	0.093642	0.095255	0.138793	E30000266	Slough and Heathrow
9	E30000284	4950	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000202	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000202	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000404	0.000000	0.000202	0.000404	0.000000	0.000000	0.000000	0.000000	0.000202	0.000202	0.000202	0.000000	0.000000	0.000606	0.000404	0.000000	0.000404	0.000606	0.000000	0.000000	0.000202	0.000606	0.001212	0.000808	0.000404	0.000808	0.002424	0.000404	0.001010	0.000606	0.001616	0.003030	0.001212	0.001818	0.003232	0.002020	0.002222	0.000808	0.002424	0.002828	0.002424	0.004242	0.007273	0.006869	0.006667	0.007879	0.006061	0.011111	0.015960	0.010101	0.009091	0.015960	0.031919	0.015758	0.019798	0.027071	0.032929	0.035556	0.046061	0.053333	0.064646	0.099798	0.137980	0.297778	E30000284	Warrington and Wigan

Dep. Variable:	y	R-squared:	0.253
Model:	OLS	Adj. R-squared:	0.246
Method:	Least Squares	F-statistic:	39.27
Date:	Tue, 13 Jul 2021	Prob (F-statistic):	6.49e-09
Time:	21:04:24	Log-Likelihood:	280.32
No. Observations:	118	AIC:	-556.6
Df Residuals:	116	BIC:	-551.1
Df Model:	1
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
const	-0.7435	0.120	-6.195	0.000	-0.981	-0.506
x1	0.0004	6.12e-05	6.267	0.000	0.000	0.001

Omnibus:	129.234	Durbin-Watson:	0.120
Prob(Omnibus):	0.000	Jarque-Bera (JB):	2345.954
Skew:	3.843	Prob(JB):	0.00
Kurtosis:	23.447	Cond. No.	1.13e+05

	registered_number	incorporation_date	dissolution_date	latest_accounts_cash	latest_accounts_assets	latest_accounts_liabilities	year_obs	sic_year	sic4	change_sic	...	pcd2	pcds	cty	laua	ctry	rgn	ttwa	pct	lep1	lep2
0	06199872	2007-04-02	NaN	NaN	15785.0	NaN	2017	2007.0	8622.0	0.0	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	11896710	NaN	NaN	NaN	NaN	NaN	2018	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	SC041753	1965-02-19	NaN	NaN	NaN	NaN	2018	2007.0	2013.0	0.0	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	12125173	NaN	NaN	NaN	NaN	NaN	2018	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	11968595	NaN	NaN	NaN	NaN	NaN	2018	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

	birth_year	count
0	2018.0	66154
1	2017.0	42583
2	2015.0	40174
3	2016.0	39836
4	2014.0	35894
...	...	...
89	1925.0	13
90	1942.0	13
91	1924.0	13
92	1923.0	12
93	1909.0	11

	ttwa_code	name	counts
0	E30000234	London	146959
1	E30000239	Manchester	21683
2	E30000266	Slough and Heathrow	20224
3	E30000169	Birmingham	11262
4	E30000237	Luton	10408
5	E30000175	Bournemouth	9232
6	E30000179	Brighton	8368
7	E30000186	Cambridge	7296
8	E30000180	Bristol	7090
9	E30000284	Warrington and Wigan	6968

	tech_cluster	all_companies	1907	...	2009	2010	2011	2012	2013	2014	2015	2016	2017	2018
0	E30000169	9592	0.00000	...	0.027314	0.039304	0.040450	0.057339	0.074750	0.090179	0.105088	0.094245	0.098415	0.150751
1	E30000175	7546	0.00000	...	0.017493	0.032202	0.041479	0.049960	0.049033	0.076067	0.122184	0.137291	0.161675	0.211105
2	E30000179	7482	0.00000	...	0.022053	0.041834	0.056001	0.063085	0.093692	0.107725	0.130179	0.120556	0.102646	0.117215
3	E30000180	6176	0.00000	...	0.027850	0.038860	0.051975	0.065576	0.069462	0.088407	0.093588	0.092455	0.089216	0.121762
4	E30000186	6251	0.00016	...	0.026716	0.033275	0.041433	0.049912	0.059510	0.065270	0.080947	0.087186	0.099824	0.175492

	tech_cluster	all_companies	1902	1907	1908	...	2011	2012	2013	2014	2015	2016	2017	2018	ttwa_code	name
0	E30000169	9592	0.0000	0.0000	0.0000	...	0.0405	0.0573	0.0747	0.0902	0.1051	0.0942	0.0984	0.1508	E30000169	Birmingham
1	E30000175	7546	0.0000	0.0000	0.0000	...	0.0415	0.0500	0.0490	0.0761	0.1222	0.1373	0.1617	0.2111	E30000175	Bournemouth
2	E30000179	7482	0.0000	0.0000	0.0000	...	0.0560	0.0631	0.0937	0.1077	0.1302	0.1206	0.1026	0.1172	E30000179	Brighton
3	E30000180	6176	0.0000	0.0000	0.0000	...	0.0520	0.0656	0.0695	0.0884	0.0936	0.0925	0.0892	0.1218	E30000180	Bristol
4	E30000186	6251	0.0000	0.0002	0.0000	...	0.0414	0.0499	0.0595	0.0653	0.0809	0.0872	0.0998	0.1755	E30000186	Cambridge
5	E30000234	118998	0.0000	0.0000	0.0000	...	0.0442	0.0556	0.0678	0.0873	0.0983	0.1002	0.1095	0.1878	E30000234	London
6	E30000237	8869	0.0000	0.0000	0.0000	...	0.0521	0.0583	0.0677	0.0874	0.0999	0.0878	0.1122	0.1629	E30000237	Luton
7	E30000239	18270	0.0001	0.0000	0.0000	...	0.0447	0.0560	0.0652	0.0869	0.0995	0.0972	0.1002	0.1589	E30000239	Manchester
8	E30000266	17364	0.0000	0.0000	0.0001	...	0.0500	0.0629	0.0711	0.0933	0.1010	0.0936	0.0953	0.1388	E30000266	Slough and Heathrow
9	E30000284	4950	0.0000	0.0000	0.0000	...	0.0329	0.0356	0.0461	0.0533	0.0646	0.0998	0.1380	0.2978	E30000284	Warrington and Wigan