alt text

If you were not one of the lucky people to participate at PyData 2024, no worries—you can find my talk on YouTube (see below).

The talk is about the construction and analysis of a time-varying worldwide network of professional relationships among startups to predict long-term economic performance using network centrality measures. In my PyData talk, I provided an overview of how I built the worldwide startup network using CrunchBase data and the Networkx library. I modeled employee flow and knowledge transfer as links between startups. By applying network centrality measures, I ranked early-stage startups (pre-seeded) and evaluated how their ranked positions correlated with their future success. I also touched on the implications of these findings for entrepreneurs, investors, and policymakers.

Drawing on large-scale online data, I modeled professional relationships and employee transitions among startup companies in a time-varying global network. In this network, companies were represented as nodes, while links indicated employee flows and the transfer of knowledge across firms.

I investigated whether a startup’s position and connectivity patterns within this network could predict its long-term economic performance and likelihood of success. Using network centrality measures like PageRank, I ranked startups within the global startup network. My analysis showed that this network provided valuable predictive signals, enabling results that sometimes doubled the performance of traditional venture capital screening processes. These findings supported the idea that a startup’s position within its ecosystem plays a critical role in determining future success.

In the talk, I covered the methodology for network construction, the predictive modeling approach, key findings, and their implications. Entrepreneurs could learn how to optimize their startups’ ecosystem positioning, while venture capitalists and policymakers gained insights into conducting more objective assessments and designing targeted interventions within innovation ecosystems.

This PyData talk was designed for data scientists, network scientists, investors, entrepreneurs, and anyone curious about combining network science and machine learning for empirical studies. I balanced technical modeling details with higher-level insights. A basic understanding of networks/graphs and machine learning concepts would have been helpful, but a strong curiosity about innovative approaches in venture capital made the talk engaging for a wide audience.