Data science appeared to me suddenly in 2014, when I happened upon the Transportation Data Challenge issued by Code for Boston and MAPC (the Boston regional planning agency). The kick-off conference for the challenge included a presentation explaining the basics of data cleaning, exploratory data analysis, and modeling. I thought: This is what I have been doing throughout my career, but there are many new tools to learn. I teamed up with a data visualization specialist and designer; I did the data analysis and modeling. Our team submitted a project comparing land use and automobile travel (with a great-looking viz, thanks to my team members) that earned us the “Best in Analysis” award. My prize was a custom license plate, “VMTWIZ.” Not valid for use, unfortunately.
Since then, I have wanted to learn more tools of data science. In 2017 I completed Level Education’s Data Analytics program, which gave me a refresher in SQL, and got me started using R and Tableau. Since then I have become proficient in using the Tidyverse package in R. But more tools are needed. So this week I started General Assembly’s Data Science Immersive program to gain proficiency in Python and its key data science packages, plus Jupyter Notebooks, Git, and other basic tools.
It seems that everyone is looking for data scientists these days. My research and work has mostly been in urban transportation and city planning. The City of Boston, where I previously worked as a transportation planner, has been hiring data scientists for a new Citywide Analytics Team. The U.S. DOT Volpe Center, where I also used to work, recently posted an opening for a data scientist. The MBTA, Boston’s regional transit agency, has a Data Blog and a team of data analysts.
I recently took a deep dive into politics — specifically, how to fix America’s defective democracy. Ranked choice voting is an important reform that has started to get traction thanks to the efforts of reformers in Maine and elsewhere, notably Voter Choice Massachusetts, which has mounted a spirited campaign to get the Commonwealth to become the second state to adopt RCV statewide. I used R to analyze 20 years of election returns from Massachusetts legislative races to demonstrate that non-majority winners are very common in open seat elections. Because of the power of incumbency, open seat winners tend to stay in office without significant competition for a long time. I extended the analysis to the 2016 U.S. House election results in a report for FairVote.
Data science has become important in political campaigns. During the 2018 campaign, Tech for Campaigns matched data scientists with the campaigns of 133 progressive candidates. MoveOn.org used data scientists to make their video messaging more effective. More campaigns are using data from smartphones, smart TVs, and smart speakers to micro-target voters — creating questions about privacy.
Data science has generated positive buzz for many years. Recent articles suggest there are still bullish signs for the field:
- TechTarget says Demand for data scientists is booming and will increase.
- TechHQ says There is a data scientist shortage
- Tech Republic says Data scientist is the most promising job of 2019
- Glassdoor rates data scientist as the Best Job in America for 2019
Data scientists would seem to have no trouble in finding a job. The problem is becoming a “data scientist.” On ne prête qu’aux riches. With luck — and grit — I am on that path.