Advisor

The Datapreneurs: A Must-Read for CIOs on Their Data Technology Journey

Posted June 29, 2023 | Leadership |
Data journey

The recently published book The Datapreneurs is an unusual read. Author Bob Muglia, a former Microsoft executive, former President of Snowflake, and a venture investor, says his new book is “part memoir and part history of the people and technologies that made the data analytics era possible.” And I would add, it is part what people in futures research call a “reference projection.” In this case, it is a reference projection for data, analytics, humanity, and artificial intelligence (AI).

Topics in The Datapreneurs include the foundations of digitization, the emergence of cloud and the data economy, and the role of automation in future economic and political structures. Importantly, Muglia suggests that we are only at the beginning of the data era and claims that data is the most critical business asset, second only to employees. Given this, it is a must-read for CIOs trying to put together where data technologies are today and where they are going.

Foundations for Digitization

Muglia starts by sharing how relational databases transformed the computer industry. Without question, they made it easier to manage and retrieve data and write and run business applications. As a reminder, Muglia says, “Relational databases organized data into tables with keys to identify each row. They eliminated the hard-coded links between data, providing more flexibility in application design.”

Interestingly, when I worked for Peregrine, an earlier version of what is today ServiceNow, I was amazed to learn that its software was built on a pre-relational database. This made Peregrine hard to upgrade and even harder to maintain. Adding onto relational databases was SQL, which enabled data to be grouped into tables and relations to other tables using simple commands. Allen Taylor wrote in SQL for Dummies, “SQL is an industry-standard language designed to enable people to create databases, maintain data, and retrieve selected parts of data.”

The Move to Cloud Data

The big data era was initiated because of the explosion of the amount and variety of data. It included the fact that on-premises SQL could not scale to emerging workloads. However, big data’s Hadoop failed because it proved overly complex for all but the most tech-savvy organizations. Hadoop was brute-force and difficult to install, maintain, and manage.

Cloud has proved a game changer with an explosion of options through cloud vendors, including Amazon Web Services, Microsoft Azure, and Google. Cloud, without question, has become an essential platform for data management and analytics. Prior to the cloud, as organizations dealt with more diverse data (especially on premises), data chaos compounded. Muglia claims that cloud data lakes and data warehouses enable organizations to access unlimited computer and storage resources, support unlimited concurrent users, and pay for only the services that it uses.

The Modern Data Stack

The origin of the modern data stack is a 2020 Andreessen Horowitz white paper. According to Muglia, the modern data stack helps organizations manage and analyze their growing data. In contrast to past waves of technology, the modern data stack will not be provided by one vendor; it is instead an ecosystem of technologies provided by many companies and open source projects. Developed through a collection of products and services that work together in the cloud, the modern data stack is built in the cloud and takes advantage of its native capabilities. It includes the public cloud’s ability to scale, lower cost, and facilitate for data modeling with SQL databases.

Muglia believes a modern data stack makes the gathering, managing, integrating, analyzing, and sharing of data easier and cheaper. The stack includes pipelines, data lake and warehouse, and predictive analytics. According to the Andreessen Horowitz white paper, the modern data stack enables data discovery, data governance, data observability, and entitlements and security.

A modern data stack enables people to organize, access, and integrate data more conveniently and affordably, which makes it easier to break down data silos and share data internally and with business partners. Muglia reviews many of the details of the modern data stack, including data pipelines, data visualization, and operational databases. He looks at the growing importance of unlocking unstructured data or data complexity. He also shares interesting stories about start-ups, such as Pinecone, Docugami, and Collagia, and discusses what he calls a “model-driven world,” new computer languages such as Julia, and data applications that include relational knowledge graphs.

Emergence of Smart Machines

Muglia believes we are at the early days of machine learning (ML). Having said this, he believes that automation is going to take over many routine, human-driven processes. He shares details about the emergence of DALL-E 2, Stable Diffusion, GPT-3, ChatGPT, and OpenAI. He also digs into how deep learning involves 50 or more processing layers with feedback.

Muglia then discusses the timing for machines achieving general intelligence; to get to this point, they need to sense, learn, reason, plan, adapt, and act. When discussing self-awareness, he shares a discussion with Molham Aref, CEO of RelationalAI. In this discussion, Aref says, “There are two hard questions in the universe, how it all started, what was there before the Big Bang. And at what point do animals and machines become self-aware. I am not a philosopher, and I have no idea how to answer either question.”

Arc of Data Innovation

Muglia effectively proposes his version of an information hype cycle for digital computing. The hype cycle is a maturity lifecycle for technology and innovation that starts with the World Wide Web and evolves into semi-structured data, ML, foundational models, artificial general intelligence (AGI), intelligent robots, and the technology singularity. For me, AGI was a new collection of words. Muglia suggests that it will occur in the next 10 years and means that AI assistants will pass the threshold of median human intelligence. (My guess is that this is when the 2013 science fiction movie Her was set.) That said, Muglia believes the combination of human and machine intelligence will cause an era of progress and prosperity.

In contrast to dystopian science fiction, Muglia paints a positive view of robots as essential tools for human expansion into space, the solar system, and eventually thousands of worlds in our galaxy. Muglia shares some definitions of intelligence provided from his interactions with OpenAI CEO Sam Altman:

  • AGI is achieved when computing systems possess intelligence equivalent to that of the median human.

  • Artificial superintelligence occurs when a system possessing AGI is smarter than all of humanity put together.

  • Technology singularity occurs when AGIs working with people enable hundreds of years of progress in one year.

Muglia believes the 2030s will be the golden era of robotics.

The Social Contract

Having shared his vision of the future, Muglia goes back to the political philosopher Jean-Jacques Rousseau and suggests that as AI reaches forward, we will reach the point in human history where it is necessary to create a new kind of contract that governs the relationship with intelligent machines, helping society and improving how individuals deal with the rapid changes and disruption that will likely come. In Klaus Schwab’s The Fourth Industrial Revolution, he worried about the impacts of automation and suggests the need for a living wage for everyone displaced from work.

Muglia suggests that the era of intelligent machines will have ethical and moral implications. He claims smart machines should be respected, not feared, and that we need a social contract that governs the relationship between people and emerging generations of AGI machines. It may now be time for Isaac Asimov’s three (or four) laws of robotics. Muglia goes further and suggests laws will be needed for humans as well.

Muglia claims robots will be our servants and partners and get stuff done. They will provide much more than the Jetson’s robot housekeeper “Rosie.” To accomplish this in the correct manner, Muglia involves governmental action. Is there a budding Dr. Alfred Lanning out there to help? Hopefully, Muglia is right that we are not heading for Terminator and Skynet. I, however, am personally ready for Rosie to take over my cooking and cleaning chores!

Parting Words

If you are like me, this book goes places that I didn’t expect. I expected a nice book on the data revolution that we are now experiencing. However, generative AI is getting a lot of attention these days. We need to consider its potential impacts. Having this opportunity, regardless of the timeline, is worthy of current consideration. So why should CIOs get this book? It does an amazing job of describing how we got to where we are today and where we are going. In one book, CIOs can dig in and learn about the implications for their transformational plans.

About The Author
Myles Suer
Myles Suer is Strategic Marketing Director at Privacera. He has been a data business leader at various companies, including Alation, Informatica, and HP Software. Mr. Suer is the facilitator for CIOChat, a platform that brings together worldwide executive-level participants from a mix of industries, including banking, insurance, energy, education, and government. He has been published in Computerworld, CIO Magazine, eWeek, CMS Wire, and COBIT… Read More