The impact of the connected economy on data management
The number of connected things is projected to grow exorbitantly before the end of the decade. Depending on the source, projections for the number of connected objects by 2020 could be as low as 26 billion or as high as 50 billion. But even the low end of that range is quite large, so it is reasonable to expect connectedness to be commonplace and expected within the next few years.
The connected economy impacts both consumers and businesses, with the overall market for IoT technology projected to expand to $883.55 billion by 2022. Businesses use sensors and connected devices to track inventory, survey the workplace, enforce policies, monitor clickstreams on web servers, and more. Consumers wear devices to track their health and will increasingly enable more of their devices with sensors – both in their homes and in their automobiles. We are at the advent of a society where just about every 'thing' (both living and inanimate) can have an attached or embedded sensor.
The impact of these connected devices will be more automation and autonomy, but a significant aspect of the IoT is the creation of more data. A lot more data. The Internet of Things is projected to generate 400 zettabytes of data a year as soon as 2018. IDC’s Digital Universe study predicts that by 2020, the amount of information produced by the IoT will account for about 10 percent of all data on Earth.
This enormous growth in data creation is causing many significant changes to IT and data management. Some of these issues include increased usage of unstructured data, streaming data, and different forms of database systems to handle the increased volume.
Structured data remains the bedrock of the information infrastructure at most organizations, but unstructured data is growing in importance. Unstructured data is non-traditional (that is, not number or short character strings). Instead, unstructured data can range from images and videos to large text documents and e-mail. And importantly, unstructured data accounts for about 90 percent of all digital information according to International Data Corp. This means that new and different methods of manipulating and storing unstructured data are required because traditional methods used by legacy database systems don’t work with it.
Streaming data is another important aspect of the connected economy. As connected sensors and devices are turned on, the need to capture and read the generated data streams becomes important. But not every piece of data ever generated from a sensor may need to be stored for posterity. Instead, the stream of data needs to be ingested, filtered and analyzed looking for patterns and anomalies. This can be done without ever persisting the entire stream of data – which will be increasingly important as the IoT grows and generates more and more data.
Additionally, new types of database systems are being used that are engineered for analytics and large data volume issues. NoSQL databases with their lightweight infrastructure and flexible schema capabilities are growing rapidly. It is common for organizations to have multiple database systems, both relational/SQL and NoSQL. One specific type of NoSQL DBMS, the graph database system, focuses on relationships between values. Data is stored using graph structures with nodes, edges and properties in a graph database. With graph database systems the relationships between data elements is at least as important as the data itself.
Graphs are particularly useful when data elements are interconnected and there are an undetermined number of relationships between them. For example, consider maintaining a social network like Facebook or LinkedIn. There are numerous other applications such as routing and dispatching, public transportation links, road maps, and recommendation engines (such as used by online retail sites).
Of course, there are many additional aspects of connectedness and the data growth that accompanies it. Privacy issues, security and protection, data governance and compliance, and metadata management are examples of significant areas that are being impacted by the IoT. But we’ve covered more than enough change for one blog post, so we’ll have to discuss these issues later…