Data lakes will look nothing like this.

The “Internet of Things” is a trendy term that probably makes you think about connected toasters and smart refrigerators. But for GE, it also includes jet engines and power generators.

You might not think of these giant artifacts of industrial culture as part of the Internet of Things, but they will be — and their impact is going to be far bigger than adding some superfluous level of convenience to your dishwasher.

Ibrahim Gokcen, a technology executive at GE, told me recently how GE jet engines include sensors to monitor temperature, pressure, fuel consumption, and more. The sensors themselves aren’t new, but the quantity of data they’re generating is.

Ten years ago, each sensor would generate about 30KB of data per flight, sampling conditions at takeoff, when the plane reached cruising altitude, and again at landing.

Today, the sensors embedded in a GE jet engine sample conditions continuously and generate 500GB of data — per engine — for each flight.

Airlines are interested in that data, of course, since it can help identify problems before they become critical. Whenever a jet engine is in the shop (“off wing” in industry parlance), it’s not helping the airline earn money, and if you’re paying billions of dollars for your jet engines, you need them to be on airplane wings, earning money.

So airlines save all the data, from each engine, for each flight. But that’s where the problems start.

Multiply 500GB by each airplane’s number of engines (two to four, in most modern passenger jets) and by the number of daily flights, and you’ve got a staggering amount of data being generated just by one component of an airline’s infrastructure.

For now, most industrial sensor data is simply dumped into a database, just like any other data generated by a corporation. There’s no guarantee that the database used for one component of a business (jet engines) will be compatible with the database used to track data from another component (airplane maintenance records, or human resources records). In fact, it’s a pretty safe bet that any large industrial corporation is going to have dozens if not hundreds of data repositories, or one giant “data lake,” an apt term in that it conjures up an image of a huge, stagnant pool of fetid data, just sitting there doing nothing.

And yet, using that data can confer significant advantages. If you can eke another few percentage points of efficiency out of a power-generating gas turbine, that translates directly into reduced costs and increased margin. Keeping jet engines in service longer is a win, even if it’s only a slight improvement. When you’re a billion-dollar business, every percent of efficiency improvement or cost reduction is worth $10 million, so it’s worth diving deep into the data lakes to pull out the gems.

This is where it gets tricky, because the number of sensors and the amount of data is about to get ridiculous.

Cisco estimated a few years ago that 8.7 billion “things” — including computers — were connected to the Internet. Looking just at devices (not computers or phones), Gartner estimated there would be 4.9 billion connected IoT “things” by 2015. But that number is set to explode: GE estimates there will be 50 billion devices attached to the Internet by 2020.

These devices will mostly be invisible. They won’t inspire page after page of glowing commentary from Internet pundits in high-profile tech news sites. But they will make an enormous difference to the way industry works in the modern world.

If companies are going to make use of all this data they’re generating, they’ll need to start pulling them together. That’s where big-data analysis tools like Apache Spark, data-management and storage tools like Hadoop, and search engines like Maana and Elasticsearch come in.

GE, for its part, has its own platform for managing data from the industrial Internet, called GE Predix, which currently generates about $1 billion in revenue. Gokcen said he expects that business to grow to $4 billion to $5 billion over the next few years.

GE is hedging its bets with investments in Maana, Predixion, Ayasdi, and others; it also owns 10 percent of big-data company Pivotal.

In a way, that makes an industrial manufacturing giant like GE into a software company. It’s still making turbines, but the software to analyze all that data is an increasingly important part of the sale, so GE has to make the software, too, and invest in tools that can help it and its customers get the analyses done that they need.

It’s a transition many industrial companies will have to make as they learn how to wrestle with terabytes and petabytes of sensor data.

 

Originally published on VentureBeat » Dylan Tweney: http://ift.tt/1GkA049