By Alex Woodie
During last week’s Strata Data Conference, Datanami sat down to talk with Cloudera Chief Strategy Officer Mike Olson to talk about the state of big data, and where Cloudera offerings are headed next. Here’s a rehash of that conversation.
When Olson co-founded Cloudera in 2008 and served as its first CEO, Apache Hadoop was just starting to register as a blip on the radar screens of tech executives, who were wondering how they were going to store all that data and what they could do with it. Now 10 years later, Olson is still a driving force for Cloudera, but the strategy has changed significantly for the company.
For the first phase of the business, Cloudera wrangled zoo animals, Olson says. “There were Pigs and Hives and Zookeepers and the yellow elephant, and we wrangled them all,” he says. “For about four years we were super evangelical in explaining what big data was and why all those open source projects manner.”
In the 2013 time frame, as the big data technology boom got bigger, Cloudera introduced the enterprise data hub (EDH). “It was a reformulation of the earlier platform story,” Olson says. “All of those projects were in the box, but we didn’t call them out separately any longer because it was the comprehensive capabilities, knit together by security, governance, compliance — all the shared data experience services we’ve rolled out.”
Cloudera only recently departed from the EDH with a new strategy that widens the company’s focus into specific areas, including data warehousing, machine learning, and cloud computing. The company has assigned a general manager for three areas, including Anupam Singh for analytics, Hilary Mason for machine learning, and Vikram Makhija for cloud, while Fred Koopmans heads up the enterprise platform.
Data warehousing and analytics is the farthest along among the three business areas, Olson says. “The major driver of growth on our platform right now are analytics databases, and especially data warehousing workloads,” he says. “Maybe I’m a Netezza customer and I have a few hundred terabytes in my cluster, but IOT is happening to me and I’d like to be able to keep a decade’s worth of data and not just a year or a quarter’s worth of data. So the data volumes are going from 10s or 100s of terabytes to petabytes, and the modern platform gets to handle those and the legacy architectures really can’t.”
That “modern platform” is an amalgamation of various open source projects that are collectively referred to as Hadoop, which is still important to Cloudera, Olson says. “So we’re continuing to innovate there,” he says. “But that’s really just table stakes. What’s in interesting is what do you do with the data once you get it in the data lake. That’s our focus now.”
Apache Impala is one of the fastest growing engines in the Hadoop ecosystem at the moment, according to Qubole’s recent survey, where it placed second only to Apache Flink, which grew more than 100%. Cloudera is keen to ride Impala’s blazing interactive query speed — as well as Kudu’s advantages in fast IoT data ingest — to fortunes in the massive data warehousing market.
“If you think about it, right now, there’s $16 billion per year spent on data warehousing,” he says. “That’s a rich field to plow, my friend. The machine learning installed base is vanishingly small but gonna absolutely explode, and it’s possible for a new vendor in the market to go to dramatic market dominance very quickly. So it’s a new business for us but it’s one we’re super, super bullish on.”
Olson says he spends much of his time these days ensuring that Cloudera’s solutions are working well with the Big Three of cloud vendors. “We’re exploding into cloud, so I’m spending a substantial part of my day working with Amazon, working with Microsoft, working with Google to be sure that we run well and that we have the go-to-market alignment with those guys and that we’re integrating with the right level of services.”
Cloudera has already done the work to ensure that its platform works with cloud technologies, specifically by supporting the object storage systems that underlie each of the public cloud platforms. One of the next big steps will be to support hybrid software delivery methods, where customers have the freedom to deploy the same application or applications to public clouds, on-premise clusters, or a combination of both. Kubernetes and Docker are seen as big answers to part of that question.
“The storage layer essentially becomes the data lake and you on-demand provision dedicated compute clusters just for the job you’re running, so that’s been really liberating in the cloud,” he says. “That was an impossibly hard thing before Docker and Kubernetes came round….Docker was a super successful project, but until there was good orchestration, until Google made Kubernetes open source, there was no good answer there. And now I think we have the building blocks we need.”
But the other part of that story has to do with the object storage system, and the answers aren’t as clear there, Olson says.
“We’ve already done a big chunk of what we needed to do, which is when we moved to the cloud and embraced the object stores in the cloud, we separated compute and storage,” he says. “So we did that one time already. So we got a bunch of that plumbing. We need to move it on prem now, and that’s a little bit tricky, but Kubernetes and Docker make that easier.
“What we need is storage virtualization on prem,” he continues. “There is no widely adopted, reliable object store in the data center. That would make this a lot easier. You can create by the way a shared HDFS cluster, where you’ve got traditional HDFS and then you can fire up and fire down compute clusters on top of that, and that’s likely an interim step that we’ll take while the market for object store shakes out on prem.”
The object store market is quite diverse, with 21 different vendors, all of which have “miniscule market share,” Olson says. While many (if not most) of them are API compatible with Amazon’s Super Simple Storage (S3) object store, and everybody agrees that S3 is the right path forward, that doesn’t necessarily make the job any easier, he says.
“S3 likely will be what everybody adopts, but somebody needs to be the variant of S3 on-prem that takes over the market so we can just go to that one,” he says. “We don’t expect to be a general purpose object store vendor. Cloudera won’t go into that business. But we’d very much like to see a market leader emerge and the sooner that happens, the better we think for everybody.”
While Cloudera and its employees who sit on Apache Software Foundation projects were instrumental in the development of the Hadoop Distributed File System (HDFS), that work was a matter of necessity. Today’s Cloudera does not want to be in the business of creating a next-gen object storage system to support the hybrid big data apps of the future, since it sees its future developing apps that sit higher in the stack, like data warehousing and machine learning.
“You would not believe how expensive it is to support a new object storage system,” Olson says. “It touches every single component, all of the security stuff that we do has to be integrated with whatever the object store surfaces. Every single analytics or data processing engine needs to code in those APIs. And all the vendors will say, yeah we’re S3 compatible. Dude, do you think so? You’re are not bug compatible with S3. And your performance under duress is going to be different. And then you’ve got to design around the needs of those performance curves.”
Olson likes what he sees in Red Hat and the folks behind Ceph, whom he called “lights out good.” But until that common object store of the future arrives, Cloudera will just have to wait for a complete hybrid solution, just like everybody else.
The post Mike Olson on Zoo Animals, Object Stores, and the Future of Cloudera appeared first on Datanami.
Read more here:: www.datanami.com/feed/
Global energy leader Chevron has selected WISER Systems Inc., developer of a precise real-time locating system, to participate in the Chevron Technology Ventures Catalyst Program. As a division of Chevron itself, Chevron Technology Ventures (CTV) is “a conduit for early adoption of emerging technology” according to the program’s website. CTV supports innovative companies with new […]
The post Chevron Technology Ventures Selects WISER Systems for Catalyst Program appeared first on IoT – Internet of Things.
Read more here:: iot.do/feed
Organisations from the mining industry are struggling to take full advantage of the data gathered by their Industrial Internet of Things (IIoT) applications due to connectivity challenges.
This is according to a global study by Inmarsat, a provider of global mobile satellite communications services. It finds that 94% of mining organisations are facing significant challenges in extracting valuable insights from data to improve the productivity, efficiency and safety of their operations.
These challenges are due, in large part, to problems related to connectivity. Two-thirds (66%) of mining organisations reported that a lack of reliable connectivity is hampering the success of their IIoT deployments, further underlining the importance of robust communication networks to the success of IIoT. While, almost half (46%) of mining organisations cited a lag between data collection and it being available for use as a reason for why they are not able to generate full value from the data collected by their IIoT solutions.
This issue highlights the need for mining companies to implement more reliable connectivity methods and data-processing strategies to collect, transfer and present mission critical data for analysis. Given the remote location of many mines and the vast quantity of data gathered by connected sensors, these capabilities are critical for mining companies seeking to capitalise on their IIoT solutions.
Commenting on the findings, Joe Carr, director of Mining, Inmarsat Enterprise, said: “Mining businesses increasingly rely on IIoT technology to extract, haul and process raw materials. The data produced by these systems often has a shelf life, meaning that if it is not where it needs to be, at the right time, it can become outdated and of little value. To secure the significant benefits that IIoT offers, businesses must ensure that they can view and analyse mission critical data in real-time, which requires a robust and reliable communications network.
“The remote location of most mining facilities, and the attached high cost of deploying terrestrial connectivity, means that satellite communications can play a critical role in transferring data back to control centres to provide a complete picture of mission critical metrics. Businesses must work with trusted IIoT satellite connectivity specialists and their partner eco-systems to ensure they can extract and analyse their data effectively, wherever their operations are located,” he concluded.
To view the research microsite and download the full report – ‘IIoT on Land and at Sea’ click here
Comment on this article below or via Twitter: @IoTNow_OR @jcIoTnow
Read more here:: www.m2mnow.biz/feed/
By offering third-party-enabled Industrial Internet of Things (IIoT)-based products and solutions, industrial automation vendors have evolved to provide proprietary digital platforms via the Product-as-a-Service (PaaS) business model.
According to a new Frost & Sullivan report, the trend of digitalisation in end-user industries has prompted automation vendors to invest in IIoT technologies across diverse applications. They are now said to be looking to integrate these technologies to complement conventional automation systems and give end users better control over the systems’ functionality.
“The advent of Industry 4.0 is disrupting the partnership ecosystem in industrial automation, with start-ups and independent software vendors (ISVs) partnering with automation vendors to develop digital capabilities and solutions,” said Rohit Karthikeyan, senior research analyst, Industrial Automation & Process Control at Frost & Sullivan.
“Automation vendors will aim to standardise their portfolios through M&As and partnerships, and drive growth in their respective business segments. The consolidation of their IIoT portfolios will result in the upselling and cross-selling of automation solutions and create fresh revenue streams.”
Frost & Sullivan’s recent analysis, Global Industrial Automation Market Outlook, 2018, highlights the IIoT platform offerings of major automation companies and compares their products and services. It underlines the role of start-ups with niche capabilities in operational and information technologies in 2017. The analysis also details the market landscape of the key participants in process automation, hybrid automation, and discrete automation markets.
“As the North American and European markets are in the midst of a downturn, vendors are focusing on the developing economies of Asia-Pacific, Africa, and Latin America. Not only do these regions have a high number of greenfield projects, butthey also enjoy significant government support,” noted Karthikeyan. “More than 40% of the end users in these markets are small- and medium-sized enterprises encouraging vendors to deliver cost-effective products and educate them on the value of IIoT-ready solutions in process optimisation and control.”
In addition to maximising market expansion opportunities, proactive automation vendors will also explore opportunities in:
Forging strategic partnerships with pure-play IIoT providers to add value to their existing offerings and becoming a single point of contact for end users.
Diversifying into electrification products. Automation vendors have to focus on developing solutions that will enable traditional oil & gas companies to foray into the power business.
Promoting open-source controllers. The introduction of app logic controllers (ALC) has bolstered the market as the system is driven by open-source programming, wherein end users can download and use an app to control a specific application. This, in turn, has created revenue streams for app developers, hardware providers, and system integrators.
Shifting from hardware to software and services. Vendors can ease clients’ shift to digital technologies by minimising their investment risks by employing novel business models such as PaaS, pay-per-use, and licensing.
Global Industrial Automation Market Outlook, 2018 is part of Frost & Sullivan’s global Industrial Automation & Process Control Growth Partnership Service programme.
Comment on this article below or via Twitter: @IoTNow_OR @jcIoTnow
Read more here:: www.m2mnow.biz/feed/
It was inevitable but connected cars are prone to diddling. In an experiment to identify connected car fraud, researchers in Germany collected data from 300,000 cars. They found that 23% of Minis and 27% of E60 BMW 5 Series had experienced some form of manipulation.
Part of me feels disgusted at the Germans, says Nick Booth, freelance technology writer, while another side of my psyche wonders why Britain’s online criminals are failing to compete!
There is a lot of mileage in fiddling BMWs. On average, 15% of all BMWs checked had been subjected to some form of tampering and, of that, 90% of the diddling was related to the number of kilometres shown on the clock.
The other 10% was related to manipulating the vehicle identity itself. In one outstanding example a BMW 5 Series (E 60 generation), which would have been built in 2000, showed only 18,703 miles on the speedometer. The purchaser probably presumed the car had belonged to one careful lady owner who used it to go to church on Sunday.
In fact, according to researchers working for Carly Car Check, the car had been driven 120,564 miles. By a simple software hack they had put thousands of pounds on to the value of the car.
Applications like Carly Car Check can reveal the secret life of cars to prospective buyers. They tap into the car’s multiple computers (ECUs) to check for the correlation with the visible speedometer and identify whether data has been manipulated.
Results from mainland Europe, where the app has been available for three years, suggest the issue is bigger than consumers may think.
“Unscrupulous motor traders and private owners can easily get access to the car’s data. With the right equipment they can adjust the electronic mileage clocks commonly found in modern cars. There are visual and paperwork checks but don’t be reassured by the fact that most car buyers have confidence in them – they are easily fooled. There is no safety in being one of ‘the crowd’ because fraud is so rife,” says a Carly spokesperson.
Interrogate your car
Carly claims to interrogate up to 50 control units in the car and compares their mileage data with what is visible on the dashboard. Often the data stored in the control units – be it related to mileage or types of usage – does not tally.
The route data reveals if the owner really was a ‘careful church-going lady driver’. The app displays fuel economy and average speed and the journey times that the car has been used for. You can look at the location and check if she really did drive to church and whether she was there for the service on Sunday or the Krav Maga combat classes the church hall hosts on Wednesday night.
Short journeys aren’t necessarily a good thing anyway. “Since many cars only do short journeys, they don’t get the right sort of use to regenerate the filters,” says Avid Avini, co-founder of Carly.
A new filter will cost you a thousand pounds! Carly also lists fault codes logged in the car, relating to developing issues […]
The post When smart cars fall ill you won’t be able to consult Dr Google appeared first on IoT Now – How to run an IoT enabled business.
Read more here:: www.m2mnow.biz/feed/