TomTom launches AutoStream: A revolutionary map delivery service for autonomous driving

By Zenobia Hegde

TomTom (TOM2), announced the launch of TomTom AutoStream, an innovative map delivery service for autonomous driving and advanced driver assistance systems. They are the first partners to use the technology: Baidu and Zenuity.

TomTom AutoStream enables vehicles to build a horizon for the road ahead by streaming the latest map data from the TomTom cloud. By ensuring that the map used to power advanced driving functions is always the latest, TomTom AutoStream enhances driver comfort and safety.

Willem Strijbosch, TomTom’s head of Autonomous Driving, said: “The launch of TomTom AutoStream is a game-changer for OEMs and technology companies that are working on the future of driving. TomTom AutoStream allows vehicles to access the latest, most up-to-date TomTom map data for their driving automation functions.”

TomTom AutoStream is designed in a flexible way, allowing customers to customise the map data stream based on criteria such as sensor configuration and horizon length. It can stream a wide variety of map data including ADAS attributes such as gradient and curvature, and the TomTom HD Map with RoadDNA. This flexibility allows customers to use AutoStream to power a wide range of driving automation functions.

Strijbosch continues: “Our early investment in the TomTom advanced map-making platform means that we can continue to deliver revolutionary innovations like TomTom AutoStream. With TomTom AutoStream we can significantly simplify and shorten the development time for our customers, accelerating the future of driving.”

TomTom AutoStream ensures that the TomTom map data used to power advanced driving functions is the latest, most accurate available, enabling a safer and more comfortable experience.

“With AutoStream TomTom is offering an innovative map delivery system targeted at automated driving,” said Roger C. Lanctot, director, Automotive Connected Mobility for Strategy Analytics. “The development is targeted at helping automakers bring ADAS and autonomous driving functions to market faster.”

TomTom AutoStream will be available for production usage in 2018.

Comment on this article below or via Twitter: @IoTNow_OR @jcIoTnow

The post TomTom launches AutoStream: A revolutionary map delivery service for autonomous driving appeared first on IoT Now – How to run an IoT enabled business.

Read more here::

Pre-packaged solutions accelerate IoT adoption

By IoT Now Magazine

Cumulocity IoT is an innovative software platform that addresses the market demand for easy, fast, and scalable IoT solutions. It combines the power of Software AG’s Digital Business Platform with Cumulocity’s product portfolio. Functionality includes the ability to monitor and analyse streaming IoT data; cloud, on premise, edge and hybrid deployment; and a range of pre-packaged IoT solutions such as condition monitoring, predictive maintenance and track and trace. Andrew Brown, the executive director of Enterprise and IoT Research at Strategy Analytics, discusses this open, application-centric approach to Industrial IoT with chief executive, Bernd Gross

Andrew Brown: Cumulocity has an impressive track record and the company has been a leading vendor of device and application management platforms since 2010. Why did you do the deal with Software AG?

Bernd Gross: We had been and continue to be very successful with our platforms, but we operate in a very dynamic market and by 2017 it was clear that we needed to scale our offer and become a global solution provider. Moreover we needed to do it quickly so that was one of the reasons why we did the deal. Software AG currently has offices in 70 countries around the world.

Bernd Gross, CEO of Cumulocity

The second reason is their rich portfolio of software products that complement our offer, one of which is WebMethods, an advanced integration engine that enables seamless interoperability between the operational technology and information technology domains. The former is the domain where data are generated and the latter is the domain where data are consumed. Another key complementary product is Apama, a platform that allows organisations to analyse and act on high-volume event stream data in real-time.

The third reason comes from the emerging need for IoT platform providers to be more open about the performance of their IoT specific business offer. Software AG is leading this approach and has created a separate business unit that has enabled developments such as prepackaged solutions. These solutions reflect the way that the IoT market is maturing and they are facilitating the growing trend away from expensive, time-consuming in-house or bespoke IoT solutions.

AB: How successful have you been with this approach and can you also indicate its relevance to the Industrial IoT sectors that you target?

BG: We have been very successful. For example, Siemens has selected our technology in order to complement MindSphere, which is a powerful IoT operating system that has data analytics, connectivity capabilities, plus tools for developers, applications and services. In addition ADAMOS, that stands for ADAptive Manufacturing Open Solutions, a strategic alliance for machine and plant engineering, chose our IoT technology after an extensive evaluation process. Alliance partners include DMG MORI, Dürr, Homag, ZEISS as well as ASM PT.

The objective is to bundle knowledge in mechanical engineering, manufacturing and information technology. ADAMOS is set to become a global standard for the industry. It combines up-to-date IT technology and industry knowledge, thereby enabling engineering companies to offer tried and tested solutions for digitally networked production to their customers. These and other wins […]

The post Pre-packaged solutions accelerate IoT adoption appeared first on IoT Now – How to run an IoT enabled business.

Read more here::

Automotive consumers can now stream content in the car through a brought-in device, says Strategy Analytics

By Zenobia Hegde

While in the car, passengers and drivers are consuming infotainment in new ways. Rather than embedded devices and services, automotive consumers are increasingly reliant on a myriad of portable, connected, and streaming sources.

A new report from the In-vehicle UX (IVX) group at Strategy Analytics “Strong Shift Towards Brought-In and Streamed Content for Rear Seat Entertainment”, surveying consumers in the US, Western Europe and China regarding their interest in and willingness to pay for rear-seat entertainment systems, has found that this trend has now fully extended to rear-seat entertainment.

In key demographics, the majority of consumers are no longer interested in disc players, but rather ports which will allow them to stream content in the car through a brought-in device or dedicated service.

Click here for report.

Key report findings include:

Consumer interest in rear-seat entertainment has remained consistent in Western Europe and China since 2015.
Of those consumers interested, a there has been a dramatic shift from DVD and/or Blu-Ray players to preference for tablet docking stations and streaming video.
Willingness to pay for rear-seat entertainment systems is modest across all reasonable price points in all regions.

Monica Wong

Monica Wong, report author commented, “One key question product line managers must address in the short term is how this trend will affect desirability for dedicated screens. Although these systems were tremendous value-additions for many years, and will remain so for the immediate future, consumers’ expectations for rear-seat entertainment no longer require a dedicated seat-back screen.”

Added Chris Schreiner, director, Syndicated Research UXIP, “Though screens will remain desirable for the near term, the ability to stow or hide them will become increasingly important as well; particularly if smart surfaces become capable of accomplishing the same task.”

Comment on this article below or via Twitter: @IoTNow_OR @jcIoTnow

The post Automotive consumers can now stream content in the car through a brought-in device, says Strategy Analytics appeared first on IoT Now – How to run an IoT enabled business.

Read more here::

Cypress Brings Superior Infotainment Experience to Connected Cars

By IoT – Internet of Things

Cypress Semiconductor Corp. announced production availability of a combo solution that delivers robust 2×2 MIMO 802.11ac Wi-Fi® and Bluetooth® connectivity to vehicles, enabling multiple users to connect and stream unique content to their devices simultaneously. The new Cypress CYW89359 combo solution is the industry’s first to implement Real Simultaneous Dual Band (RSDB) technology, which enables […]

The post Cypress Brings Superior Infotainment Experience to Connected Cars appeared first on IoT – Internet of Things.

Read more here::

Samsung is Betting Big on the Internet of Things. What Does That Mean For You?

By Chris Morris

Samsung is going all in on the Internet of things, betting that connected appliances and faster Internet speeds will result in happier customers.

Executives for the electronics giant, speaking at the 2018 CES technology show in Las Vegas on Monday, reconfirmed a vow made two years ago that all of the company’s products will be IOT-compatible by 2020–adding that 90% already are as of today. And it plans to use its existing SmartThings app to ensure that those devices can all talk to each other–from the TV to the phone to the refrigerator to the washing machine.

Samsung promised that the initiative would debut in the spring.

What that means for users will depend on which Samsung appliances they own, of course. One example is people who buy a new Samsung TV will no longer have to worry about entering user names and passwords for services like Netflix, Hulu, and Spotify when they initially set up their TVs. That information will automatically entered into the TV by checking other systems in which the customer is logged in, making it a more seamless experience.

TV sets will also have personalized recommendations for movies and shows, based on a user’s overall viewing habits on all their devices. TVs will also include Bixby, Samsung’s voice-controlled digital assistant, and will be able to double as a central hub for smart products around the home, letting users do everything from see who is at the front door to adjust the thermostat.

The goal of the push, says Tom Baxter, president and CEO of Samsung’s North American division is to create an “eco-system of devices working together to produce unique experiences.”

As part of the initiative, Samsung will expand the number of smart refrigerator models it sells by introducing 14 new models that come with its integrated screen and newly expanded FamilyHub technology that lets owners stream music, leave notes to each other, and view the contents of the fridge in real time. Through Bixby, the FamilyHub will differentiate between users’ based on their voices and give custom information to them, such as their schedule for the day or commute times to school or work.

“IOT is still frustrating to a lot of people, but it doesn’t need to be,” said Yoon Lee, senior vice president at Samsung Electronics.

Samsung’s not stopping with its own products, either. The company is working closely with the Open Connectivity Forum, the world’s largest IOT standardization body, to have all SmartThings-compatible products work with the app. And the company said it has signed an agreement with a “leading European auto manufacturer” to extend this IOT integration to vehicles (letting, for instance, people check if they’re out of milk as they drive by the store).

Samsung has previously tried and failed to make its ecosystem more connected. To ensure this effort is more successful, Baxter said, the company has invested $14 billion in research and development over the past year.

Read more here::

DTS Play-Fi announces wireless speakers to support Works with Amazon Alexa

By Zenobia Hegde

DTS, a global provider in high-definition audio solutions and a wholly owned subsidiary of Xperi Corporation, is pleased to announce the first DTS Play-Fi®-enabled wireless speakers to support the Works with Amazon Alexa functionality.

Initial products supported include the Pioneer Elite Smart Speaker F4, Onkyo Smart Speaker P3 and the Phorus PS10. Additional DTS Play-Fi-enabled products, including the Klipsch Stream wireless multi-room audio lineup, as well as McIntosh Laboratories, MartinLogan and THIEL Audio, will add the capability by the end of Q1 2018.

Available via a Works with Amazon Alexa over-the-air firmware update, consumers can now control audio playback on select DTS Play-Fi products from another room using an Amazon Echo, Dot or Show. This functionality allows users to verbally ask Alexa to play a song in a specific room, groups of rooms or the whole house, adjust volume, skip the track forward, mute, pause and stop the music.

“We continue to expand our range of Alexa voice control solutions available to licensees, making DTS Play-Fi the first open wireless multi-room audio platform to offer both integrated Alexa Voice Services (AVS) and Works with Amazon Alexa,” said Dannie Lau, general manager, DTS Play-Fi, at Xperi. “We look forward to continuing to forge a strong relationship with Amazon, the most widely recognized and adopted voice service on the market.”

“We’re thrilled to offer the Pioneer Elite Smart Speaker F4 and Onkyo Smart Speaker P3 as the first speakers in the DTS Play-Fi ecosystem that can be controlled via Amazon Alexa,” Nobuaki Okuda, director and CTO, Onkyo Corporation and president, Onkyo and Pioneer Technology Corporation. “With this update, consumers can not only control their whole-home music system using their voice with our speakers, but using a third party voice control product as well.”

DTS Play-Fi technology enables lossless multi-room wireless audio streaming from the world’s most popular music services including Amazon Music, Deezer, iHeartRadio, Juke, KKBox, Napster, Pandora, Qobuz, QQ Music, SiriusXM, Spotify and TIDAL, thousands of Internet radio stations, as well as personal music libraries, on any supported product. In addition, DTS Play-Fi features advance streaming functionality like wireless surround sound, stereo pairing, music station presets, and audio/video synchronisation.

The DTS Play-Fi ecosystem features the largest collection of products in the whole-home wireless audio space, with more than 200 interoperable speakers, sound bars, set-top boxes, and A/V receivers from the top names in premium audio including Aerix, Anthem, Arcam, Definitive Technology, DISH TV, Elite, Integra, Fusion Research, Klipsch, MartinLogan, McIntosh, Onkyo, Paradigm, Phorus, Pioneer, Polk Audio, Rotel, Sonus faber, Soundcast, SVS Sound, THIEL Audio and Wren Sound.

Comment on this article below or via Twitter: @IoTNow_OR @jcIoTnow

The post DTS Play-Fi announces wireless speakers to support Works with Amazon Alexa appeared first on IoT Now – How to run an IoT enabled business.

Read more here::

Dutch Blockchain company, LegalThings aims to update criminal justice via smartphone

By Zenobia Hegde

The justice system is known for many things, but efficiency is not one of them. Neither is being up-to-speed with technology. One joke goes that the unofficial IT slogan of the courts is, “Yesterday’s technology, tomorrow!”

Into this space comes LegalThings, an Amsterdam-based digital contracts company that’s aiming to update how those accused of a crime move through the justice system by making the law accessible while making judicial record-keeping more open and secure.

After winning a “blockathon” competition in September hosted by the Dutch Ministry of Justice and Security, LegalThings began a pilot project with the Public Prosecution Service of the Netherlands, known as the “Openbaar Ministerie,” or OM, in Dutch. The project aims to build a system to process low-level criminal offenders quickly and with more transparency. If successful, it could be a huge time- and money-saving enterprise for the government.

“What you see now [in the justice system] is there is a lot of procedures, and those procedures are important to create a fair legal system, but they’re also really labor-intensive,” said Arnold Daniels, a co-founder of LegalThings and its chief software engineer. “What we’re trying to do is create an alternative to that.”

How might that work in practice? Imagine someone nabbed for possession of a small amount of illicit drugs, a crime that, in the Netherlands, can carry a fine of a few hundred euros. There are a number of parties involved in processing such a law enforcement action: the police who catch the alleged offender, the forensics expert that examines the drugs, and the OM.

Depending on whether the forensic expert is on-site to test the drugs, processing such an enforcement action can take anywhere from several hours to a couple days, said Sanne Giphart, innovation manager at OM. While some record-keeping systems have been made digital, that’s an ongoing process, Giphart explained. Things can move slowly.

Arnold Daniels

By contrast, with the LegalThings application, the accused can get an explanation of the relevant law, choose whether to be represented by counsel, and agree to pay the relevant fine—all on their smartphone. All told, the actual processing of the offender takes about 30 minutes, and every step of the exchange is recorded, time-stamped, and made unchangeable using cryptography to ensure records can’t be fudged.

So far, OM, which is comparable to a mashup of the Department of Justice and local district attorneys in the U.S., has only experimented with the technology on “dummy data” involving a drug offense and a domestic violence offense, Giphart said. “The next step is to let people get familiar with this type of technology within the [OM] and then hopefully we can implement on one stream of cases.”

The challenges to implementing such a system are not purely technological. It also will likely require some changes in both public and institutional attitudes toward judicial record-keeping, said Daniels. “With this system, there’s really no backsies,” he explained. “You can correct it, but you can always see your initial action.”

Unlike other blockchain systems that use a publicly distributed ledger, the LegalThings project with OM allows […]

The post Dutch Blockchain company, LegalThings aims to update criminal justice via smartphone appeared first on IoT Now – How to run an IoT enabled business.

Read more here::

Wearable Data Analytics Bring Humans into the IoT

By IoT – Internet of Things

Data analytics provides companies, healthcare professionals, and consumers alike with further insight into the long stream of data that they receive from various sensors and devices, which increasingly includes wearable devices. Traditional wearables, such as fitness tracking devices, provide the user with raw information such as their heart rate or step count. By applying analytics […]

The post Wearable Data Analytics Bring Humans into the IoT appeared first on IoT – Internet of Things.

Read more here::

Embracing the Future – The Push for Data Modernization Today

By Steve Wilkes

There is growing recognition that businesses today need to be increasingly ‘data driven’ in order to succeed. Those businesses that can best utilize data are the ones that can better serve their customers, out-compete their competitors and increase their operational efficiency. However, to be data driven, you need to be able to access, manage, distribute and analyze all of your available data while it is still valuable; and to understand and harness new potential data sources.

Key to this is Data Modernization. Data Modernization starts with the recognition that existing systems, architectures and processes may not be sufficient to handle the requirements of a data-driven enterprise, and that new innovative technologies need to be adopted to succeed. While the replacement of legacy technology is not a new phenomenon, the particular sets of pressures, leading to the current wave of modernization, are.

In this article we will delve into the very real pressures pushing enterprises down the path of data modernization, and approaches to achieving this goal in realistic time frames.

Under Pressure

Business leaders world-wide have to balance a number of competing pressures to identify the most appropriate technologies, architectures and processes for their business. While cost is always an issue, this has to be measured against the rewards of innovation, and risks of failure versus the status quo.

This leads to cycles for technology, with early adopters potentially leap-frogging their more conservative counterparts who may not then be able to catch up if they wait for full technological maturity. In recent years, the length of these cycles has been dramatically reduced, and formally solid business models have been disrupted by insightful competitors, or outright newcomers.

Data Management and Analytics are not immune to this trend, and the increasing importance of data has added to the risk of maintaining the status quo. Business are looking at Data Modernization to solve problems such as:

  • How do we move to scalable, cost-efficient infrastructures such as the cloud without disrupting our business processes?
  • How do we manage the expected or actual increase in data volume and velocity?
  • How do we work in an environment with changing regulatory requirements?
  • What will be the impact and use cases for potentially disruptive technologies like AI, Blockchain, Digital Labor, and IoT, and how do we incorporate them?
  • How can we reduce the latency of our analytics to provide business insights faster and drive real-time decision making?

It is clear to many that the prevalent and legacy Data Management technologies may not be up to the task of solving these problems, and a new direction is needed to move businesses forward. But the reality is that many existing systems cannot be just ripped out and replaced with shiny new things, without severely impacting operations.

How We Got Here

From the 1980s to the 2000s, databases were the predominant source of enterprise data. The majority of this data came from human entry within applications, web pages, etc. with some automation. Data from many applications was collected and analyzed in Data Warehouses, providing the business with analytics. However, in the last 10 years or so, it was recognized that machine data, logs produced by web servers, networking equipment and other systems, could also provide value. This new unstructured data, with a great amount of variety, needed newer Big Data systems to handle it, and different technologies for analytics.

Both of these waves were driven by the notion that storage was cheap and, with Big Data, almost infinite, whereas CPU and Memory was expensive. Outside of specific industries that required real-time actions – such as equipment automation and algorithmic trading – the notion of truly real-time processing was seen to be out of reach.

However, in the past few years, the industry has been driven to rethink this paradigm. IoT has arrived very rapidly. Connected devices have been around for some time, and industries like manufacturing have been utilizing sensors and automation for years. But it is the consumerization of devices, coupled with the promise of cloud processing, that have really driven the new wave of IoT. And with IoT comes the realization that storage is not infinite, and another processing paradigm is required.

As I outlined in this article, the predicted rate of future data generation – primarily, but not solely, driven by IoT – will massively outpace our ability to store it. And if we can’t store all the data, yet need to extract value from it, we are left to conclude it must be processed in-memory in a streaming fashion. Fortunately, CPU and memory have been become much more affordable, and what was unthinkable 10 years ago, is now possible.

A Streaming Future

Real-time in-memory stream processing of all data, not just IoT, can now be a reality, and should be part of any Data Modernization plans. This does not have to happen overnight, but can be applied use-case-by-use-case without necessitating a rip and replace of existing systems.

The most important step enterprise companies can make today is to move towards a ‘streaming first’ architecture. A Streaming First architecture is one in which at least the collection of all data is performed in a real-time, continuous fashion. Understanding that a company can’t modernize overnight, at least achieving the capability of continuous, real-time data collection enables organizations to integrate with legacy technologies, while reaping the benefits of a modern data infrastructure that can combat the ever-growing business and technology demands within the enterprise.

In practical terms, this means:

  • Using Change Data Capture to turn databases into streams of inserts, updates and deletes;
  • Reading from files as they are written to instead of shipping complete logs; and
  • Harnessing data from devices and message queues without storing it first.

Once data is being streamed, the solutions to the problems stated previously become more manageable. Database change streams can help keep cloud databases synchronized with on-premise while moving to a hybrid cloud architecture. In-memory edge-processing and analytics can scale to huge data volumes, and be used to extract the information content from data, massively reducing its volume prior to storage. Streaming systems with self-service analytics can be instrumental in remaining nimble, and continuously monitoring systems to ensure regulatory compliance. And new technologies become much easier to integrate if, instead of separate silos and data stores, you have a flexible streaming data distribution mechanism that provides low latency capabilities for real-time insights.

Data Modernization is becoming essential for businesses focused on operational efficiency, customer experience, and gaining a competitive edge. And a ‘streaming first’ architecture is a necessary component of Data Modernization. Collecting and analyzing data in a streaming fashion enables organizations to act on data while it has operational value, as well as storing only the most relevant data. With the data volumes predicted to grow exponentially, a streaming-first architecture is the truly the next evolution in Data Management.

The world operates in real-time, shouldn’t your business as well?

About the author: Steve Wilkes is co-founder and CTO of Striim. Prior to founding Striim, Steve was the senior director of the Advanced Technology Group at GoldenGate Software, focused on data integration. He continued in this role following the acquisition by Oracle, where he also took the lead for Oracle’s cloud data integration strategy. Earlier in his career, Steve served in senior technology and product roles at The Middleware Company, AltoWeb and Cap Gemini’s Advanced Technology Group. Steve holds a Master of Engineering degree in microelectronics and software engineering from the University of Newcastle-upon-Tyne in the UK.

Related Items:

Streaming Analytics Picks Up Where Hadoop Lakes Leave Off

Streaming Analytics Ready for Prime Time, Forrester Says

Investments in Fast Data Analytics Surge

The post Embracing the Future – The Push for Data Modernization Today appeared first on Datanami.

Read more here::

Backing Up Big Data? Chances Are You’re Doing It Wrong

By Peter Smails

The increasing pervasiveness of social networking, multi-cloud applications and Internet of Things (IoT) devices and services continues to drive exponential growth in big data solutions. As businesses become more data driven and larger, more current data sets become important to support the online business processes, analytics, intelligence and decisions. Additionally, data availability and integrity become increasingly critical as more and more businesses and their partners rely on these (near) real-time analytics and insights to drive their business. These big data solutions typically are built upon a new class of hyper-scale, distributed, multi-cloud, data-centric applications.

While these NoSQL, semi-structured, highly distributed data stores are perfect for handling vast amounts of big data on a large number of systems, they can no longer be effectively supported by legacy data management and protection models. Not only based on the sheer data size and the vast number of storage and compute nodes, but also because of built-in data replication, data distribution, and data versioning capabilities – a different approach for backup and recovery is needed. Even though these next-generation data stores have integrated high availability and DR capabilities, events like logical data corruption, application defects, and/or simple user errors still require another level of recoverability.

To meet the requirements of these high-volume and real-time applications in a scale-out, cloud centric environments, a wave of new data stores and persistence models has emerged. Gone are the days of just files, objects and relational databases. The next-generation key-value stores, XML/JSON document stores, arbitrary width column stores and graph-databases (sometimes characterized as NoSQL stores) share several fundamental characteristics that enable the big data driven IT. Almost without exception, all big data repositories are based on a cloud-enabled, scale-out, distributed data persistence model that leverages commodity infrastructure while providing some form of integrated data replication, multi-cloud distribution and high-availability. The big data challenges aren’t limited to just the data ingest, data storage, data processing, data queries, result set capturing, visualization, but also pose increasing difficulties around data integrity, availability, recoverability, accessibility and mobility/movement. Let’s see how this plays out in a couple example case studies.

(Tatiana Shepeleva/Shutterstock)

A first case study revolves around an Identity and Access Management service provider that uses Cassandra as its core persistence technology. The IDaaS (Identity as a service) is a multi-tenant service with a mixture of large enterprise, SMB and development customers and partners. The Cassandra database provides them with a highly scalable, distributed, high available data store that supports per tenant custom user and group profiles (i.e. read dynamic extensible schemas). While the data set may not be very large in absolute storage size, the number of records definitely will be in the 10’s, if not 100’s of millions.

What drives the unique requirements for recoverability is the multi-tenancy and the 100% availability targets of the service. Whether it is through user error, data integration defects and changes, or simply tenant migrations, it may be required to recover a single tenant’s data set without having to restore the whole Cassandra cluster (or replica thereof) in order to restore just one tenant instance. Similarly, the likelihood that the complete Cassandra cluster is corrupt is slim and in order to maintain (close to) 100% availability for most tenant service instances, partial recovery would be required. This drives the need for some level of application aware protection and recovery. In other words, the protection and recovery solution must establish and persist some application data semantic knowledge to be able to recover specific, consistent Cassandra table instances or point-in-times.

The second case study is centered around a Hadoop clustered storage solution, whereby the enterprise application-set persists its time-series data from devices and their end-user activities in the Hadoop filesystem. The Hadoop storage acts as de-facto “data lake” fed from multiple diverse data sources in different formats, whereby the enterprise can now apply various forms of data processing and analysis through map-reduce batch processing, real-time analytics, streaming and/or in-memory queries and transformations. Even though a map-reduce job creates ephemeral intermediate and end results that in principle could be recreated by running the job once more in case of failure or corruption, the data set can be too large (and therefore too expensive to reprocess) and undergoing constant updates.

Even though Hadoop provides replication and erasure encoded duplication (for high-availability and scale-out), there really is no data versioning or snapshots for that matter (given the original ephemeral model of the map-reduce processing). Any logical error, application or service failure or plain user error, coul result in data corruption or data loss. Data loss or corruption could occur to the original ingested data, any intermediate ephemeral data or data streams, as well as any resulting datasets or database instances and tables. Rather than creating a full copy of the Hadoop file-system for backup and recovery of intermediate files and database tables (which would be cost prohibitive and/or too time consuming), a different approach is needed. In order to do so, a better understanding of the application data sets and their schema’s, semantics, dependencies and versioning is required.

Looking at both case studies, there is common thread amongst them driving the need for a different approach to data management and specifically backup and recovery:

  • Both Cassandra and Hadoop provide integrated replication and high-availability support. Neither capability, however, provides sufficient, if any protection against full or partial data corruption or data loss (human, software or system initiated). An actual application data centric or aware backup is needed to support data recovery of specific files, tables, tenant data, intermediate results and/or version thereof
  • However, a storage centric (file or object infrastructure) backup solution is not really feasible. The data set is either too large to repeatedly be copied in full, or a full data set takes too large an infrastructure to recover fully or to extract just specific granular application data items. In addition, storage centric backups (file system snapshot, object copies, volume image clones or otherwise) do not provide any insight into the actual data set or data objects that the application depends on. On top of the fully recovered storage repository, an additional layer of reverse engineered application knowledge would be required as well.
  • Application downtime is critical now more than ever. In both case studies, multiple consuming services or clients depend on the scale-out service and persistence. Whether it’s a true multi-tenant usage pattern, or multitude of diverse data processing and analytics applications, the dataset needs to be available close to 100%. Secondly, a full data-set recovery would simply take too long and the end-users or clients would incur too much downtime. Only specific, partial data recovery would support the required SLA’s.

The requirement for an alternate data management and recovery solution is not limited to just the above described Cassandra and Hadoop case studies. Most big data production instances ultimately do require a data protection and recovery solution that supports incremental data backup and specific partial or granular data recovery. More importantly the data copy and recovery must acquire semantic knowledge of the application data in order to capture consistent data copies with proper integrity and recoverable granularity. This would allow the big data DevOps and/or Production Operations teams to just recover data items that are needed without having to do a full big data set recovery on an alternate infrastructure. For example, the data recovery service must be able to expose the data items in the appropriate format (e.g. Cassandra tables, Hadoop files, Hive tables, etc.) and within a specific application context. At the same time the protection copies must be able to be distributed across on-premise infrastructure as well as public cloud storage to leverage both cost effective protection storage tiering and scaling as well as support alternate cloud infrastructure recovery.

A solution that provides big data protection and recovery in a granular and semantic aware approach not only addresses “Big Data Backup” in the appropriate fashion, but it also creates opportunities to extract and use data copies for other purposes. For example, the ability to extract application specific data copies or critical parts of the big data set enables other users to efficiently get down-stream datasets for test and dev, data integrity tests, in-house analytics, 3rd party analytics or potential data market offerings. Combining this with multi-cloud data distribution, we then get closer to realizing a multi-cloud data management solution that starts to address today’s and tomorrow’s needs for application and data mobility, as well as their full monetization potential.

About the author: Peter Smails is vice president of marketing and business development at Datos IO, provider of a cloud-scale, application-centric, data management platform that enables organizations to protect, mobilize, and monetize their application data across private cloud, hybrid cloud, and public cloud environments. A former Dell EMC veteran, Peter brings a wealth of experience in data storage, data protection and data management.”

Related Items:

Big Data Begets Big Storage

Data Recovery Gets Speed, Security Boost

The post Backing Up Big Data? Chances Are You’re Doing It Wrong appeared first on Datanami.

Read more here::