Skip to content

"Trends and Information on AI, Big Data, Data Science, New Data Management Technologies, and Innovation."

This is the Industry Watch blog. To see the complete
website with useful articles, downloads and industry information, please click here.

Apr 19 20

On Drones and Socio-Technical thinking. Interview with Gordon Hoople and Austin Choi-Fitzpatrick

by Roberto V. Zicari

Sociotechnical education is our way of talking about how to help students recognize the complex interconnection of the social and the technical. We bring students together from different majors, give them real problems to tackle, and then challenge them with reading and discussions that force them to face their own assumptions.”Gordon Hoople

“As we developed the class, and later wrote a book together, we realized how much engineering wrestles with social issues (whether it recognizes this or not) and how much social change efforts are supporting or resisting changes that engineers dreamed up in the first place.” –Austin Choi-Fitzpatrick

I have interviewed Gordon Hoople and Austin Choi-Fitzpatrick. We talked about Sociotechnical educationthe mission of The Good Drone Lab, their forthcoming book “Drones for Good. How to Bring Sociotechnical Thinking into the Classroom” and how to engage students in challenging conversations at the intersection of technology and society.


Q1. What is a socio technical education?

Gordon: Sociotechnical education is our way of talking about how to help students recognize the complex interconnection of the social and the technical. This is as true for classroom assignments as it is in real world projects. Is Wikileaks and Russian interference in the United States’ 2016 election a story about technology, a story about politics, a story about society, or a stunning admixture of all three? Students have a real 0-60 moment when they get their first real job–we want to give them a head-start in that process!

Q2. You are are co-directors of “The Good Drone Lab”. What is it?

Austin: The Good Drone Lab, which I started with Tautvydas Juškauskas in 2014, is focused on tinkering and experimenting with the potential drones have for promoting the greater good. We’re exclusively focused on applications that level the playing field between the powerful and the powerless. How can we democratize surveillance, and how can we hold authorities to account, even in protests? More recently we’re also interested in exploring how people from the technical arts (like engineering) can work alongside folks from the social sciences (like sociology or ethnic studies).

Q3. Why did a social scientist decided to collaborate with an engineer, and an engineer with a sociologist, and together on a book about drones and sociotechnical thinking in the classroom?

Gordon: For fun! We’d be lying if we didn’t say up front that we think drones are cool and that we like working with one another. We’d also be lying if we didn’t say that there was some money involved! In the fall of 2016 our colleagues received a National Science Foundation grant for “Revolutionizing Engineering Departments.” We thought this would be a cool effort to join, so we pitched a collaborative class and crossed our fingers.

Austin: As we developed the class, and later wrote a book together, we realized how much engineering wrestles with social issues (whether it recognizes this or not) and how much social change efforts are supporting or resisting changes that engineers dreamed up in the first place. So, we had a spark, and from there we’ve built some very interesting fires. I’m not sure about that analogy, though!

Q4. Why do disciplinary silos create few opportunities for students to engage with others beyond their chosen major? Why do you think that engaging students in challenging conversations at the intersection of technology and society is a useful thing?

Austin: Universities are fossils. They were dreamed up four hundred years ago, and have been ticking along with only minor modifications ever since. That’s not entirely true, and we’re fortunate to work in institutional spaces that welcome innovation, but for the most part academics are hived off into their disciplines, and do a pretty good job self-policing so that we steer clear of one another. That’s a good way to avoid accidents. The problem is that if I steer clear of Gordon’s area of expertise, then we might not bump into one another! So we organize to prevent happy accidents. We think that’s silly. The world is made up of both hidebound institutions and happy accidents. We want our students to see that.

Gordon: So our idea is to take hackathons and maker spaces one step further, and push students together from all these different academic silos. Engineers and social change students both have to leave the university to work with people very different from them. We’re just moving some of that engagement into the classroom and our class projects.

Austin: The real world is fundamentally sociotechnical. All the time international aid groups, for example, are launching new initiatives around clean water; we’re saying this is good, but engineers, nonprofits, and local communities should all be working together. The alternative is one actor setting off on their own, and this often has unintended consequences. I mean, you remember the One Laptop Per Child campaign? Later it turned out that the thing it taught every student to do was to download pornography. If we want stuff to stick, we have to think sociotechnically.

Q5. Can you please explain your socio technical approach to  interdisciplinary education?

Gordon: We bring students together from different majors, give them real problems to tackle, and then challenge them with reading and discussions that force them to face their own assumptions. We pop into and out of small group discussions, ask all the engineering students to be quiet while they listen to peace studies students, then flip the roles. For a lot of our students it’s the first time they’ve done anything like this. It’s challenging, but they seem to like it.

Q6. Do you have any evidence-based pedagogies that your approach is working and is valuable?

Austin: Yes. First of all, students tell us it’s working. But we have also incorporated cutting-edge methods for measuring learning, and then published a bunch of that work in the usual academic outlets, like conferences and journals.

Measurement is central for us, because, even from the beginning, we were both very interested in figuring out whether our methods were translating to student learning in a way we could document. In an early iteration of the class we had the benefit of working closely with a post-doc, Dr. Beth Reddy, now a professor at Colorado School of Mines, who helped us by leading interviews, focus groups, and classroom observations to see what impacts we were having on the students. While we won’t rehash the full findings from those papers here, suffice to say we do think these methods are having a measurable impact.

Q7. What are the main obstacles for effective interdisciplinary teaching?

Gordon: Time! It takes time to do this right, to get on the same page, to communicate clearly to students. Students want to understand the material, and also want to know how to do well in a class. Fortunately, we both agree on those things, but it still takes time to plan the class, then to communicate everything to students in a way that adds more signal than noise.

Q8. In your book you write about The Ethics of Drones. Can you please elaborate on this?

Austin: We are very concerned that drone use will be reserved for the already-powerful. I’m a social movement scholar, and am focused on maintaining balances of power between the state and the people, and between the haves and the have-nots. What happens if only governments and big business have drones? We want to democratize access to important tools for holding the powerful to account. I wrote a whole different book about that (The Good Drone, MIT Press, link), and we wanted our students to wrestle with some of those broader questions, whether or not they agree with me.

 Author Bio Photo

Gordon Hoople is an assistant professor and a founding faculty member of Integrated Engineering Department at the University of San Diego’s Shiley-Marcos School of Engineering. His work focuses on engineering education and design. He is the principal investigator on the National Science Foundation Grant “Reimagining Energy: Exploring Inclusive Practices for Teaching Energy Concepts to Undergraduate Engineering Majors.” His design work occurs at the intersection of STEM and Art (STEAM). He recently completed the sculpture Unfolding Humanity, a 12 foot tall, two ton dodecahedron that explores the relationship between technology and humanity. Featured at Burning Man and Maker Faire, this sculpture brought together a team of over 80 faculty, students, and community members.

Austin Choi-Fitzpatrick is an associate professor of political sociology at the Kroc School of Peace Studies at the University of San Diego, and is concurrent associate professor of social movements and human rights at the University of Nottingham’s Rights Lab and School of Sociology and Social Policy. His work focuses on politics, culture, technology, and social change. His recent books include The Good Drone (MIT Press, 2020) and What Slaveholders Think (Columbia, 2017) and shorter work has appeared in Slate, Al Jazeera, the Guardian, Aeon, and HuffPo as well as articles in the requisite pile of academic journals.


Drones for Good. How to Bring Sociotechnical Thinking into the Classroom.  Gordon Hoople, University of San Diego, Austin Choi-Fitzpatrick, University of San Diego, University of Nottingham, ISBN: 9781681737744 | PDF ISBN: 9781681737751 Hardcover ISBN: 9781681737768 Copyright © 2020 | 111 Pages, Morgan & Claypool.

– The Good Drone: How Social Movements Democratize Surveillance (Acting with Technology). Austin Choi-Fitzpatrick, The MIT Press (July 28, 2020)

Related Posts

– Embedded EthiCS @ Harvard: bringing ethical reasoning into the computer science curriculum. DECEMBER 17, 2019

– On CorrelAid: Data Science for Social Good. Q&A with André AUGUST 28, 2019

Follow us on Twitter: @odbmsorg



Mar 19 20

On Continuous Integration and Software Flight Recording Technology. Interview with Barry Morris

by Roberto V. Zicari

“The key challenge, however, is the cultural change required within software engineering teams to evolve to a state where any software failure, no matter how insignificant it may seem, is unacceptable. No single software engineer, or team, possesses all of the technical experience required to keep a CI pipeline functioning at this level. There must be a cross-disciplined commitment to work towards this goal throughout the development lifecycle in order to be effective.” –Barry Morris

I have interviewed Barry Morris, well-know serial entrepreneur and currently CEO at Undo. We talked about the challenges to  deliver high quality software at a productive level, the cost of persistent failures in Continuous Integration (CI) pipelines, and how Software Flight Recording Technology could help.


Q1. What are typical challenges software engineering teams face to deliver high quality software at a productive level?

Barry Morris: Reproducibility is the fundamental problem plaguing software engineering teams. The inability to rapidly, and reliably, reproduce test failures is slowing teams down. It blocks their development pipeline and prevents them from delivering software on time, and with confidence.

Organizations that can solve the issue of reproducibility are able to confidently deliver quality software on a scheduled, repeatable, and automated basis by eliminating the guesswork associated in defect diagnosis. The best part is that it does not require a complete overhaul of existing tool sets – rather an augmentation to current practices.

The key challenge, however, is the cultural change required within software engineering teams to evolve to a state where any software failure, no matter how insignificant it may seem, is unacceptable. No single software engineer, or team, possesses all of the technical experience required to keep a CI pipeline functioning at this level. There must be a cross-disciplined commitment to work towards this goal throughout the development lifecycle in order to be effective.

Q2. Software failures are inevitable. Do you believe the adoption of Continuous Integration (CI) as a key contributor to agile development workflows, is the solution?

Barry Morris: Despite the best efforts of software engineering teams, there are too many situational factors outside of their direct control that can cause the software to fail. As teams add new features, new processes, new microservices, and new threading to their code, the risk of unpredictable failures grows exponentially.

The adoption of CI as a key contributor to agile development workflows is on the rise. I believe it is the key to delivering software at velocity and offers radical gains in both productivity and quality. According to a recent survey conducted by Cambridge University, 88% of enterprise software companies have adopted CI practices.

Q3. It seems that the volume of tests being run as a result of CI leads to a growing backlog of failing tests. Is it possible to have a zero- tolerance approach to failing tests?

Barry Morris: Unfortunately, the volume of tests being run as a result of CI leads to a growing backlog of failing tests – ticking time bombs just waiting to go off – costing shareholders $1.2 trillion in enterprise value every year.

True CI requires a zero-tolerance approach to software failures. Tests must pass reliably and any failures represent new regressions. Failures that only show up once every 300 runs, or under extreme conditions only make this more challenging. The same survey also found that 83% of software engineers cannot keep their test suite clear of failing tests

Q4. You are offering a so called Software Flight Recording Technology (SFRT). What is it and what is it useful for?

Barry Morris: SFRT enables software engineering teams to record and capture all the details of a program’s execution, as it runs. The recorded output allows the team to then wind back the tape to any instruction that executed and see the full program state at that point. Whereas static analysis provides a prediction of what a program might do, SFRT provides complete visibility into what a program actually did, line by line.

SFRT can speed up time-to-resolution by a factor of 10 by eliminating guesswork, using real, actionable data-driven insights to get to the crux of the issue, faster. But the beauty of this kind of approach is that it is not simply a last line of defense against the most challenging defects (e.g intermittent bugs, concurrency defects, etc). Rather, it can be used to improve the time-to-resolution of all software failures.

Q5. Is SFRT the equivalent to a black box on an aircraft?

Barry Morris: Yes, absolutely.

Q6. When a plane crashes, one of the first things responders do is locate the black box on board. How does it relate to software failures?

Barry Morris: When a plane crashes, one of the first things responders do is locate the black box on board. This device tells them everything the plane did – its trajectory, position, velocity, etc. – right up until the moment it crashed. SFRT can do the same for software, allowing software engineering teams to view a recording of what a program was doing before, during, and after a defect occurs.

Q7. Who has already successfully used Software Flight Recording Technology to to capture test failures?

Barry Morris: SAP HANA, a heavily multi-threaded, feature-rich, in-memory database, is built from millions of lines of highly-optimized Linux C++ code. To ensure the software is high-quality and reliable, the engineering team invested considerably in CI and employed rigorous testing methodologies, including fuzz-testing.

However, non-deterministic test failures could not reliably be reproduced for debugging. Analyzing logs from failed runs could not capture enough information to identify the root cause of specific failures; and reproducing complex failures on live systems was time-consuming. This was slowing development down.

LiveRecorder, Undo’s platform based on Software Flight Recording Technology, was implemented to capture test failures. Recording files of those failing runs were then replayed and analyzed. With LiveRecorder, engineers could see exactly what their program did before it failed and why – allowing them to quickly hone-in on the root cause of software defects.

As a result, SAP HANA was able to accelerate software defect resolution in development, by eliminating the guesswork in software failure diagnosis. On top of significantly reducing time-to-resolution of defects, SAP HANA engineers managed to capture and fix 7 high-priority defects[1] – including a couple of race conditions, and a number of sporadic memory leaks and memory corruption defects.

Q8. What are the key questions to consider when developing CI success metrics?

Barry Morris: Every organization judges success differently. To some, finding a single, hard-to-reproduce bug per month is enough to deem changes to their CI pipeline as effective. Others consider the reduction in the amount of aggregate developer hours spent finding and fixing software defects per quarter as their key performance indicator. Speed to delivery, decrease in backlog, and product reliability are also common metrics tracked.

Whatever the success criteria, it should reflect the overarching goals of the larger software engineering team, or even corporate objectives. To ensure that teams measure and monitor the success criteria that matters most to them, software engineering managers and team leads should establish their own KPIs.

Some questions to consider when developing CI success metrics:

  • Is code shipped earlier than previous deployments?
  • How many defects are currently in the backlog compared to last week/month?
  • Are developers spending less time debugging?
  • Are other teams waiting for updates?
  • How many developer hours does it take to find and fix a single bug?
  • How long does it take to reproduce a failure?
  • How long does it take to fix a failure once found?
  • What is the average cost to the organization of each failure?

These questions are designed as an initial starting point. As mentioned earlier, each organization is different and places value on certain aspects of CI depending on team dynamics and needs. What’s important is to establish a baseline to ensure agreement and commitment across teams, and to benchmark progress.


Barry Morris

Barry Morris, CEO, Undo.

With over 25 years’ experience working in enterprise software and database systems, Barry is a prodigious company builder, scaling start-ups and publicly held companies alike. He was CEO of distributed service-oriented architecture (SOA) specialists IONA Technologies between 2000 and 2003 and built the company up to $180m in revenues and a $2bn valuation.

A serial entrepreneur, Barry founded NuoDB in 2008 and most recently served as its Executive Chairman. Barry has now been appointed as CEO in September 2018 to lead Undo‘s high-growth phase.


– Research Report: The Business Value of Optimizing CI pipeline. Judge Business School from the University of Cambridge in partnership with Undo (link to download the report- registration required)

–  3 Key Findings from our CI Research Report, Undo Blog post:

The research concluded three key findings:

  1. Adoption of CI best practices is on the rise. 88% of enterprise software companies say they have adopted CI practices, compared to 70% in 2015
  2. Reproducing software failures is impeding delivery speed. 41% of respondents say getting the bug to reproduce is the biggest barrier to finding and fixing bugs faster; and 56% say they could release software 1-2 days faster if reproducing failures wasn’t an issue
  3. Failing tests cost the enterprise software market $61 billion. This equals 620 million developer hours a year wasted on debugging software failures

[1] Improving Software Quality in SAP HANA, 2018

– Technical Paper: Software Flight Recording Technology, Undo (link: registration required to download the paper.)

Related Posts

– On Software Reliability. Interview with Barry Morris and Dale Vile. ODBMS Industry Watch, April 2, 2019

– Go Green Stay Green. Q&A with Greg Law. JULY 1, 2019

– On Software Quality. Q&A with Alexander Boehm and Greg Law., November 26, 2018

– Integrating Non-Volatile Memory into an Existing, Enterprise-Class In-Memory DBMS. By Alexander Böhm., JULY 18, 2017

Follow us on Twitter: @odbmsorg



Feb 13 20

On AI for Insurance and Risk Management. Interview with Sastry Durvasula

by Roberto V. Zicari

“AI in complex global industries is in a league of its own, with many opportunities, many risks and many rewards! We definitely see AI having a major impact on the entire risk and insurance industry value chain from improving customer experience to changing core insurance processes to creating next-gen risk products.” –Sastry Durvasula

I have interviewed Sastry Durvasula, Chief Digital Officer and Chief Data & Analytics Officer at Marsh, Inc.


Q1: You are Marsh’s Chief Digital Officer and Chief Data & Analytics Officer. What are your main priorities?

Sastry Durvasula: My primary focus is leading Marsh’s global digital, data and analytics strategy and transformation, while building new digital-native businesses and growth opportunities. This includes development of next-gen digital platforms and products; data science and modelling; client-facing technology; and digital experiences for clients, carrier partners and colleagues. We also launched Marsh Digital Labs to incubate emerging tech, InsurTech partnerships, and forge industry alliances. Another key aspect of the role is to drive digital culture transformation across the company.

Q2: Can you talk briefly about Marsh Digital Labs?

Sastry Durvasula: We established Marsh Digital Labs as an incubator for developing innovative insurance products, running select tech experiments and supporting strategic engagements with clients, insurance carriers and InsurTechs. The Labs has an innovation funnel process whereby we select and move ideas from concepts to actual market pilots before handing off to the product teams for full-scale development. This allows us to be agile, fail fast and demonstrate product viability, which is critical in today’s fast-changing tech landscape. Our most recent pilot was RiskExchange, a blockchain for trade credit insurance, which was actually the winning idea from our global colleague hackathon called #marshathon.

We’re currently focused on three emerging tech areas  – AI/ML, Blockchain and IoT – and exploring a number of new insurance products and distribution channels in the small commercial and consumer sector, as well as in the sharing economy, cyber, autonomous vehicles, and worker safety areas. But ongoing R&D is a core component of the Labs, too, and we collaborate with a number of industry, academia and open-source initiatives. And we need to cut through all the hype and focus on use cases that create true business impact. For example, the Labs has a dedicated unit right now working on using AI and IoT to develop next-gen risk model capabilities that leverage new streams of real-time data, cloud-based platforms, and machine learning algorithms.

Q3: Can you talk a little bit about your overall data infrastructure and the new data streams you are exploring?

Sastry Durvasula: Yes, absolutely. We implemented the Marsh big data ecosystem leveraging multi-cloud platform and capabilities, advanced analytics and visualization tools, and API-based integrations. It has been built to support data in any format, source or velocity with dynamic scalability on processing and storage. Data privacy and governance are safeguarded with metadata and controls built-in.

Keep in mind that traditional risk management and insurance placement is mostly done using static exposure data that gets updated typically only during policy renewal. We are actively working on changing the game by bringing in a wide variety of newer data streams, including IoT data and other external sources, in order to quantify and manage risks better.

For example, in the marine and shipping industry this includes behavioral data such as vessel statistics, movements, machinery and weather information, combined with historical claims data. We can get a more accurate picture of risk and can price more accurately. To assist with these metrics, we recently launched a partnership with InsurTech firm Concirrus that specializes in marine analytics. Similarly, in property risk we are looking at factors such as building integrity as measured by vibrations or earthquake potential,  damage from water leakage as measured by sensors or actuators, and so on. In telematics, we can use real-time GPS and speed data, as well as driving behavioral data like braking, acceleration and so on.

We are also researching the overall risk profile of smaller enterprise clients by leveraging third-party external sources such as news, social, government and other regulatory or compliance filings. So, there is a wide variety of data and data types that we deal with or are actively exploring.

Q4: What is exciting about AI in the insurance and risk management space?

Sastry Durvasula: AI in complex global industries is in a league of its own, with many opportunities, many risks and many rewards. We definitely see AI having a major impact on the entire risk and insurance industry value chain from improving customer experience to changing core insurance processes to creating next-gen risk products.

Underwriting based on AI models working on dynamic data streams will result in usage-based and on-demand insurance offerings. We will also see systems that allow straight-through quoting, placement and binding of selected risks powered by AI. For insurance brokers and carriers, this will allow more intelligent risk selection methods.

Claims is another area where AI will have a major impact including automated claims management, claims fraud detection, and intelligent automation of the overall process.

Accelerating use of AI in many industries will have an impact on risk liability models. For example, as AI-powered autonomous vehicles become mainstream, liability shifts from personal auto coverage to a commercial product liability held by the manufacturer. So the insurance industry a decade from now may look quite different from today.

We have also been working on conversational AI and chatbots to support various client-facing and colleague-facing initiatives. AI will play a big role in intelligent automation – insurance is an industry with vast numbers of documents and is very manual and process-oriented. By providing AI-powered human-augmentation functions that improve and enhance the manual processes, we will see efficiencies in the overall industry.

Q5: What are some of the emerging risk and insurance products that Marsh is working on?

Sastry Durvasula: We have several new products targeting either different risks or different market segments. We recently launched Blue[i] next-gen analytics and AI suite, powered by Marsh’s big data ecosystem. Many of Marsh’s big enterprise customers retain significant risk in their portfolio. In fact, in many cases, the premium paid for risk transfer to the insurance markets is only a certain percentage of the Total Cost of Risk (TCOR) to that company. Worker’s compensation is one of the biggest risks in the US, costing employers nearly $100B annually. Our Blue[i] ML models powered by behavioral data and real-time insights help with the prediction of and reduction in claims, as well as reduction in insurance premiums.

Cyber Risk is definitely one of the fastest growing risk categories in the world. We launched market-leading solutions to understand, quantify and manage an enterprise’s cyber risk. These include several proprietary ways of quantifying cyber exposure, cyber business interruption and data breach impacts. These techniques will get more sophisticated as we build out our AI capabilities and increase our data sources.

Pandemic risk is another emerging risk category that we are building out solutions for. In partnership with a Silicon Valley based startup called Metabiota and re-insurer MunichRe, we have created an integrated pandemic risk quantification and insurance solution targeted at key industries in the travel, aviation, hospitality and educational sectors.

In addition to emerging risk products, we have also been innovating on digital solutions in the small commercial and consumer space. We launched Bluestream, a cloud-based digital broker platform for affinity clients, providing them with a new, streamlined way to offer insurance products and services to their customers, contractors, and employees.

Q6: Can you elaborate on how AI and IoT enable real-time risk management?

Sastry Durvasula: AI and new IoT data streams are making real-time risk management a possibility because enterprises have an up-to-the minute view of changing risk exposures and can effectively take actions to mitigate them. It changes how risks are calculated – from traditional actuarial models based on historic events to AI-powered analytics that support dynamic views of risk-triggering mitigating actions.

For example, in the marine use case, cargo insurance policies can be repriced in real-time based on the operator behavior, value of cargo, sea and weather conditions, and many other dynamic variables. In addition to repriced risk, the operator can also be ‘nudged’ to take less riskier actions in exchange for reduced insurance pricing.

We are also actively leveraging wearables to drive reduction in workers compensation claims based on repetitive motion as well as to improve worker safety. By using data from wearables such as smart belts that measure an employee’s sitting, standing, bending, twisting, walking and other repetitive motion actions, dashboards are created to collect and show individual and aggregate movement and locations. Our models recommend ways to improve the client’s safety as well as ergonomic plans to reduce injury and claims likelihood.

Q7: There is a lot of concern around possible malicious use of AI as the technology progresses. Can you talk about some of the risks posed by AI?

Sastry Durvasula: Definitely, this is an important area for us going into the future. AI models are not perfect at all – in fact, far from it. AI models trained on data sets containing unintentional human biases will reflect that same prejudice in their predictions. We are also starting to see more and more cases where opaque AI models resulted in inscrutable errors that were only uncovered after lengthy lawsuits. As more and more complex AI algos and models make their way to the enterprise, it has become very urgent to incorporate accountability and trust criteria into different stages of the model creation. This includes being on the lookout for bias in training data to ‘explainable and interpretable’ models and to have a meaningful appeals process. Definitely, thoughtful regulation needs to be introduced in a way not to impede the technological progress, but to push it in the right direction.

In addition to the above, AI is already causing major headaches by amplifying the ability of bad actors – whether it is automating hacking attempts that make corporate security even harder, or causing broader global harm with fake news and propaganda, or making existing weaponry more destructive. As mentioned earlier, cyber risk is the fastest growing risk category and AI will only add more fuel to the fire.

Not everything around AI is increasing risk, though. Apparently, 90% of auto accidents are caused by human errors. So in this case the rise of AI-powered autonomous vehicles may actually bring down overall driving risk as they become more mainstream!

Q8: How do you see AI governance evolving at the enterprise level?

Sastry Durvasula: AI governance is definitely an area that will get a lot of attention over the next 18-24 months and beyond as more and more AI models are implemented by firms across various industries. Operationalizing AI systems is a complex multi-step process that is also complicated by the fact that AI models can drift in performance, especially if they have feedback loops and are training continuously. In addition, AI models can vary in the degree of autonomy – for example, a low autonomy model that supports human augmentation may require less governance as opposed to completely autonomous systems that necessitate a very high degree of governance.

At the very least we see the following issues being very key for AI governance in enterprise systems: explainability, interpretability, and accountability.

The first refers to explainability standards – understanding why an AI system is behaving in a specific way or even if the AI can be explained. This will be critical to improving the overall trust on the accuracy and appropriateness of the predictions. Also, the interpretability of AI algos and models will be a key feature. Finally, accountability tools, such as the ability to audit a model or ways to contest a prediction, will be needed.

Other important issues are the ability to stop biases from creeping into models as well as incorporating appropriate safety controls into the overall system. Safety can be improved with continuous monitoring to check whether the AI system violated any safety constraints, and automatic failover or human override in the case of any suspected safety breach. The quandary about how to limit biases converges with the dilemma around AI ethics – should ethical AI be approached through self-regulation in the development of AI tech, or by creating ‘moral machines’ where ethics and values are built into the machine. In either case, ethics is generally open to interpretation and is not yet in the legal framework.

In addition, as a risk management company, we are always on the lookout for liability issues for our enterprise clients. As the client implements AI, it has to be noted that some person or organization is still ultimately responsible for the actions of the AI systems under their control – no matter how complex or sophisticated the AI model is. On top of it, most enterprise systems will typically rely on AI models that have been developed by a tech company. In many such scenarios, it is not very clear where the liability lies in the case of an incident. For example, if an autonomous vehicle has an accident based on some AI model failure, it is not clear whether the vehicle manufacturer is liable or whether it is the AI software provider or maybe even the AI chip vendor! We are at very early stages of such complex liability frameworks and we may need to have governments stepping in with clear regulatory guidelines in such cases. These are very early days but again we expect to see a flurry of activity in this sector soon.

Q9: How are you attracting top talent in AI, analytics and other emerging tech areas?

Sastry Durvasula: Talent is a big focus area for us. We have been able to attract a number of engineering and product experts, and data science talent, with diverse industry backgrounds. In the US, we hired the head of Labs in Silicon Valley, the head of data science in New York, and built our digital hub in Phoenix. We recently launched global innovation centers in select locations to attract regional talent, and have been forging industry and academia alliances.

It is equally important to keep the team energized and provide cross-functional development opportunities. There are some very interesting and complex data, analytics and digital problems in the risk and insurance space as I discussed earlier. We focus on shedding light on them, building an agile culture, and fostering experimentation.

As an example, we launched a global colleague hackathon called #marshathon that had amazing response and participation. The winning teams get to partner with our Labs to incubate the idea and launch in-market pilots. We also launched the first-ever all-women hackathon in the industry called #ReWRITE, for Women, Risk, Insurance, Tech and Empowerment, in the US and Europe working with Girls in Tech and other industry partners. It was a great opportunity for women technologists from universities, startups and other corporations to network, learn and hack some innovative ideas utilizing AI, IoT, blockchain and other digital technologies.

Q10: Have you seen any significant or notable changes in the risk and insurance industry from when you started?

Sastry Durvasula: Where there is risk, there is opportunity. We are seeing increased momentum and significant investments in digital, data and analytics, and InsurTech is gaining speed. Digital has become a Board level topic in the industry. New collaborations and consortia are forming, especially leveraging the power of Blockchain and other emerging technologies.

There are as many opportunities as there are challenges both on the demand side and the supply side of the value chain. The rapidly changing cyber risk landscape, increased surface area with IoT devices, autonomous vehicles, sharing and gig economy, and other Industry 4.0 advancements are bringing new opportunities while adding new complexities in a tightly regulated environment.

Legacy operational systems are the delimiters for the industry to fully capitalize on these opportunities and address the challenges, and companies need to make digital transformation a strategic and relentless priority.  As Yoda would say, “Do. Or do not. There is no try.”

Sastry Durvasula
Chief Digital and Chief Data & Analytics Officer, Marsh

Sastry is CDO and CDAO of Marsh, the world’s leading insurance broker and risk adviser. He leads the company’s digital, data and analytics strategy and transformation, while building new digital-native businesses and growth opportunities. This includes development of innovative digital platforms and products, data science & modelling, client-facing technology, and digital experiences across global business units. In his previous role at American Express, Sastry led global data and digital transformation across the lifecycle of cardmembers and merchants, driving innovation in digital payments and commerce, big data, machine learning, and customer experience.

Sastry plays a leading role in industry consortia, CDO/CIO forums, FinTech/InsurTech partnerships, and building academia/research affiliations. He is a strong advocate for diversity & inclusion and is on the Board of Directors for Girls in Tech, the global non-profit that works to put an end to gender inequality. Sastry launched an industry-wide initiative called #ReWRITE focused on Women, Risk, Insurance, Technology & Empowerment. He holds a Master’s degree in Engineering, is credited with 20+ patents and has been the recipient of several industry awards for innovation and leadership.


The Ethics of Artificial Intelligence, Frankfurt Big Data Lab.

Related Posts

On The Global AI Index. Interview with Alexandra MousavizadehODBMS Industry Watch, 2020-01-18

On Innovation and Digital Technology. Interview with Rahmyn KressODBMS Industry Watch ,2019-09-30

On Digital Transformation, Big Data, Advanced Analytics, AI for the Financial Sector. Interview with Kerem TomakODBMS Industry Watch, 2019-07-08

Follow us on Twitter: @odbmsorg


Jan 18 20

On The Global AI Index. Interview with Alexandra Mousavizadeh

by Roberto V. Zicari
“The US is the undisputed leader in AI development, the Index shows. The western superpower scored almost twice as highly as second-placed China, thanks to the quality of its research, talent and private funding. America was ahead on the majority of key metrics – and by a significant margin. However, on current growth experts predict China will overtake the US in just five to 10 years.” –Alexandra Mousavizadeh.
I have interviewed Alexandra Mousavizadeh, Partner, Tortoise Media, Director, Tortoise Intelligence. We talked about “The Global AI Index”. 

Q1. On 3rd of December 2019 in London, you have released “The Global AI Index” ranking 54 countries. What was the prime motivation for producing such an index?

Alexandra Mousavizadeh: Artificial intelligence is an engine of change, for better or for worse. Increasingly, our daily lives are impacted by technologies using machine learning, and businesses are using them to support more and more of their processes.

Our motivation for producing the Index here at Tortoise was to monitor and help explain this change on a global scale. The initial request came from three governments who were seeking a comprehensive and detailed index that would help them set and track their national AI strategy. As a news company focused on understanding what forces are driving geopolitical, environmental and social change we knew we needed to focus on artificial intelligence. At Tortoise Intelligence, our data and analytics team, the tool for doing this is the composite index.

Q2. How did you choose the 54 countries?

Alexandra Mousavizadeh: The 54 countries were chosen to represent those that had lifted artificial intelligence to the top of the national agenda in some way; publishing a national strategy, appointing a minister for artificial intelligence, setting up public and private sector collaborations and institutes.
Ultimately the list of 54 was a selection that represents the countries in which data was beginning to be gathered on the relevant factors, and those that are stepping onto the world stage in terms of development.

Q3. Of the 150 indicators you have chosen, which one(s) are most relevant for the ranking?

Alexandra Mousavizadeh:  Our overall approach was to represent the fact that:

Artificial intelligence is still the product of human intelligence; and therefore talent is a priority. Talented practitioners and developers who can innovate and implement new technologies. A leading indicator, and one that is very relevant to the ranking is the number of data scientists active in a given country. For this indicator we drew data from GitHub, StackOverflow, Kaggle and LinkedIn.

Research into artificial intelligence is also a leading factor; making skills researchers and the generation of new understandings and techniques another priority. A leading indicator, and another that impacts the rankings significantly is the number of researchers in top rated journals in a given country.

Finally, money remains the primary catalyst to activity on artificial intelligence. Talent, and research, come at a premium to businesses and other institutions. So commercial funding is another leading indicator, with total amount of investment into artificial intelligence companies being an impactful indicator.

Q4. What criteria did you use to weight each indicator for importance?

Alexandra Mousavizadeh: Throughout the course of our consultations with the advisory boards, and the many ThinkIns held at Tortoise during the development of the Index, we put together a model for explaining the significance of each sub-pillar in terms of building capacity for artificial intelligence. As described, the leading factors were talent, research and investment; mostly expressing that financial and intellectual capital currently trump all other factors.

Our experts were consulted across the full range of indicators, and we reached a consensus on the importance. We recognised that this remains a subjectively constructed set of weightings, which is why we have conducted testing to demonstrate that the impact of the weightings is relatively insignificant compared to the impact of the actual values themselves.

Q5. Why have you presented an index ranking on capacity?

Alexandra Mousavizadeh: At present the availability of information is growing rapidly, and the question as to how to manage and interpret this information is growing more urgent. Composite indicators meet the need to consolidation – through aggregation – a large amount of data into a set of simplified numbers that encompass and reflect the underlying complexity of the information. All indices constructed from composite indicators should be interpreted with caution, and scrutinised carefully before important conclusions are drawn out. In alignment with the OECD ‘Handbook on constructing composite indicators’; ‘capacity’ is the multi-dimensional concept and the underlying model around which the individual indicators of The Global AI Index are compiled.

Capacity – the amount of something that a system can contain or produce – is the organising concept of The Global AI Index. It is an appropriate means of considering the relationship between the different relevant factors that exist within a given nation. Increased capacity, in this case, can be understood as an increased ability to generate and sustain artificial intelligence solutions, now and in the future. The Artificial Intelligence for Development Organisation talk about ‘capacity’ for exactly this reason; it speaks both to the current organisation of productive factors that contribute to technological development, as well as future potential for generating new innovations in their use, and in the design of the technologies themselves.

Q6. Is it reasonable to compare nations of vastly different sizes when considering capacity?

Alexandra Mousavizadeh: We have constructed our data set to demonstrate both gross capacity, and proportional capacity – or intensity – with the intensity rankings being very different from the headline rankings for gross capacity. We believe that the answer to this question hinges on what you believe the purpose of a comparative index is; we think that they are an excellent tool for condensing a lot of complexity into a simpler conclusion that can be understood and tackled be experts and non-experts alike.

By creating a number of clusters within the 54 countries we have tried to present the rankings in a more like for like way. For example, the UK can be considered in relation to the full set, and to its closest competitors which we call the ‘Traditional Champions’ of higher education, research and governance. These nations; including Canada, France and Germany are facing some of the same challenges when it comes to development and adoption. In future editions we may choose to dig more deeply into the question of intensity versus raw capacity.

Q7. What data sources did use for The Global AI Index? What about Missing values or incorrect values?

Alexandra Mousavizadeh: The vast majority of sources used for The Global AI Index are publically available and open source; only one of which is proprietary. This was the Crunchbase API, which was drawn on for data in the ‘Commercial Ventures’ sub-pillar. A full list of the sources used in the The Global AI Index is available in the indicator table. Some headline sources are Crunchbase, GLUE, IEEE, GitHub API, LinkedIn and SCOPUS.

Missing values represent approximately 4.5% of the collected data-set for The Global AI Index. There was a limited amount of data available with which to train an imputation model – although this was strongly considered as an option – and as such there are a variety of imputation techniques employed.

Imputation by zero – used when data is not pre-defined but is the logical or necessary value; e.g, if the number of Kaggle Grandmasters is empty it is most likely because a country has never had one.

Imputation by average value – used when the variable in question is independent of a country’s population size or GDP; placing the mean or median value in place of a missing value.

Imputation by last observation carried forward – used when alternative data sources show only values from previous years; in some cases previous values are taken as indicators of a country’s current state.

Imputation by model – used in observation of obvious relationships between a country’s demographics – population, GDP, employment rates, etc. In some cases it was necessary to build a generalised linear model to predict what value should be used.

Q8. What are the key findings?

Alexandra Mousavizadeh: We believe that the key findings of the Index to date are:
The US is the undisputed leader in AI development, the Index shows. The western superpower scored almost twice as highly as second-placed China, thanks to the quality of its research, talent and private funding. America was ahead on the majority of key metrics – and by a significant margin. However, on current growth experts predict China will overtake the US in just five to 10 years.

China is the fastest growing AI country, our Index finds, overtaking the UK on metrics ranging from code contributions to research papers in the past two years. Last year, 85 per cent of all facial recognition patents were filed in China, as the communist country tightened its grip on the controversial technology. Beijing has already been condemned for using facial recognition to track and profile ethnic Muslims in its western region.

Britain is in third place thanks to a vibrant AI talent pool and an excellent academic reputation. This country has spawned hugely successful AI companies such as DeepMind, a startup founded in 2010 which was bought by Google four years later for $500 million. Britain has been held back, however, by one of the slowest patent application processes in any of the 51 countries. Other countries are snapping at its heels.

Q9. What other findings did you find relevant or surprising?

Alexandra Mousavizadeh: Despite playing a starring role in the space race and the nuclear arms race, Russia is a small player in the AI revolution, our data suggests. The country only comes 30th out of 54 nations, pushed down by its failure to attract top talent, and by a lack of research. Anxious to catch up, President Vladimir Putin announced last year a new centre for artificial intelligence hosted at the Moscow Institute for Physics and Technologies.

Smaller countries – such as Israel, Ireland, New Zealand and Finland – have developed vibrant AI economies thanks to flexible visa requirements and positive government intervention. Israel’s Mobileye Vision Technology, which provides technology for autonomous vehicles, was purchased in 2017 by Intel for $15.3 billion.

More than $35 billion has been publicly earmarked by governments to spend on AI development over the next decade, with $22 billion promised by China alone. Many more billions may have been allocated secretly through defence departments which are not made public.

Countries are using AI in very different ways. Russia and Israel are among the countries focusing AI development on military applications. Japan, by contrast, is predominantly using the technology to cope with its ageing population.

Q10. What do we learn overall from this index?

Alexandra Mousavizadeh: We’ve learned more about the vast scale of activity on artificial intelligence and cut through some of the noise about how and why it is changing the world. We’ve been able to uncover a lot of information about collaboration between supposed rivals, informal learning of coding and machine learning skills, and a lot about the availability and competition for talent.

Q11. What were the main challenges in creating such an Index?
Alexandra Mousavizadeh: Building up a network of people who are sufficiently knowledgeable to scrutinise and comment on the process.  Dealing with a vast number of data points that need to be normalised, and made comparable.  Checking the provenance and robustness of the data points.

Q12. Where are the ethical considerations in this index?

Alexandra Mousavizadeh: Ethics have been a major focus in our conversation about artificial intelligence. We decided that The Global AI Index would solely measure capacity and activity. An index on AI Ethics is planned for this year.

Firstly, the most ethical model for developing and adopting artificial intelligence just hasn’t emerged yet, and perhaps it never will. This lack of consensus means that it is more difficult to select variable that show better or worse ethical considerations.

However, we have significant plans throughout 2020 to build upon our work on ethics and artificial intelligence. We hope this work will amount to another product within the year, one that reflects the complexities of governance in relation to artificial and where in the world the most is being done to safeguard good outcomes for all.

Q13 The fast-changing processes of innovation and implementation in artificial intelligence requires constant re-examination. How do you intend to keep up with such constant changes? and how do you plan to improve the index in the future?

Alexandra Mousavizadeh: We have planned a bi-annual refresh of the Index, drawing in new values for a range of our indicators to keep the rankings dynamic.

Our series of ThinkIns and events are Tortoise will also continue throughout the year. These represent fantastic opportunities to build upon our methodology and move the conversation into new areas. We are currently hoping to improve the index by:

Adding more data on importation and exports of computing hardware, chip designs and by expanding our data reach on patents.

Including data capture statistics, in an attempt to show which nations are building the largest and most useful data-sets. This will also fit into our investigation of data privacy and governance. Our most recent ThinkIn – which you can watch here – on ‘data rules’ focused on the various models for using data and the risks associated with each.

Q14. Are you planning to release the data open source?

Alexandra Mousavizadeh: We’ve already shared the underlying data-set with a range of partners and interested parties. Ultimately we hope the Index will be a tool for developing better understanding, and we will look to share the data as part of the ongoing conversation.
Alexandra Mousavizadeh is a Partner at Tortoise Media, running the Intelligence team which develops indices and data analytics. Creator of the recently released Responsibility100 Index and the new Global AI Index. She has 20 years’ experience in the ratings and index business and has worked extensively across the Middle East and Africa. Previously, she directed the expansion of the Legatum Institute’s flagship publication, The Prosperity Index, and all its bespoke metrics based analysis & policy design for governments. Prior roles include CEO of ARC Ratings, a global emerging markets based ratings agency; Sovereign Analyst for Moody’s covering Africa; and head of Country Risk Management, EMEA, Morgan Stanley.


Related Posts
Follow us on Twitter: @odbmsorg
Jan 2 20

On Kubernetes, Hybrid and Multi-cloud. Interview with Jonathan Ellis

by Roberto V. Zicari

“Container and orchestration technologies have made a quantum leap in manageability for microservice architectures.  Kubernetes is the clear winner in this space.  It’s taken a little longer, but recently Kubernetes has turned a corner in its maturity and readiness to handle stateful workloads, so you’re going to see 2020 be the year of Kubernetes adoption in the database space in particular. “— Jonathan Ellis.

I have interviewed Jonathan Ellis, Co-Founder and CTO at DataStax. We talked about Kubernetes, Hybrid and Multi-cloud.  In addition, Jonathan tells us his 2020 predictions and thoughts around migrating from relational to NoSQL.

                                                                   Happy and Healthy New Year! RVZ

Q1. Hybrid cloud vs. multi-cloud: What’s the difference?

Jonathan Ellis: Both hybrid and multi-cloud involve spreading your data across more than one kind of infrastructure.  As most people use the terms, the difference is that hybrid cloud involves a mix of public cloud services and self-managed data center resources, while multi-cloud involves using multiple public cloud services together, like Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP).

Importantly, multi-cloud is more than using multiple regions within one cloud provider’s infrastructure. Multiple regions can provide resiliency and distribution of your data (although outages with a large enough blast radius can still affect multiple regions, like Azure’s global DNS outage earlier this year), but you’re still limited to the features of a single provider rather than a true multi-cloud environment.

Q2. What is your advice: When is it better to use on-prem, or hybrid, or multi-cloud?

Jonathan Ellis: There are three main areas to consider when evaluating the infrastructure options for an application.  The best approach will depend on what you want to optimize for.

The first thing to consider is agility—cloud services offer significant advantages on how quickly you can spin infrastructure up and down, allowing you to concentrate on creating value on the software and data side.  But the flip side of this agility is our second factor, which is cost.  The agility and convenience of cloud infrastructure comes with a price premium that you pay over time, particularly for “higher level” services than raw compute and storage.

The third factor is control.  If you want full control over the hardware or network or security environment that your data lives in, then you will probably want to manage that on-premises.

A hybrid cloud strategy can let you take advantage of the agility of the cloud where speed is the most important factor, while optimizing for cost or for control where those are more critical.  This approach is popular for DataStax customers in the financial services sector, for instance.  They like the flexibility of cloud, but they also want to retain control over their on-premises data center environment. We have partnered with VMware on delivering the best experience for public/private cloud deployments here.

DataStax builds on Apache Cassandra technology to provide fine-grained control over data distribution in hybrid cloud deployments.  DataStax Enterprise (DSE) adds performance, security and operational management tools to help enterprises improve time-to-market and TCO.

Q3. IT departments are facing an uphill battle of managing hybrid, multi-cloud environments. Why does building scalable modern applications in the cloud remain a challenge?

Jonathan Ellis: Customers of modern, cloud-native applications expect quick response times and 100% availability, no matter where you are in the world.  This means your data layer needs the ability to scale both in a single location and across datacenters.  Relational databases and other systems built on master/slave architectures can’t deliver this combination of features.  That’s what Cassandra was created for.

Cloud vendors have started trying to tackle these market requirements, but by definition their products are single-cloud only.  DSE not only provides a data layer that can run anywhere, but it can actually run on a single cluster that spans machines on-premises and in the cloud, or across multiple public clouds.

Q4. Securing a multi-cloud strategy can be difficult due to a lack of visibility across hosts. What is your take on this?

Jonathan Ellis: Security for a multi-cloud architecture is more complex than security for a single cloud and has unique challenges. Security is required at multiple levels in the cloud and often involves compliance with regulatory standards. While security vendors are trying to solve this problem across clouds, the current tooling is limited and the feature sets vary so the ability to have a cohesive view of the underlying IaaS across clouds is not optimal. This implies a need for IT teams to have skill sets for each cloud in their architecture, while relying on the AWS, GCP or Azure specific security, monitoring, alerting and analytics services to provide visibility.  (As applications and databases move to managed kubernetes platforms like GKE, EKS and AKS, some of the security burden for host level security shifts to the cloud providers who manage and secure these instances at different levels.)

These challenges are not stopping companies from moving forward with a multi-cloud strategy, driven by the advantages of avoiding vendor lock in and improved efficiency from a common data layer across their infrastructure, as well as by non-technical factors such as acquisitions.

Datastax provides capabilities that enable companies to improve their security posture and help with the security challenges. At the data security level, DSE advanced security allows companies to minimize risk, achieve granular access control, and help with regulatory compliance. It does this with functionality like unified authentication, end-to-end encryption, and enhanced data auditing. We are also developing a next generation cloud based monitoring tool that will have a unified view across all of your Cassandra deployments in the cloud and will be able to provide visibility into the underlying instances running the cluster.  Finally, Datastax managed services offerings like Apollo (see below) will also provide some relief to this problem.

Q5. You recently announced early access to the DataStax Change Data Capture (CDC) Connector for Apache Kafka®. What are the benefits of bridging Apache Kafka with Apache Cassandra?

Jonathan Ellis: Event streaming is a great approach for applications where you want to take actions in realtime. Apache Kafka was developed by the technology team at LinkedIn to manage streaming data and events for these scenarios.

Cassandra is the perfect fit for event streaming data because it was built for the same high ingest rates that are common for streaming platforms such as Kafka. DataStax makes it easier to bring these two technologies together so that you can do all of your real-time streaming operations in Kafka and then serve your application APIs with a highly available, globally distributed database. This defines a future proof architecture that handles any needs that microservices and associated applications throw at it.

It’s important to recognise what Kafka does really well in streaming, and what Cassandra does well in data management. Bringing these two projects together allows you to do things that you can’t do with either by itself.

Q6. DataStax recently announced a production partnership with VMware in support of their VMware vSAN to include hybrid and multi-cloud configurations. Can you please elaborate on this?

Jonathan Ellis: We have worked with VMware for years on how to support hybrid cloud environments, and this partnership is the result. VMware and DataStax have a lot of customers in common, and for a lot of those customers, the smoothest path to cloud is to use VMware to provide a common substrate across their on-premises and cloud deployments.  Partnering with VMware allows DataStax to provide improved performance and operational experience for these enterprises.

Q7. What are your 2020 predictions and thoughts around migrating from relational to NoSQL?

Jonathan Ellis: Container and orchestration technologies have made a quantum leap in manageability for microservice architectures.  Kubernetes is the clear winner in this space.  It’s taken a little longer, but recently Kubernetes has turned a corner in its maturity and readiness to handle stateful workloads, so you’re going to see 2020 be the year of Kubernetes adoption in the database space in particular.  (Kubernetes support for DSE is available on our Labs site.)

In terms of moving from relational to NoSQL, there’s still a gap that exists in terms of awareness and understanding around how best to build and run applications that can really take advantage of what Cassandra can offer.  Our work in DataStax Academy for Cassandra training will continue in 2020, educating people on how to best make use of Cassandra and get started with their newest applications. This investment in education and skills development is essential to helping the Cassandra community develop, alongside the drivers and other contributions we make on the code side.

Q8. What is the road ahead for Apache Cassandra?

Jonathan Ellis: I was speaking to the director of applications at a French bank recently, and he said that while he thought the skill level for developers had gone up massively overall, he also thought that skills specifically around databases and data design have remained fairly static, if not down over time.  To address this skills gap, and to take advantage of cloud-based agility, we’ve created the Apollo database (now in open beta) as a cloud-native service based on Cassandra. This makes the operational complexities of managing a distributed system a complete non-problem.

Our goal is to continue supporting Cassandra as the leading platform for delivering modern applications across hybrid and multi-cloud environments.  For companies that want to run at scale, it’s the only choice that can deliver availability and performance together in the cloud.


 Jonathan Ellis

Jonathan Ellis

Jonathan is a co-founder of DataStax. Before DataStax, Jonathan was Project Chair of Apache Cassandra for six years, where he built the Cassandra project and community into an open-source success. Previously, Jonathan built an object storage system based on Reed-Solomon encoding for data backup provider Mozy that scaled to petabytes of data and gigabits per second throughput.


– DataStax Enterprise (DSE)

– DataStax Academy

– Apollo database

Related Posts

–  The Global AI Index 2019, DEC. 17, 2019

–  Look ahead to 2020 in-memory DEC. 27, 2019

Follow us on Twitter: @odbmsorg

Follow us on: LinkedIn


Nov 25 19

On Patient-driven Innovation. Interview with Amy Tenderich

by Roberto V. Zicari

“We find ourselves in a new era of patient-driven innovation, which drives far better design and fosters collaboration between stakeholders.” — Amy Tenderich.

I have interviewed Amy Tenderich, journalist / blogger, well known patient advocate, and founder and editor of DiabetesMine.


Q1. You are one of the leading advocates for the diabetic community. In 2007, you wrote an open letter to Steve Jobs that went viral, asking Apple to apply the same design skills to medical devices that Apple devoted to its consumer products. What happened since then?

Amy Tenderich: There has been a true Revolution in Diabetes Technology and the “consumerization” of medical devices in general… and I’m thrilled to be part of it! As I laid out in my “10 Years Later” post, the biggest milestones are:

  • Upsurge of patient involvement in innovation/design
  • Shift to data-driven disease care that increasingly prioritizes Interoperability of devices and data
  • US FDA forging a path for open, candid interaction between the regulatory agency and the patient community – which we had a hand in (very exciting!)
  • Consumer giants like Apple, Google, Microsoft, Samsung and others getting involved in healthcare, and diabetes specifically — which changes the landscape and mindset for products and services

Q2. At that time you wrote that the devices the diabetic community had to live with were “stuck in a bygone era”, created in an “engineering-driven, physician-centered bubble.”  How is the situation now?

Amy Tenderich: With the help of our prodding, medical products are now designed to be more compact, more comfortable, more aesthetic and more personalizable than ever before. In other words, they’re now keeping pace with consumer tech products.

For examples, see the Tandem t:slim insulin pump and the One Drop glucose meter – which both resemble Apple products – the Quell pain relief solution, and the dynamic, fun-to-use MySugr diabetes data app.

Q3. Why is it so hard to bring the tech and pharma worlds together?

Amy Tenderich: Good question! Check out the 2012 Altantic article titled, “The Reason Silicon Valley Hasn’t Built a Good Health App.” It basically outlines how technology companies tend to focus on the tech itself, without understanding the real-world use case.
Also, tech companies tend to develop and iterate at breakneck speed, whereas the healthcare world – especially big legacy pharma companies – are burdened by loads of regulations and historically moved at a glacial pace.

The good thing is, these two worlds are inching closer together as:

  • Pharma companies are by necessity transforming themselves into digital organizations that deal in software and innovate more rapidly, and
  • Tech companies are “getting religion” on understanding the real-world aspects of people’s health and disease care.

Q4. Who are the key diabetes “stakeholders”?

Amy Tenderich: Patients and caregivers, first and foremost, as the people literally “living this illness.” Then of course: Pharma and Medtech companies, FDA regulators, clinicians, researchers, other healthcare providers (eg. Certified Diabetes Educators), non-profit advocacy groups, health data platform and app developers, and healthcare designers.

Q5. Artificial Intelligence and Machine Learning (ML) are becoming widely discussed and employed in the diabetes tech world. What is your take on this?

Amy Tenderich: Indeed, AI/ML appear to be the wave of the future. All data-driven tools for diabetes care – including new Artificial Pancreas tech on the horizon – is based on these advanced computing techniques.

Q6. When using AI for diabetes: what are the main new regulatory and ethical issues that need to be faced?

Amy Tenderich: We were fortunate to have Bill Evans, Managing Director of Rock Health, present on this topic at our 2018 DiabetesMine Innovation Summit.

His slide on “Seven Threats to AI” laid out the following:


  • Over-focusing on “shiny objects” vs. the UX and business value.
  • Smart algorithms are being trained on dumb and dirty data.
  • Practitioners are building “black boxes” even they can’t understand.


  • Though they’re the key customers, most enterprise organizations don’t know where to begin.
  • Major incumbents possess—but fail to capitalize on—the most valuable commodity: Data.

INVESTORS : Hype allows some companies to masquerade as “AI” companies.

REGULATORS : Regulation of AI/ML still needs to come into focus.

Evans and Rock Health have actually been instrumental in helping the US FDA decide how to approach regulation of AI and Machine Learning in Healthcare. Their work focuses on gaining consensus around “ground truth data.” You can read all about it and even weigh in here.

Q7. Which do you care more about: Accelerating medical advances or protecting data rights?

Amy Tenderich:  The hope is that these are not mutually exclusive. But if you ask people in the Diabetes Community, I believe they would almost always prioritize accelerating medical advances.

That’s because type 1 diabetes is a potentially deadly disease that requires 24/7 effort just to stay out of the hospital. Data privacy seems a small trade-off for many people to get better tools that aid in our survival and reduce the disease burden.

Q8. Many in the Diabetes Community are turning to DIY tech to create their own data-sharing tools and so-called Automated Insulin Delivery (or “closed loop”) systems.  Can you please explain what this means? Is it legal?

Amy Tenderich:  I’m proud to say that we at DiabetesMine helped launch the #WeAreNotWaiting community rallying around this DIY tech.

That is, the now-hashtag “We Are Not Waiting” was the result of a group discussion at the very first DiabetesMine D-Data ExChange technology forum in November 2013 at Stanford University. We gathered some of the early tech-savvy patient pioneers who were beginning to reverse-engineer existing products and develop their own platforms, apps and cloud-based solutions to help people with diabetes better utilize devices and health data for improved outcomes.

Today, there is a vibrant community of thousands of patients using (and iterating on) their own homemade “closed loop systems” around the world. These systems connect a continuous glucose monitor (CGM) with an insulin pump via a sophisticated algorithm that essentially automates insulin dosing. Current systems do still require some user intervention (so the loop is not completely “closed”), but they greatly improve overall glucose control and quality of life for patients.

These DIY systems have not been approved by FDA for safety and effectiveness, but they are by no means illegal. In fact, the results have been so powerful that no less than 6 companies are seeking FDA approval for commercial systems with the same functionality. And one popular DIY model called Loop has been taken up by an ­­­­outfit called Tidepool for conversion into a commercial, FDA-scrutinized product.

Q9 Is it possible to use Social Media for real Health Impact?

Amy Tenderich:  Most certainly, yes. There is a growing body of evidence showing real-world impact on improved health outcomes. See for example this recent eVariant article that cites the benefits of patient-powered research networks, and states, “There’s no question that patients use the Internet to take control of their own health.”

See also, original research from our DiabetesMine team, published in the Journal of Diabetes Science and Technology (Nov 2018): “Findings indicate that social media provides a significant source not only of moral support and camaraderie, but also critical education on thriving with diabetes. Importantly, we observed strong evidence of peer influence on patients’ therapy and diabetes technology purchasing decisions.”

Q10 What is the FDA mHealth Pre-certification Program?  and what it Means for Diabetes?

Amy Tenderich:  This is the FDA’s revolutionary move to change how it reviews mobile apps and digital health software to accelerate the regulatory process and get these products out there for people to start using ASAP.

The agency announced its Pre-Certification for Software Pilot Program in July 2017. Its role is to evaluate and dub certain companies as “trustworthy,” to fast track their regulatory review process.

For the pilot, the FDA chose 9 companies out of more than 100 applicants, and notably for our Diabetes Community: seven of the nine companies have direct ties to diabetes!

See our coverage here for more details.

Qx Anything else you wish to add?

Amy Tenderich:  We find ourselves in a new era of patient-driven innovation, which drives far better design and fosters collaboration between stakeholders. There are so many exciting examples of this – in telemedicine, at the Mayo Clinic, and at Novo Nordisk, to name just a few.


Amy Tenderich photo

Amy Tenderich

Amy is the Founder and Editor of, a leading online information destination that she launched after her diagnosis with type 1 diabetes in 2003. The site is now part of San Francisco-based Healthline Media, where Amy also serves as Editorial Director, Diabetes & Patient Advocacy.

Amy is a journalist / blogger and nationally known patient advocate who hosts her own series of thought leadership events (the annual DiabetesMine Innovation Summit and biannual DiabetesMine D-Data ExChange) that bring patient entrepreneurs together with the medical establishment to accelerate change.

She is an active advisor to the American Association of Diabetes Educators (AADE) and medtech consultant, along with a frequent speaker at policy and digital health events.

As a pioneer in the Diabetes Online Community (DOC), Amy has conducted numerous patient community research projects, and authored articles for Diabetes Spectrum, the American Journal of Managed Care and the Journal of Diabetes Science and Technology.

Amy is also the proud mom of three amazing young adult daughters. In her “free time,” she enjoys hiking, biking, leisure travel, good wine and food, and just about anything relaxing done under the California sun.



Related Posts

–  On gaining Knowledge of Diabetes using Graphs. Interview with Alexander Jarasch, ODBMS Industry Watch, February 4, 2019.

–  On using AI and Data Analytics in Pharmaceutical Research. Interview with Bryn Roberts  ODBMS Industry Watch, September 10, 2018

Follow us on Twitter: @odbmsorg


Nov 7 19

On Redis. Interview with Salvatore Sanfilippo

by Roberto V. Zicari

“I think Redis is entering a new stage where there are a number of persons that now actively daily contribute to the open source. It’s not just “mostly myself”, and that’s great.” –Salvatore Sanfilippo

I have interviewed Salvatore Sanfilippo, the original developer of Redis. Redis is an open source in-memory database that persists on disk.

Q1.What is new in the Redis 6 release?

Salvatore Sanfilippo: The main new features are ACLs, SSL, I/O threading, the new protocol called RESP3, assisted client side caching support, a ton of new modules capabilities, new cluster tools, diskless replication, and other things, a very long list indeed.

Q2.Can you tell us a bit more about the new version of the Redis protocol (RESP3), what is it?  and why is it important?

Salvatore Sanfilippo: It’s just an incremental improvement over RESP2. The main goal is to make it more semantical. RESP2 is only able to represent arrays from the point of view of aggregated data types. Instead with RESP3 we have sets, hashes, and so forth. This makes simpler for client libraries to understand how to report the command reply back to the client, without having a conversion table from the array to the library language target type.

Q3.You have recently implemented a client side caching for Redis 6. What are the main benefits of this?

Salvatore Sanfilippo: Most big shops using Redis end having some way to memorize some information directly into the client. Imagine a social network that caches things in Redis, where the same post is displayed so many times because it is about a very famous person. To fetch it every time from Redis is a lot of useless efforts and cache traffic. So many inevitably end creating protocols to retain very popular items directly in the memory of the front-end systems, inside the application memory space. To do that you need to handle the invalidation of the cached keys. Redis new client side caching is a server side “help” in order to accomplish this goal. It is able to track what keys a given client memorized, and inform it when such keys gets modified, so that the client can invalidate them.

Q4. Are there any drawbacks as well?

Salvatore Sanfilippo: Sure, more caching layers, more invalidation, more complexity. Also more memory used by Redis to track the client keys.

Q5. “Streams” data structure were introduced in Redis 5. What is it? How does it differ from other open source streaming framework such as Apache Pulsar or Kafka ?

Salvatore Sanfilippo: A Redis stream is basically a “log” of items, where each item is a small dictionary composed of keys and values. On top of that simple data structure, which is very very memory efficient, we do other things that are more messaging and less data structure: consume a stream via a consumer group, block for new messages, and so forth.
There are use cases that can be solved with both Redis Streams and Pulsar or Kafka, but I’m against products comparisons, it’s up the users to understand what they need.

Q6.What are you working at present?

Salvatore Sanfilippo: At finalizing the Redis 6 release adding many new module APIs, and also porting the Disque project ( as a Redis module.

Q7. What is your vision ahead for Redis?

Salvatore Sanfilippo: I think Redis is entering a new stage where there are a number of persons that now actively daily contribute to the open source. It’s not just “mostly myself”, and that’s great.
Redis modules are playing an interesting role, we see Redis Labs creating modules, but also from the bug reports in the Github repository, I think that there are people that are writing modules to specialize Redis for their own uses, which is great.


 Salvatore Sanfilippo


Salvatore started his career in 1997 as a security researcher, writing the hping ( security tool and inventing the Idle Scan ( Later he worked on embedded systems, focusing on programming languages research and creating a small footprint Tcl interpreter, which is still in active development. With a colleague, Salvatore created the first two Italian social applications in partnership with Telecom Italia. After this experience, he decided to explore new ways to improve web app development, and ended up writing the first alpha version of Redis and open sourcing it. Since 2009, he has dedicated most of his time to developing Redis open source code.

Over the years, Salvatore has also created a number of other open source projects ranging from software defined radio, to line editing tools, to children development environments. He lives in Sicily, Italy.


Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker.

Related Posts

On Redis.Q&A with Yiftach Shoolman,Founder and CTO of Redis Labs.

follow us on Twitter: @odbmsorg


Sep 30 19

On Innovation and Digital Technology. Interview with Rahmyn Kress

by Roberto V. Zicari

“Corporations can have the best technology, the best digital infrastructure, but if they cannot excite people to work with it and to see it not only as a tool, but a vehicle for innovation and development that can massively empower the greater vision of the company they are part of, technology will only reach half its potential.” –Rahmyn Kress

I have interviewed Rahmyn Kress, Chairman of the Digital Executive Committee at Henkeland founder of Henkel X, an open-innovation platform accelerating Henkel’s entrepreneurial transformation.


Q1. We are seeing a new wave of disruption through digital technology. What are the main challenges and opportunities?

Rahmyn Kress: I personally think the biggest challenge of digital disruption is not finding and implementing new technologies, but rather familiarizing employees with them so they will accept the change.
Corporations can have the best technology, the best digital infrastructure, but if they cannot excite people to work with it and to see it not only as a tool, but a vehicle for innovation and development that can massively empower the greater vision of the company they are part of, technology will only reach half its potential.
That is why it is so important to link the topic of digitization with something positive and make it come alive through dialogue and interaction.
We, at Henkel X, are doing this through various activities: Our CDO+1 lunch takes place every two weeks and gives employees the opportunity to ask questions about recent trends, disruptive technologies and Henkel X projects. We also introduced our Henkel X App, which is filled with curated content and offers an opportunity to chat with coworkers from around the world.
Furthermore, we launched our Digital BaseFit initiative to provide employees with the basic digital knowledge they need to know today. And there is also the opportunity to attend our Show & Tell events where startups pitch their ideas to Henkel – a total of 12,000 participants from various backgrounds in Henkel have dialled in or attended the events in person. All these initiatives make it much easier for us to address new technologies and issues.

Q2. You have founded ” Henkel X”. Can you please explain how it works?

Rahmyn Kress: When Marius (Swart) and I founded Henkel X in February 2018, we designed it not only to accelerate Henkel’s entrepreneurial transformation, but to provide an open innovation platform that could act as a catalyst of industrial change, drive innovations and create disruptive business models for the whole industry. In order to do that, we established and operate several impact driven programs for its members based on three pillars: The biggest value lies in our Ecosystem integrating a strong, diverse network of partners and experts sharing knowledge, views and ideas. On top of that we create Experiences, that means we organize and host events to foster collaboration and innovation, and finally we facilitate Experimentation to boost new ways of working and building minimum viable products (MVPs) fast, in an agile environment.

Q3. How do you create a curious fail fast culture in the enterprise?

Rahmyn Kress: Through the establishment of Henkel X we are trying to introduce a culture that enables employees to generate ideas, act quickly and thus achieve rapid success – or fail fast. We really try to create a vibrant field for experimentation and encourage our business units not to shy away from contact and cooperation with startups. This approach carries the risk of failure, but other paths can quickly be taken so that the teams can implement their projects successfully in the shortest possible time. As Steve Jobs once said: “Sometimes when you innovate, you make mistakes. It is best to admit them quickly, and get on with improving your other innovations.” Speed and the willingness to fail are key in order to drive digital innovation and stay competitive in the market.

Q4. Isn’t this culture dependent?

Rahmyn Kress: Yes, it totally is. And it is one of the most difficult points in the digital innovation process. In order for corporates to adapt to the new technologies we definitely need to cultivate a stronger trial and error mentality. In digital innovation, for example, Germany lags five years behind the United States – even though, among all European countries it is Germany in particular which has a huge amount of potential: outstanding tech know-how, financially powerful brands and corporations, and a large number of globally leading research institutes focused on new technologies and digitization. We should make use of these resources – and with Henkel X that’s precisely what we’re doing.

Q5. What are the main lessons you have learned while working at Accenture and Universal Music?

Rahmyn Kress: As the EVP of Physical and Digital Transformation, Supply Chain and Operations at Universal Music Group I quickly built a wealth of experience in digital transformation, just as the first wave of disruption hit the media industry. It was an early wake-up call that taught me how to pivot and adapt in an industry entering the early stages of change. Initially, digital only accounted for a small percentage of the music industry’s total revenue, but it suddenly became clear that if digital continued to prevail then manufacturing and logistics would become a commodity. I am not suggesting for one second that this is true for consumer goods, but we have so many examples of rapid change that the signs of digital transformation must be taken very seriously. This mainly affects how we handle products and the way we orient ourselves towards services instead of products. I saw this during my time at Accenture as well, where I created and essentially optimized digital supply chains and helped large corporates in their efforts to pivot their business towards the digital transformation.

Q6. What are your current projects as Chairman of the Digital Executive Committee at Henkel?

Rahmyn Kress: I see myself as a catalyst who strengthens the entrepreneurial spirit and questions existing structures and processes: To make our internal digital upskilling program as effective as possible, for example, we created a digital glossary that ensures we speak a common language. Also, my team put together the digital technology stack to help us communicate with our audience that is using the Henkel brands and products. By having a common architecture throughout the organisation we can move faster when it comes to adaptation and enhancements going forward. Most importantly, we have the opportunity to capture data that we can later extract value from – be it in supply chain optimisation or understanding emerging customer and consumer trends.
But our efforts in rolling out the digital transformation don’t stop here: As Henkel X also operates as an open innovation platform we initiated Henkel X Partners, a network forum during which we bring local entrepreneurs, industry partners, VC’s, influential family businesses, startups, and thought leaders together. As collaborating partners they form part of our ecosystem which we intend to grow and strengthen across Europe. Last month, for example, we launched Henkel X Partners in Barcelona to extend the Henkel X activities into Spain and build this regional extension. In October we are going to do the same in Milan in close cooperation with Bocconi.

Q7. You have set up a priority to accelerate digitisation in your company. What are the main stumbling blocks, since you are not Google?

Rahmyn Kress: The biggest challenge does not lie in digitisation itself, but in how we use it to change the way we work and the way we do business, and in what new business areas and models we are developing. We have to think long-term and as a visionary. This means asking ourselves, for example, “Will there be any washing powders and adhesives as we know them today at all in the future,?”, “Will this still be our core business?”.
In order to find the right answers and move forward in the right direction, I think in three different dimensions, which can be seen as three interconnected horizons: Horizon 1 focuses on the constant optimisation of the core business through digital technology to fund the growth of the incremental innovation. Horizon 2 is about transforming and finding new business models. Perhaps in the future we will sell washing powder like coffee capsules?
Nowadays, we are still talking about our ‘traditional products’, which may be consumed completely differently in the future. And this brings us to Horizon 3 which is about actual disruptive innovations – the so called ‘moon shots’. Here, completely new business models are conceivable. The most important thing is to engage in all three horizons at the same time. Therefore each organisation needs to decide for itself, how much it wants to invest in each of them by taking into account the challenges, opportunities and threats in the marketplace, as well as the respective digital maturity.

Q8. You are a member of the World Economic Forum for the “Platform Economy“. What are the key insights you have gained out of this activity?

Rahmyn Kress: We are moving from a product focused to a much more platform focused world. Platform business models have higher barriers to entry, but once they’re established and operating, they are very difficult to compete against. Many organizations struggle with the rate of external innovation, they feel they can’t keep up. That is why they need to start thinking more about ways to collaborate together than how to compete with each other. Business as usual is a thing of the past: It is no longer big versus small, but rather slow versus fast – economic platforms are a promising answer to this ongoing shift.

Q9. Artificial Intelligence is on the rise. Is AI part of your strategy, and if yes, can you please explain what do you expect out of AI?

Rahmyn Kress: We see AI as an essential part of our strategy. Just recently, we entered into a partnership with Cognition X to make AI adoption more efficient and to drive transformation. Henkel X will use Cognition X, a fascinating AI news and advice platform, to engage with the Henkel X community through information, and to provide a network of expertise and knowledge around the deployment of artificial intelligence. Furthermore, we will start to roll out the Enterprise Edition of CognitionX’s AI Advice Platform, to access a Wiki of AI products. AI is great and we should make use of it!

Qx. Anything else you wish to add.

Rahmyn Kress: It is definitely time that we start to consider our industrial peers as business partners instead of competitors. Of course, there are areas of rivalry, especially in relation to products. But when it comes to innovation, we should work, think, and develop together. Here we can also learn from the music industry which demonstrates how important common platforms are. Digital transformation is a joint responsibility and our goal should be to enhance future growth, build reliable ecosystems across our value chains and drive digital innovation forward. What we need are places, digital or physical, to exchange and discuss ideas to hit our targets before others do – that is exactly what Henkel X aims to achieve.


Dr. Rahmyn Kress is  Chairman of the Digital Executive Committee at Henkel and founder of Henkel X, an open-innovation platform accelerating Henkel’s entrepreneurial transformation.

Previously, he was President and CEO of DigiPlug, a tech company acquired by Accenture. Kress then joined ACCENTURE Ventures as their lead for Europe, Latin America and Africa.

Today, Kress is an active member in the venture capital and start-up community as mentor and angel investor and a member of several executive advisory boards, including the World Economic Forum Platform Economy advisory board.

Most recently, he founded an initiative that is uniting entrepreneurs, artists, business leaders, investors and strategists to create awareness and provide support around neurodiversity like ADHD and dyslexia to be recognized as unique skills in an entrepreneurial world.


Henkel X

World Economic Forum for the “Platform Economy“

Related Posts

On Digital Transformation, Big Data, Advanced Analytics, AI for the Financial Sector. Interview with Kerem Tomak ODBMS Industry Watch, 2019-07-08

Follow us on Twitter: @odbmsorg


Sep 7 19

On Video Analytics for Smart Cities. Interview with Ganesh Ananthanarayanan

by Roberto V. Zicari

“Cameras are now everywhere. Large-scale video processing is a grand challenge representing an important frontier for analytics, what with videos from factory floors, traffic intersections, police vehicles, and retail shops. It’s the golden era for computer vision, AI, and machine learning – it’s a great time now to extract value from videos to impact science, society, and business!” — Ganesh Ananthanarayanan

I have interviewed Ganesh Ananthanarayanan. We talked about his projects at Microsoft Research.


Q1. What is your role at Microsoft Research?

Ganesh Ananthanarayanan: I am a Researcher at Microsoft Research. Microsoft Research is a research wing within Microsoft, and my role is to watch out for key technology trends and work on large scale networked-systems.

Q2. Your current research focus is to democratize video analytics. What is it?

Ganesh Ananthanarayanan:  Cameras are now everywhere. Large-scale video processing is a grand challenge representing an important frontier for analytics, what with videos from factory floors, traffic intersections, police vehicles, and retail shops. It’s the golden era for computer vision, AI, and machine learning – it’s a great time now to extract value from videos to impact science, society, and business!

Project Rocket‘s goal is to democratize video analytics: build a system for real-time, low-cost, accurate analysis of live videos. This system will work across a geo-distributed hierarchy of intelligent edges and large clouds, with the ultimate goal of making it easy and affordable for anyone with a camera stream to benefit from video analytics. Easy in the sense that any non-expert in AI should be able to use video analytics and derive value. Affordable because the latest advances in CV are still very resource intensive and expensive to use.

Q3. What are the main technical challenges of large-scale video processing?

Ganesh Ananthanarayanan: In the hotly growing “Internet of Things” domain, cameras are the most challenging of “things” in terms of data volume, (vision) processing algorithms, response latencies, and security sensitivities. They dwarf other sensors in data sizes and analytics costs, and analyzing videos will be a key workload in the IoT space. Consequently, we believe that large-scale video analytics is a grand challenge for the research community representing an important and exciting frontier for big data systems.

Unlike text or numeric processing, videos require high bandwidth (e.g., up to 5 Mbps for HD streams), need fast CPUs and GPUs, richer query semantics, and tight security guarantees. Our goal is to build and deploy a highly efficient distributed video analytic  system. This will entail new research on (1) building a scalable, reliable and secure systems framework for capturing and processing video data from geographically distributed cameras; (2) efficient computer vision algorithms for detecting objects, performing analytics and issuing alerts on streaming video; and (3) efficient monitoring and management of computational and storage resources over a hybrid cloud computing infrastructure by reducing data movement, balancing loads over multiple cloud instances, and enhancing data-level parallelism.

Q4. What are the requirements posed by video analytics queries for systems such as IoT and edge computing?

Ganesh Ananthanarayanan: Live video analytics pose the following stringent requirements:

1) Latency: Applications require processing the video at very low latency because the output of the analytics is used to interact with humans (such as in augmented reality scenarios) or to actuate some other system (such as intersection traffic lights).

2) Bandwidth: High-definition video requires large bandwidth (5Mbps or even 25Mbps for 4K video) and streaming large number of video feeds directly to the cloud might be infeasible. When cameras are connected wirelessly, such as inside a car, the available uplink bandwidth is very limited.

3) Provisioning: Using compute at the cameras allows for correspondingly lower provisioning (or usage) in the cloud. Also, uninteresting parts of the video can be filtered out, for example, using motion-detection techniques, thus dramatically reducing the bandwidth that needs to be provisioned.

Besides low latency and efficient bandwidth usage, another major consideration for continuous video analytics is the high compute cost of video processing. Because of the high data volumes, compute demands, and latency requirements, we believe that largescale video analytics may well represent the killer application for edge computing.

Q5. Can you explain how Rocket allows programmers to plug-in vision algorithms while scaling across a hierarchy of intelligent edges and the cloud?

Ganesh Ananthanarayanan: Rocket ( is an extensible software stack for democratizing video analytics: making it easy and affordable for anyone with a camera stream to benefit from computer vision and machine learning algorithms. Rocket allows programmers to plug-in their favorite vision algorithms while scaling across a hierarchy of intelligent edges and the cloud.

The figure above shows our video analytics stack, Rocket, that supports multiple applications including traffic camera analytics for smart cities, retail store intelligence scenarios, and home assistants. The “queries” of these applications are converted into a pipeline of vision modules by the video pipeline optimizer to process live video streams. The video pipeline consists of multiple modules including the decoder, background subtractor, and deep neural network (DNN) models.

Rocket partitions the video pipeline across the edge and the cloud. For instance, it is preferable to run the heavier DNNs on the cloud where the resources are plentiful. Rocket’s edge-cloud partitioning ensures that: (i) the compute (CPU and GPU) on the edge device is not overloaded and only used for cheap filtering, and (ii) the data sent between the edge and the cloud does not overload the network link. Rocket also periodically checks the connectivity to the cloud and falls back to an “edge-only” mode when disconnected. This avoids any disruption to the video analytics but may produce outputs of lower accuracy due to relying only on lightweight models. Finally, Rocket piggybacks on the live video analytics to use its results as an index for after-the-fact interactive querying on stored videos.

More details can be found in our recent MobiSys 2019 work.

Q6. One of the verticals your project is focused on is video streams from cameras at traffic intersections. Can you please tell us more how this works in practice?

Ganesh Ananthanarayanan: As we embarked on this project, two key trends stood out: (i) cities were already equipped with a lot of cameras and had plans to deploy many more, and (ii) traffic related fatalities were among the top-10 causes of deaths worldwide, which is terrible! So, in partnership with my colleague (Franz Loewenherz) at the City of Bellevue, we asked the question: can we use traffic video analytics to improve traffic safety, traffic efficiency, and traffic planning? We understood that most jurisdictions have little to no data on the continuous trends on directional traffic volumes; accident near-misses; pedestrian, bike & multi-modal volumes, etc. Data on these is usually got by commissioning an agency to count vehicles once or twice a year for a day.

We have built technology that analyzes traffic camera feeds 24X7 at low cost to power a dashboard of directional traffic volumes. The dashboard raises alerts on traffic congestion & conflicts. Such a capability can be vital towards traffic planning (of lanes), traffic efficiency (light durations), and safety (unsafe intersections).
A key aspect is that we do our video analytics using existing cameras and consciously decided to shy away from installing our own cameras. Check out this project video on Video Analytics for Smart Cities.

Q7. What are the lessons learned so far from your on-going pilot in Bellevue (Washington) for active traffic monitoring of traffic intersections live 24X7?  Does it really help preventing traffic-related accidents?  Does the use of your technology help your partners with jurisdictions to identify traffic details that impact traffic planning and safety?

Ganesh Ananthanarayanan: Our traffic analytics dashboard runs 24X7 and accumulates data non-stop that the officials didn’t have access to before. It helps them understand instances of unexpectedly high traffic volumes in certain directions. It also generates alerts on traffic volumes to help dispatch personnel accordingly. We also used the technology for planning a bike corridor in Bellevue. The objective was to do a before/after study of the bike corridor to help understand the impact of the corridor on driver behavior. The City plans to use the results, to decide on bike corridor designs.

Our goal is to make the roads considerably safer & efficient with affordable video analytics. We expect that video analytics will be able to drive decisions of cities precisely in these directions towards how they manage their lights, lanes, and signs. We also believe that data regarding traffic volumes from a dense network of cameras will be able to power & augment routing applications for better navigation.

As the number of cities that start to deploy the solution increase, it will only increase the accuracy of the computer vision models with better training data, thus leading to a nice virtuous cycle.

Qx Anything else you wish to add?

Ganesh Ananthanarayanan: So far I’ve described our video analytics solution on how it uses video cameras to continuously analyze and get data. One thing I am particularly excited to make happen is to “complete the loop”. That is, take the output from the video analytics and in real-time actuate it on the ground to users. For instance, if we predict an unsafe interaction between a bicycle & car, send a notification to one or both of them. Pedestrian lights can be automatically activated and even extended for people with disabilities (e.g., in a wheelchair) to enable them to safely cross the road (see demo). I believe that the infrastructure will be sufficiently equipped for this kind of communication in a few years. Another example of this is warning approaching cars when they cannot spot pedestrians between parked cars on the road.

I am really excited about the prospect of the AI analytics interacting with the infrastructure and people on the ground and I believe we are well on track for it!


Ganesh Ananthanarayanan is a Researcher at Microsoft Research. His research interests are broadly in systems & networking, with recent focus on live video analytics, cloud computing & large scale data analytics systems, and Internet performance. His work on “Video Analytics for Vision Zero” on analyzing traffic camera feeds won the Institute of Transportation Engineers 2017 Achievement Award as well as the “Safer Cities, Safer People” US Department of Transportation Award. He has published over 30 papers in systems & networking conferences such as USENIX OSDI, ACM SIGCOMM and USENIX NSDI. He has collaborated with and shipped technology to Microsoft’s cloud and online products like the Azure Cloud, Cosmos (Microsoft’s big data system) and Skype. He is a member of the ACM Future of Computing Academy. Prior to joining Microsoft Research, he completed his Ph.D. at UC Berkeley in Dec 2013, where he was also a recipient of the UC Berkeley Regents Fellowship. For more details:


Rocket (

Related Posts

– On Amundsen. Q&A with Li Gao tech lead at Lyft, Expert Article, JUL 30, 2019

–  On IoT, edge computing and Couchbase Mobile. Q&A with Priya Rajagopal, Expert article, JUL 25, 2019


Aug 22 19

On MariaDB. Interview with Michael Widenius

by Roberto V. Zicari

“The best possible database migration is when you are able to move all your data and stored procedures unchanged(!) to the new system.” — Michael Widenius.

I have interviewed Michael “Monty” Widenius,  Chief Technology Officer at MariaDB Corporation.

Monty is the “spiritual father” of MariaDB, a renowned advocate for the open source software movement and one of the original developers of MySQL.


Q1. What is adaptive scaling and why is it important for a database?

Michael Widenius: Adaptive scaling is provided to automatically change behavior in order to use available resources as efficiently as possible, as demands grows or shrinks. For a database, it means the ability to dynamically configure resources, adding or deleting data nodes and processing nodes according to demand. This provides both scale up and scale out in an easy manner.

Many databases can do part of this manually, a few can do this semi-automatically. When it comes to read scaling with replication, there are a few solutions, like Oracle RAC, but there are very few relational database systems that can handle true write scaling while preserving true ACID properties. This is a critical need for any company that wants to compete in the data space. That’s one of the reasons why MariaDB acquired ClustrixDB last year.

Q2. Technically speaking, how is it possible to adjust scaling so that you can run the database in the background in a desktop with very few resources, and up to a multi node cluster with petabytes of data with read and write scaling?

Michael Widenius: Traditionally databases are optimized for one particular setup. It’s very hard to be able to run efficiently both with a very small footprint, which is what desktop users are expecting, and yet provide extreme scale out.

The reason we can do that in MariaDB Platform is thanks to the unique separation between the query processing and data storage layers (storage engines). One can start by using a storage engine that requires a relatively small footprint (Aria or InnoDB) and, when demands grow, with a few commands move all or just part of the data to distributed storage with MariaDB ColumnStore, Spider, MyRocks or, in the future, ClustrixDB. One can also very easily move to a replication setup where you have one master for all writes and any number of read replicas. MariaDB Cluster can be used to provide a fully functional master-master network that can be replicated to remote data centers.

My belief is that MariaDB is the most advanced database in existence, when it comes to providing complex replication setups and very different ways to access and store data (providing OLTP, OLAP and hybrid OLTP/OLAP functionalities) while still providing one consistent interface to the end user.

Q3. How do you plan to use ClustrixDB distributed database technology for MariaDB?

Michael Widenius: We will add this as another storage engine for the user to choose from. What it means is that if one wants to switch a table called t1 from InnoDB to ClustrixDB, the only command the user needs to do is:


The interesting thing with ClustrixDB is not only that it’s distributed and can automatically scale up and down based on demands, but also that a table on ClustrixDB can be accessed by different MariaDB servers. If you create a ClustrixDB table on one MariaDB server, it’s at once visible to all other MariaDB servers that are attached to the same cluster.

Q3. Why is having Oracle compatibility in MariaDB a game changer for the database industry?

Michael Widenius:MariaDB Platform is the only enterprise open source database that supports a significant set of Oracle syntax. This makes it possible for the first time to easily move Oracle applications to an open source solution, get rid of single-vendor lock-in and leverage existing skill sets. MariaDB Corporation is also the best place to get migration help as well as enterprise features, consultative support and maintenance.

Q4. How do you manage with MariaDB to parse, depending on the case,  approximately 80 percent of the legacy Oracle PL/SQL without rewriting the code?

Michael Widenius: Oracle PL/SQL was originally based on the same standard that created SQL, however Oracle decided to use different syntax than what’s used in ANSI SQL. Fortunately, most of the logical language constructs are the same. This made it possible to provide a mapping from most of the PL/SQL constructs to ANSI.

What we did:

– Created a new parser, sql_yacc_ora.yy, which understands the PL/SQL constructs, and map the PL/SQL syntax to existing MariaDB internal structures.

– Added support for SQL_MODE=ORACLE mode, to allow the user to switch which parser to use. The mode is stored as part of SQL procedures to allow users to run a stored procedure without having to know if it’s written in ANSI SQL or PL/SQL.

– Extended MariaDB with new Oracle compatibility that we didn’t have before such as SEQUENCES, PACKAGES, ROW TYPE etc.

You can read all about the Oracle compatibility functionality that MariaDB supports here.

Q5. When embarking on a database migration, what are the best practices and technical solutions you recommend?

Michael Widenius: The best possible database migration is when you are able to move all your data and stored procedures unchanged(!) to the new system.

That is our goal when we are supporting a migration from Oracle to MariaDB. This usually means that we are working closely with the customer to analyze the difficulty of the migration and determine a migration plan. It also helps that MariaDB supports MariaDB SQL/PL, a compatible subset of Oracle PL/SQL language.

If MariaDB is fairly new to you, then it’s best to start with something small that only uses a few stored procedures to give DBAs a chance to get to know MariaDB better. When you’ve succeeded to move a couple of smaller installations, then it’s time to start with the larger ones. Our expert migration team is standing by to assist you in any way possible.

Q6. Why did you combine your transactional and analytical databases into a single platform, MariaDB Platform X3?

Michael Widenius: Because thanks to the storage engine interface it’s easy for MariaDB to provide both transactional and analytical storage with one interface.  Today it’s not efficient or desirable to have to move between databases just because your data needs grows. MariaDB can also provide the unique capability of using different storage engines on master and replicas. This allows you to have your master optimized for inserts while some of your replicas are optimized for analytical queries.

Q7. You also launched a managed service supporting public and hybrid cloud deployments. What are the benefits of such service to enterprises?

Michael Widenius: Some enterprises find it hard to find the right DBAs (these are still a scarce resource) and would rather want to focus on their core business instead of managing their databases. The managed service is there to help these enterprises to not have to think about how to keep the database servers up and running. Maintenance, upgrading and optimizing of the database will instead be done by people that are the definitive experts in this area.

Q8. What are the limitations of existing public cloud service offerings in helping companies succeed across their diverse cloud and on-prem environments?

Michael Widenius: Most of the existing cloud services for databases only ensures that the “database is up and running”. They don’t provide database maintenance, upgrading, optimization, consultative support or disaster management. More importantly you’re only getting a watered down version of MariaDB in the cloud rather than the full featured version you get with MariaDB Platform. If you encounter performance problems, serious bugs, crashes or data loss, you are on your own. You also don’t have anyone to talk with if you need new features for your database that your business requires.

Q9. How does MariaDB Platform Managed Service differ from existing cloud offering such as Amazon RDS and Aurora?

Michael Widenius: In our benchmarks that we shared at our MariaDB OpenWorks conference earlier this year, we showed that MariaDB’s Managed Service offering beats Amazon RDS and Aurora when it comes to performance. Our managed service also unlocks capabilities such as columnar storage, data masking, database firewall and many more features that you can’t get in Amazon’s services. See the full list here for a comparison.

Q10. What are the main advantages of using a mix of cloud and on-prem?

Michael Widenius: There are many reasons why a company will use a mix of cloud and on-prem. Cloud is where all the growth is and many new applications will likely go to the cloud. At the same time, this will take time and we’ll see many applications stay on prem for a while. Companies may decide to keep applications on prem for compliance and regulatory reasons as well. In general, it’s not good for any company to have a vendor that totally locks them into one solution. By ensuring you can run the exact same database on both on-prem and cloud, including ensuring that you have all your data in both places, you can be sure your company will not have a single point of failure.

Michael “Monty” Widenius,  Chief Technology Officer, MariaDB.

Monty is the “spiritual father” of MariaDB, a renowned advocate for the open source software movement and one of the original developers of MySQL, the predecessor to MariaDB. In addition to serving as CTO for the MariaDB Corporation, he also serves as a board member of the MariaDB Foundation. He was a founder at SkySQL, and the CTO of MySQL AB until its sale to Sun Microsystems (now Oracle). Monty was also the founder of TCX DataKonsult AB, a Swedish data warehousing company. He is the co-author of the MySQL Reference Manual and was awarded in 2003 the Finnish Software Entrepreneur of the Year prize. In 2015, Monty was selected as one of the 100 most influential persons in the Finnish IT market. Monty studied at Helsinki University of Technology and lives in Finland.



MariaDB’s “Restful Nights” Release Brings Peace of Mind to Enterprise Customers

Related Posts

On MariaDB Kubernetes Operator. Q&A with Saravana Krishnamurthy.,  June 21, 2019

On the Database Market. Interview with Merv Adrian, ODBMS Industry Watch, April 23, 2019

Follow us on Twitter: @odbmsorg