Comments for ODBMS Industry Watch Trends and Information on AI, Big Data, New Data Management Technologies, Data Science and Innovation. Thu, 30 Jul 2020 10:08:53 +0000 hourly 1 Comment on Thirty Years C++. Interview with Bjarne Stroustrup by Tom Welsh Thu, 30 Jul 2020 10:08:53 +0000 An excellent interview with one of the giants of software. Although he has been interviewed hundreds of times before (at least), this is full of fresh and interesting insights.

Congratulations and thanks!

Comment on On Databases and Non-Volatile Memory technologies. Interview with Joy Arulraj and Andrew Pavlo by Joy Arulraj Tue, 18 Jun 2019 08:07:11 +0000 Steven — Thanks for sharing the architecture of eXtremeDB!

The write-behind logging protocol is indeed inspired by several prior research efforts in both memory-centric and disk-centric database systems. We recently came to know that a paper advocating a no-redo/undo recovery protocol was published in SIGMOD’75 [1]. The author states that they were developing a database system for the Puerto Rican DOT and that this design decision was geared towards handling frequent power outages during the summer. Write-behind logging differs from prior no-redo/undo recovery protocol in the manner in which it is tightly integrated with the multi-versioned concurrency control protocol.

[1] File structure design to facilitate on-line instantaneous updating, Robert L. Rappaport

Comment on On Databases and Non-Volatile Memory technologies. Interview with Joy Arulraj and Andrew Pavlo by Steven Graves Fri, 14 Jun 2019 00:33:10 +0000 It is unfortunate that the authors did not include eXtremeDB in their research. If they had, they would know that eXtremeDB has had these capabilities since 2006, when we began supporting Curtiss-Wright boards with battery-backed RAM, and was proven again in 2013 with published papers demonstrating the superior performance and durability of NVDIMMs from Agigatech and Micron (they replace the battery with a super capacitor) versus PCIe SSDs, and in 2017 testing with eXtremeDB and Optane.

While the terminology is different, the approach described in this article as ‘write behind logging’ is similar to eXtremeDB, in which data is updated in place and only information needed to undo the transaction in case of overt rollback or application/system crash is kept. Once the transaction is committed, the undo information is no longer needed and is overwritten by the next transaction. The philosophy of eXtremeDB has always been that commits are the norm and should be fast, while aborts are exceptional and can be slightly more time consuming (but still blazingly fast).

As the authors describe in their case, the index structures used in eXtremeDB are modified for the in-memory use case to be faster and smaller. (eXtremeDB was originally conceived, designed and implemented as an in-memory database in 2001.)

The authors are absolutely correct that a DBMS must be designed for this; not just any ol’ in-memory database can get maximum leverage from persistent memory. That said, there’s absolutely no difference from the DBMS’ perspective between 2006’s battery-backed RAM, 2013’s NVDIMM, and modern Optane or other persistent memory technology.

Steven Graves, CEO, McObject

Comment on On European Data Protection. Interview with Giovanni Buttarelli by Jonathan Sat, 23 Mar 2019 03:15:31 +0000 Very interesting read and indeed a big topic in the rapid growth of AI/IoT over the past few years.

Would love to hear how we can create an environment in which startups (or lean corporate efforts), which are at the forefront in this new space of AI/IoT, could navigate in a LEAN way in this new complex regulatory space? I feel many startups do focus much on the “technological how” and often lack the know-how / resources for the regulatory space which can hold them back on focusing what they do best: innovating.

Comment on On European Data Protection. Interview with Giovanni Buttarelli by Chinmayee Mon, 18 Mar 2019 09:30:17 +0000 Great article and Mr. Giovanni Buttarelli has rightly pointed out some of the biggest challenges that lie ahead of us as a digital society.

The pressing questions mentioned in the article such as “how do we control bias in AI systems? How do we keep control of AI developments and algorithms? Who decides what is right or wrong?” have to be deliberated upon.

One of the critical challenges lies in understanding how the algorithm comes to a decision. With the amount of complexity in the algorithms today, how can we even solve this issue? We have the recent example of the Ethiopian airlines and the Lion One airlines crash to prove why it is important to understand the “black box” and its decisions. As humans, we can at least justify our decisions and reflect on what was right and what was wrong. How can we achieve this with AI, if we don’t really understand how they arrived at a conclusion?

I feel that ‘Ethics by design’ might be the first step towards mitigating some of the risks associated with the data-driven world. It would be great to see more articles on this topic!

Comment on On gaining Knowledge of Diabetes using Graphs. Interview with Alexander Jarasch by David McComb Fri, 15 Feb 2019 20:20:02 +0000 Yes, very right on. I think graphs are the way to go with complex disease states.

However I think a bit of pruning needs to be done. I have done some research on diabetes myself. I am (was) pre-diabetic with a A1c of 5.8. I am now at 5.3 without medication.

I would prune out animal studies on (mostly) herbivores (rabbits and mice) as the key hormonal pathways are different enough that the conclusions are going to be more distraction than useful.

We know with as much certainty as one gets in medicine that diabetes is based on malfunctioning in the insulin related metabolism. It’s complete absence in type 1 and variations of insulin resistance in type 2. Insulin converts glucose to fat and at the same time inhibits fat metabolism. The western diet gives us a glucose hit every few hours (from sugar, carbohydrates, snacks etc), a healthy body responds by producing insulin. The insulin converts the glucose to fat (that is the relationship between obesity and diabetes by the way, it isn’t that obesity causes diabetes, they are both caused at the same time by the same mechanism which is why they co-occur). Over time adipose fat cells become over stuffed and the deposition gets harder and harder, this is the beginning of insulin resistance. The other path is fat deposition in the liver and especially the pancreas. Large deposits in the pancreas inhibits the islets of islet of Langerhans, and insulin production drops. Both cause glucose levels to remain elevated, which is the classic type 2 diabetes.

The whole cycle is reversible by eliminating sugar and carbohydrate from ones diet.

We should start with what we know and use the graph databases to drill down into the deeper details, instead of getting lost trying to re-invent what is already known.

I know its over stated but let’s not confuse correlation with causation. Especially when we know some of the causation. Let’s look for correlation to find the next level of causation.

(or use the graph database to see if what I just said is crap)

Comment on Big Data and AI– Ethical and Societal implications by Linda Fisher Thornton Wed, 06 Feb 2019 15:15:14 +0000 That is the key question I’m in the process of exploring. I am working on a paper taking a multidimensional look at the ethics of the Iot with links to resources and plan to share it in the coming months. It includes aspects of AI.

Comment on Big Data and AI– Ethical and Societal implications by Francesco Mon, 04 Feb 2019 03:04:20 +0000 Roberto,

Fantastic presentation, thanks for sharing it.

My most immediate comment is that ethics, especially in a deep tech context, means several different things, so we could proceed by breaking down the big problems into more “manageable issues”, and try to propose solutions to narrow problems at first. In details, we can potentially divide it into:

– biases;
– accountability;
– trust;
– usage;
– control;
– safety.

Just for reference, I explore more on this here:

Said so, I also believe we should start approaching ethics as a technical problem, rather than merely qualitative. It comes with a cost (an implementation cost, an accuracy/performance cost, etc.) and with several benefits, and I am sure that bringing some quantitative measure to the table will help policymakers to act more promptly.

Finally, I would love research on ethics to focus more on two aspects, which are very relevant to me and where I couldn’t find a lot to be done:

– Algorithmic Aversion: humans often still decide they don’t want to actively listen to something they know being better;
– what I call Paradigm 37-78: what I mean is the degree we affect machines values/ethics, and at the same time how much the impact us in turn.

Comment on Big Data and AI– Ethical and Societal implications by Roberto V. Zicari Mon, 28 Jan 2019 09:39:37 +0000 Back in 2007, in a talk I gave at Google HQ, I introduced the concept of “Ethical Social Code”, to avoid that:

– Stickness equals sickness,
– Repeated becomes addictive.

Link to video (43:36):

#Ethics #Data #UseProfile #MachineLearning #ML #AI #Citizens

Comment on Big Data and AI– Ethical and Societal implications by Roberto V. Zicari Mon, 28 Jan 2019 09:38:39 +0000 Thank you Patrice, very useful!