ODBMS Industry Watch

Mar 29 24

On AI Factory and Generative AI. Interview with Ashok Reddy.

by Roberto V. Zicari

“ The AI Factory is a revolutionary concept aimed at streamlining the creation and deployment of AI models within organizations. It serves as a centralized hub where data scientists, engineers, and domain experts collaborate to develop, train, and deploy AI solutions at scale.“

Q1. You joined KX as Chief Executive Officer in August 2022 (*). How has the database and the AI market developed since then?

Ashok Reddy: Since taking on the role of Chief Executive Officer at KX in August 2022, I have observed significant transformations within the database and AI markets, underscored by burgeoning investments and innovations across several key segments:

Non-relational Databases (NRDBMSs): These databases have experienced notable growth, driven by the demand for flexible, scalable data management systems that accommodate the complex needs of modern, data-intensive applications.

Analytics and Business Intelligence Platforms: This segment has continued to expand rapidly, fueled by the increasing need for sophisticated analytical tools that can provide deeper insights into vast datasets, enabling more informed decision-making processes.

Data Science and AI Platforms: The emergence and integration of advanced data science and Generative AI (GenAI) technologies have propelled this sector forward, with organizations seeking powerful platforms that can drive AI-driven innovation and operational efficiency.

These industry segments have showcased annual growth rates ranging from 20 to 25%, highlighting a substantial shift towards technologies that are not only agile and scalable but also capable of underpinning advanced analytics and AI-driven applications.

Q2. It seems that securing a tangible return on investment from artificial intelligence (AI) is still a challenge. Do you agree? How can you ensure your AI has an ROI?

Ashok Reddy: Yes, achieving a tangible return on investment from artificial intelligence (AI) poses a challenge, but it’s not unattainable. AI, fundamentally a prediction technology as highlighted in “Power and Prediction” by Ajay Agrawal, Avi Goldfarb, and Joshua Gans, and “Competing in the Age of AI” by Karim R. Lakhani and Marco Iansiti, offers organizations the ability to make informed predictions based on extensive datasets. This capability is key to gaining a competitive advantage.

To ensure AI initiatives yield a tangible ROI, a strategic focus on leveraging prediction technology is crucial. This involves minimizing prediction costs, which can be achieved by reducing the incidence and impact of prediction errors and continually refining AI models to enhance their accuracy. By strategically lowering these costs, companies can boost operational efficiency, foster innovation, and elevate revenues.

Moreover, viewing AI’s predictive prowess as a strategic asset allows for the alignment of AI endeavors with specific business goals. Whether the aim is to streamline operational processes, improve customer experiences, or explore new market opportunities, the predictive capacity of AI can be effectively utilized to achieve concrete business results.

Q3. What is the AI Factory concept, and what is its significance?

Ashok Reddy: The AI Factory is a revolutionary concept aimed at streamlining the creation and deployment of AI models within organizations. It serves as a centralized hub where data scientists, engineers, and domain experts collaborate to develop, train, and deploy AI solutions at scale. The AI Factory is essential for organizations looking to leverage AI effectively across various business functions, enabling them to drive innovation, enhance productivity, and gain a competitive edge.

Q4. How does an AI Factory facilitate the utilization of a single GenAI model across multiple functions and tasks?

Ashok Reddy: An AI Factory empowers organizations to leverage a single GenAI model for a multitude of functions and tasks through systematic processes and automation. By centralizing model development and deployment, the AI Factory ensures consistency, scalability, and reusability of AI solutions. This enables organizations to streamline workflows, optimize resource allocation, and extract maximum value from their AI investments.

Q5. What are your tips for scaling AI in practice?

Ashok Reddy: Adopting an AI Factory approach will enable businesses to scale AI. In practice, this means changing how we think about the typical tasks our workforce is asked to complete and how we apply [AI] technology to support not only the completion of that task but the entire job motion itself. For example, if we can support paralegals to do the document review and preparation that a lawyer typically handles, we’re enabling that lawyer to dedicate more of their time to complex cases thereby revolutionizing the job processes for both sides.

By rethinking how we scale AI within the human workforce, we can empower professionals to pursue roles of greater strategic value, reduce time to market for AI solutions, and ensure consistency and quality across AI initiatives.

Q6. Let’s now look at GenAI. Several specialized companies offer so called GenAI “foundation models”– deep learning algorithms pre-trained on massive amounts of public data. How can enterprises take advantage of such foundation models?

Ashok Reddy: Foundation models provide a massive head start for enterprises, but their value lies in adaptation and thoughtful integration. Some examples of adoption of these models for organizations to consider are:

Reducing Development Time and Costs: Instead of building complex AI models from scratch, enterprises can fine-tune foundation models for specific tasks, saving significant resources.
Unlocking New Applications: The flexibility of foundation models (for text, image generation, etc.) enables creative applications, potentially disrupting existing workflows.
Democratizing AI Access: Smaller businesses and those without in-house AI expertise can leverage pre-trained models via APIs or user-friendly platforms.
The RAG Connection for Leveraging Private Data: Foundation models are the perfect ‘retrieval’ component of the retrieval-augmented generation (RAG) pipeline. Accessing knowledge at scale becomes easier.

However, to effectively deploy these strategies, enterprise organizations should consider putting a few guardrails in place, such as:

Domain-Specific Fine-Tuning and Data: Introduce your own high-quality, domain-specific datasets to combat potential irrelevance or bias in the pre-trained model.
Constraints and Sanity Checks: Embed rules derived from industry knowledge and common sense into your fine-tuned model to avoid unrealistic or undesirable outputs.
Continuous Evaluation: Don’t treat a foundation model as static. Test regularly against real-world scenarios and adjust as needed
Human-AI Collaboration: Emphasize explainability, critical evaluation of outputs, and maintaining human oversight. This is vital for trust and responsible deployment.

Q7. Three issues are very sensitive when talking about GenAI: quality and relevance of public data, ethics, and accountability. What is your take on this?

Ashok Reddy: I think these issues are important to consider as we, as a collective industry, work to create expanded capabilities in the area of GenAI.

When it comes to the quality and relevance of public data, the two most important considerations are bias amplification and domain mismatch. First, public datasets often contain biases. Enterprises need rigorous pre-processing and bias mitigation techniques to avoid harmful outputs. This should include work to identify and mitigate biases within the models during development and testing to make sure we’re approaching GenAI creation with a sense of fairness and inclusivity.

With regard to domain mismatch, it’s important to know that data used to train the foundation model may not align with your specific enterprise needs. Fine-tuning and supplementing with your own data is essential.

On the topic of ethics, we’re seeing instances of misinformation and deepfakes in our everyday consumption of news and social media. Organizations should ensure foundation models are not misused for malicious purposes by implementing safeguards and internal policies to guide responsible use. As a society, we need to consume with a discerning eye and make sure we hold creators accountable.

I believe that in the area of accountability, transparency, governance, and continuous monitoring of models are all critical issues to be addressed.

Where possible, aim for some level of transparency in your foundation model’s decisions. This builds trust and helps debug potential issues. Establish clear ownership and protocols for using GenAI within your enterprise, defining acceptable use cases and limitations.

And, as foundation models evolve, so too must your risk assessment and ethical considerations. Therefore, establishing a method for continuous monitoring is key.

To help accomplish oversight in these areas, it’s important that we adopt grounding strategies.

The first way to do this is to use your knowledge base as an anchor, supplementing public data with your curated knowledge base of reliable sources, financial reports, or industry-specific data. This helps the model gain a better understanding of your domain’s facts and principles.

Secondly, you should use RAG for verification. If your foundation model generates text, use a RAG model to cross-check its outputs against trusted knowledge sources, reducing the chance of spreading misinformation.

And finally, make sure to include explainability as a requirement. Prioritize techniques that give insight into why the model generates certain outputs. This helps spot issues and maintain a strong link to reality.

Q8. What staffing approach is required for operating an AI Factory? Is it about hiring new specialists or upskilling existing business and tech roles, or both?

Ashok Reddy: Operating a GenAI Factory requires a strategic approach to staffing that combines both hiring new specialists and upskilling existing talent. While recruiting individuals with expertise in AI, machine learning, and data science is crucial for driving innovation, it’s equally important to invest in upskilling existing business and tech roles. By fostering a culture of continuous learning and development, organizations can ensure that their workforce remains adaptable and proficient in leveraging advanced AI technologies effectively.

Q9. Harvard Business Review (**) reported that “Gartner research has identified five forces that will keep the pressure on executives to keep learning, testing, and investing in the GenAI technology: 1) Board and CEO expectations; 2) Customer expectations; 3) Employee expectations; 4) Regulatory expectations; and 5) Investor expectations.” What does it mean in practice?

Ashok Reddy: Boards and CEOs are looking to implement solutions that have clear data pipelines, experimentation processes and a focus on translating AI insights into actionable decisions, all attributes of a well-structured AI Factory for continuous innovation.

Customers want AI-enhanced experiences. Organizations can work to deploy AI from the Factory for personalized recommendations, intelligent chatbots, and streamlined processes. Constant evaluation and refinement are the keys to success.

When it comes to employees, using AI as a Co-pilot or Assistant can help upskill employees to collaborate with the AI tools developed in their AI Factory. This reduces mundane tasks and fosters a sense of empowerment.

When it comes to regulatory expectations, you can look to explainability and bias mitigation and work to design an AI Factory with transparency in mind. Incorporate tools and processes to explain AI outputs and proactively address potential biases in datasets and models.

Investors are going to look for an AI Factory to produce ROI. For this, organizations can demonstrate a clear link between their AI Factory investments and tangible business outcomes (cost savings, revenue growth, risk reduction), while also working to underscore their commitment to ethical and responsible AI use.

An AI Factory approach isn’t a magic solution but a strategic framework. Executives who develop a structured plan for building their AI Factory, one that is responsive to these five forces, will gain trust and secure sustained investment for AI initiatives.

Q10. How has KX developed since August 2022?

Ashok Reddy: Since August 2022, KX has been at the forefront of driving accelerated computing for data and AI-driven analytics, catering specifically to AI-first enterprises. Our focus on strategic partnerships has allowed us to deliver cutting-edge solutions tailored to meet the evolving needs of our customers.

Furthermore, our commitment to innovation and customer-centricity has solidified our position as a trusted partner in the AI-driven analytics space. By aligning our efforts with market demands, we continue to lead the way in delivering transformative solutions that drive success for our customers.

……………………………………….

Ashok Reddy, CEO, KX

One of the leading voices in vector databases, search & temporal LLM’s, Ashok KX joined as Chief Executive Officer in August 2022. He has more than 20 years of experience leading teams and driving revenue for Fortune 500 and private equity-backed technology companies. He spent ten years at IBM as Group General Manager where he led the end-to-end delivery of enterprise products and platforms for a diverse portfolio of global customers. In addition, he has held leadership roles at CA Technologies and Broadcom, and worked as a special adviser to digital transformation company Digital.AI where he helped the senior leadership team devise the product and platform vision and strategy.

Resources:

(**) 5 Forces That Will Drive the Adoption of GenAI. Harvard Business Review, Dec 14, 2023.

On Generative AI. Interview with Maharaj Mukherjee. ODBMS Industry Watch, December 10, 2023.

On Generative AI. Interview with Philippe Kahn. ODBMS Industry Watch, June 19, 2023.

…………………

Follow us on X.

Follow us on LinkedIn.

Mar 28 24

On Digital Ethics. Interview with Jean Enno Charton.

by Roberto V. Zicari

“Digital ethics must be context specific. To bridge the operationalization gap, we must consider that digital ethics cannot be a one-size fits-all approach.“

Q1. What is your responsibility and your current projects as Director Bioethics & Digital Ethics at Merck KGaA?

Jean Enno Charton: As the Director of Bioethics and Digital Ethics at Merck KGaA, I’m responsible for addressing ethical considerations and questions that arise from rapid advancements in science and technology especially in areas of legal ambiguity and moral complexity. Merck as a leading science and technology company is perpetually at the forefront of pioneering new projects that redefine our offerings and customer interactions. Consequently, my responsibilities and projects are dynamic, evolving in line with Merck’s growth and the ever-changing scientific landscape. Over the recent years I have built up a team as specialists that works as sort of ‘in-house consultancy for ethics.

For example, we have to carefully evaluate which organizations we supply with our life science products, such as genome editing tools, to prevent potential misuse or abuse for unethical purposes. Other recent projects that I am involved in are related to stem cell research, human embryo models, fertility technologies, clinical research, donation programs, and others. My work also includes ensuring data and algorithms at Merck are used in ethically favourable ways that align with our values and principles. Here, we are frequently delving into ethical considerations surrounding AI deployment and data utilization in research and development, data analytics, human resource management and other areas.

In addition, we coordinate with two advisory panels that provide independent external guidance on how to use the scientific and digital innovations developed by Merck KGaA in a responsible manner. These are the Merck Ethics Advisory Panel for Science and Technology (MEAP) and the Digital Ethics Advisory Panel (DEAP). My team and I select the topics and experts for these panels and disseminate their recommendations.

Q2. Recent attempts to develop and apply digital ethics principles to address the challenges of the digital transformation leave organizations with an operationalisation gap. Why?

Jean Enno Charton: The operationalisation gap in digital ethics arises mainly due to the necessarily high level of abstraction of ethical principle frameworks vis-à-vis the granularity needed to answer day-to-day challenges that arise from data and AI use.

Ethical principles are typically formulated at a high level of abstraction. While these principles serve as essential compass points, they lack the specificity needed for practical application. Imagine a map with coordinates indicating the general direction but failing to provide street-level details. Similarly, high level ethical principles guide organizations but fall short when it comes to navigating the intricacies of real-world scenarios.

Moreover, digital ethics must be context specific. To bridge the operationalization gap, we must consider that digital ethics cannot be a one-size fits-all approach. Each organization context – whether it’s healthcare, finance, or human resources – presents unique challenges. In some instances, you may need geopolitical maps and in others you may want geographical maps.

Q3. What are the main challenges in translating high-level ethics frameworks into practical methods and tools that match the organization specific workflows and needs?

Jean Enno Charton: The primary challenge lies in the intellectual capacity required for such translation. In the realm of bioethics and digital health, ethical considerations are intricate and multifaceted, demanding a nuanced approach to each unique scenario.

Organizations must have the capability to create tools and methodologies that are fit for their specific operational needs. Careful planning is essential, entailing a strategic plan of integrating these tools into current systems with minimal interruption and optimal productivity. Furthermore, fostering inter departmental cooperation is crucial to overcome the common challenge of compartmentalization within organizations. Often, the resources necessary for such endeavors are scarce or inadequately allocated.

Additionally, digital ethics presents unique challenges that necessitate a fundamental shift from the traditional model of one-time advisory consultations. The highly automated and dynamic nature of project oversight, coupled with the scale and velocity of data analytics projects, calls for ongoing ethical engagement and innovative approaches to responsibility assignment. In dispersed data analytics teams, attributing ethical accountability is particularly difficult because individual responsibilities can become unclear. Consequently, there is an imperative for developing new methodologies for ethical assessment. These methodologies must be adaptable to the changing landscape and sufficiently nuanced to reflect the complexities of data and AI utilization.

Q3. You helped develop a risk assessment tool called Principle-at-Risk Analysis (PaRA). What is it? And how does it work in practice at Merck KGaA?

Jean Enno Charton: The Principle-at-Risk Analysis (PaRA) is a standardized risk assessment tool developed to bridge high-level ethics frameworks with practical methods and tools that align with specific workflows and needs. At Merck KGaA, PaRA guides and harmonizes the work of the DEAP, ensuring alignment with the company’s Code of Digital Ethics (CoDE).

How does PaRA work?

Identification: The first step is to identify potential scenarios or decisions in the project seeking guidance from the DEAP where ethical principles could be compromised. These principles include privacy, transparency, fairness, and accountability. At the end of the process, the panel receives a list of potential conflicts between Merck’s CoDE and the project being investigated, enabling a comprehensive review of all relevant ethical concerns.
Assessment: Once potential risks are identified, they are assessed based on their potential impact on ethical principles. The assessment considers factors such as the severity of the potential harm, the likelihood of occurrence, and the company’s ability to mitigate the risk.
Mitigation: After assessing the risks, measures are implemented to mitigate or manage them effectively. This may involve adjusting processes, implementing safeguards, or providing additional training and guidance to employees involved in decision-making.
Monitoring and Review: The PaRA framework emphasizes ongoing monitoring and review of ethical risks to ensure that mitigation measures remain effective. This includes regular audits, feedback mechanisms, and adapting strategies as new risks emerge or circumstances change.
Integration with Decision-Making: Importantly, the PaRA framework is integrated into the company’s decision-making processes. This ensures that ethical considerations are taken into account when making business decisions, from strategic planning to day-to-day operations.

In practice, we have applied PaRA in various contexts, such as ensuring the comprehensibility of consent forms in data-sharing scenarios at Syntropy, a collaborative technology platform for clinical research. The tool can also be applied across various departments and functions, such as research and development or marketing.

Q4. How can ethics panels make an effective contribution to implementing digital ethics in a commercial organization?

Jean Enno Charton: Ethics panels can play a crucial role in implementing digital ethics in commercial organization by providing expertise, guidance and oversight in navigating complex ethical quandaries associated with digital technologies.

In corporate context of companies like Merck, internal ethics teams may be limited in size, hindering their ability to handle diverse ethical issues effectively. This is where external advisory panels come in by contributing diverse knowledge from technical fields, ethics, sociology, anthropology, and law. This diverse expertise is essential for addressing complex ethical conundrums in areas such as stem cell, patient consent, and AI in HR.

Ethics panels also act impartially in balancing commercial interests and ethical considerations ensuring fair outcomes.

They contribute to developing robust ethical policies and frameworks, drawing from their experience in policy formation, public consultation, or regulatory roles. DEAP, for instance, assisted in developing CoDE, PaRA, among other recent methodological advancements at Merck.

Panels are also well equipped to conduct risk and opportunity assessments to identify potential ethical concerns and prioritize appropriate countermeasures. This approach promotes a more humane technological landscape that aligns opportunities with ethics for a conscientious contribution.

Q5. Merck created a digital ethics panel composed of external experts in digital ethics, law, big data, digital health, data governance and patient advocacy. How do you handle conflict of interests and ensure a neutral approach?

Jean Enno Charton: To address potential conflicts of interests and maintain neutrality, the panel members are required to disclose potential conflicts of interest, affiliations with other organizations, or any other factors that might affect their impartiality. This disclosure is part of an agreement that panel members make with Merck during the recruitment process, ensuring a commitment to good faith actions.

Additionally, Merck employs a transparent selection process for its panel members. Candidates are meticulously chosen through a comprehensive process that prioritizes independence, breadth of expertise that is pertinent to Merck’s diverse portfolio, academic merit, diversity of viewpoints, and the capacity to accurately reflect the views pertinent to specific geographic regions as well as the interests of minority groups that are often underrepresented in ethical assessments.

Furthermore, our Code of Ethics and PaRA standardizes the panel’s activities, providing structured support in the decision-making processes, thereby ensuring the utmost neutrality and integrity in the panel’s operations.

Q6. How do you avoid the risk that an ethics panel will form an isolated entity within the company?

Jean Enno Charton: Avoiding the risk of an ethics panel becoming an isolated entity within a company involves a delicate balance of independence and integration. It is crucial for such a panel to maintain a certain level of detachment to offer unbiased independent opinions. Yet, it’s equally important to prevent the panel from becoming isolated. At Merck, we achieve this balance by fostering a sense of inclusion within both the Merck and scientific communities.

Our panel members, while not Merck employees, often have a long-standing relationship with the company. We ensure they are well informed about Merck’s activities and specific use cases, enabling them to offer valuable insights without internal biases.

To promote transparency, we communicate the panel’s activities, decisions, and recommendations within the company through minutes or other appropriate communication methods. This open communication helps other stakeholders understand and value the panel’s role, mitigating the risk of isolation. It also nurtures a sense of belonging and contribution to the broader scientific endeavor at Merck.

Q7. Can you share some details of a best practice you have developed using the PaRA tool?

Jean Enno Charton: One of the best practices we have developed using PaRA tool involves integrating it with other ethical frameworks, such as CoDE, and collaborating with experts including DEAP and my own team to ensure a comprehensive approach to ethical considerations.

It’s important to note that the PaRA tool doesn’t encompass all ethical aspects. Therefore, revisiting and reassessing the tool’s output is sometimes necessary to identify any overlooked or missed elements. This practice has significantly enhanced our effective utilization.

Q8. What are the main lessons you have learned?

Jean Enno Charton: One of the main lessons I have learned is the importance of scientific rigor in developing and implementing a risk analysis tool like PaRA. While a principled framework provides a foundation, it must be supported by rigorous scientific and ethical analysis to ensure its effectiveness and meaningfulness. This involves conducting thorough research, gathering relevant data, and engaging with experts in various fields to inform the development of the tool.

Another key lesson is the necessity of garnering support from senior leadership for the success of the tool. Without the backing of senior leadership, it would have been challenging – if not impossible – to develop and implement PaRA effectively.

Further, I have learned the need for flexibility and sensitivity to the specific needs of individual departments within the organization. While overarching ethical principles guide the development of the tool, it’s essential to recognize that different departments may face unique challenges and priorities. As such, the tool must be adaptable enough to accommodate these varying needs while upholding ethical standards consistently across the organization. This flexibility ensures that the tool remains relevant and applicable across diverse departments and scenarios.

Q9. How do you make sure that the recommendations developed on the basis of the Principle-at-Risk Analysis are really enforced and not ignored in practice?

Jean Enno Charton: From a governance perspective Our role is advisory rather than decision or enforcement. We aim to convince our collaborators to adopt the recommendations from the panels and implement the principles laid out in the Merck’s Code of Digital Ethics.

Departments and individuals bear the primary responsibility for adhering to these principles. However, we support department leaders in creating mechanisms for effective enforcement.

For instance, we have created a handbook-style self-assessment tool for project managers working with generative AI to identify and mitigate ethical risks during project development. Additionally, we have embedded semi-automated ethics assessment processes into existing project management structures within data analytics.

We also actively engage with department leaders to encourage follow-up and implementation of recommendations to ensure accountability and transparency. When recommendations are not feasible due to various constraints, we communicate the constraints to the ethics panel, valuing their input while outlining implementation barriers.

Q10. What are the challenges and limitations of such an approach?

Jean Enno Charton: A significant challenge of this approach is the voluntary nature of compliance. Unlike traditional compliance departments that enforce rules and regulations, we rely on voluntary adherence to ethical principles. While the voluntary approach elevates ethics to a higher moral standard, it also means that there’s no direct mechanism for enforcement. This voluntary aspect can pose challenges in ensuring consistent adherence across all departments and levels of the organization. This has also to do with the culture and code of conduct we live at Merck – a still largely family-owned company with a long-term, generationally-thinking approach on doing responsible business.

Qx Anything else you wish to add?

Jean Enno Charton: (WHAT’S THE BIG TAKEAWAY)

…………………………………..

Dr. Jean Enno Charton, Director Bioethics & Digital Ethics , Merck KGaA

Dr. Jean Enno Charton is Director Digital Ethics & Bioethics at Merck. After a brief stint in the biotech industry, he has been with Merck since 2014 – first in the Research & Development department of the Healthcare division (Medical Affairs), later as Chief of Staff of the Chief Medical Officer Healthcare. Since 2019, he has built up the independent Digital Ethics & Bioethics department and is responsible for the topic across all divisions.

Jean Enno Charton studied biochemistry at the University of Tübingen and obtained his doctorate in cancer research at the University of Lausanne; his research experience includes stays at the Canadian Science Center for Human and Animal Health and Harvard Medical School.

On Digital Transformation and Ethics. Interview with Eberhard Schnebel, ODBMS Industry Watch, November 23, 2020

On Responsible AI. Interview with Kay Firth-Butterfield, World Economic Forum. ODBMS Industry Watch, September 20, 2021

………………………………………..

Follow us on X

Follow us on LinkedIn

Jan 17 24

On The Future of Vector Databases. Interview with Charles Xie

by Roberto V. Zicari

“Open source is reshaping the technological landscape, and this holds particularly true for AI applications. As we progress into AI, we will witness the proliferation of open-source systems, from large language models to advanced AI algorithms and improved database systems.“

Q1. What is your definition of a Vector Database?

Charles Xie: A vector database is a cutting-edge data infrastructure designed to manage unstructured data. When we refer to unstructured data, we specifically mean content like images, videos, and natural language. Using deep learning algorithms, this data can be transformed into a novel form that encapsulates its semantic representation. These representations, commonly known as vector embeddings or vectors, signify the semantic essence of the data. Once these vector embeddings are generated, we store them within a vector database, empowering us to perform semantic queries on the data. This capability is potent because, unlike traditional keyword-based searches, it allows us to delve into the semantics of unstructured data, such as images, videos, and textual content, offering a more nuanced and contextually rich search experience.

Q2. Currently, there are a multitude of vector databases on the market. Why do they come in so many versions?

Charles Xie: When examining vector database systems, disparities emerge. Some, like Chroma, adopt an embedded system approach akin to SQLite, offering simplicity but lacking essential functionalities like scalability. Conversely, systems like PG Vector and Pinecone pursue a scale-up approach, excelling in single-node instances but limiting scalability.

As a seasoned database engineer with over two decades of experience, I stress the complexity inherent in database systems. A systematic approach is vital when assessing these systems, encompassing components like storage layers, storage formats, data orchestration layers, query optimizers, and execution engines. Considering the rise of heterogeneous architectures, the latter must be adaptable across diverse hardware, from modern CPUs to GPUs.

From its inception, Milvus has embraced heterogeneous computing, efficiently running on various modern processors like Intel and AMD CPUs, ARM CPUs, and Nvidia GPUs. The integration extends to supporting vector processing AI processes. The challenge lies in tailoring algorithms and execution engines to each processor’s characteristics, ensuring optimal performance. Scalability, inevitable as data grows, is a crucial consideration addressed by Milvus, supporting both scale-up and scale-out scenarios.

As the vector database gains prominence, its appeal to vendors stems from its potential to reshape data management. Therefore, transitioning to a vector database necessitates evaluating its criticality to business functions and anticipating data volume growth. Milvus stands out for both scenarios, offering consistent, optimal performance for mission-critical services and remarkable cost-effectiveness as data scales.

Q3. In your opinion when does it make sense to transition to a pure vector database? And when not?

Charles Xie: Now, let’s delve into the considerations for transitioning to a pure vector database. It’s crucial to clarify that a pure vector database isn’t merely a traditional database with a vector plugin; it’s a purposefully designed solution for handling vector embeddings.

There are two key factors to weigh. Firstly, assess whether vector computing and similarity search are critical to your business. For instance, if you’re constructing a RAG solution integral to millions of users daily and forming the core of your business, the performance of vector computing becomes paramount. In such a situation, opting for a pure vector database system is advisable. It ensures consistent, optimal performance that aligns with your SLA requirements, especially for mission-critical services where performance is non-negotiable. Choosing a vector database system guarantees a robust foundation, shielding you from unforeseen surprises in your regular database services.

The second crucial consideration is the inevitable increase in data volume over time. As your service runs for an extended period, the likelihood of accumulating larger datasets grows. With the continuous expansion of data, cost optimization becomes an inevitable concern. Most pure vector database systems on the market, including Milvus, deliver superior performance while requiring fewer resources, making them highly cost-effective.

As your data volume escalates, optimizing costs becomes a priority. It’s common to observe that the bills for vector database services grow substantially with the expanding dataset. In this context, Milvus stands out, showcasing over 100 times more cost-effectiveness than alternatives such as PG Vector, OpenSearch, and other non-native web database solutions. The cost-effectiveness of Milvus becomes increasingly advantageous as your data scales, making it a strategic choice for sustainable and efficient operations.

Q4. What is the initial feedback from users of Vector Databases?

Charles Xie: Reflecting on our beginnings six years ago, we focused primarily on catering to enterprise users. At the time, we engaged with numerous users involved in recommendation systems, e-commerce, and image recognition. Collaborations with traditional AI companies working on natural language processing, especially when dealing with substantial datasets, provided valuable insights.

The predominant feedback we received emphasized the enterprise sector’s specific needs. These users, being enterprises, possessed extensive datasets and a cadre of proficient developers. They emphasized deploying a highly available and performant vector database system in a production environment, a requirement often seen in large enterprises where AI was gaining traction.

It’s important to note that independent AI developers were not as prevalent during that period. AI, being predominantly in the hands of hyper-scalers and large enterprises, meant that the cost of developing AI algorithms and applications was considerably high. Around six years ago, hyper-scalers and large enterprises were the primary users of vector database systems, given their capacity to afford dedicated teams of AI developers and engineers. This context laid the foundation for our initial focus and direction.

In the last two years, we’ve witnessed a remarkable shift in the landscape of AI, marked by the breakthrough of modern AI, particularly the prominence of large language models. Notably, there has been a significant surge in independent AI developers, with the majority comprising teams of fewer than five individuals. This starkly contrasts the scenario six years ago when the AI development scene was dominated by large enterprises capable of assembling teams of tens of engineers, often including a cadre of computer science PhDs, to drive AI application development.

The transformation is striking—what was once the exclusive realm of well-funded enterprises can now be undertaken by small teams or even individual developers. This democratization of AI applications marks a fundamental shift in accessibility and opportunities within the AI space.

Q5. Will semantic search be performed in the future by ChatGPT instead of using vectors and a K-nearest neighbor search?

Charles Xie: Indeed, the foundation models we encounter, such as Chat GPT and vector databases, share a common theoretical underpinning—the embedding vector abstraction. Both Chat GPT and vector database systems leverage embedding vectors to encapsulate the semantic essence of the underlying unstructured data. This shared data abstraction allows them to make sense of the information and perform queries effectively. Across large language models, AI models, and vector database systems, a profound connection exists rooted in the utilization of the same data abstraction—embedding vectors.

This connection extends further as they employ identical metrics, primarily relying on distance metrics like Euclidean or cosine distance. Whether within Chat GPT or other large language models, using consistent metrics facilitates the measurement of similarities among vector embeddings.

Theoretically, a profound connection exists between large language models like Chat GPT and various vector databases, stemming from their shared use of embedding vector abstraction. The workload division between them becomes apparent—they both excel at performing semantic and k-nearest neighbor searches. However, the noteworthy distinction lies in the cost efficiency of these operations.

While large language models and vector databases tackle the same tasks, the cost disparity is significant. Executing semantic search and k-nearest neighbor search in a vector database system proves to be approximately 100 times more cost-effective than carrying out these operations within a large language model. This substantial cost difference prompts many leading AI companies, including OpenAI, to advocate for using vector databases in AI applications for semantic search and k-nearest neighbor search due to their superior cost-effectiveness.

Q6. There seems to be a need from enterprises to have a unified data management system that can support different workloads and different applications. Is this doable in practice? If not, is there a risk of fragmentations of various database offerings?

Charles Xie: No, I don’t think so. To illustrate my point, let’s consider the automobile industry. Can you envision a world where a single vehicle serves as an SUV, sedan, truck, and school bus all at once? This has yet to happen in the last 100 years of the automobile industry, and if anything, the industry will be even more diversified in the next 100 years.

It all started with the Model T; from this, we witnessed the birth of a great variety of automobiles commercialized for different purposes. On the road, we see lots of differences between SUVs, trucks, sports cars, and sedans, to name a few. A closer look at all these automobiles reveals that they are specialized and designed for specific situations.

For instance, SUVs and sedans are designed for family use, but their chassis and suspension systems are entirely different. SUVs typically have a higher chassis and a more advanced suspension system, allowing them to navigate obstacles more easily. On the other hand, sedans, designed for urban areas and high-speed driving on highways, have a lower chassis for a more comfortable driving experience. Each design serves a specific goal.

Looking at all these database systems, we see that many design goals contradict each other. It’s challenging, if not impossible, to optimize a design to meet all these diverse requirements. Therefore, the future of database systems lies in developing more purpose-built and specialized ones.

This trend is already evident in the past 20 years. Initially, we had traditional relational database systems. Still, over time, we witnessed the emergence of big data solutions, the rise of NoSQL databases, the development of time series database systems, graph database systems, document database systems, and now, the ascent of vector database systems.

On the other hand, certain vendors might have an opportunity to provide a unified interface or SDK to access various underlying database systems—from vector databases to traditional relational database systems. There could be a possibility of having a unified interface.

At Milvus, we are actively working on this concept. In the next stage, we aim to develop an SQL-like interface tailored for vector similarity search in vector databases. We aim to incorporate vector database functionality under the same interface as traditional SQL, providing a unified experience.

Q7. What does the future hold for Vector databases?

Charles Xie: Indeed, we are poised to witness an expansion in the functionalities offered by vector database systems. In the past few years, these systems primarily focused on providing a single functionality: approximate nearest neighbor search (ANN search). However, the landscape is evolving, and in the next two years, we will see a broader array of functionalities.

Traditionally, vector databases supported similarity-based search. Now, they are extending their capabilities to include exact search or matching. You can analyze your data through two lenses: a similarity search for a broader understanding and an exact search for detailed insights. By combining these two approaches, users can fine-tune the balance between obtaining a high-level overview and delving into specific details.

Obtaining a sketch of the data might be sufficient for certain situations, and a semantic-based search works well. On the other hand, in situations where minute differences matter, users can zoom in on the data and scrutinize each entry for subtle features.

Vector databases will likely support additional vector computing workloads, such as vector clustering and classification. These functionalities are particularly relevant in applications like fraud detection and anomaly detection, where unsupervised learning techniques can be applied to cluster or classify vector embeddings, identifying common patterns.

Q8. And how do you believe the market for open source Vector databases will evolve?

Charles Xie: Open source is reshaping the technological landscape, and this holds particularly true for AI applications. As we progress into AI, we will witness the proliferation of open-source systems, from large language models to advanced AI algorithms and improved database systems. The significance of open source extends beyond mere technological innovation; it exerts a profound impact on our world’s social and economic fabric. In the era of modern AI, with the dominance of large language models, open-source models and open-source vector databases are positioned to emerge victorious, shaping the future of technology and its societal implications.

Q9. In conclusion, are Vector databases transforming the general landscape, not just AI?

Charles Xie: Indeed, vector databases represent a revolutionary technology poised to redefine how humanity perceives and processes data. They are the key to unlocking the vast troves of unstructured data that constitute over 80% of the world’s data. The promise of vector database technology lies in its ability to unleash the hidden value within unstructured data, paving the way for transformative advancements in our understanding and utilization of information.

………………………………………………..

Charles Xie is the founder and CEO of Zilliz, focusing on building next-generation databases and search technologies for AI and LLMs applications. At Zilliz, he also invented Milvus, the world’s most popular open-source vector database for production-ready AI. He is currently a board member of LF AI & Data Foundation and served as the board’s chairperson in 2020 and 2021. Charles previously worked at Oracle as a founding engineer of the Oracle 12c cloud database project. Charles holds a master’s degree in computer science from the University of Wisconsin-Madison.

On Vector Databases and Gen AI. Q&A with Frank Liu. ODBMS.org, DECEMBER 8, 2023

Jan 5 24

On the Future of AI. Interview with Raj Verma

by Roberto V. Zicari

“ Five years from now, today’s AI systems will look archaic to us. In the same way that computers of the 60s look archaic to us today. What will happen with AI is that it will scale and therefore become simpler, and more intuitive. And if you think about it, scaling AI is the best way to make it more democratic, more accessible.“

Q1. What are the innovations that most surprised you in 2023?

Raj Verma: Generative AI is definitely the talk of the town right now. 2023 marked its breakthrough, and I think the hype around it is well founded. Few people knew what generative AI was before 2023. Now everyone’s talking about it and using it. So I was quite impressed by the takeup of this new technology.

But if we go deeper, we have to acknowledge that the rise of AI would not have been possible without significant advancements in how large amounts of data are stored and handled. Data is the core of AI and what is used to train LLMs. Without data, AI is useless. To have powerful generative AI that gives you answers, predictions and content right at the moment you need it, you need real-time data, or data that is fresh, in motion and delivered in a matter of milliseconds. The interpretation and categorization of data are therefore crucial in powering LLMs and AI systems.

In that sense, you will notice a lot of hype around Specialized Vector Databases (SVDB), which are independent systems that you plug into your data architecture designed to store, index and retrieve vectors, or multidimensional data points. These are popular because LLMs are increasingly relying on vector data. Think of vectors as an image or a text converted into a stored data point. When you prompt an AI system, it will look for similarities in those stored data points, or vectors, to give you an answer. So vectors are really important for AI systems and businesses often believe that a database focused on just storing and processing vector data is essential for AI systems.

However, you don’t really need SVDBs to power your AI applications. In fact, loads of companies have come to regret their use because, as an independent system, they result in redundant data, excessive data movement, increasing labor and licensing costs and limited query power.

The solution is to store all your data — structured data, semi-structured data based on JSON, time-series, full-text, spatial, key-value and vector data — in one database. And within this system have a powerful vector database functionality that you can leverage to conduct vector similarity search.

All this to say that, I’ve been impressed at the speed in which we are developing ways to power generative AI. We’re experimenting based on its needs and quickly figuring out what works and doesn’t work.

Q2. What is real-time data and why is it essential for AI?

Raj Verma: Real time is about what we experience in the now. It is access to the information you need, at the moment you need it, delivered together with the exact context you need to make the best decision. To experience this now, you need real-time data — data that is fresh and in motion. And with AI, the need for real-time data — fast, updated and accurate data — is becoming more apparent. Because without data, AI is useless. And when AI models are trained on outdated or stale data, you get things like AI bias or hallucinations. So, in order to have AI that is powerful, and that can really help us make better choices, we need real time data.

With the use of generative AI expanding beyond the tech industry, the need for real-time data is more urgent than ever. This is why it is important to have databases that can handle storage, access and contextualization of information. At SingleStore, our vision is that databases should support both transactional (OLTP) and analytical (OLAP) workloads, so that you can transact without moving data and put it in the right context — all of which can be delivered in millisecond response times.

Q3. One of the biggest concerns around AI is bias, the idea that existing prejudices in the data used to train AI might creep into its decisions, content and predictions. What can we do to mitigate this risk?

Raj Verma: I believe humans should always be involved in the training process. With AI, we must be both student and teacher, allowing it to learn from us, and in that way continuously give it input so that it can give us the insight we need. There are many laudable efforts to develop Hybrid Human AI models, which basically incorporate human insight with machine learning. Examples of hybrid AI include systems in which humans monitor AI processes through auditing or verification. Hybrid models can help businesses in several ways. For example, while AI can analyze consumer data and preferences, humans can jump in to guide how it uses that insight to create relevant and engaging content.

As developers, we must also be very cognizant of where the data used to train LLMs comes from. And in this sense, being transparent about where it comes from helps, because the systems can be held accountable and challenged if biased data does creep into the training process. The important thing here is also to know that an AI system is only as good as the data that is trained on.

Q4. The popularity and accessibility of generative artificial intelligence (gen AI) has made it feel like the future we see in science fiction movies is finally at our doorstep. And those science fiction movies have sowed much worry about AI being dangerous. Is this Science fiction vision of AI becoming true?

Raj Verma: Don’t expect machines to take over the world, at least not any time soon. AI can process and analyze large amounts of data and generate content based on that, at a much faster pace than we humans can. But they are still very dependent on human input. The idea that human-like robots will come to rule the world makes for great fiction movies, but it’s far from becoming a reality.

That doesn’t mean that AI isn’t dangerous — and we have a responsibility to discern discerning which threats are real.

AI poses an unprecedented risk in fueling the spread of disinformation because it has the capacity to create authentic looking content. Distinguishing between content generated by AI and that created by humans will become increasingly challenging. AI can also pose cybersecurity threats. You can trick ChatGPT into writing malicious code, or use other generative AI systems to enhance ransomware. And AI can worsen current malicious trends that have surfaced with social media. I personally worry that AI systems will exploit the attention economy and spur higher levels of social media addiction. This can have terrible consequences on teenagers’ mental health. As a father of two, I am deeply concerned about this.

These are the threats that we should worry about. And we humans are capable of mitigating these risks. We should always be involved in AI’s development, audit it and pay special attention to the data that we use to train it.

Q5. You are quoted saying that ” without data, AI wouldn’t exist—but with bad or incorrect data, it can be dangerous.” How dangerous can AI be?

Raj Verma: Generative AI is like a superhuman who reads an entire library of thousands of books to answer your question, all in a matter of seconds. If it doesn’t have access to that library, and if that library doesn’t have the latest books, magazines and newspapers, then it cannot give you the most relevant information you need to make the best decision possible. This is a very simple explanation of why, without data, AI is useless. Now imagine that library is full of outdated books that were written by white supremacists during the civil war. The information you are going to get from this AI system is going to guide your decisions, and you are going to make some very bad decisions. You are going to make biased decisions, and you’re going to perpetuate biases that already exist in society. That’s how AI can be dangerous, and that is why we need AI systems to have access to the most updated, accurate data out there.

Q6. Should AI be Regulated? And if yes, what kind of regulation?

Raj Verma: The issue is, it’s hard to regulate something that is still developing. We just don’t know what AI will look like, in its entirety, in the future. So we want to avoid regulation hampering the development of this technology. That doesn’t mean that there aren’t standards that can be applied globally. Data regulation is key, since data is the backbone of AI. Data regulation can be based on the principle of transparency, which is key to generate trust in AI and our ability to hold this technology and its developers accountable should something go wrong. To achieve transparency you need to know where the data in the AI system is coming from. So, proper documentation of the data used to train LLMs is something we can regulate. You also must be able to explain the reasoning behind an AI system’s solutions or decisions. These must be understandable by humans. And there’s also transparency in how you present the AI system to users. Do users know that they are talking to an AI robot and not a human? We can regulate data transparency without imposing excessive measures that could hamper AI’s development.

Q7. There is no global approach on AI regulation. Several Countries in the world are in various stages of evolving their approach to regulating AI. What are the practical consequences of this?

Raj Verma: A global scale regulation of AI is incredibly challenging. Each country’s social values will be reflected in the way they approach regulating this new technology. The EU has a very strong approach to consumer protection and privacy, which is probably why it authored the first significant widespread attempt to regulate AI in the world. I don’t believe we will see such a wide sweeping legislation in the US, a country that values innovation and market dynamics. The US, we will see a decentralized approach to regulation, with maybe some specific decrees that seek to regulate its use in specific industries, like healthcare or finance.

Many worry that the EUs new AI act will become another poster child of the Brussels effect, where firms end up adopting the EU’s regulation, in absence of any other, because it saves costs. Yet the Brussels effect might not exactly happen with the AI act, particularly because firms might want to use different algorithms in the first place. For example, marketing companies will want to use different algorithms for different geographic areas because consumers behave differently depending on where they live. It won’t be hard then for firms to have their different algorithms comply with different rules in different regions.

All this to say that we should expect different AI regimes around the world. Companies should prepare for that. AI trade friction with Europe is likely to emerge, and private companies will advance their own “responsible AI” initiatives as they face a fragmented global AI regulatory landscape.

Q8. How can we improve the way we gather data to feed LLMs?

Raj Verma: We need to make sure LLMs are up to date. Open source LLMs that are trained on large, publicly available data are prone to hallucinate because at least part of their data is outdated and probably biased. There are ways to fix this problem, including Retrieval Augmented Generation (RAG), which is a technique that uses a program to retrieve contextual information from outside the model, immediately feeding it to the AI system. Think of it as an open book test where the AI model, with the help of a program (the book), can look up information specific to the question it is being asked about. This is a very cost effective way of updating LLMs because you don’t need to retrain it all the time and can use it in case-specific prompts.

RAG is central to how we at SingleStore are bringing LLMs to date. To curate data in real time, it needs to be stored as vectors, which SingleStore allows users to do. That way you can join all kinds of data and deliver the specific data you need in a matter of milliseconds.

Q9. What is the evolutionary path you think AI will go through? When we look back 5-10 years from now, how will we look at genAI systems like ChatGPT?

Raj Verma: Five years from now, today’s AI systems will look archaic to us. In the same way that computers of the 60s look archaic to us today. What will happen with AI is that it will scale and therefore become simpler, and more intuitive. And if you think about it, scaling AI is the best way to make it more democratic, more accessible. That is the challenge we have in front of us, scaling AI, so that it works seamlessly in giving us the exact insight we need to improve our choices. I believe this scaling process should revolve around information, context and choice, what I call the trinity of intelligence. These are the three tenets that differentiate AI from previous groundbreaking technologies. They are also what help us experience the now in a way that we are empowered to make the best choices. Because this is our vision at SingleStore, we focus on developing a multi-generational platform which you can use to transact and reason with data in millisecond response times. We believe this is the way to make AI more powerful because with more precise databases that can deliver information in real time, we can power the AI systems that will really help us make the best choices as humans.

………………………………………..

Raj Verma is the CEO of SingleStore.

He brings more than 25 years of global experience in enterprise software and operating at scale. Raj was instrumental in the growth of TIBCO software to over $1 billion in revenue, serving as CMO, EVP Global Sales, and COO. He was also formerly COO at Apttus Software and Hortonworks. Raj earned his bachelor’s degree in Computer Science from BMS College of Engineering in Bangalore, India.

How will the GenAI/LLM database market evolve in 2024. Q&A with Madhukar Kumar. ODBMS.org, December 9, 2023

On Generative AI. Interview with Maharaj Mukherjee. ODBMS Industry Watch, December 10, 2023

On Generative AI and Databases. Interview with Adam Prout. ODBMS Industry Watch, October 9, 2023

__________________

– Follow us on X

– Follow us on LinkedIn

Dec 10 23

On Generative AI. Interview with Maharaj Mukherjee

by Roberto V. Zicari

“Managing changes is one of the dimensions that an organization may adapt to for reaping the complete benefits out of Generative AI. It need to manage the change, adaptation and redeployment easy on its workforce so that people do not feel threatened by this new technology.”

Q1. Generative AI applications like ChatGPT, DALL-E, Stable Diffusion and others are said to rapidly democratizing the technology in business and society. Is this really happening ?

Mukherjee: Democratizing in the AI/ML area has been happening for some time. It is a slow evolutionary process. It started with the advent of auto ML tools whereby the model building moved very quickly from the expertise of data scientists to the purview of any one with a touch of a button. But except for some areas such as face recognition, etc., it has been out of reach for most people. Now with the coming of generative AI with large foundational models and large language models the doors have been opened to the public to experiment with AI in all different ways. But I do not think the ball will stop here and we are yet to see all the many ways people can make use of these new sets of tools that have suddenly become available to them. These use cases will now drive the development of newer types of research and innovation in the AI field.

Q2. What industries do you believe will be most impacted by LLMs and Generative AI? Why?

Mukherjee: In my humble opinion, the arts and entertainment industry as well as the advertisement industry would be the first adaptors of this technology and that is already happening. It will be happening slowly in the area that require more specialized knowledge such as in the Science, Technology, and Engineering and often more regulated such as healthcare and pharmaceutical industries.

Q3. What kind infrastructure will be essential for deploying generative AI?

Mukherjee: The current barrier to the early adaptation for any small industry are two folds. One is the availability of specialized hardware such as GPUs and the next is availability of quality data scientists and programmers who can make use of the best-known algorithms and best available hardware to make it work. These two limitations are keeping these technologies out of reach for most companies except for a few very large organizations. But I would think that it is a matter of time and with improvements of scales things will be within the reach of almost every organization.

Q4. How will companies prevent the breach of third-party copyright in using pre-trained foundation models?

Mukherjee: That is a main concern for existing LLM and Gen-AI models. However, many organizations are already in the process of building models based on Retrieval Augmented Generation (RAG) which can take care of copyright violation and other ethical and legal issues. Another way to handle such violations is tagging the generation and retrieval to the original source by keeping a record of all intermediate steps using methodologies such as block chain.

Q5. What about trust in generative AI? How can you ensure the accuracy of generative AI outputs and maintain user confidence?

Mukherjee: A model is always as good as the results it generates. The problem of errors is not new to the area of AI and ML and just calling it using some anthropomorphic terms such as “Hallucinations” does not in any case make it different. Often errors are introduced as a safety measure as a bias in the model. As more and more people get used to these fundamental limitations of AI, people will adjust their expectations and find how the best way to use these models.

Q6. LLMs may generate algorithmic bias due to imperfect training data or decisions implicitly or explicitly made by the engineers developing the models. What is your take on this?

Mukherjee: Quality of data has always been an issue with any AI/ML models, and it is not very different in the age of generative AI and LLM. It is always the case of “Garbage In – Garbage Out”. In traditional AI/ML model developments engineers have been culling and engineering data to suit their goal and need. But often the engineering bias sips in how the data is selected and consequently some human biases are introduced in the model. In the realm of Generative AI the philosophy is slightly different from traditional AI. Since it is built based on any and all kinds of data – the initial data bias may not be an issue here. However, we still need to make sure that the output follows our societal norms and principles and also not harmful in general.

Q7. Will change management be critical to implementing generative AI?

Mukherjee: Managing changes is one of the dimensions that an organization may adapt to for reaping the complete benefits out of Generative AI. It need to manage the change, adaptation and redeployment easy on its workforce so that people do not feel threatened by this new technology.

Q8. How is AI regulation going to have an impact when it comes to harnessing the opportunities of generative AI?

Mukherjee: As with any human technology Generative AI needs to conform to our accepted societal principles, norms, and moralities. If the technologist and the industry cannot regulate themselves, there is a risk that the regulations may be imposed upon them by outsiders who may not have as much knowledge and understanding of the technology. It is better, therefore that developers of Gen AI step back and spend some time figuring out how to do that and impose certain standard checks and balances upon themselves.

Q10. If it were up to you would you use generative AIi in mission-critical applications?

Mukherjee: I would repeat my thoughts as before that Gen AI is not fundamentally different from traditional AI. Any area where people have used any traditional AI in the past may consider adapting to Generative AI or at least explore the options as a possibility.

……………………………………….

Maharaj Mukherjee, Senior Vice President and Senior Architect Lead, Bank of America

Well recognized expert in cutting edge technologies including Edge and Massively Distributed Computing, Artificial Intelligence and Machine Learning, Cognitive Deep Learning, Blockchain, and Internet-of-Things. Currently working in the Bank of America as Senior Vice President and Senior Architect Lead in the Technology Infrastructure Organization. Previously worked as Senior AI/ML Architect and SVP at the Employee Experience Technology within Bank of America. Before Bank of America Maharaj Mukherjee worked for twenty years in the IBM Research in various leading-edge technologies including Shape Processing Engine, Computational Lithography, Design for Manufacturing, Deep and Cognitive Machine Learning, and Watson Internet of Things. Maharaj Mukherjee is an IBM Master Inventor Emeritus and holds 162 US patents and 160 International Patents to his credit. He is also recognized as a top inventor in the Bank of America in 2020 and 2021. He received the Platinum Award from the Bank of America for being one of the top three inventors in 2021. He was also recognized by IBM for the “Twenty Patents from the Past Twenty years” in 2015. He was inducted in IBM’s Inventor Wall of Fame in 2011.

He holds a PhD from Rensselaer Polytechnic Institute, an MS from SUNY Stony Brook, and B-Tech (Hons.) from Indian Institute of Technology, Kharagpur.

He currently serves as a member of the Institute of Electrical and Electronics Engineers (IEEE) USA Awards Committee as well as a member of the IEEE Region 1 Awards Committee. He is also the current chair of Central Area of IEEE USA.

Related Posts

On Generative AI and Databases. Interview with Adam Prout, ODBMS Industry Watch, October 9, 2023

On Generative AI. Interview with Philippe Kahn, ODBMS Industry Watch, June 19, 2023

Follow us on X: @odbmsorg

Oct 9 23

On Generative AI and Databases. Interview with Adam Prout

by Roberto V. Zicari

” With GenAI also requiring massive amounts of training data, the need for greater storage capacity is crucial. Databases are designed to scale as data volumes grow, ensuring generative AI projects can handle larger datasets as they become available. This means databases can help support the growing demand for AI capabilities across the business world. “

Q1. How is Generative AI transforming the way we store, structure, and query data?

Adam Prout: The focus of generative AI is to create new data, such as texts and images. At its core, GenAI is made of neural networks, a subset of machine learning that handles unstructured data like text, audio, images, and videos. These networks consist of connected layers that learn from training data and identify patterns to make new instances. But it’s not creating copies of the existing instance in the data set. Instead, these networks develop unique data points based on the training data. As a result of increased computational power and the massive amounts of data produced in recent years, it has paved the way for generative AI.

Due to advancements in GenAI, many organizations are exploring the ways that the technology can increase efficiencies in their operations. For example, generative AI can help data analysts find hidden patterns in data sets, deriving actionable insights faster than a human could. In other instances, data augmentation helps organizations generate more data to train neural networks. Models like generative adversarial networks (GANs) can learn the distribution of original data, augment it, and create synthetic data to diversify training datasets for machine learning models. Likewise, content creation is a significant use case for generative AI as organizations can create reports, summaries, and other deliverables using proprietary data at a rapid speed.

As for query data, we can ask questions of our data in natural language, creating efficiency over writing an SQL query or doing a full text search. More data is being stored in vector embeddings in databases, and looked up via Approximate Nearest Neighbor (ANN) vector searches, as a result of GenAI.

There are many more ways that generative AI helps organizations better leverage their existing data while generating original instances. We’ll continue to discover how generative AI can transform the way we store, structure, and query data for years to come.

Q2. Generative AI relies on large amounts of data to generate human-like answers. Among the challenges faced by generative AI are Data Quality and Quantity. How can a database help here?

Adam Prout: Databases provide a structured framework for data storage, allowing organizations to implement routine data quality checks and validation rules to ensure models are only trained on high-quality information. Another advantage of using a database is the consistent maintenance of data through cleansing and enrichment tools. These processes remove inconsistencies, duplicates, and errors from the data, leading to better model training and improved generative AI outputs.

With GenAI also requiring massive amounts of training data, the need for greater storage capacity is crucial. Databases are designed to scale as data volumes grow, ensuring generative AI projects can handle larger datasets as they become available. This means databases can help support the growing demand for AI capabilities across the business world.

Q3. Unlike traditional AI workloads that require additional specialized skills, new Generative AI workloads are available to a larger segment of the developer community. What does it mean in practice?

Adam Prout: This is great news for the practice. More software developers are able to leverage generative AI tools to increase efficiency and solve simple, clearly defined problems. And with a growing number of advanced AI code-generation tools on the market, developers can experiment with these technologies to create artificial data and test their code.

It’s no surprise that developers will play a key role in the GenAI revolution. Their expertise and skill sets are vital to improving the performance of AI and machine learning models. They’ll be able to successfully pivot to focusing on AI development as the need for AI/ML skills skyrockets.

Q4. Generative AI: How to Choose the Optimal Database?

Adam Prout: When selecting the right database for AI and machine learning models, organizations need to take into account several considerations:

Speed of data processing: The ability to handle large volumes of data while processing information quickly can help organizations gain real-time insights to drive decision-making. This is especially true when working with streaming data or developing applications that require quick response times such as fraud detection or recommendation systems. A database built on a distributed architecture and in-memory data story can enable data processing at lightning-fast speed, helping organizations make fast and informed decisions.

Vector search: The way vector searches handle high-dimensional data and provide advanced search and similarity capabilities helps organizations simplify data management processes. A vector search categorizes data based on multiple features, allowing organizations to store and search high-dimensional vectors efficiently. This capability helps organizations build more accurate and effective machine learning models as it filters comprehensive datasets into the systems.

Scalability and integration: As AI requires more computing power and training data, selecting a database becomes even more important to help organizations build out their capabilities. Massive AI projects need a database that can handle complex queries at scale while helping extract and transform data to train AI/ML platforms. A highly scalable database can help companies meet increasing demands for AI-powered workloads. General purpose databases are flexible enough to handle a wide swath of data.

Real-time analytics capabilities: Databases with built-in analytics capabilities can help organizations quickly identify trends and patterns in their data to make more informed and instantaneous decisions. The ability to run analytical queries paired with transactional ones in the same database system, known as hybrid transactional/analytical processing (HTAP), can eliminate the need for separate systems to complete tasks — simplifying the data architecture and reducing costs. This also offers greater flexibility as organizations look to adopt more AI capabilities into their operations.

Q5. Are NoSQL databases better suited for Generative AI than SQL databases?

Adam Prout: NoSQL and SQL databases each have their own strengths and weaknesses, and which one works best for Generative AI depends on what your project needs. NoSQL databases inherently come with more flexibility when it comes to handling unstructured or semi-structured data, which can be beneficial for certain types of data used in Generative AI – think text, images, and sensor data. As for SQL databases, they provide powerful query capabilities, enabling IT leaders to perform complex data retrieval and analysis.

To put it simply, many GenAI projects use a combination of both types of databases, leveraging the strengths of each. When choosing which database to utilize, it’s critical to evaluate the needs and constraints of your project.

Q6. Some SQL databases do have some features that make them compatible with Generative AI, such as supporting JSON data and functions. Are they suited for Generative AI?

Adam Prout: SQL databases that support features, like JSON, can be well-suited for certain aspects of Generative AI, largely when dealing with flexible or semi-structured data formats. Some benefits these features provide are JSON support, schema flexibility, data integration, complex querying, and scalability.

However, depending on the nature of one’s data – the volume and the complexity – a combination of SQL databases with NoSQL databases may also be a suitable solution. There isn’t a “one-size-fits-all” approach, and to ensure you’re best aligning with your project’s needs and constraints, it’s important to evaluate the end goal that is wanting to be achieved by this particular project.

Q7. Are databases with vector support the bridge between LLMs and enterprise gen AI apps? Why?

Adam Prout: Databases that include vector support can most definitely play a crucial role when it comes to bridging the gap between LLMs and enterprise Generative AI applications for many reasons:

Easier storage and retrieval of embeddings: LLMs, like ChatGPT, generate word embeddings or vector representations of text data – meaning it’s not only designed to efficiently store embeddings but also to retrieve them, making it easier to manage and query.
Quick and accurate similarity searches: Vector searches reign supreme when it comes to performing similarity searches, and in the context of Generative AI, this is very valuable, as it enables applications to find similar documents or content quickly.
Scalability: Scalability is crucial for enterprise applications that need to process vast amounts of data, especially as LLMs continue to produce substantial volumes of vector data. Vector search are purpose-built to efficiently manage large-scale vector data, making them a vital component in handling such demands.
Real-time applications: Various enterprise Generative AI applications like chatbots, sentiment analysis, and content generation, require real-time processing. Vector enables real-time retrieval and analysis of vector data – increasing the necessary responsiveness of applications.

Q8. Will vector databases be the essential infrastructure in bringing about the societal and economic changes promised by AI?

Adam Prout: Firstly, I want to clarify my thoughts on the term “vector database.” To SingleStore, vector search is a capability of a database, not a new category of database. That being said, databases that support vector indexing are suited for storing and querying high-dimensional vectors, meaning that they are well-equipped for tasks related to machine learning, recommendation systems, natural language processing, and more.

So, will vector searches be the essential infrastructure in bringing about the societal and economic changes promised by AI? They most definitely play a significant role, however, it’s important to understand that they are just one piece of a very large puzzle that includes algorithms, hardware, ethical considerations, and much more. Whether or not they become “the essential infrastructure” depends on various factors, such as specific applications and use cases of AI. In addition, good results from GenAI prompts often require more than a vector search – often there is a need for more traditional filters on other attributes of data and the like.

Q9. Who is already using Generative AI in the enterprise world?

Adam Prout: A recent report explored how companies are utilizing generative AI and shared: 46% for content generation, 43% for developing analytics insights summary, 32% for analytics insight generation, 32% for code development, and 27% for process documentation. On top of this, most companies are curious about AI but don’t use it as part of their everyday process, with the majority of 53% saying they are “exploring” or “experimenting” with the tech.

All of this is to say that the use of generative AI in the enterprise landscape continues to evolve rapidly – whether that be organizations fully implementing the tech in their day-to-day operations, or employees utilizing it to complete specific tasks.

Q10. SingleStoreDB has evolved over the past 10 years from its early days as MemSQL (in-memory OLTP) to become a more general purpose distributed SQL Database. How do you manage AI and Generative AI?

Adam Prout: When we first founded MemSQL, people were saying SQL couldn’t scale – we know that wasn’t true. We knew we could build something scalable, but similar enough to traditional, single-host so that customers wouldn’t have to learn a whole new database.
That took us to real-time analytics and progressing to a general purpose database. We expanded to a broad set of workloads and analytics, with performance similar to or even better than specialized systems. We’re giving customers the flexibility that comes with general purpose databases, as well.
As for AI, SingleStore has supported basic exact-match vector search capabilities for many years and we are adding improved vector indexes for ANN search for larger data sets. We believe vector searches combined with general purpose SQL databases capable of filtering, full text search, JSON and the like are crucial capabilities to unlock the most value from GenAI.
……………………………………………..

Adam Prout, CTO and Co-Founder, SingleStore

Adam Prout is the CTO at SingleStore and oversees product architecture and development. He joined SingleStore in 2011 as a co-founding engineer. Previously, Adam led engineering efforts on kernel development at Microsoft SQL Server. He holds Bachelor degrees in Computer Science and Mathematics, and a Masters degree in Mathematics from the University of Waterloo.

Related Posts

On Generative AI. Interview with Philippe Kahn, ODBMS Industry Watch, June 19, 2023

On Generative AI. Q&A with Bill Franks, ODBMS.org JUNE 26, 2023

Resources

ODBMS.org EXPERT ARTICLES

Follow us on X: @ODBMSorg

Jun 19 23

On Generative AI. Interview with Philippe Kahn

by Roberto V. Zicari

“ AI will neither save nor doom the world. It’s people that will. “

Q1. OpenAI CEO Sam Altman says AI will reshape society, acknowledges risks: ‘A little bit scared of this’ (*) What is your take on this?

Philippe: At Fullpower-AI, we build a domain-specific generative AI platform so that I may have an insider perspective. The AI platforms we build are the world’s most rapidly developing technologies. They aren’t just generating text, images, videos, and sounds. They are creating a combination of excitement and anxiety among people and governments across the globe. It’s important to stay ahead of the curve.

Here are some advantages of Generative AI

Assistance and Automation: ChatGPT, Bard, and similar models can provide valuable assistance and automation in various tasks. They can answer questions, provide recommendations, assist with research, and even automate certain processes, saving time and effort.
One-on-one personalization for learning, researching, and automating.
Creativity: Generative AI can create new and original content, including text, images, and music. It can develop ideas and solutions that some humans might have not considered.
Democratization of knowledge and content: Generative AI can make complex information and technologies more accessible to a broader audience. It can simplify and explain complex concepts more understandably, allowing people to engage with information that might otherwise be challenging to understand.

Here are some disadvantages of generative AI:

Bias and Misinformation: Generative AI models like ChatGPT can inadvertently reflect and amplify biases present in the training data. The model’s outputs may also contain similar biases or misinformation if the training data contains biased or inaccurate information. And, of course, there is a potential for a compounding effect.
Potential nonsense generation: Although AI models can generate coherent responses, they often need a true understanding of the content. They rely heavily on patterns, brute force data manipulation, and statistical correlations in the data rather than true comprehension, which can lead to incorrect or nonsensical answers with a potentially compounding effect.
Ethical Concerns: There are ethical concerns surrounding generative AI, particularly when it comes to deep fakes and the potential for malicious use. These models can be misused to create convincing fake content, significantly affecting opinion, attitude, policy, privacy, security, and trust. It’s the old perverse: “The bigger the lie, the more will believe it.”
Overreliance and Dependency: As generative AI becomes more prevalent, there is a risk of overreliance and dependency on these systems. People might rely too heavily on AI-generated content without critically evaluating its accuracy or considering alternative perspectives, leading to intellectual laziness and bigotry.
Unintended Consequences: It’s easy with these systems to produce “credible” spam, phishing, or propaganda with very negative societal impacts.

Recognizing and addressing these negatives is important to ensure the responsible development and deployment of generative AI technologies. We are fully aware of that at Fullpower-AI and think about it continuously.

Q2. What do you think is the Social Impact of Chat GPT-4?

Philippe: Per your prior question ChatGPT, Bard, and others already have a profoundly positive and negative impact.

For example, more advanced generative AI could increase automation in various industries and job sectors. While this can improve efficiency and productivity, it may also result in job displacement or changes in the job market, requiring individuals to adapt to new roles and acquire new skills.

It’s important to remember that these are all speculative impacts based on the general trajectory of AI advancement. The specific social impact of the AI systems would depend on various factors, including design, deployment, and the actions taken by developers, policymakers, and society to shape its use.

AI will neither save nor doom the world. It’s people that will.

Q3. Do you use Chat GPT-4? What do you think about it?

Philippe: At Fullpower-AI, we build domain-specific generative AI systems targeting sleep management, breathing anomalies, skincare, industrial automation, etc. As general-purpose systems, ChatGPT and Bard have proven their usefulness. It’s important to remember the safeguards per your question 1.

Q4. In the interview above, it is mentioned that, “GPT-4 is just one step toward OpenAI’s goal to eventually build Artificial General Intelligence, which is when AI crosses a powerful threshold which could be described as AI systems that are generally smarter than humans.” Is this Science Fiction, or is it something that may happen?

Philippe: Building Artificial General Intelligence is challenging. It remains controversial. This is way passed the Turing test because human behavior and intelligent behavior are not the same things.

Regarding feasibility, there are varying opinions regarding the potential for achieving AGI. I believe we will achieve 90% of AGI in the next decade. The last 10% may take a very long time.

It is important to approach AGI development cautiously, ensuring responsible and ethical practices are in place to address potential risks and consequences. Continued research, collaboration, and discussions are necessary to advance our understanding of AGI and its societal implications.

Qx anything else you wish to add here?

Philippe: While there is no definitive answer, it is crucial to consider the potential risks and take precautions to ensure AGI’s safe development and deployment. To mitigate risks, we advocate for AGI’s development with safety and ethics. Part of the challenge is the geopolitical competition. We must ensure that nations on the fringe don’t exploit loopholes. Personally, I believe in progress and that AI technology can have a deep positive impact on our future.

Resources:

(*) Source: abc news. OpenAI CEO, CTO on risks and how AI will reshape society, March 16, 2023,

………………………………………

Philippe Kahn is a highly successful serial entrepreneur who founded a number of leading companies, including Fullpower-AI, LightSurf, Starfish Technologies, and Borland.

Feb 6 23

On Innovation. A Conversation with Philippe Kahn

by Roberto V. Zicari

“ I always think about a graduate class called “Invent.” Innovation has to be based more on spark than process. “

I asked ten questions on innovation to Philippe Kahn back in February 2006. Now this is a new revision…

RVZ

Q1. What is Innovation for you?

Philippe Kahn: Innovation is a key success ingredient for science, business, and personal growth. It is all about bringing something new: New ideas, new devices, and new methods.

Q2. What pivotal role did your parents play in your personal development?

Philippe Kahn: Yes, I grew up with a single Mom, my hero. Here Wikipedia speaks for itself: Clair Monis.

Q3. Besides a master’s in mathematics, you also received a master’s in musicology composition and classical flute performance. Did music influence your career as an entrepreneur? How?

Philippe Kahn: Playing music is part of my daily practice. My Mom, a concert violinist, would make me practice 30 minutes before going to school. This has become a daily life discipline like meditation. I play both Jazz and classical music daily.

Q4. You are credited with creating the first camera phone. You had a vision, but this did not materialize at that time. What is the main lesson you learned from this?

Philippe Kahn: Pioneering visions never materialize instantly. We created the first working prototype in 1997, launched it in Japan toward the end of 1999, then in the US in 2002. In 2007 Steve Jobs and Apple launched the iPhone, and the market grew. Here is a helpful link

Q5. You are also credited with being a pioneer in wearable technology. This developed into Fullpower-AI: AI-modeled biosensing algorithms and embedded AI Machine Learning solutions and generative AI for Synthetic trial augmentation. What obstacles did you have to overcome, to make this vision reality? Who helped you to make it a reality?

Philippe Kahn: Our team at Fullpower-AI created the first iteration of our IoT biosensing platform. We thought that the first application was for wearables. We built complete solutions for Nike and launched Nike Running solutions. We also licensed our technology to Jawbone in 2011. It is all internal development: Device, sensing, firmware, security, cloud, and actionable insights. Now we are focused on digital transformation with our IoT/AIoT biosensing platform. Our goal is to help transform sleep, cosmetics, wellness, and medicine by leveraging our platform.

Q6. What do you consider are the current most promising innovations that will have an impact in the near future?

Philippe Kahn:. We all know of the impact of generative AI on creating content such as text, music, and graphics. It’s helpful to many, but there may be a few hints of plagiarism. I think that generative AI could be helpful in helping people develop better writing, musical, and graphic skills. However, the most promising applications of AI are in wellness, health, and medicine. We may finally make significant progress in tackling challenges such as Alzheimer’s, Cancer, etc. All this is possible because of the combination of IoT, AIoT, biosensing, and deep learning.

Q7. In 2006 you mentioned that Vision, Leadership, and Perseverance were in your opinion the top 3 criteria for successful Innovation. Did you change your mind in the meanwhile?

Philippe Kahn: Yes, Vision, Leadership, and Perseverance are key. Let’s sprinkle a bit of luck too. With Fullpower-AI and our IoT/AIoT platform, we were early in 2010, now we look like an “overnight success!’

Q8. What is a culture that supports and sustains Innovation?

Philippe Kahn: No matter what size, visionary leadership is key. It’s necessary but sometimes not sufficient. Augmenting the teams with the best talent is key while setting up non-invasive disciplined processes.

Q9. What should be taught in universities to help Innovation that is currently missing in your opinion?

Philippe Kahn: I always think about a graduate class called “Invent.” Innovation has to be based more on spark than process.

Q10. You and your wife Sonia run the Lee-Kahn Foundation. Tell us a bit about it.

Philippe Kahn: Yes, we like to focus on the environment, in particular wild life, animal welfare, and conservation. Our founding vision is utopian, yet something we can get behind. It reads like this: “May our children and our children’s children enjoy better health and be able to hear the howl of a Wolf Pack in the wild, experience the magic of Dolphins playing with the ocean waves, drink pure water from every stream… “

………………………………………….

Philippe Kahn is a highly successful serial entrepreneur who founded a number of leading companies, including Fullpower-AI, LightSurf, Starfish Technologies, and Borland.

Resources:

On Innovation. Archive of interviews (2006-now)

Jan 13 23

On Cloud Database Management Systems. Interview with Rahul Pathak.

by Roberto V. Zicari

IT teams no longer want to be consumed by undifferentiated heavy lifting so that they can focus on strategic business goals and innovation. This is very liberating, and we believe that this is a major growth driver.

Q1: In your opinion what is the status of the database market today and in the next years to come?

Rahul: The broader database market trend is more of a question for analysts. Our unwavering focus is to continue innovating on behalf of customers to make advanced database features more approachable while reducing the costs and complexities of maintaining databases. IT teams no longer want to be consumed by undifferentiated heavy lifting so that they can focus on strategic business goals and innovation. This is very liberating, and we believe that this is a major growth driver.

Q2: You just wrapped up re:Invent 2022. Is re:Invent the high point of the year in terms of your database announcements?

Rahul: re:Invent is always an exciting and energizing event. That said, we actually release new innovations throughout the year, when they are ready. For example, we released some big innovations earlier in 2022, like Amazon Aurora Serverless v2, Amazon RDS Multi-AZ with two readable standbys, and a whole lot more. We also have announcements at re:Invent in addition to providing attendees a hands-on learning experience of our services.

Q3: Can you share some details on these more notable launches prior to re:Invent?

Rahul: Absolutely. We launched Amazon Aurora Serverless v2 (ASv2), which provides customers the ability to instantly scale up and down in fine grained increments based on their application’s needs. ASv2 is particularly useful for spiky, intermittent, or unpredictable workloads. Manually managing database capacity can take up valuable time and can lead to inefficient use of database resources. With ASv2, customers only pay on a per-second basis for the database capacity that you use when the database is active. ASv2 has become the fastest adopted feature in the history of Aurora. Customers, like Liberty Mutual, S&P Global, and AltPlus, have used ASv2 to reduce their costs while achieving improved database performance.

Another feature launch that has proven compelling to customers is the release of Amazon RDS Multi-AZ with two readable standbys in different AZs, improving both performance and availability. As you may know, we launched Multi-AZ deployment back in 2020 in which we automatically create a primary database (DB) instance and synchronously replicate the data to an instance in a different AZ. When it detects a failure, Amazon RDS automatically fails over to a standby instance without manual intervention. Now, the launch of Multi-AZ two standbys adds another layer of protection and significant performance benefits. With this feature, failovers typically occur in under 35 seconds with zero data loss and no manual intervention. Customers can gain read scalability by distributing traffic across two readable standby instances and up to 2x improved write latency compared to Multi-AZ with one standby.

Q4: During re:Invent, it was mentioned that AWS also recently launched serverless and global database for your graph database, Amazon Neptune. Can you share some details on this?

Rahul: Yes, Amazon Neptune is now our sixth database to be serverless and our fifth database with ability to scale reads globally across regions. Both of these capabilities are important for modern day applications with global performance requirements at scale. I should also mention that for our first ever serverless database, Amazon DynamoDB, we recently announced the capability to import data from S3. This further underscores our focus on increasing interoperability and integration across our services to minimize effort by customers in moving their data to where they need it.

Q5: On the heels of re:Invent, AWS became the new Leader of Leaders in the Gartner MQ for Cloud Database Management Systems 2022. That’s a remarkable achievement. How is AWS thinking about this recognition? What are the main strengths that Gartner found in your offering? Are there any weaknesses?

Rahul: While AWS has been named as a leader for the eighth consecutive year, we were elated and humbled to be positioned highest in execution and placed furthest in vision among the top 20 data and analytics companies in the world. We think listening to our customers and solving their most challenging problems is key. We engage closely with customers on product roadmaps and work diligently to deliver on our commitments as promised. Our own experience in operating our e-commerce business has and continues to also be a wellspring of learnings for what it takes to build massive modern internet scale applications serving customers on a global scale.

In their 2022 report, Gartner called out the breadth of our services as a major strength. Our best-fit philosophy, targeted to specific use cases as needed by various applications and microservices, is really paying off. No vendor ever gets a perfect score and Gartner also noted that there is still upside from better integration between our sevices. Gartner gave us credit for a progress towards an integration roadmap, and this continues to be a major roadmap theme for us. At re:Invent, we announced Amazon Aurora zero-ETL integration with Amazon Redshift, and we’re eager to continue delivering on our integration roadmap. You can read the report here.

Q6. What were the overarching themes around your announcements at re:Invent 2022?

Rahul: Our database business tracks several themes that we deliver against. Of these themes, there were three that were at the center of our announcements. These themes were interoperability across services, advancing performance and scale, and operational excellence by making security and advanced operational techniques more approachable.

Q7: Why are these themes important?

Rahul: Interoperability across our services is important because it improves productivity across development and operations teams. Integration between services is needed as part of building modern applications. It’s a question of where the integration occurs. Application developers often have to include this integration as part of their application code or solution architects must take extra measures to include additional integration components which increases complexity. If the integration is built in under the covers, then that’s one big area developers and architects don’t need to worry about.

Performance and scale are important because of the deluge of data and types of data organizations are experiencing and will continue to experience. For almost every organization this deluge of data is a clear and present day-to-day reality. Customers need reassurance that they can scale-up and scale-out with real-time performance.

Finally, the approachability of security and advanced operational techniques removes big hurdles that get in the way of organizations that don’t want to make massive investments in IT operations and specialized skills. It levels the playing field for the undifferentiated heavy lifting – things that are not core to the business but necessary for advancing the mission of the business. The definition of undifferentiated heavy lifting is expanding. Years ago, we started by removing the resources associated with hardware provisioning, database setup, patching, backups, and more. This is expanding to scaling up/down and scaling in/out based on an application’s needs, and removing the highly specialized skill sets and extensive resources otherwise required.

Q8: What did AWS announce in support of interoperability across services?

Rahul: We announced the preview of interoperability between Amazon Aurora and Amazon Redshift. Each of these services leads in their categories – Amazon Aurora as an operational database and Amazon Redshift as an analytical database.

The traditional approach to integration between operational and analytical databases is to use generalized ETL or ELT. This is beset with problems in so many ways. It’s complex and heavy, often requiring manual coding of SQL to optimize query performance. It’s harder to setup, maintain and use. Maintenance and the lifecycle management of this type of data integration is worsened by the inherent fragility of this approach – the integration breaks when there is a change to the source or target schema. This requires extensive testing after every change. What you get after taking on all these burdens is usually a low performance, non-elastic solution that doesn’t adapt well to changing workloads.

We announced the preview of a purpose-built, point-to-point, fully managed integration that doesn’t suffer from these issues. Our Amazon Aurora zero-ETL integration with Amazon Redshift can consolidate data from multiple Aurora databases to a single Redshift database, giving you the benefit of near-real-time analytics on unified data. This opens up an entire category of use cases for time sensitive analytics on fresh data.

The integration is easy to setup – creating a Redshift integration target, whether it’s a new or existing endpoint, is easy. Furthermore, we designed this zero-ETL integration for easy maintenance adapting to Aurora side schema changes. Database or table additions and deletions are handled transparently. If a transient error is encountered, the integration automatically re-synchs after the recovery from the error.

Data is replicated in parallel, within seconds. So large data volumes are not a problem. On the Amazon Redshift side, you can transform data with materialized views for improving query performance.

Q9: Now shifting to performance and scale, what are the highlights?

Rahul: We announced three key new features starting with Amazon DocumentDB Elastic Clusters which will horizontally scale writes with automated operations. As you may know, we can already horizontally scale reads across all our popular databases using read replicas. For Amazon DocumentDB, our customers needed the ability to horizontally scale writes beyond limits of a single node. Amazon DocumentDB Elastic Clusters uses sharding, a form of partitioning data across multiple nodes in a cluster, so that each node can support both reads and writes in a multi-active approach. When data is written to a node it is immediately replicated to the other nodes. This has the added benefit of supporting massive volumes of data. What’s exciting is Amazon DocumentDB can scale to handle millions writes (and reads) per second with petabytes of storage capacity.

In addition to horizontal scaling, we also invested in optimizing the performance of a single database instance. Our announcement of Amazon RDS Optimized Writes and Amazon RDS Optimized Reads for MySQL are examples of this. Both of these enhancements improve our internal implementation to improve performance.

Prior to RDS Optimized Writes, atomicity of writes was handled by writing pages twice. Smaller chunks of a page were first written to a “doublewrite buffer” and then written to storage. This protects against data loss in case of failure, but two writes take longer and consume more I/O bandwidth reducing database throughput and performance. For use cases with a high volume of concurrent transactions, to solve for durability customers also need to provision additional IOPS to meet their performance requirements. Optimized writes work by atomically writing more data to the database for each I/O operation. So, this means that the pages are written to table storage durably as a single atomic operation in one step. With Optimized Writes, customers can now gain up to 2x improvement in write transaction throughput at no additional cost and with zero data loss.

With RDS Optimized Reads, read performance is improved by leveraging data proximity. A MySQL server creates internal temporary tables while processing complex or unoptimized queries like analytical queries that require grouping, sorting etc. When these temporary tables cannot fit into memory, the server defaults to disk storage. With Optimized Reads, RDS places these temporary tables on the instance’s local storage instead of an Elastic Block Storage volume, which is shared network storage. It’s the local availability of temporary data that makes queries up to 50% faster.

Q10: How about security and operational excellence, what did AWS announce for this theme?

Rahul: Security is of utmost importance and an area of sustained investment for us. We announced the preview of Amazon GuardDuty RDS Protection, which protects Amazon Aurora databases from suspicious login attempts that can lead to data exfiltration and ransomware attacks. It does this by identifying anomalies, sending intrusion alerts, managing stolen credentials, and more. Our goal with GuardDuty was to create a tool that’s easy to enable and produces timely, actionable results. We use machine learning to accurately detect highly suspicious activities like access attacks using evasion techniques. Security findings are enriched with contextual data so you can quickly answer questions such as what database was accessed, what was anomalous about the activity, has the user previously accessed the database, and more. Aurora is the starting point. We’ll also extend this capability to other RDS engines.

We also announced Trusted Extensions for PostgreSQL, an open-source development kit and project, available for Amazon Aurora and Amazon RDS. This project is focused on increasing the security posture for extensions starting with PostgreSQL.

Developers love PostgreSQL for many reasons including the thousands of available extensions, but adding extensions can be risky. This makes certification of extensions very important. Our customers asked us for an easier way to use their extensions of choice and also build their own extensions. It’s impractical for AWS to certify the long tail of extensions, so we worked with the open-source community to come up with a more scalable model.

Trusted Language Extensions for PostgreSQL is a framework that empowers developers and operators to more safely test and certify extensions. Now, as soon as a developer determines an existing extension meets their needs or is ready to implement a custom extension, they can safely test and deploy it in production. Developers no longer need to wait for AWS to certify an extension to begin implementation because Trusted Language Extensions are considered to be part of your application. It provides a safe approach because the impact of any defects in an extension’s code is limited to a single database connection. Trusted Language Extensions supports popular programming languages that developers love including JavaScript, Perl, and PL/pgSQL. We do plan to support other programming languages so stay tuned for announcements in 2023.

Q11: What else did AWS launch for making advanced operational techniques more approachable?

Rahul: I am also excited about Amazon RDS Blue/Green Deployments, which automates an advanced DevOps technique – and this is available for MySQL in both Amazon RDS and Amazon Aurora. In the current atmosphere of 24/7 operations, downtime for updates (security patches, major version upgrades, schema changes, and more) or disruptions or data loss due to failed attempts at updates are not acceptable.

In this DevOps technique, the production environment is the ‘blue’ environment and the staging environment is the ‘green’ environment. For organizations with advanced DevOps skills, they will test new versions of software in a ‘green’ environment under a production load, before actually putting it in production. But this requires advanced operational knowledge, careful planning and time. With RDS Blue/Green Deployments, we provide a fully managed staging environment. When an upgrade is deemed to be ready, the database can be updated in less than a minute with zero data loss – a much simpler, safer and faster approach to database updates.

Another launch is AWS Database Migration Service (DMS) Schema Conversion making heterogeneous migrations operationally easier. Previously, a separate schema conversion tool was needed for mapping the data at the source database to the target database. Now the schema conversion is integrated with DMS, making schema assessments and conversions much simpler. Heterogenous schema conversion can now be initiated with a few simple steps, reducing set up time from hours to minutes.

Q12: Would you like to add anything else?

Rahul: A good way to come up to speed with the latest from AWS and the art of the possible is to watch recordings from re:Invent. We showcased product announcements and a breadth of sessions that cover our product roadmap and best practices. You can also learn more from our database category page, and database blog. We’re energized and focused on innovating for our customers! Feedback is always welcome and I encourage all customers to reach out so we can help no matter where they may be on their journey to the cloud – simply complete our Contact Us form.

………………………………..

Rahul Pathak is Vice President, Relational Database Engines at AWS, where he leads Amazon Aurora, Amazon Redshift, and Amazon QLDB, AWS’ core relational database engine technologies. Prior to his current role, he was VP, Analytics at AWS where he led Amazon EMR, Amazon Redshift, AWS Lake Formation, AWS Glue, Amazon Athena, and Amazon OpenSearch Service. During his 11+ years at AWS, Rahul has focused on managed database and analytics services with previous roles leading Emerging Databases, Blockchain, RDS Commercial Databases, and more. Rahul has over twenty-five years of experience in technology and has co-founded two companies, one focused on digital media analytics and the other on IP-geolocation. He holds a degree in Computer Science from MIT and an Executive MBA from University of Washington.

Resources

AWS positioned highest in execution and furthest in vision

Gartner has recognized Amazon Web Services (AWS) as a Leader and positioned it highest in execution and furthest in vision in the 2022 Magic Quadrant for Cloud Database Management Systems among 20 vendors evaluated. This Magic Quadrant report provides cloud data and analytics buyers with vendor insights based on Gartner research criteria. AWS has been a Leader in the report for eight consecutive years.

Magic Quadrant for Cloud Database Management Systems

Published 13 December 2022 – ID G00763557 – 71 min read

Figure 1: Magic Quadrant for Cloud Database Management Systems (source Gartner (December 2022)

Read the Gartner Report

Related Posts

EXPERT ARTICLES DECEMBER 16, 2022

Deep Dive Amazon DocumentDB Elastic Clusters. Q&A with Vin Yu

https://www.odbms.org/2022/12/deep-dive-amazon-documentdb-elastic-clusters-qa-with-vin-yu/

Follow us on Twitter: @odbmsorg

Mar 21 22

On Using in Memory Database. Interview with Jonah H. Harris

by Roberto V. Zicari

” Whether it’s adding features, fixing bugs, or improving performance, it all comes down to the quality of the code.”–Jonah H. Harris.

Q1. You are the director of Artificial Intelligence & Machine Learning at The Meet Group. What are your current responsibilities?

Jonah H. Harris: AI and ML research is rapidly growing. Staying on top of those advancements to identify key strategic opportunities and improvements that deliver novel and strategic solutions, which solidify our position as leaders in personal connection, is paramount. While setting direction is important, my primary goal is to shape, grow, and lead an exceptional team of Machine Learning Engineers to research, design, develop, and implement innovative solutions and advance our company’s capabilities across multiple business units. Our focus areas primarily include deep learning, natural language processing, computer vision, recommendation, ranking, and anomaly detection. It’s quite a bit to remain current on these days.

Q2. What do you use Artificial Intelligence & Machine Learning for?

Jonah H. Harris: At The Meet Group, we provide multiple brands and platforms which enable members to identify potential partners for romantic, platonic, and entertainment purposes. While traditional recommendation systems match items (e.g., books, videos, etc.) with a user’s interests, we aim to match people who are mutually interested in and likely to communicate with each other. While recommendation is a critical component of our business, additional work is required to perform abuse prevention and improve monetization – all of which are enhanced using a combination of data science, machine learning, and artificial intelligence. Our team employs many different techniques and technologies to accomplish each area mentioned above as quickly and efficiently as possible.

Q3. You have been working previously as the VP of Architecture and Lead DBA, overseeing high performance data access. What were your most important projects?

Jonah H. Harris: Now paired with Parship, The Meet Group is a worldwide leader in personal connection with a globally distributed workforce. When I joined as the Lead DBA in 2008, however, it was a small social network named myYearbook based in New Hope, Pennsylvania. Through multiple acquisitions and stages of the company, from private to NASDAQ-listed and private once again, I’ve been fortunate enough to grow with the organization and hold various positions from individual technologist to Chief Technology Officer. I’ve always enjoyed challenging work and my current position, overseeing AI/ML, is no different.

When I think of all the projects I’ve architected or developed over the years, one of the most fun and architecturally challenging was the reciprocal matchmaking system designed for a game called BlindDate.

BlindDate was a questionnaire-based matchmaking system that allowed members to select questions about themselves, supply their own answers, and identify their desired partner’s answers. To be “matched,” other members would need to answer the same questions along with the desired answers bi-directionally. One important implementation caveat was that we did not want to precompute these matches – they had to be done in (soft) real-time. We found many members would submit hundreds or even thousands of questions. While we did our best to partition this problem into an optimal search space, performing this reciprocal match was a performance challenge.

For our MVP, we initially designed this to use a relational database. Early on, however, we found this began to take around eight hundred milliseconds per request. As the game scaled, this would never work as initially designed. This led us to look at eXtremeDB.

Coupled with its new (at the time) multi-version concurrency control (MVCC) transaction manager and ability to control the low-level data structure format, we were able to design a bitwise-optimized matching algorithm. As a result, the eXtremeDB-based implementation dropped the response time of a single request down to seventy-six microseconds on the exact same hardware; it also reduced memory usage by two-thirds.

Q4. What are the main challenges you have encountered to achieve high performance data access?

Jonah H. Harris: Largely, a primary challenge is defining the appropriate structure to store and query data. Relational databases are great for general-purpose data management. On the other hand, NoSQL-oriented systems are great for flexibility. Similarly, systems such as Redis provide a unique ability to perform tasks that can’t easily be done with great performance in a traditional database management system. When designing an application, you have to choose the best tool for the job and make trade-offs where necessary. In some cases, this requires utilizing multiple data management technologies or sacrificing performance on one task in favor of another. It’s hard to find a system that’s both as flexible as it is fast: eXtremeDB is really the only contender in that category I’ve found.

Q5. Can you tell us about some of the work you have done with eXtremeDB?

Jonah H. Harris: In addition to the BlindDate case mentioned above, we experimented with storing a graph database structure in eXtremeDB – it was highly performant and gave us the ability to store the graph in an optimal form while also making it queryable via SQL.

eXtremeDB is so good that I have personally licensed it to develop and test out my own ideas and implementations of various systems. I’ve built everything from a Redis-compatible service to real-time recommender systems based on eXtremeDB.

I’m actually in the process of writing a book for Apress, Realtime Recommendation Systems: Building Responsive Recommenders from the Ground Up, and testing out several of those algorithms with eXtremeDB as well. Compared to several well-known open-source recommenders, my eXtremeDB-based versions consistently demonstrate several hundred percent improvements in performance. This is due to eXtremeDB’s highly-optimized in-memory implementation, which doesn’t force me to sacrifice on-disk capabilities as other systems do. Additionally, I’ve always licensed the eXtremeDB source code, which is rare for a company to offer. With that, I’ve been able to gain a solid understanding of internals and compile-time optimizations, enabling me to make even better performance gains. The code is immaculate, and McObject is equally great about accepting patches for additional functionality.

Q6. Why choosing eXtremeDB?

Jonah H. Harris: If my earlier answers haven’t already praised its modularity flexibility enough, I’ll state it more clearly: with over twenty years of professional experience not only administering and developing against databases but also working on their internals, eXtremeDB is the only system I’ve found that gives developers the ability to build almost anything with very few constraints.
Likewise, McObject’s support is exceptional. You can ask as detailed of a question as you can imagine and get a solid answer, in many cases from the engineers themselves.

Q7. You have implemented a number of features for commercial and open-source databases. What are the main lessons you have learned?

Jonah H. Harris: Whether it’s adding features, fixing bugs, or improving performance, it all comes down to the quality of the code. Unfortunately, most open-source database code is abysmal. Postgres, InnoDB (proper), and Redis are exceptions. That said, you’d expect commercial implementations to be so much better – but they’re usually not. It’s sad, really.

While I didn’t know it initially, part of the team behind eXtremeDB was also behind the old Raima Database Manager (RDM). In the late nineties, I used RDM quite a bit and had a source code license for its code as well. Aside from the MASM-based NetBIOS lock manager implementation, which I believe they acquired from a third-party developer, it was an extremely well-written system with great documentation. So, when I found out eXtremeDB was a brand new, from the ground-up, in-memory-optimized system with very similar developer-friendly embedded database design goals, I was sold!
Sure, I’ve worked on the internals of many different database systems. But, I have no problem understanding the code to eXtremeDB at all. It’s all well-organized and straightforward, which is hard to do for a system that supports multiple transaction managers and is optimized for both in-memory and on-disk operations.

Q8. You are an active open-source contributor. What are your current open source projects you contribute to?

Jonah H. Harris: As of late, I haven’t had a great deal of time to do much open-source work. Database-wise, my latest contributions are to Redis, adding a few useful commands and performance optimizations. The rest are generally bug fixes or feature additions in libraries I frequently use.

Q9. What is your experience of using open source software for mission critical applications?

Jonah H. Harris: I’ve always been a big advocate of open-source. I remember first using FreeBSD and Linux in the mid-90s when I was in middle school. That said, I’m huge on choosing the best tool for the job at hand. Sometimes that’s open-source, and sometimes it’s not.

In the early 2000s, I was hired to lead the development of a Johnson & Johnson brand’s rewrite of their CFR Part 11 quality system ERP module from PowerBuilder to Apache+PHP. We used a good amount of open-source, but it still ran on top of HP-UX and Oracle. Did it need to? No. But that’s what they were comfortable with and, to be honest, those were a better choice stability-wise at the time.

These days, when I’m building a general back-end web-based API, I default to Node.js+NGINX, Postgres, and Redis. As most things are containerized on top of a Linux distribution these days, it’s hard to beat that stack. Language-wise, I like TypeScript, though I do see cases for Rust and Go in the future.

That said, when I’m building a performance-optimized system, I still prefer C with libuv for networking. For data management, I’ll use eXtremeDB when I need MVCC or dual in-memory/on-disk functionality. There’s no need to reinvent that, and nothing is nearly as fast. Otherwise, I’ll use klib data structures for simple single-threaded apps.

Open source is great, and it’s come a long way. But, there are still valid cases for using commercial systems.

Qx Anything else you wish to add?

Jonah H. Harris: For the most part, IMDB systems have always been considered a niche: you either know about them or you don’t. eXtremeDB is an IMDB-optimized system, but its functionality far surpasses its competitors in every aspect. It can be used locally or distributed, with and without SQL, in-memory only or as an on-disk hybrid, in-process and as a server, with high availability, vector-optimized operations, real-time embeddability, source code, and many compile-time optimizations. More people really should know about it; it’s a genuinely fantastic system.

……………………………………..

Jonah H. Harris Director of Artificial Intelligence & Machine Learning, The Meet Group.

Leader. Entrepreneur. Technologist. NEXTGRES Founder. Former CTO at The Meet Group. OakTable Member. Open Source Contributor. Founding Member of the Forbes Technology Council.

Resources

–McObject and Siemens Embedded Announce Immediate Availability of eXtremeDB/rt for Nucleus RTOS

Follow us on Twitter: @odbmsorg

ODBMS Industry Watch

On AI Factory and Generative AI. Interview with Ashok Reddy.

On Digital Ethics. Interview with Jean Enno Charton.

On The Future of Vector Databases. Interview with Charles Xie

On the Future of AI. Interview with Raj Verma

On Generative AI. Interview with Maharaj Mukherjee

On Generative AI and Databases. Interview with Adam Prout

On Generative AI. Interview with Philippe Kahn

On Innovation. A Conversation with Philippe Kahn

On Using in Memory Database. Interview with Jonah H. Harris

About the author

Archives

Meta

About

Flickr

Search

About the author

Tags

Archives

Meta

About

Flickr

Search