ODBMS.org

On Generative AI Investments. Q&A with René Haag

Roberto Zicari — Tue, 23 Apr 2024 17:17:39 +0000

Q1. You published recently a study developed with HFS Research to help organizations understand the value they can derive from their generative AI (GenAI) investments. What are the main take aways from this study?

Like all new technologies or business processes, the Clients need to make sure to put the right foundations in place. With AI – as stated in the Study- the adage “garbage in, garbage out” never becomes truer as with this technology. AI learns and evolves from data input and uses this to build the outcomes. So one of the key takeaways from this study is the importance of Data for a successful AI Implementation and furthermore, the expected outcome of AI driven processes will only appear with the right Data Foundation.

Q2. One-third of executives in this study believe less than half of their organization’s data is actually consumable. What does this mean in practice?

The good news here is that the value of data has become a key role in the executive’s mindset and is part of the planning for a successful business. In practice it means that companies who are considering data as part of their value chain, gain more and better out codes of their business results. When it comes to digital transformation and of course the implementation of AI, there is still a way to go for the companies but as stated in the study, with the right approach and the change in mindset on all levels, it will bring a lot of value.

Q3. What is your definition of a proper data foundation that can help real-world business to generate good data quality outputs?

Understand your data! At Syniti we have a simple approach behind our successful data initiatives with our Clients. “Let the data tell the story”. As every other project, you first of all need to understand where you are in your journey. After that, using the right tools, like our Syniti platform, and more importantly a change management initiative to understand and focus on data, will generate the output required to support business transformation. .

Q3. Is Syniti able to help overcoming the challenge of data bias? If so, how?

Absolutely. That’s part of what we do on a long term basis for our clients. It all starts with the definition of a common understanding of data on a global basis. For example salutations: Given the diversity in our society, we need to define rules on how we want to interact with employees, clients, vendors…. This rule needs to be defined and executed on the data level, so that the AI process (for example, a Chatbot can use the right salutation based on different data inputs). Of course we can add local requirements to that but the principles behind need to be defined and enforced.

Q4. Syniti’s CEO is advocating a Data-First Approach To Digital Transformation. What does this mean in practice? Why does it matter?

Usually, the legacy data will be touched the first time, when it comes to the Data Migration. At this point in time the companies have already gone through a lot of effort and decisionmaking regarding the new Template & Processes. Very often the companies need to invest additional time and effort as soon as they try to transform the legacy data into the new standard, one key reason for project delays and overrun of budgets. With Data First, we include the legacy Data from Day one and can reduce effort, ease decision making and secure the transformation within Time & Budget.

Q5. To quote Syniti’s CEO: “Digital transformation requires data transformation.” Can you explain why?

First of all I would like to clarify the term Transformation. As already mentioned, the lift and shift approach is in our view not a transformation. It is a technical approach to move the legacy World into the new digital World. This approach will bring only very little improvements to the business, as the processes and the Data will stay nearly the same. If the aim of the company is to transform their Business, improve their business processes and improve their business outcome, this -in our view- is a Transformation. And to secure the expected outcome of new Business processes and gain the improvements of a Digital Transformation (and of course of AI), it is required to transform your Data to a “Business Ready” state, which requires Data Quality, Data Harmonization and more.

Q6. Can you give a short explanation of Syniti’s silo-free enterprise data management approach?

The Syniti Knowledge Platform (SKP) or maybe better known in the Region as SAP Advanced Data Migration & Management (ADMM), is a single Platform to support the whole Data Lifecycle. It contains Solutions like Data Rightsizing, Data Assessment, Data Migration, Master Data Management and more, just in a single Platform. Each part can be used in a modular approach but all modules are integrated. That means, wherever our clients are in their Data Journey, we can support their actual initiatives and re-use the knowledge for each upcoming step in the Data Journey. We call it Knowledge Capture & re-use. One reason why we can improve and streamline Data Initiatives and Transformation Projects at any scale.

Q7. Anything else you wish to add?

It doesn’t matter if you consider Transformations, AI Implementation, Reporting, Business Process Improvements or any kind of Business Initiatives. All of these topics come back to Data. Only if the Data foundation is ready to support these, the companies will receive the expected outcomes. Data First, will help to design and improve these initiatives, reduce the efforts to implement and most important, will secure the outcomes.

……………………………………………….

René Haag is VP of Sales MEE (Central and Eastern Europe) at Syniti. His focus is on data migration, data integration and data governance. Before joining Syniti, Haag was an independent consultant in the SAP environment and held management positions at Qlik, Magnitude Software and CDQ in Switzerland, among others.

Understanding How Your Data is Stored, Compressed, and Accessed Can Accelerate Data Query Performance

admin — Sat, 20 Apr 2024 13:10:07 +0000

By Gavin Halliday, HPCC Systems

Businesses are using big data platforms to glean valuable business intelligence from the enormous amounts of data they generate. The size of that business data grows exponentially, making it harder for companies to quickly analyze and apply it. Accordingly, many businesses are turning to AI algorithms and data lakes to analyze and store their data.

Data lake developers focus on ways to reduce the time required to access stored data. The primary source of query delays in a data lake is often the need to access data stored on disk, so developers should consider what kind of storage media they’re using and how it affects query times.

Let’s assume data is stored on disks distributed over many compressed blocks or nodes, and the nodes need to be decompressed before data can be read into memory and searched. The process would look something like this:

Next, let’s add some typical timings for hard disk and SSD data storage to find the expected time needed to return a match from a node when all nodes are stored in memory and different percentages of the nodes require retrieval from a disk.

	Percentage of nodes in memory
100%	90%	50%	0%
HDD	5 µs	525 µs	2605 µs	5205 µs
SSD	5 µs	40 µs	180 µs	355 µs

The search is obviously going to be much faster if all nodes are loaded into memory. Note retrieval time rises sharply whenever access to nodes outside of memory is needed, even with 90 percent of the nodes already in memory.

Because of the enormous difference in time between searching items in memory versus on disk, data lakes often use an internal LRU cache of nodes to read from and index to reduce the search times. This table would seem to show that getting the best performance possible involves making that internal LRU cache as large as possible. But when a query requests data from a file, the operating system doesn’t necessarily read it from the disk. Most data lakes run on Linux systems, and Linux has a page cache that holds all recent data read from a disk. Linux takes advantage of any unused memory for usage in the page cache and evicts items from that page cache if the memory is needed elsewhere. So, the memory used for the Linux page cache competes with the memory for the node cache.

This complicates time/data calculations because while the page cache holds compressed nodes, a data lake’s internal cache can hold decompressed nodes. More compressed nodes fit in the page cache than uncompressed nodes can fit in internal cache. Assuming a file has a typical compression ratio of 1:5, the system has enough memory to store all the compressed data in the page cache, and the page cache has been fully populated, how does that effect retrieval times?

If there is no internal cache and all memory is devoted to the page cache, each search will take 10+200+5 µs = 215 µs (as seen in chart below).

On the other hand, if all the memory is allocated to the internal cache it will only be able to store 20 percent of the data, so 80 percent will need to be read from the disk. The chart below illustrates this.

	Memory	%	Page	%	Disk	%	Total ms
Cached	5	0%	215	100%	n/a	0%	215
HDD	5	20%	215	0%	5205	80%	4165
SSD	5	20%	215	0%	285	80%	285

Interesting to note that query performance is faster when using the Linux page cache for storing data instead of internal cache.

How about 40 percent internal, 60 percent Linux page cache? Because the nodes are expanded by a factor of five, only eight percent will be in the internal cache and 32 percent will need to be read from the disk.

	Memory	%	Page	%	Disk	%	Total µs
HHD 40%	5	8%	215	60%	5205	32%	1792
SSD 40%	5	8%	215	60%	355	32%	240

What if the node only expanded by a factor of two in memory so more pages could be held in the internal cache?

	Memory	%	Page	%	Disk	%	Total µs
HHD 1:2	5	20%	215	60%	5205	20%	1168
SSD 1:2	5	20%	215	60%	355	20%	198

This comparison reveals a use case where using the internal cache is worthwhile – when the data is not highly compressed. But it only provides a benefit if data is stored on SSDs, not slower performing HDDs. This doesn’t mean that the lower the compression the better performance will be. Rather, poor compression means an index will be even bigger, so a smaller part of it will fit in the Linux page cache. What it does shows is that if you can avoid decompressing data, using an internal cache is faster.

In real-world use cases, data is not accessed uniformly. Some data will be accessed more often. What happens if half of the searches only access one-fifth of the data and all the memory is used for the internal cache?

	Memory	%	Page	%	Disk	%	Total µs
HHD	5	50%	215	0%	5205	50%	2605
SSD	5	50%	215	0%	355	50%	178

This reveals another case where the internal cache is now providing a benefit.

Finally, as a last thought experiment, what effect would reducing the compression time to 10 µs have?

	Memory	%	Page	%	Disk	%	Total µs
Cached	5	0%	25	100%	n/a	0%	25
HHD 100%	5	20%	25	0%	5015	80%	4013
SSD 100%	5	20%	25	0%	165	80%	133

The chart shows that when reading from SSD, search times are roughly halved, but when reading from the page cache the search time is reduced by a factor of 10. That kind of change would significantly speed up the searches but note the effect on the internal cache performance. The average search time when all memory is allocated to the page cache is now less than a fifth of the time when it is all used for the internal cache, previously it was three quarters.

These comparisons show how optimal internal cache sizes depend on the access patterns within the index, the compression ratio, how fast the storage disk is, and the time needed to decompress a node.

By considering the types of storage media they use, which data sets are most often accessed in queries, and using Linux page cache, developers can significantly reduce query times and to get faster access to business intelligence from their data.

…………………………………………

Gavin Halliday

Gavin’s primary focus is on the code generator, which converts ECL into the queries which run on the platform. Gavin enjoys working on problems together with the development team and the varied nature of the work keeps him engaged. Gavin shares how the platform compares with competitive platforms, including scalability and coding simplicity. He enjoys working on the platform and the elegant solutions the development team is able to implement. Gavin encourages people to give it a try!

Sponsored by HPCC Systems

Industry veteran Peter Brennan to lead Scality operations as CEO of US subsidiary, Scality, Inc.

admin — Fri, 19 Apr 2024 10:40:58 +0000

Expanded US role based on successful first year as global Chief Revenue Officer, Scality founder, Jerome Lecat remains as global CEO and Chairman

London, UK – April 18, 2024: Scality, a global leader in cyber-resilient storage for the AI era, today announced the appointment of Peter Brennan as Chief Executive Officer (CEO) of Scality Inc. Brennan, currently serving as the global Chief Revenue Officer (CRO) of Scality, will assume the additional role of CEO of US subsidiary, Scality, Inc. effective immediately. This strategic role expansion provides geographic focus as the company continues to grow rapidly across key regions with customers in nearly 70 countries worldwide.

Since joining Scality as CRO in March 2023, under Peter’s leadership, Scality has grown its US and worldwide sales team, established a two-tier sales channel with the addition of new distribution partners Ingram Micro, TD SYNNEX and a burgeoning Value Added Reseller (VAR) programme. Jerome Lecat, who co-founded Scality in 2009, remains in his current role as global CEO and chairman of Scality and continues to lead the company’s operations globally with a specific focus on EMEA, Asia Pacific and Japan.

In his new role, Brennan will lead Scality Inc.’s overall strategic direction, operations, and execution to grow the company’s footprint across the US. He will continue to play a vital role in driving revenue growth opportunities globally. With over two decades of experience in the technology sector, including leadership roles in sales, marketing, and business development, Brennan brings a wealth of knowledge and expertise to his new position.

“It’s been exciting to help lead Scality through an exceptional year of global growth. Now, I look forward to guiding the US operations as the chief executive to make a significant market impact as more organisations move to modernise their cyber-resilient and AI-ready infrastructures,” said Peter Brennan, CEO, Scality, Inc. “I anticipate that our newly expanded sales team and channel organisation will continue to deliver, and I will apply my own knowledge gained from over 20 years working in leading US sales organisations to support the team in achieving our goals.”

“Peter has surpassed our key business goals in his first year as Scality’s chief revenue officer which has been a contributing factor in gaining the confidence of Scality’s board of directors in recommending this promotion,” said Jerome Lecat, CEO and founder of Scality. “Peter’s extensive knowledge and expertise in the US market, coupled with his familiarity of the culture in this region, positions him as the right person to lead Scality to even greater heights in the coming year with this new role as CEO for our US operations.”

About Scality

Scality solves organisations’ biggest data storage challenges — security, performance, and cost. Designed to provide the strongest form of immutability plus end-to-end cyber resilience, Scality solutions safeguard data at five core levels for unbreakable ransomware protection. Delivering utmost resilience, Scality makes storage infrastructures limitlessly scalable in all critical dimensions. The world’s most discerning companies trust Scality so they can grow faster and execute AI data-driven ideas quicker — while increasing efficiency and avoiding lock-in. Recognised as a leader by Gartner, Scality S3 object storage software is reliable, secure and sustainable. Follow us on Twitter and LinkedIn. Visit www.scality.com and our blog.

Scality and TD SYNNEX partner to meet ransomware protection requirements

admin — Fri, 19 Apr 2024 10:38:29 +0000

Collaboration advances enterprise cyber-resiliency footprint across
hybrid environments with immutable object storage software

SAN FRANCISCO, USA – April 9, 2024: Scality, a global leader in cyber-resilient storage for the AI era, today announced a strategic partnership with TD SYNNEX to distribute Scality’s secure, simple object-storage software solutions to resellers serving enterprises in the DACH region. The partnership extends the reach of Scality’s advanced object-storage solutions to organizations seeking affordable, enterprise-grade ransomware backup solutions to strengthen their cyber security posture across hybrid-cloud environments.

For decades, TD SYNNEX has helped maximize the value of technology investments, demonstrate business outcomes, and unlock growth opportunities for its diverse customer base. As a premier distribution partner for Scality in the DACH region, TD SYNNEX will give enterprises across Germany, Austria and Switzerland an easy on-ramp to unbreakable ransomware protection with Scality’s cyber-resilient object storage software.

Christoph Storzum, Scality Vice President of Sales, European Region. “It’s exciting to have TD SYNNEX as part of our distribution partner network. It reinforces our channel-first strategy of building a diversified partner ecosystem that allows us to leverage the strengths of industry experts in key regions. We share a common goal — ensuring that organizations have the best defense against ransomware attacks, which are only getting more sophisticated and evasive due to AI. This is only the beginning of a partnership that has enormous growth potential for an extended network of VARs, and we are committed to providing all the necessary resources to support their success and ability to address their customers’ most pressing cyber resiliency and ransomware protection challenges.”

The partnership gives channel partners easy access to Scality’s industry-leading solutions for ransomware protection, hybrid cloud and AI. ARTESCA is a cyber-resilient S3 object storage software that delivers five levels of protection for immutable, ransomware-proof backups. It provides enterprise-grade capabilities at a low price-point to ensure organizations have a trusted last line of defense backup solution to strengthen their cybersecurity posture. Additionally, enterprises and service providers creating unbreakable cloud data centers can choose Scality’s RING software to support any combination of use cases thanks to its cloud-style economics and limitless, independent scale-out of capacity and throughput performance.

Michael Görner, TD SYNNEX Vice President Advanced Solutions & Maverick DACH: “Partnering with Scality supports our ability to ensure our solution provider network can adequately address the real-world ransomware protection needs of organizations that are now facing more menacing threats than ever. TD SYNNEX has a specialized team of experts who are eager to assist our solution providers in learning about Scality’s ransomware backup solutions so they can seize new market opportunities and, ultimately, help their clients become unbreakable against ransomware.”

Scality is the only independent and 100% software-defined storage company leading the Gartner Magic Quadrant for distributed file systems and object storagefor eight consecutive years. This market validation, coupled with Scality’s disruptive product innovation and partner-first growth strategy, has accelerated Scality solutions’ deployment across industries, including banking, healthcare and government entities to name a few.

Designed according to zero-trust principles, the company’s latest solution, ARTESCA, includes over 12 security innovations that set a new standard for cyber-resilient storage. ARTESCA is the only S3 object storage software that offers the strongest form of data immutability plus end-to-end cyber resilience to protect data at five core levels, ensuring zero data loss. Its ability to easily scale to multi-petabytes while offering impressive performance at low TCO sets it apart from both traditional solutions and other object stores. This unrivaled combination of benefits makes it an essential component of an enterprise’s cyber-resiliency toolkit.

About Scality

Scality solves organizations’ biggest data storage challenges — security, performance, and cost. Designed to provide the strongest form of immutability plus end-to-end cyber resilience, Scality solutions safeguard data at five core levels for unbreakable ransomware protection. Delivering utmost resilience, Scality makes storage infrastructures limitlessly scalable in all critical dimensions. The world’s most discerning companies trust Scality so they can grow faster and execute AI data-driven ideas quicker — while increasing efficiency and avoiding lock-in. Recognized as a leader by Gartner, Scality S3 object storage software is reliable, secure and sustainable. Follow us on Twitter and LinkedIn. Visitwww.scality.com and our blog.

“Time is Now: A Journey Into Demystifying AI.” Q&A with Raj Verma

Roberto Zicari — Wed, 17 Apr 2024 08:32:21 +0000

“Part of why I wrote Time is Now was to demystify this technology. Steve Jobs once compared the invention of the PC to a bicycle. I’ve always found that fascinating. With AI, we also have a new bicycle.”

Q1. You just published your first book, “Time is Now: A Journey Into Demystifying AI.” What made you decide that “now” was the best time to write this book?

We are at a crucial point in the development of AI. We’ve been talking about this technology since the 50s, when we saw the development of the first AI programs. Some of these were capable of playing checkers and solving algebra problems. In the following decades, AI development stalled because computers lacked key capabilities needed to power this technology, including the processing power to perform complicated calculations and access to vast amounts of data to generate answers.

That has all changed today. Computer processing is more powerful than most of us could’ve dreamed in the 50s, and the internet and cloud computing have made it much easier to access and store much more data.

Time is Now arrives at this crossroads moment, in which generative AI is finally a reality, but there is still much of it to be defined. We need to make important decisions and actions about how we develop it and how we want it to impact our world. We need to start looking at AI not necessarily as artificial intelligence but as amplified intelligence, because when used correctly, it can transform our lives for the better. One of the main reasons I chose to write this book is because I think we need to be aware that we are the ones steering this technology. We choose what path it will take.

We have had a narrow view of many technologies before, and that has led to serious consequences. For example, we dropped the ball with social media, and allowed these platforms to be developed in a way that creates addiction and other mental health issues among users – and concerningly in teenagers. We can’t afford to do that with AI. Technology isn’t the problem. What’s critical is the way we develop and use it. AI is so powerful, we must approach it differently, more responsibly, because we cannot afford to ignore its repercussions. We need to get it right, and to get it right, we need to act now.

Q2. Why was it important for you to take readers on “A Journey into Demystifying AI”? What do you think most people misunderstand about AI? Do you think experts need to change how they explain AI to the general public?

There’s a lot of uncertainty around AI and what it will bring. While some are racing to figure out a way to integrate it into businesses and become more productive, others are terrified that it will one day come to dominate humans. Before we jump into quickly adopting or rejecting it, we need to truly understand AI. Because if we truly understand it, what it is, and how it works, we have a much better chance at shaping AI in a way that benefits us. In other words, the only way we as a society can actually take control of the way we develop this powerful technology is for each of us to understand it.

Many fears about AI are not grounded in facts. AI is not likely to take control over humanity, as it does in the movies. But there are some serious risks involved, like AI bias, which is very serious and can perpetuate social ills like racism, or data leaks, which can put people’s identities and even lives in danger.

I am optimistic about AI’s future and our potential to use this technology responsibly and ethically. But those of us who dwell in the world of AI development need to do a better job at explaining it to society, to our leaders, so we can have a healthy conversation about how we develop it. After all, it’s hard to be excited about something you don’t understand. Confusion leads to anxiety and knee-jerk responses that do more harm than good.

We need to communicate that AI has been around for decades, and all the ways it benefits people’s lives already. We need to plainly explain how it works – that it relies on the data we feed it. Most of all, we need to emphasize what the responsible development and use of this technology looks like and how it will benefit society overall. Education is the path through which the misinformed can become more informed and the only way to mitigate people’s fears.

Q3. In your book, you talk about the “Three Pillars of Now”: information, context, and choice. Explain these concepts and why they are so important in the context of the book.

“Now” refers to the present, how we experience it, and how we react to it. Information refers to data – the wealth of knowledge available to us. Context is about understanding the circumstances and factors that are relevant to the information, and filtering that data through this lens. Choice is about what we do with this information, funneled through the right context.

I love golf, and it’s a helpful analogy to explain these concepts. Golfers have to make decisions in real time, based on the direction of the wind, the distance to the green, and the position of the water and sand traps. They have to remember how many points they need to get ahead in a competition, and consider all of this before deciding how to swing. The golfer is experiencing the “now” by recalling information within a specific context and making a choice.

We may not all golf, but this is what every single person does every moment of the day. Consequently, information, context and choice, are the three concepts that form the foundation of “now.”

These concepts are so important to Time Is Now because they are both critical to my personal journey and because they also form the “trinity of intelligence ” (TOI) framework I use to describe AI throughout the book. Through information, context, and choice, I’ve been able to bounce back from personal and professional setbacks, make challenging leadership decisions, and achieve my dream role as the CEO of a major Silicon Valley company. At the same time, I believe these capabilities are what distinguish the AI revolution from previous technological innovations – and they are key to scaling AI going forward. AI depends on information, or vast amounts of data, and the right context to make the choices necessary to provide us with the best answers and insight we humans need to generate the most impactful decisions.

Q4. You say we should view AI not as artificial intelligence, but as amplified intelligence. What do you mean by that?

AI is not sentient. It relies on data to function. I love to say that with AI, “the easy will become automated, the hard will become easy, and the impossible will become possible.” By using AI’s transformative capabilities, we can make breakthroughs and decisions faster and make more informed choices; all of which will amplify, or enhance, our existing intelligence. For example, if physicians can use AI to help comb through thousands of pages of medical research, then they can spend more time treating and supporting patients.

Furthermore, by focusing on the “amplified” instead of the “artificial,” we reinforce that humans are the ones steering the ship of this technology, and ultimately who have control over how it is used. By using AI responsibly, we can use it to help us make our lives better and to be more productive, so we can spend more time supporting and taking care of each other – the things that make us human.

Q5. In one year, we went from clearly fake, AI generated videos of an alien-looking Will Smith shoving pasta into his mouth to crystal-clear, 60-second Sora videos from OpenAI. What do you say to people who are apprehensive or scared about the rapid development of AI?

I understand why people are apprehensive. Fear of change and fear of risk are natural. However that is expressed to you – your gut, your inner voice, or your intuition – I feel it too.

But life is about navigating risks. If we don’t take risks, we can’t innovate. I like to say that we can’t let the fear of discovery prevent us from experiencing the joy of discovery. The good news about AI is that we are still in the driver’s seat, and thus we decide which path to pursue with this technology and how it is developed.

I also emphasize to people that generative AI relies on the data we feed it. Without data AI is nothing, and so by being intentional about the types of data we use for AI development, we can successfully mitigate many of the risks that so many people are concerned about.

We have a responsibility as leaders in the field to take action now, in the training process of AI, and carefully define its relationship with data, to shape the kind of future we want to live in with this new technology.

Q6. The tech ecosystem is rapidly adjusting to the growth of AI. Tell us what it is like to be the CEO of a company deeply involved in that change. What excites you, and what challenges do you face?

For SingleStore specifically, it is a very exciting time because, as you know, data is what fuels AI. As I said before, without data, AI is useless. And with the wrong data, AI can be dangerous. So data technology companies have a very big responsibility in the age of AI. The way we store, process and analyze data will determine AI’s potential to transform our lives.

Leading a data technology company during this moment in history is exciting, but also comes with a lot of uncertainty. We have made bets to stay ahead of the game in powering AI. For example, when we chose to migrate our data platform to the cloud. That was a huge decision, with many costs involved, but it paid off because the cloud helps us scale and adapt data to efficiently power AI. Today it is obvious that AI benefits from a cloud based data platform, because the algorithm has access to a limitless pool of data and resources on which AI models can be trained.

We also made the decision to make our data platform unique in supporting both analytical and transactional workloads. These capabilities make it a real time data platform, which, again, is crucial for AI to deliver insights that will help us make better decisions. If AI models aren’t fed the most up to date and accurate data, they are not going to be much better than the latest model of a computer. This is a groundbreaking technology because it can analyze data and deliver unprecedented insights. That is how it is going to transform humanity. And we, the data technology companies out there, have a huge responsibility in unleashing that potential.

Q7. You talk about your early life and career path throughout the book. What experience has had the biggest impact on your leadership style? How will leadership change in the age of AI?

It’s hard to narrow it down just to one, but I do have to say that the process of getting my first job was a pretty defining moment, both in terms of my life and my leadership style. I grew up in India, which is where I got my Bachelor’s degree in computer engineering. As I was about to graduate from college, I wanted to work at WIPRO, just like most of my classmates. The process of getting a job there involved taking a test. Which we all failed. I was pretty disheartened, as I felt my chances of working at WIPRO were ruined. But after processing this sadness, I refused to give up. One day I put on the only pair of professional looking pants I had and knocked on WIPRO’s door. I won’t tell you many details about what exactly happened next (you’ll have to read Time is Now to find out how things evolved!). But, you can deduce that in the end, things worked out for my career, because today I have the privilege and honor of leading one of the most exciting data technology companies in Silicon Valley.

And since then I have been bold in many leadership moments. For example, when the pandemic started, all our financial advisors were telling us we should take on significant amounts of debt and layoff staff. But my instinct told me otherwise: I wanted to avoid an interest rate problem down the road and, critically, because I believe people are at the heart of any organization. So we kept our staff positions and took only a minimal amount of debt. And today, our financial standing is resting on solid ground and our workforce feels supported.

This experience taught me that, while being informed on trends and what is happening with the economy is important, it is also crucial to listen to your instincts, be bold and have the courage to go against the tide.

Q8. If there is one thing you want readers to take away from your book, what would it be?

That knowledge is power when it comes to AI. Read, watch, and stay updated. AI is a technology that will revolutionize society, probably more so than how factories and the industrial revolution revolutionized production, cars and airplanes revolutionized transportation, and the internet revolutionized information and communications.

Part of why I wrote Time is Now was to demystify this technology. Steve Jobs once compared the invention of the PC to a bicycle. I’ve always found that fascinating. With AI, we also have a new bicycle. Like a bike, AI can help us go faster and farther, and be more agile. But like a bike, we must also pedal AI in the right direction as it develops. It’s up to us. Now is the time for us to take the reins and steer AI onto the path that can revolutionize humanity for good.

Resource

Time is Now: A Journey into Demystifying AI,” Raj Verma. Forbes Book. Publication Date: April 16, 2024

……………………………………………………..

Raj Verma is the CEO of SingleStore.

He brings more than 25 years of global experience in enterprise software and operating at scale. Raj was instrumental in the growth of TIBCO software to over $1 billion in revenue, serving as CMO, EVP Global Sales, and COO. He was also formerly COO at Apttus Software and Hortonworks. Raj earned his bachelor’s degree in Computer Science from BMS College of Engineering in Bangalore, India.

On Real-time data management. Q&A with Steinar Sande.

Roberto Zicari — Tue, 09 Apr 2024 18:57:25 +0000

Q1. Where do you position Raima in the overall database market? 

A major benefit of Raima ́s edge computing data management solution is that it improves time to action and reduces response time down to milliseconds, while also conserving network resources. The Raima concept of edge computing is not expected to replace cloud computing, but rather compliment the total solution. It fills a niche that a cloud-only solution can’t completely fulfill. Despite Raima ́s ability to reduce latency and network bottlenecks, edge computing can pose significant security risks that Raima is well-positioned to address.

While a solid and complete IIoT (Industrial Internet of Things) to Cloud solution can address issues related to security, cognition, agility, latency, and efficiency, Raima plays a key role in providing data management directly on the edge embedded devices. Raima ́s compact linkable database library can enable users to gather data in real-time, near the data generating source, and allows for summarization of that data so that critical actions and decisions can be made much sooner. Raima ́s database on the IoT (Internet of Things) edge helps safeguard against latency issues that come into play if these applications must wait for data to be directly moved to a central control center where a decision is not made in time.

Q2. What are the key features of a real-time data management?

Real-time data management is critical in various domains, including finance, telecommunications, and healthcare, where timely data processing and decision-making are crucial. When considering RaimaDB, a database known for its real-time capabilities, here are the key features of real-time data management it supports:

Low Latency: Real-time data management systems like RaimaDB are designed to offer very low latency in data processing. This means that data is captured, processed, and available for decision-making in milliseconds or microseconds, which is essential for applications that rely on up-to-the-moment data.

Concurrency and Multitasking: RaimaDB supports concurrent data access and multitasking. This means multiple transactions can occur simultaneously without compromising data integrity, which is vital for systems that require real-time responsiveness.

Real-Time Analytics: RaimaDB enables real-time analytics, allowing organizations to analyze data as it comes in. This is crucial for making immediate decisions based on the latest information, such as fraud detection in banking or patient monitoring in healthcare.

High Availability and Reliability: In real-time data management, system uptime is crucial. RaimaDB ensures high availability and reliability so that data is always accessible when needed, minimizing the risk of downtime.

Data Integrity and Consistency: Ensuring data integrity and consistency is crucial in real-time systems. RaimaDB maintains strict data integrity and consistency, ensuring that the data is accurate and reliable for real-time decision-making.

Scalability: Real-time data management systems must scale to handle increasing volumes of data. RaimaDB is designed to scale efficiently, accommodating growing data needs without compromising performance.

Event-Driven Processing: RaimaDB supports event-driven processing, allowing the system to react and process data as events occur. This is essential for applications where actions need to be triggered based on real-time data insights.

Distributed Data Management: RaimaDB can manage data distributed across various locations, ensuring efficient data synchronization and access in real-time applications, even in geographically dispersed environments.

Security: In real-time data management, securing data against unauthorized access and breaches is crucial. RaimaDB provides robust security features to protect sensitive data while it is being processed and stored.

Interoperability: RaimaDB supports interoperability with other systems and technologies, which is vital for integrating real-time data management into broader IT ecosystems and enabling seamless data flow between different platforms and applications.

These features make RaimaDB suitable for applications where real-time data processing and management are critical, allowing organizations to act swiftly based on the latest data insights.

Q3. Who is using RaimaDB and for what kind of applications? 

A&D companies like Boeing use Raima aboard the Airborne Warning & Control System to manage data to locate potential threats in the area. Industrial automation companies such as GE Power embed Raima’s database solution inside their mission-critical ALSPA ControPlant system for power plants managing historical power data. Automotive applications use Raima at companies like CARIAD for deployment in next gen autonomous driving vehicles such as Audi, VW and more. The database is deeply embedded inside the camera and ECU of the next generation electric vehicles. The database collects data from all sensors around the vehicle, combines the results with map data, stores it into the RaimaDB engine and then uses Raima’s rapid data retrieval and processing API’s to make nano-second decisions for the car’s advanced ADAS and visualization system. Healthcare companies like Infor use RaimaDB in an interoperability platform designed to assist with the receipt and sending of messages which move from one applicable medical software system to another. These are just some examples from a multitude of companies trusting RaimaDB.

Q4. In the industry-standard TPC-B test, your disk-based database was able to process over 2,000+ more transactions than other competitors on an X86 platform and over 20x more transactions than SQLite on an ARM platform. Can you please explain how this benchmark was performed?

We ran the TPC-B tests on two devices, one IoT Embedded Arm chip and one Desktop x86 platform. We benchmarked SQLite 3.37, PostgreSQL 12 and RaimaDB using the same test on the same specific hardware. We saw that SQLite and PostgreSQL were not able to process as many transactions per second as RaimaDB regardless of the number of clients.

Q5. RaimaDB vs. SQLite: what are the difference and similarities?

RaimaDB is similar to SQLite in that it is a small footprint database that can run in-memory or on disk for deeply embedded applications. The major differences include Raima’s ability to handle multiple users through an intuitive locking mechanism and to handle massive scalability when additional hardware is added. SQLite does not have any out-of-the box locking and scalability is limited when the hardware is increased. Raima is also platform independent, uniquely optimized to run in any environment and supports nearly every hardware combination and operating system. The Raima db is also highly optimized for flash media. This allows customers to prolong the lifetime of the application through less writes to the storage medium. Finally, Raima is extremely easy to use on whatever operating system target you choose with pre-packaged project files and integrations for Visual Studio, XCode, Makefiles, CMake, Wind River Workbench, & Green Hills MULTI.

Q6. What is your vision ahead for Raima?

The vision ahead for Raima is to continue to be a leading database provider but laser focused on solving the data management needs within the immerging & rapidly growing edge computing embedded market-space. Raima wants to provide customers secure access to the most up-to-date and relevant information, as close to the source of origin as possible, to enable the most accurate and informed decisions to be made much quicker than relying on a cloud-only based technology implementation.

Qx Anything else you wish to add?

In the ever-evolving IoT and Edge environment we see vast amounts of data flowing through. I believe it is important to compute and manage that data close to the source of origin to leverage all the computing power that is out there, reducing the stress on networks and the cloud. Making decisions right there and then leaving the devices more autonomous. I believe we are well positioned for that. Go to Raima.com to see for your self.

Resources

TPC Benchmarks Overview

Raima Unveils RaimaDB 16.0: Elevating Database Performance

………………………………………………….

Steinar Sande, CEO of Raima.
Steinar Sande has worldwide experience with sales of products to enterprice clients. Sande also holds extensive experience in engineering and P&L responsibility for AR/AP, software industry and Internet of Things . That includes Database technology, edge computing, Industrial Internet Of Things and challenges around edge computing.

On Databases and Data platforms for the Edge/Iot. Q&A with David Rolfe

Roberto Zicari — Tue, 09 Apr 2024 16:19:43 +0000

Q1. The database/ data platform market is undergoing a consolidation. What is your take on this?

A number of factors are working together. At one point the 451 Group was tracking around 130 separate database / data platform products. We’re now seeing steady shrinkage. Some of this is the natural operation of the free market. Literally every possible idea was tried; some worked and some didn’t. The market simply isn’t big enough to support 130 platforms, be they open source or otherwise.

But other factors are in play as well. The hyperscalers have also derailed the plans of companies that used VC funding to ‘launch’ an open source NoSQL platform. The original assumption was that enterprise support contracts from rich Fortune 500 companies would allow a commercial operation to succeed, while simultaneously nurturing an open source version. That changed the day hyperscalers started charging money for hosted versions of open source products. This, in turn, led to the Server Side Public Licence, which aims to prevent hosting plus reselling to the public. The recent shift to DBaaS is the latest move in this chess game.

As to where it’s all going? On-prem isn’t going away. Beyond that, it’s hard to predict with certainty.

Q2. Considering high cloud bills and that many Edge/Iot use cases have what might be called ‘finely balanced’ economics. Are we approaching ‘peak cloud’?

I think we’re at ‘peak cloud’. We’re getting a lot of interest as the long-term economics of cloud ownership become visible to the corner office. Take one example: One of the big advantages of cloud is that if you want to turn a three-node cluster into a five-node cluster you can just click your fingers and add two new instances. Or can you? Because unless you reserve them in advance there is no guarantee that when you ask for them you’ll actually get them. Then there’s the perverse incentives at play. If I own hardware and my workload starts exceeding what I can do, I can use my in-house DB skills to optimize the application so it fits. If I’m using DBaaS, the vendor will provide support, but their incentive is to get me to use more by-the-hour resources, not less.

Q3. ARM seems the favorite architecture for Edge/IoT. What is it? And why is it important?

ARM chips have been around for years in things like mobile phones but are now entering the server market. As a rule of thumb, an ARM chip is 40% faster than an x86 one, as they usually don’t have hardware multithreading. Edge likes ARM because it uses a lot less power than Intel and can be very affordable.

Q4. Why did you start doing TCO benchmarks on AWS Graviton, Intel C7 and even Raspberry Pi?

As a company, we’ve invested significant ‘blood and treasure’ on porting to ARM because our customers were asking us for that. Because Volt is a ‘close-to-the-metal’ product, this wasn’t a simple task, but we did it and got results. Pi was picked as a defined ‘minimum configuration’. Note that we don’t approve it formally for production as the memory cards can wear out, but a lot would depend on the use case, so that might change.

Q5. What did you discover in doing such TCO benchmarks?

Using our charging benchmark, AWS Graviton is 40% faster than AWS c5 and 20% faster than c7. Pricing is also very competitive, but since AWS could change it at any time it’s not what I’d ‘hang my hat on’ on for this one.

As for the Raspberry Pi benchmark, when we started testing ARM I asked our engineers what the smallest hardware we could run on would be, because a lot of Edge hardware is … austere. Nobody actually knew, so I made it my business to find out. For under Euro 1000 I was able to build a five-node cluster and get around 2,500 TPS on our benchmark. Assuming the hardware was written off after a year, the cost per transaction was, bizarrely enough, about the same as AWS.

Fifteen years ago, standing up a system that could support a workload of 2,000 TPS for an OLTP application would be a big deal. Now we can do it with the kind of hardware people do school projects on. The world is changing, and ARM is part of that change.

Q6. You recently published the Volt’s latest Yahoo! Cloud Serving Benchmark (YCSB) numbers. What did you discover this time?

This is the first time we’ve started to closely track TCO as well as actual results. Getting to 1,000,000 TPS in YCSB isn’t helpful if it costs you US$1,000,000 to do it. What we’re seeing is that we can get 500K TPS for about US$2.50 an hour. We think we can go much higher, but at 500K TPS we’re starting to saturate the network available to us in AWS. We could go bigger if we set up a custom network, which we may try later this year.

Q7. You announced your latest release, Volt Active Data 13.1. What is special about it?

Many individual things:

In-service upgrades: ie, the ability to upgrade the software on a Volt cluster while it’s running.
ARM64 CPU support, which we’ve discussed above.
Elastic reduction of Kubernetes cluster sizes. Elastic shrink is very, very hard to do in Kubernetes.

The big promise of ‘cloud native’ is that a lot of the stuff we used to regard as scary and time-consuming is automated and not a problem anymore. But when it comes to elastic add and shrink, many aspects of how K8tes works make it very hard to implement. Both of these features were hard to implement, but we did it.

Q8. What’s next?

We’re continuing research into an Edge-specific offering and an API offering.

There are some interesting ideas out there about having a local database that’s mirrored to a cloud, the same way Apple’s iCloud does for your photos. This makes Edge activity visible centrally and also makes it really easy to replace the Edge system if it fails.

On the API front, we’re looking at a way to make Volt available as a web service with zero client coding.

Resources

VOLT APPROACHES 500K TPS FOR <$2.50 PER HOUR ON YCSB

WHY V13.1 IS OUR MOST IMPORTANT RELEASE YET January 16, 2024

………………………..

David Rolfe brings 20+ years of experience managing data in the telecom industry. David helps telecom software vendors meet the scale and latency requirements imposed by 5G data utilizing Volt Active Data. He helps companies take the steps they need to deploy mass-scale, ultra-low latency
transactional applications in cloud-native environments. He has over 25 years of experience with high-performance databases and telco systems and demonstrated expertise with charging and policy systems. He has authored multiple patents relating to geo-replicated conflict resolution.

Establishing Total Cost of Ownership for a Big Data Platform

admin — Tue, 09 Apr 2024 09:07:35 +0000

By Hugo Watanuki, HPCC Systems

Word count: 1,011

In today’s enterprise, a successful big data strategy can mean the difference between success and failure. For example, Netflix reports the company is able to save $1 billion a year from customer retention thanks to its use of big data analytics, and enterprises in every other vertical market are following suit. But adopting a big data strategy is a big undertaking for enterprises, and there are a host of questions an IT team must answer before they can decide on the best big data platform for their needs.

Should the enterprise use an on-premises or cloud-based infrastructure for data analytics? Which data analytics software best fits the organization’s use case? Does the IT team have the requisite experience and expertise to implement a big data solution? Will the chosen big data platform still meet an enterprise’s business needs in 12 months? What about in 5 years?

In addition to confirming a potential big data platform has the performance and features needed, IT teams should also consider other factors if they are to gain a true understanding of the capital and operational expenses required to keep a big data platform operational and scalable.

Here’s a list of criteria IT teams should consider when choosing a big data platform.

Fees for seat licenses – Proprietary big data platforms require enterprises to purchase licenses, often sold on a per seat basis. As the platform grows and matures, additional IT staff and seat licenses may be necessary, so license fees could increase over the long term. To reduce license costs, enterprises should also consider free, open-source big data platforms like HPCC Systems or Spark.
Technical support – If the platform isn’t performing as required, enterprises need access to technical support resources that can quickly and effectively solve the problem. But identifying which vendor is responsible for providing a technical fix in mixed-vendor IT environments is difficult, and it’s common for customers to be stuck without a fix for their problem because their vendors can’t agree which component of the environment is at fault. Technical support is particularly important for security. If a platform is orphaned by its vendor, security patch development ceases, leaving the platform’s users and data at risk.
Third-party software – Many big data platforms are sourced from multiple vendors; each vendor providing a solution for a specific stage in the big data pipeline. Enterprises need to be sure they understand all the software they’ll need to install and support to get the functionality they require.
Compute resources – The amount of compute processing and storage a big data platform needs will vary over time. Many businesses have seasonal data workloads that rise and fall throughout the year. This can lead to overprovisioning: allocating extra capacity to manage spikes in workloads only to watch that same capacity go unused during periods of low activity. Furthermore, does the platform selected require high-end computing and storage components or can it perform well using lower-tier options as well?
Cloud support – Cloud-based infrastructures allow enterprises to scale their compute and storage capacity up or down in real time to keep overprovisioning under control. Is the platform selected well adapted to this paradigm? Cloud computing also has security risks that may require specific security capabilities to meet regulations and SLAs around data privacy, sovereignty, and security, and IT teams will need to make sure any cloud-based big data solutions comply with those requirements.
Staffing – IT managers need to determine if their choice of big data platform will require adding additional staff to address any skill gaps or increase code output. What skills will those new hires need? As big data grows in popularity, potential hires with expertise in big data and cloud computing will be in high demand and their salaries will reflect this. IT managers must keep staff payrolls in mind when considering a big data platform’s total cost of ownership.
Implementation time – After a platform is selected, how long will it take to get the platform up and running? Sourcing software and hardware from different sources can cause compatibility issues that must be addressed before the platform goes live, potentially delaying the platform’s launch date.
Ongoing maintenance – Once a big data platform is operational, how much ongoing opex will it cost to keep it running? How frequently is the platform updated and how simple it is to install those upgrades? If the platform’s processing and storage capacity need to expand in the future, how long will that expansion take and how much will it cost?
Flexibility – If an IT team requires its big data platform to support specific features, what resources are available to provide that feature if the platform’s vendor is unable or unwilling to build it?
Developer ecosystem – Is there a robust, global network of developers working on value-added projects for the platform? Does an enterprise need their big data platform to support a specific vertical industry? Or a particular application? The larger a big data platform’s developer community, the more likely software for specific industries or use cases is already available.
Reliability/maturity – Is the platform’s technology new and without extensive real-world testing? Is the vendor a startup who may not be around to support their technology, or currently not able to scale to meet demand for support? Do they have good technology AND good customer service? Can they provide localized support resources for different regions?
Data support – Does the platform process data in any format? Does data in different formats work well together? Data in different formats often end up siloed in separate components that don’t communicate with one another, which can lead to inaccurate or incomplete data analysis.

Big data can transform a business. But before rushing into selecting and implementing a big data platform, be sure you take the considerations outlined above into account. By doing so, you can spare your organization from making poor, ill-informed decisions that can lead to significant problems and costs down the road. To learn more about Total Cost of Ownership for a big data platform, please consult our white paper.

…………………………………………

Hugo Watanuki, Information Systems Engineer, HPCC Systems

Predictable Real-time Data Management Over Persistent Storage Media

admin — Fri, 05 Apr 2024 08:53:47 +0000

3 April 2024, Seattle, USA: McObject®, a leading provider of embedded database management systems, is set to present an intriguing topic at the Embedded World 2024 conference. The presentation, titled “Predictable Real-time Data Management Over Persistent Storage Media,” will be delivered by Steve Graves, McObject’s CEO, and Thom Denholm, Technical Product Manager of Tuxera.

Here are the details:

Date: April 11, 2024

Time: 10:00 – 10:30 (Session 5.7 Memory)

Topic: Predictable real-time data management over persistent storage media

Abstract: Embedded designs with predictable performance requirements face timing constraints. Their response to events must fit within known time frames. As hard real-time systems become more complex, managing data becomes increasingly challenging. Additionally, the need for data retention may require the introduction of persistent media such as NAND flash or SSD. However, these storage mediums introduce unpredictable latency. How can timing constraints be met in such scenarios?

This presentation explores predictability aspects of real-time databases and introduces deadline management algorithms for transient and persistent storage devices. It discusses traditional transaction processing policies and their deficiencies in achieving transaction predictability. The ShadowCopy (copy-on-write) technique is introduced as a better approach for hard real-time transaction processing. The presentation demonstrates the applicability of the Shadow Copy-based parametric I/O model to NAND flash and SSD using actual storage stacks.

The Embedded World conference in Nuremberg, Germany, is a gold standard for exchanging information about embedded systems, and McObject’s participation is highly anticipated. If you’re attending the conference, make sure to catch this insightful session!

For more information, please visit McObject’s events page: www.mcobject.com/events.

McObject, founded in 2001, strives to anticipate and meet the requirements of the market. A market leader, its database has been employed in virtually every industry including aerospace and defense, financial systems, transportation, telecoms and industrial automation.

# # #

Editorial contacts

Shannon Levy

email: press@mcobject.com

Tel: +1-425-888-8505

About McObject

Founded by database and real-time systems experts, McObject offers proven, ultra-fast data management technology, used across a wide range of industries and market segments. The company’s background and expertise in defense and aviation sectors means that its technology is exceptionally reliable and robust.

McObject counts among its customers industry leaders such as BAE Systems, TradeStation, Siemens, Philips, EADS, JVC, Pentair, F5 Networks, CA, Motorola and Boeing. eXtremeDB uses powerful, industry-standard tools and languages, such as SQL, Python, C/C++, Java and C#. For more information please visit www.mcobject.com/ and follow McObject on and LinkedIn and Twitter.

McObject and eXtremeDB are registered trademarks of McObject LLC. All other company or product names mentioned herein are trademarks or registered trademarks of their respective owners.

On Modernising the Data Architecture Stack. Q&A with Matt Newman.

Roberto Zicari — Wed, 03 Apr 2024 12:55:52 +0000

” It can be difficult to work with legacy architectures and multiple multi-year migration projects that are in-flight at the same time.”

Q1. What are you currently working on at Adobe?

I am involved with integrating Adobe Workfront data and systems into various other Adobe products. I am helping with Adobe GenStudio, which is an end-to-end solution to accelerate and simplify your content supply chain with generative AI and intelligent automation. I am also working on a customer-facing, ad-hoc reporting solution, replacing the legacy version of this with a new modern one that will democratize reporting for all users.

Q2. What are the main challenges you face when modernising the data architecture stack at Adobe?

It can be difficult to work with legacy architectures and multiple multi-year migration projects that are in-flight at the same time. We migrated from Oracle to PostgreSQL, changed messaging technology from NATS to Kafka, disaggregated a monolith into microservices and onboarded onto the Adobe Business Platform (for authentication) all at the same time. It is also a challenge being an acquired company and sorting out which technologies we will adopt broadly, and which ones we will continue to use that we are using before the acquisition. The timing of all of these projects can be overwhelming to sort out.

Q3. How do you handle legacy systems?

We generally start communicating that a technology or system is deprecated, and then encourage everyone to build any new systems on the new technology. We accept that we will likely need to run parallel systems for quarters or even years, and we plan that in. Usually at the end, when we are 80% there, we start aggressively pushing to the new solution and decommissioning the old systems. The projects are particularly difficult when they require customer interaction.

Q4. What are the benefits for your customers?

As we modernize our architecture and technology, our customers get a better, faster experience. They are able to more efficiently plan and execute large-scale projects across their enterprises, and then analyze the results of their work. We are helping companies manage all the complexities and interdependencies of their work.

Q5. What are the lessons you learned so far in using generative AI?

We have learned a lot with generative AI, both in terms of training our own models (e.g., Firefly), and also using pre-trained LLMs from OpenAI, Anthropic and Google. We also have been playing with many open-source projects like Llama and Mistral.

There has been a lot of hype, and sometimes it takes a while for the real business value to come in, but it does come in slow and steady. It will take a while for us to find the ways that generative AI really benefits customers, and then monetize that, but once we do, it will be very strategic for our future. We have all kinds of uses for generative AI on the creative side with generating images, videos, 3D models, etc. Our customers have loved generating content within Photoshop and other tools, and as we enable them to generate content for marketing campaigns and automate even more of their marketing workflows, they will be delighted.

Q6. Why SingleStore’s latest release (SingleStore Pro Max) is of interest for you?

I am excited about SingleStore Pro Max. Features like better database branching and Smart DR will enable us to efficiently scale our deployments. Kai will be interesting for us as we are heavy users of MongoDB, and it will be really nice to do analytical queries using the same MongoDB queries we already run. I also love the free shared tier because it will help other Adobe groups to get exposure to using SingleStore without having to commit a lot upfront.

Q7. Specifically, which of new SingleStore Pro Max features could help you and your customers to take full advantage of generative Al capabilities? Why?

The improved vector search will enable us to use SingleStore in more cases for Retrieval Augmented Generation (RAG). We also use it for similarity and semantic search use cases.

Q8. Could you give some examples of how your customers are currently using AI?

Our customers use Firefly to generate images for marketing campaigns. They also use generative fill and expand features in Photoshop. On the Adobe Experience Cloud side, we have features to generate content (images, marketing copy, etc.) for tasks, projects or even entire marketing campaigns. In Workfront, when a user is assigning a task to someone in our application, we now have a “Smart Assignment” feature that recommends users most likely to be assigned based on similarity search on vectors.

Q9. Anything else you want to add?

I think SingleStore is a great data platform for building real-time analytics. The fast columnar storage enables us to ingest data and query it all in under five seconds. It also has all the enterprise features that we need as a larger company.

…………………………………

Matt Newman, Principal Data Architect at Adobe

I currently am a Principal Data Architect at Adobe. We are changing the world through digital experiences. I personally am working on defining architectural patterns for source databases (MongoDB and PostgreSQL), eventing (Apache Kafka), and destinations (Elasticsearch, Snowflake, Azure Data Lake, etc.). This enables our customers to operate projects and campaigns more efficiently, and gain insights from joining Workfront and Adobe together.

Previously, I have built data pipelines at Cox Automotive (Dealertrack), Dsco (supply chain), and Taulia (fintech company that does e-invoicing and supply chain finance). I have used technologies including MySQL, Amazon Redshift, Amazon Kinesis, Amazon Lambda, Python, etc. Prior to that, I built out the data and analytics platform at Kuali, an edtech startup based in Lehi, Utah. I also managed the database team at the LDS Church, which provides database solutions that power most of the applications of the Church.

Resources

SingleStore Announces Real-time Data Platform to Further Accelerate AI, Analytics and Application Development. SAN FRANCISCO — January 24, 20