What’s New in Vertica 9.0?
This blog post was authored by Soniya Shah.
In Vertica 9.0, we introduce new functionality including:
• Eon Mode Beta
• Supported Platform Updates
• Machine Learning Enhancements
• Apache Hadoop Integration Updates
• Partition Grouping and Hierarchical Partitioning
• Browsing S3 Data Using External Tables
• Support for the UUID Data Type
Eon Mode Beta
Vertica 9.0 allows you to operate your database in Eon Mode Beta, using Amazon Web Services to capitalize on cloud economics while still enjoying the fast query processing Vertica is known for. Running Vertica in Eon Mode Beta separates the computational processes from the storage layer of your database. This new architecture enables Vertica to scale elastically, adapting to various workloads.
Eon Mode Beta is not built for production environments and Vertica will not provide technical support for Eon Mode Beta users, but you should check out the VerticaBeta Forum to ask and answer questions about Eon Mode Beta. For more information, see Using Eon Mode Beta in the Vertica documentation. Stay tuned for our blog post about Eon Mode Beta!
Supported Platform Updates
In each Vertica release, we try to expand our list of supported platforms. In this release, we have added support for the following:
• Hortonworks Data Platform (HDP) 2.6
• Cloudera (CDH) 5.11
• Oracle Enterprise Linux (OEL) 6.9
• Linux Volume Manager (LVM) on all supported platforms
In Vertica 9.0, support for the following was deprecated. Support will be removed in a future version:
• Spark 1.6
• Scala 2.10
Vertica no longer supports Ubuntu 12.04. For a more comprehensive list of changes, see Vertica 9.0.x Supported Platforms.
Machine Learning Enhancements
In this release, we introduce seven new functions to help make machine learning in Vertica even faster and easier to use! You can now perform cross validation in Vertica using the new CROSS_VALIDATE function to obtain more accurate measurements across your data set. We’ve also extended support for one hot encoding with two new functions that allow you to convert categorical columns to binary ones. In this release, you can now import and export models to other Vertica clusters. We’ve also added two new summary functions: GET_MODEL_SUMMARY function replaces the SUMMARIZE_MODEL function and the SUMMARIZE_NUMCOL function provides a statistical summary of each numerical feature in a data set. For more information, see Machine Learning for Predictive Analytics in the Vertica documentation. And, look out for our blog post about machine learning enhancements in this release!
Apache Hadoop Integration Updates
In this release, there are four major updates to Apache Hadoop integration. You can now specify HDFS storage locations using URLS in the hdfs scheme, just as you can for other Vertica Hadoop interfaces. Using hdfs instead of webhdfs improves performance.
The HCatalog Connector now integrates with Sentry to manage authentication for Hive data. You can now use the ALTER HCATALOG SCHEMA statement to modify many of the parameters of an HCatalog schema. You can now load data or create external tables from Parquet and ORC data stored in S3 buckets.
Partition Grouping and Hierarchical Partitioning
You can now consolidate partitions into groups that minimize use of ROS storage. Reducing the number of ROS containers to store partitioned data helps facilitate DML operations such as DELETE and UPDATE, and avoid ROS pushback. For example, you can group date partitions by year. By doing so, the Tuple Mover allocates ROS containers for each year group, and merges individual partitions into these ROS containers accordingly.
The new meta-function CALENDAR_HIERARCHY_DAY leverages partition grouping. This function organizes a table’s date partitions into a hierarchy of groups: the oldest date partitions are grouped by year, more recent partitions are grouped by month, and the most recent date partitions remain ungrouped. Grouping is dynamic: as recent data ages, the Tuple Mover merges their partitions into month groups, and eventually into year groups.
Browsing S3 Data Using External Tables
You can create external tables to query Parquet, ORC, text, and delimited data stored in S3 buckets. External tables work the same way in Eon Mode Beta and Enterprise mode.
Support for the UUID Data Type
For a full list of new features, see the New Features Guide. And be on the lookout for our What’s New series, where we’ll give you an in-depth view of our newest feature
Sponsored by Vertica