Release Update 23.6 Features

AI Vector Search: New Vector Distance Metric

AI Vector Search was extended in 23.6 to include a new vector distance metric called Jaccard distance.

This feature allows users the option to use another distance metric between vectors. 

View Documentation

Hybrid Vector Index

The Hybrid Vector Index (HVI) is a new index that allows users to easily index and query their documents using a combination of full text search and semantic vector search to achieve higher quality search results.

The Hybrid Vector Index (HVI) simplifies the process of transforming documents into forms that are amenable both for vector similarity search and for textual search through a single index DDL.

The HVI provides a unified query API that allows users to run textual queries, vector similarity queries, or hybrid queries that leverage both of these approaches.  This lets users easily customize the search experience and enhances search results.

View Documentation

Partition-Local Neighbor Partition Vector Index

This feature enables LOCAL indexing for Neighbor Partition Vector Indexes, optimizing search performance for partitioned tables. This feature conceptually creates a dedicated vector index for each partition, allowing queries with partition key filters to search only the relevant index partitions. As a result, vector searches are more efficient, leading to significantly lower response times when querying large partitioned datasets.

Large enterprise datasets are frequently partitioned by relational attributes to optimize performance. By enabling LOCAL Neighbor Partition Vector Indexes, users benefit from enhanced scalability and accelerated query performance through partition pruning. This approach also ensures more efficient data lifecycle management, making it ideal for handling large-scale enterprise workloads.

View Documentation

Persistent Neighbor Graph Vector Indexes

The HNSW vector index is an inmemory resident multi-layered graph index. The time taken to recreate the inmemory graph on a restart can be improved by having a disk checkpoint image of the graph. This feature adds the checkpoint format as well as the framework to take a disk checkpoint and then use it to recreate the inmemory resident graph structure.

Getting an index access plan after a restart can take a long time. A higher priority disk checkpoint based reload execution improves the time taken to get an index access plan after a restart.

View Documentation

Sparse Vectors

Sparse Vectors are vectors that typically have large number of dimensions, but only a few dimensions have non-zero values. These vectors are often produced by sparse encoding models such as SPLADE and BM25. Conceptually, each dimension in a sparse vector represents a keyword from a specific vocabulary. For a given document, the non-zero dimension values in the vector correspond to the keywords (and their variations) that appear in that document.

Sparse vectors, such as those generated by models like SPLADE and BM25, often outperform dense vectors from models such as BERT, in terms of keyword sensitivity and effectiveness in out-of-domain searches. This superior performance is especially valuable in applications where precise keyword matching is crucial, like in legal or academic research. Additionally, sparse vectors are often used for Hybrid Vector Search, where they can be used alongside dense vectors to combine semantic searches and keyword searches, and provide more relevant search results.

View Documentation

Transactional Support for Neighbor Graph Vector Indexes

HNSW Index is an in-memory hierarchical graph index for vector data. In 23.4, and 23.5, DMLs were not allowed on tables that have HNSW index built on their vector column(s). This feature enables transactions to be executed on such tables. Moreover, vector search queries that use the HNSW Index will see transactionally consistent results, based on their read snapshot. Transactional consistency is guaranteed even on Oracle RAC where the HNSW Index is duplicated on all instances in the Cluster, DMLs occur on one or more instances in the Cluster, and search queries can be executed on any instance in the Cluster.

HNSW Index is the fastest vector search index that Oracle offers in 23ai. Thus, customers want to use HNSW Index for search queries, while also issuing DML modifications on relational or vector columns in the underlying table. Since DMLs may render the in-memory HNSW Index structures stale, special protocols are added in this project to guarantee transactionally consistent results for customers.

View Documentation

Vector Format Output for Feature Extraction Algorithm

Feature extraction algorithms produce a set of features that represent projections in a lower dimensional latent space. The output is typically numerical and dense. VECTOR type representation is the natural choice.

Feature extraction algorithms represent a principled approach to vectorizing relational data. The vectorized representation can be used for similarity search.

View Documentation

GoldenGate Replication of JSON-Relational Duality Views

This feature allows developers to use Oracle GoldenGate technology to replicate JSON-relational duality view data as JSON documents, instead of relational tables, from an Oracle Database to a target Oracle or non-Oracle database.

Replication of JSON data is an important feature for high availability, fail-over, and real-time migration from a non-Oracle database to Oracle Database.

Oracle GoldenGate Replication to non-Oracle databases such as MongoDB (a document database) or Redis (a NoSQL key/value store) is simple and performant with the ability to replicate JSON documents from JSON-relational duality views.

Developers need not write complex JSON and SQL transformations to shape data for relational and document-based updates, or pay the cost of reconstructing application objects on the target database.

See Replicating Business Objects with Oracle JSON Relation Duality and GoldenGate Data Streams.

View Documentation

Hidden Fields in JSON-Relational Duality Views

A column in a table underlying a JSON-relational duality view can be mapped to a hidden field; that is, a field that's not present in the documents supported by the view.

A generated field in a JSON document supported by a JSON-relational duality view can use the value of hidden fields.

A JSON document supported by a duality view can be simpler if it need not have a field for every underlying column, in particular for columns needed only for calculating.

View Documentation

Inline Augmentation in JSON-Relational Duality Views

JSON-relational duality views generate JSON documents with objects whose fields map to columns of underlying relational tables. 

You can augment the objects by adding calculated fields. For example, you can declare a JSON field totalCompensation whose value is calculated from a SQL expression that adds the values from columns BONUS and SALARY. Another example would be to add up the prices of all line items in an order. Calculated fields are read-only.

Users of duality views can convey more information to an application using augmentation. For example, if you know the size of a nested array then you can allocate appropriate space before iterating over the array. Field calculations (e.g. a purchase-order Total) can be done in the database instead of separately in multiple client applications. Because calculated fields can use masking operators, you can hide some information in an object - a feature similar to redaction. 

View Documentation

JSON Collection Views

JSON collection views are special, read-only database views that expose JSON objects in a JSON-type column named DATA.

JSON collection views are conceptually close to JSON-relational duality views, but they have fewer restrictions because they are read only.

View Documentation

JSON Replication

The logical_replication_clause of the CREATE/ALTER TABLE statement is extended to allow disabling and enabling of partial JSON updating under supplemental logging.

Partial JSON updating makes replication more efficient because less data needs to be replicated or modified. The change can be replicated on the remote side, instead of sending all updated data. Using this functionality, you will experience better performance on the same hardware, thus reducing hardware costs.

View Documentation

JSON Search Index Path Subsetting

When creating a JSON search index you can specify the fields to include or exclude from indexing: path subsetting.

Path subsetting can reduce the size of a search index and improve its performance.

View Documentation

Replication Support for JSON Collection Tables

JSON Collection Tables can be enabled for logical replication using GoldenGate. Replication is supported to and from JSON Relational Duality Views as well to and from third-party products, such as MongoDB.

Replication is a basic database functionality that works between Oracle Databases. It is also used to facilitate online migration to Oracle Database from third-party databases, such as MongoDB.

View Documentation

Enhancements to Oracle Data Redaction

This release includes many enhancements to Oracle Data Redaction such as, the optimization of existing capabilities and the removal of previous limitations.

These enhancements to Oracle Data Redaction improve the overall experience.

View Documentation

Sessionless Transactions

Managing a transaction requires the connection and session resources to be tied to the transaction throughout its lifecycle. Therefore, the session or connection can be released only after the transaction has ended. This often results in underutilization of sessions/connections. In Sessionless Transactions, after you start a transaction, you have the flexibility to suspend and resume the transaction during its lifecycle. The session or connection can be released back to the pool and can be reused by other transactions, therefore effectively being able to multiplex transactions and sessions/connections.

Sessionless Transactions provide ability for applications to suspend and resume transactions across sessions/connections (single instance or RAC) without the need for an external transaction manager, and without the application having to coordinate the commit and recovery protocols. The database manages transaction lifecycle, including commit and recovery. Application performance, and throughput, benefit from reduced commit latency since fewer client-server roundtrips are needed. Since external coordination is not required, using Sessionless Transactions results in vastly simplified mid-tier or app-tier infrastructure, and significantly decreases downtimes when compared with externally coordinating transactions (such as with XA).

View Documentation