-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-8655. Optimize listKeys by eliminating redundant seek. #9542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
HDDS-8655. Optimize listKeys by eliminating redundant seek. #9542
Conversation
3ed3c4a to
2277672
Compare
|
Thanks @armstrong0704 for the patch. Please enable the build-branch workflow in your fork. |
ab850ad to
2a6cdd0
Compare
2a6cdd0 to
2d244f6
Compare
|
@ivandika3 thanks for the input. Linters are making additional formatting. Any Guidance there? |
2d244f6 to
29cb2c7
Compare
|
@armstrong0704 Could you disable the auto format in the linter? Afterwards, if you are using IntelliJ IDEA, you can go to "Code Style" and import the https://github.com/apache/ozone/blob/master/hadoop-hdds/dev-support/checkstyle/checkstyle.xml |
|
Please revert all the whitespace and comment formatting changes. Its making this diff a lot bigger than it needs to be. |
29cb2c7 to
09129df
Compare
|
@sodonnel Done. |
|
@vyalamar do you wanna review this patch? |
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @armstrong0704 for the patch.
|
|
||
| RDBStoreByteArrayIterator(ManagedRocksIterator iterator, | ||
| RDBTable table, byte[] prefix, IteratorType type) { | ||
| RDBTable table, byte[] prefix, IteratorType type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| RDBTable table, byte[] prefix, IteratorType type) { | |
| RDBTable table, byte[] prefix, IteratorType type) { |
nit: Please disable Align when multiline in settings (if using IDEA).
| RDBStoreByteArrayIterator(ManagedRocksIterator iterator, | ||
| RDBTable table, byte[] prefix, IteratorType type, byte[] seekKey) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Please do not format method signature like this. Whenever visibility / return type / method name / other modifiers are changed, we would have to reindent all parameters.
| * | ||
| * @param beginKey start metadata key | ||
| * @param endKey end metadata key | ||
| * @param endKey end metadata key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * @param endKey end metadata key | |
| * @param endKey end metadata key |
| /** | ||
| * Covert the given key to the {@link RAW} type. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's not change this unnecessarily
| /** | |
| * Covert the given key to the {@link RAW} type. | |
| */ | |
| /** Covert the given key to the {@link RAW} type. */ |
| keyIter = getKeyTable(getBucketLayout()).iterator(null, seekKey)) { | ||
| readFromRDbStartNs = Time.monotonicNowNanos(); | ||
| KeyValue< String, OmKeyInfo > kv; | ||
| keyIter.seek(seekKey); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this seek now unnecessary?
| assertTrue(iter.hasNext()); | ||
| assertArrayEquals(getBytesUtf16("a3"), iter.next().getKey()); | ||
| assertTrue(iter.hasNext()); | ||
| assertArrayEquals(getBytesUtf16("a5"), iter.next().getKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to suggest adding a helper method to verify the list of keys iterated. Something like:
private static void assertIterator(String prefix, String seek, Table<byte[], ?> table, List<String> expected)
throws RocksDatabaseException, CodecException {
final byte[] prefixBytes = prefix != null ? getBytesUtf16(prefix) : null;
final byte[] seekBytes = seek != null ? getBytesUtf16(seek) : null;
try (TableIterator<byte[], ? extends Table.KeyValue<byte[], ?>> iter = table.iterator(prefixBytes, seekBytes)) {
for (String e : expected) {
assertTrue(iter.hasNext());
assertArrayEquals(getBytesUtf16(e), iter.next().getKey());
}
assertFalse(iter.hasNext());
}
}Then we can both simplify cases and make sure the iterator stops (like already done in two of the five).
// Case 1: Seek to existing key, no prefix
assertIterator(null, "a3", table, asList("a3", "a5", "b2", "b4"));
// Case 2: Seek to non-existent key (should land on next greater), no prefix
assertIterator(null, "a2", table, asList("a3", "a5", "b2", "b4"));
// Case 3: Seek past all keys, no prefix
assertIterator(null, "z9", table, emptyList());
// Case 4: Seek with prefix
assertIterator("b", "b3", table, singletonList("b4"));
// Case 5: Seek with prefix to start of prefix
assertIterator("b", "b2", table, asList("b2", "b4"));It can be reused in testIteratorSeekEdgeCases, too.
What changes were proposed in this pull request?
Fixes HDDS-8655 - Improve listKey performance for OBS buckets
Description
This change optimizes the listKeys operation in Ozone Manager by removing a redundant seek operation during iterator initialization. Previously, the listKeys functionality would create an iterator which implicitly sought to the beginning of the table (seekToFirst), and then immediately sought to the specific start key. This caused a double-seek, which is inefficient for RocksDB, particularly when the initial seek triggers unnecessary block loads.
Changes
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8655
How was this patch tested?
I added a new regression test, testIteratorWithSeek, to TestRDBStore to confirm that the iterator starts correctly at the designated key. I also verified thread safety with a new concurrency test, testConcurrentIteratorWithWrites, which runs a writer and a reader in parallel to ensure stability. Existing tests in TestOmMetadataManager also pass.