-
Notifications
You must be signed in to change notification settings - Fork 2
WIP feat: in-memory storage #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request introduces in-memory storage support and improves storage configuration flexibility. However, the implementation appears to be a work-in-progress (as indicated by the "WIP" prefix in the title) with several incomplete critical components that prevent the memory bucket from functioning correctly.
Key Changes:
- Added a new
memorybucket type with dedicated structures but incomplete storage operations - Introduced
db_pathconfiguration for explicit indexdb path control, including support for in-memory mode via "inmemory" value - Changed the
Existmethod signature across all bucket implementations from[]bytetoobject.IDHashfor improved type safety
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| storage/bucket/memory/memory.go | Introduces new memory bucket implementation with indexdb and LRU cache, but core storage operations (Store, Lookup, Remove) are currently stubs that don't perform actual work |
| storage/storage.go | Updates memory bucket from slice to single instance with selection logic that always returns memory bucket when present, bypassing existence checks |
| storage/builder.go | Registers the new memory bucket constructor and adds db_path to configuration merging logic |
| storage/indexdb/pebble/pebble.go | Adds in-memory mode support using vfs.NewMem() when path equals "inmemory" constant |
| storage/indexdb/indexdb.go | Defines TypeInMemory constant for identifying in-memory storage mode |
| storage/bucket/disk/disk.go | Updates Exist signature to use object.IDHash and makes db_path configurable with fallback to default location; increases LRU cache from 100k to 1M entries |
| storage/bucket/empty/empty.go | Updates Exist signature to use object.IDHash for consistency |
| api/defined/v1/storage/storage.go | Changes Exist interface signature from []byte to object.IDHash |
| conf/conf.go | Adds DBPath field to Storage and Bucket configuration structs |
| config.example.yaml | Adds commented examples for db_path configuration at global and bucket levels |
| server/middleware/caching/processor.go | Removes extraneous blank line (formatting cleanup) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| panic("implement me") | ||
| // Exist implements [storage.Bucket]. | ||
| func (m *memoryBucket) Exist(ctx context.Context, id object.IDHash) bool { | ||
| return m.cache.Has(id) |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Exist method only checks the cache (which stores Mark objects) but not the indexdb. This could lead to inconsistencies where an object exists in the indexdb but Exist returns false because it's not in the cache. Consider checking both the cache and the indexdb, or ensure that the cache and indexdb are kept in sync. The disk bucket implementation checks the indexdb directly (see storage/bucket/disk/disk.go:242-244).
| return m.cache.Has(id) | |
| // Fast path: check the in-memory cache first. | |
| if m.cache.Has(id) { | |
| return true | |
| } | |
| // Fallback: consult the indexdb, which is the authoritative source. | |
| if m.indexdb == nil { | |
| return false | |
| } | |
| exist, err := m.indexdb.Exist(ctx, id) | |
| if err != nil { | |
| log.Errorf("failed to check existence in indexdb for %v: %v", id, err) | |
| return false | |
| } | |
| return exist |
|
|
||
| func New(config *conf.Bucket, sharedkv storage.SharedKV) (storage.Bucket, error) { | ||
| mb := &memoryBucket{ | ||
| fs: vfs.NewMem(), |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fs field is initialized with vfs.NewMem() but never used in any of the bucket methods. If the intention is to use this virtual filesystem for storing actual file data (similar to how disk bucket stores files), the field should be used in Store/Lookup operations. Otherwise, this field can be removed to simplify the implementation.
| cache: lru.New[object.IDHash, storage.Mark](10_000), // in-memory object size | ||
| fileMode: fs.FileMode(0o755), | ||
| maxSize: 1024 * 1024 * 100, // e.g. 100 MB | ||
| stop: make(chan struct{}, 1), |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stop channel is created but never used in the implementation. If this channel is intended for coordinating shutdown or background operations, it should be used in the Close method or related operations. Otherwise, this field can be removed to simplify the implementation.
| weight: 100, // default weight | ||
| sharedkv: sharedkv, | ||
| cache: lru.New[object.IDHash, storage.Mark](10_000), // in-memory object size | ||
| fileMode: fs.FileMode(0o755), |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fileMode field is initialized but never used in the implementation. If file permissions are needed for virtual filesystem operations, this should be used when creating directories or files. Otherwise, this field can be removed to simplify the implementation.
| //TODO implement me | ||
| panic("implement me") | ||
| // Store implements [storage.Bucket]. | ||
| func (m *memoryBucket) Store(ctx context.Context, meta *object.Metadata) error { |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Store method returns nil without actually storing the metadata. This means objects will never be cached in the memory bucket. The implementation should store the metadata in the indexdb and update the cache with a Mark, similar to how the disk bucket implementation works (see storage/bucket/disk/disk.go:271-292). Consider implementing actual storage logic that uses m.indexdb.Set() and m.cache.Set().
| func (m *memoryBucket) Store(ctx context.Context, meta *object.Metadata) error { | |
| func (m *memoryBucket) Store(ctx context.Context, meta *object.Metadata) error { | |
| // Do not attempt to store a nil metadata. | |
| if meta == nil { | |
| return nil | |
| } | |
| // Persist metadata to the index database so it can be retrieved later. | |
| if err := m.indexdb.Set(ctx, meta); err != nil { | |
| return err | |
| } |
| // n.memoryBucket.Exist(ctx, id.Hash()) | ||
| return n.memoryBucket |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Select method always returns the memory bucket when it exists and UseAllow() is true, without checking if the object actually exists in that bucket. This means all storage operations will be routed to the memory bucket, even for objects that haven't been cached there yet. The commented-out code suggests there should be a check using Exist() to determine if the object is in the memory bucket before returning it. Without this check, lookups for non-cached objects will fail even though they might exist in other buckets, and stores will only go to the memory bucket. Consider implementing the Exist check or a fallback mechanism to use other buckets when objects aren't found in memory.
| // n.memoryBucket.Exist(ctx, id.Hash()) | |
| return n.memoryBucket | |
| if hit, _ := n.memoryBucket.Exist(ctx, id.Hash()); hit { | |
| return n.memoryBucket | |
| } |
| - path: /cache2 | ||
| type: normal | ||
| # db_type: pebble | ||
| # db_path: inmemory # index-db path, default: <path>/.indexdb/; use "inmemory" for in-memory store |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment describes using "inmemory" for in-memory indexdb storage via the db_path configuration, which is correct. However, this is different from the memory bucket type. To avoid confusion, consider clarifying in the comment that "inmemory" is for the indexdb backend, not the bucket type. For a fully in-memory bucket, users would set type to "memory" (not shown in this example).
| # db_path: inmemory # index-db path, default: <path>/.indexdb/; use "inmemory" for in-memory store | |
| # db_path: inmemory # index-db path, default: <path>/.indexdb/; use "inmemory" for in-memory indexdb backend (not the bucket type); for a fully in-memory bucket set type: "memory" (not shown here) |
| if path == indexdb.TypeInMemory { | ||
| log.Infof("kvstore used in-memory") | ||
| fs = vfs.NewMem() | ||
| } | ||
| pdb, err := pebble.Open(path, &pebble.Options{ |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using in-memory mode (fs = vfs.NewMem()), the path parameter is still passed to pebble.Open(). This might cause issues since the path could be a filesystem path like ".indexdb" which may not be meaningful in a memory-backed filesystem. Consider using a consistent path like "/inmemory" or an empty string when fs is set to vfs.NewMem() to avoid potential path-related issues.
| if path == indexdb.TypeInMemory { | |
| log.Infof("kvstore used in-memory") | |
| fs = vfs.NewMem() | |
| } | |
| pdb, err := pebble.Open(path, &pebble.Options{ | |
| dbPath := path | |
| if path == indexdb.TypeInMemory { | |
| log.Infof("kvstore used in-memory") | |
| fs = vfs.NewMem() | |
| dbPath = "/inmemory" | |
| } | |
| pdb, err := pebble.Open(dbPath, &pebble.Options{ |
| panic("implement me") | ||
| // Allow implements [storage.Bucket]. | ||
| func (m *memoryBucket) Allow() int { | ||
| return int(m.maxSize) |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Allow method should return a percentage (0-100) according to the interface documentation (api/defined/v1/storage/storage.go:59), but it currently returns maxSize which is in bytes (104857600 for 100 MB). This could cause issues if other parts of the system expect a percentage value. Consider returning a proper percentage value or using the allow field for this purpose, similar to how empty bucket returns 100.
| return int(m.maxSize) | |
| if m == nil { | |
| return 0 | |
| } | |
| return m.weight |
| panic("implement me") | ||
| // Remove implements [storage.Bucket]. | ||
| func (m *memoryBucket) Remove(ctx context.Context, id *object.ID) error { | ||
| return nil |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Remove method returns nil without actually removing the object from the indexdb or cache. This means removed objects will remain in storage and could be retrieved later. The implementation should delete from the indexdb using m.indexdb.Delete() and potentially remove from the cache, similar to how the disk bucket implementation works (see storage/bucket/disk/disk.go:266-268).
| return nil | |
| // Ensure the object is removed from the index database so it is no longer retrievable. | |
| return m.indexdb.Delete(ctx, id) |
This pull request introduces significant improvements to the storage subsystem, focusing on enhanced configuration flexibility, better in-memory storage support, and consistent interface usage. The main changes include adding explicit support for in-memory buckets, unifying the way storage paths are configured, and updating interfaces and implementations for improved type safety and maintainability.
Key changes:
In-Memory Storage Support
Added a new
memorybucket type with its own implementation (memoryBucket), supporting in-memory object storage with a dedicated LRU cache and indexdb, and ensured only one memory bucket can be used at a time. (storage/bucket/memory/memory.go,storage/builder.go,storage/storage.go) [1] [2] [3] [4] [5] [6] [7] [8]Updated the pebble indexdb backend to support in-memory mode by using a memory-backed filesystem when the
db_pathis set to"inmemory". (storage/indexdb/pebble/pebble.go,storage/indexdb/indexdb.go) [1] [2] [3]Configuration Enhancements
Introduced a new
db_pathfield in both global storage and per-bucket configurations, allowing explicit control over indexdb storage paths, including support for in-memory stores. (conf/conf.go,config.example.yaml,storage/builder.go) [1] [2] [3] [4] [5]Updated the disk bucket to use the configurable
db_path, defaulting to the previous behavior if not specified. (storage/bucket/disk/disk.go)Interface Consistency and Type Safety
Existmethod signatures in storage interfaces and implementations to use theobject.IDHashtype instead of raw[]byte, improving type safety and consistency across buckets. (api/defined/v1/storage/storage.go,storage/bucket/disk/disk.go,storage/bucket/empty/empty.go,storage/bucket/memory/memory.go) [1] [2] [3] [4]Performance Tuning
storage/bucket/disk/disk.go)These changes collectively provide more flexible and robust storage configuration, improved in-memory caching capabilities, and a cleaner, safer codebase for storage operations.