-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat(memory): add SQLite memory service #4116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @summerx96, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust, persistent local memory solution for the ADK by integrating a new SQLite-based memory service. This enhancement allows for reliable storage and retrieval of session data, improving the framework's ability to maintain context across interactions. The service is designed for efficiency with idempotent updates and intelligent search capabilities, making it a valuable addition for applications requiring durable memory without external database dependencies. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Response from ADK Triaging Agent Hello @summerx96, thank you for creating this PR! Before we can review your contribution, could you please sign the Contributor License Agreement (CLA)? It seems the check is currently failing. Also, for a new feature like this, could you please create and associate a GitHub issue with this PR? This helps us track new features and discussions around them. Additionally, it would be very helpful for the reviewers if you could provide a screenshot or some logs demonstrating the new SQLite memory service in action. This information will help reviewers to review your PR more efficiently. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new SqliteMemoryService, providing a persistent, local memory option for ADK. The implementation is well-structured, featuring optional FTS5 full-text search with a LIKE fallback, idempotent session upserts, and a configurable extraction mechanism. The changes are accompanied by clear documentation and a good set of unit tests covering key functionalities like idempotency, updates, persistence, and FTS usage. My feedback focuses on improving the robustness of data parsing and enhancing code conciseness in a helper function.
| parts = [] | ||
| for part in event.content.parts: | ||
| if not part.text: | ||
| continue | ||
| if getattr(part, "thought", False): | ||
| continue | ||
| text = part.text.replace("\n", " ").strip() | ||
| if text: | ||
| parts.append(text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for extracting text parts from an event can be made more concise by using a list comprehension with an assignment expression (walrus operator :=). This improves readability by reducing nesting and making the intent clearer.
parts = [
text
for part in event.content.parts
if part.text and not getattr(part, "thought", False)
if (text := part.text.replace("\n", " ").strip())
]| if row["metadata_json"]: | ||
| metadata["metadata"] = json.loads(row["metadata_json"]) | ||
| if row["extracted_json"]: | ||
| metadata["extracted"] = json.loads(row["extracted_json"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The json.loads calls could fail if the data in metadata_json or extracted_json is corrupted, which would cause the entire search request to fail. To make this more robust, consider wrapping them in try...except json.JSONDecodeError blocks to gracefully handle potential data corruption.
if row["metadata_json"]:
try:
metadata["metadata"] = json.loads(row["metadata_json"])
except json.JSONDecodeError:
# Consider logging a warning about data corruption.
pass
if row["extracted_json"]:
try:
metadata["extracted"] = json.loads(row["extracted_json"])
except json.JSONDecodeError:
# Consider logging a warning about data corruption.
pass|
Addressed this: wrapped metadata_json/extracted_json loads in try/except JSONDecodeError with warning logs so corrupted rows don't break search. |
d09e104 to
7b8fe1a
Compare
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new SQLite memory service, enhancing the ADK with persistent local memory capabilities. The implementation includes robust features such as idempotency for session updates, optional FTS5 support for efficient searching, and a fallback to LIKE queries. The service_registry has been updated to allow easy integration of the SQLite service via URI, and comprehensive unit tests have been added to validate its core functionalities, including persistence and search. The README.md for the memory sample has also been updated with clear usage instructions. Overall, this is a well-implemented and valuable addition to the ADK.
| elif db_path.startswith("/"): | ||
| db_path = db_path[1:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for handling db_path when it starts with / is to remove the leading slash. This implies that absolute paths are expected to be provided without the leading slash when passed to SqliteMemoryService. While this works, it might be more intuitive to handle both sqlite:///path/to/db and sqlite:////abs/path/to/db (which urlparse would parse as /abs/path/to/db) consistently by ensuring db_path is correctly interpreted as an absolute path if it starts with / after parsing, rather than stripping it. However, given the current SqliteMemoryService expects db_path to be a str or Path and handles Path objects correctly, stripping the leading slash might be fine if SqliteMemoryService internally resolves relative paths from the current working directory or if the db_path is always expected to be relative unless explicitly handled as absolute by the caller.
| if not search_text and not session_json: | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check if not search_text and not session_json: return means that if a session has no searchable text and no full session JSON to store, it won't be added to memory. This seems reasonable, but it's worth confirming if there are any edge cases where a session might be considered meaningful even without these two components (e.g., only metadata or other specific event types). If such cases exist, they would be silently ignored.
| like_query = f"%{query}%" | ||
| cursor = conn.execute( | ||
| """ | ||
| SELECT id, session_id, search_text, extracted_json, | ||
| metadata_json, updated_at_ms | ||
| FROM sessions | ||
| WHERE app_name=? AND user_id=? AND search_text LIKE ? | ||
| ORDER BY updated_at_ms DESC | ||
| LIMIT ? | ||
| """, | ||
| (app_name, user_id, like_query, self._max_results), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LIKE query uses an f-string f"%{query}%" to construct the pattern. While sqlite3 parameter binding prevents direct SQL injection for the query value itself, using LIKE with wildcards at both ends can be inefficient for large tables as it often prevents the use of indexes. For better performance on large datasets, consider alternatives like trigram indexes or ensuring that queries are structured to allow index usage where possible, or at least document this performance characteristic.
| if total_bytes > max_bytes: | ||
| raise ValueError(f"Session payload is too large ({total_bytes} bytes).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message Session payload is too large ({total_bytes} bytes). is clear. However, it might be helpful to also include the max_bytes limit in the error message for better context, e.g., Session payload is too large ({total_bytes} bytes), exceeding the limit of {max_bytes} bytes.
| if total_bytes > max_bytes: | |
| raise ValueError(f"Session payload is too large ({total_bytes} bytes).") | |
| raise ValueError(f"Session payload is too large ({total_bytes} bytes), exceeding the limit of {max_bytes} bytes.") |
|
Hi @summerx96, Thank you for your contribution! We appreciate you taking the time to submit this pull request. Your PR has been received by the team and is currently under review. We will provide feedback as soon as we have an update to share. |
|
Hi @ankursharmas , can you please review this. |
Summary
Testing