Skip to content

feat: [Multi-Modal] Incomplete Processing Pipeline Across Modalities Causes Missing Memory Extraction #590

@CaralHsi

Description

@CaralHsi

Pre-submission checklist | 提交前检查

  • I have searched existing issues and this hasn't been mentioned before | 我已搜索现有问题,确认此问题尚未被提及
  • I have read the project documentation and confirmed this issue doesn't already exist | 我已阅读项目文档并确认此问题尚未存在
  • This issue is specific to MemOS and not a general software issue | 该问题是针对 MemOS 的,而不是一般软件问题

Problem Statement

Issue Description

  1. Problem Summary

MultiModalStructMemReader introduces a framework for multimodal inputs, but key parts of the pipeline remain incomplete, causing missing memory extraction across files, images, and mixed-content messages.

  1. Expected Improvements
    • Implement fine-mode for all modalities (image, file, tool).
    • Ensure parsers generate valid TextualMemoryItem.
    • SourceMessage reconstruction.
    • Validate through the multimodal test suite.

Willingness to Implement | 实现意愿

  • I'm willing to implement this myself | 我愿意自己解决
  • I would like someone else to implement this | 我希望其他人来解决

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestpendingPending items to be addressed | 待解决事项。

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions