-
Notifications
You must be signed in to change notification settings - Fork 347
feat: Add OCR text selection with Ctrl+T shortcut and copy functionality #685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds frontend OCR: tesseract.js dependency, a singleton OCRService (worker lifecycle + recognize/terminate), ImageViewer integration with Ctrl+T toggle, image scale tracking and OCR invocation, and a new TextOverlay component rendering selectable, copyable OCR overlays. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant ImageViewer
participant OCRService
participant TesseractWorker as Tesseract.js Worker
participant TextOverlay
User->>ImageViewer: Press Ctrl+T (toggle OCR)
ImageViewer->>ImageViewer: set isOCRActive=true, isOCRLoading=true
ImageViewer->>OCRService: recognize(imagePath)
Note right of ImageViewer: show OCR processing overlay
OCRService->>OCRService: getWorker() (lazy init)
OCRService->>TesseractWorker: initialize worker, load language
TesseractWorker-->>OCRService: worker ready
OCRService->>TesseractWorker: run recognize(image)
TesseractWorker-->>OCRService: return OCR Page result
OCRService-->>ImageViewer: return ocrData
ImageViewer->>ImageViewer: set isOCRLoading=false, store ocrData
ImageViewer->>TextOverlay: render with ocrData + scale
User->>TextOverlay: Select text + Ctrl+C
TextOverlay->>TextOverlay: capture selection, write to clipboard
TextOverlay-->>User: show "Copied!" feedback
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
frontend/src/components/Media/ImageViewer.tsx (1)
76-107: Consider worker cleanup for memory efficiency.The OCR worker persists for the app's lifetime. For a Tauri desktop app, this is likely acceptable. However, if memory efficiency is a concern for long-running sessions, consider terminating the worker after a period of inactivity or when the ImageViewer unmounts.
// Optional: Add cleanup on unmount useEffect(() => { return () => { // Optionally terminate worker when viewer unmounts // ocrService.terminate(); }; }, []);frontend/src/components/Media/TextOverlay.tsx (1)
105-125: Consider extracting inline styles to a CSS file.The inline
<style>tag is injected on every render. While functional, this could cause style duplication if multiple overlays exist. For better maintainability, extract these styles to a CSS module or the project's stylesheet.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
frontend/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (4)
frontend/package.json(1 hunks)frontend/src/components/Media/ImageViewer.tsx(3 hunks)frontend/src/components/Media/TextOverlay.tsx(1 hunks)frontend/src/services/OCRService.ts(1 hunks)
🔇 Additional comments (11)
frontend/package.json (1)
69-69: LGTM! Appropriate dependency for OCR functionality.The tesseract.js library is a well-established client-side OCR solution. Version 5.1.0 uses web workers by default, which aligns with the PR's goal of non-blocking OCR processing.
frontend/src/services/OCRService.ts (2)
7-32: Solid lazy initialization pattern with proper error recovery.The worker initialization correctly handles the race condition where multiple calls to
getWorker()occur simultaneously by caching the promise. The error handling appropriately clearsworkerPromiseto allow retry on failure.
54-54: Singleton pattern is appropriate; verify worker cleanup on app lifecycle.The singleton ensures a single worker instance is reused across the app. Ensure
ocrService.terminate()is called when the application unmounts or when the feature is no longer needed to free up memory.frontend/src/components/Media/ImageViewer.tsx (4)
1-7: Clean imports for OCR integration.The imports are well-organized and appropriate for the new OCR functionality.
39-44: Good practice to reset OCR state on image change.Clearing OCR state when
imagePathchanges prevents stale overlay data from appearing on a different image.
46-74: Well-implemented scale tracking with proper cleanup.The ResizeObserver pattern correctly tracks image dimensions for overlay alignment. The cleanup properly removes the event listener and disconnects the observer.
109-183: Well-structured OCR UI with appropriate status indicators.The loading and active state overlays provide clear user feedback. Disabling panning during OCR mode correctly prioritizes text selection UX.
Minor:
convertFileSrc(imagePath)is called in multiple places. Consider memoizing it if performance becomes a concern, though the impact is likely negligible.frontend/src/components/Media/TextOverlay.tsx (4)
5-8: Clean interface definition.The props interface is well-defined with appropriate optionality for
scale.
54-70: Good event propagation management for text selection.The
onMouseDownstopPropagation correctly prevents the zoom/pan wrapper from capturing mouse events during text selection.
71-104: Clever transparent text overlay approach for selection.The transparent text with visible hover states allows native text selection while maintaining the original image visibility. The title attribute provides accessibility via tooltips.
38-39: Avoidas anytype assertion; use proper typing.The
Pagetype from tesseract.js v5 includes alinesproperty asLine[], so theas anyassertion is unnecessary and bypasses type safety. UseocrData.lines ?? []instead. If TypeScript raises errors after removing the assertion, verify that thetesseract.jstype definitions are correctly installed and thatocrDatais properly typed asPage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
frontend/src/components/Media/ImageViewer.tsx(3 hunks)frontend/src/components/Media/TextOverlay.tsx(1 hunks)frontend/src/services/OCRService.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- frontend/src/components/Media/TextOverlay.tsx
🧰 Additional context used
🧬 Code graph analysis (1)
frontend/src/components/Media/ImageViewer.tsx (2)
frontend/src/services/OCRService.ts (1)
ocrService(59-59)frontend/src/components/Media/TextOverlay.tsx (1)
TextOverlay(10-134)
🔇 Additional comments (5)
frontend/src/services/OCRService.ts (1)
1-59: Well-structured singleton OCR service with proper lifecycle management.The implementation correctly handles:
- Lazy worker initialization with caching
- Race condition in
terminate()by awaiting pendingworkerPromise(previous review issue addressed)- Error recovery by clearing
workerPromiseon initialization failure, allowing retryOne consideration: the worker persists until explicitly terminated. If OCR is used infrequently, you may want to add an idle timeout to auto-terminate the worker and free resources.
frontend/src/components/Media/ImageViewer.tsx (4)
1-29: Clean state management for OCR integration.The separation of
isOCRActive,ocrData, andisOCRLoadingprovides clear control over the different aspects of the OCR feature. UsingimgReffor scale calculation is the right approach.
46-74: Robust scale calculation with proper cleanup.Good defensive check for
naturalWidth > 0to avoid division issues. TheResizeObserverensures the overlay stays aligned when the image container resizes.
82-117: Stale data race condition properly addressed.The
imagePathRefpattern correctly prevents OCR results from being applied to a different image (previous review issue addressed).Minor consideration: if the user rapidly toggles OCR off/on while loading, the single
ocrDatacheck at line 93 might cause the second activation to skip OCR sinceocrDatais still being populated from the first request. This is edge-case behavior that's unlikely to cause real problems.
138-192: Good integration of OCR overlay with zoom/pan controls.The
panning={{ disabled: isOCRActive }}correctly locks image movement during text selection. PlacingTextOverlayinside the rotated container ensures proper alignment with rotated images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
frontend/src/components/Media/ImageViewer.tsx (4)
25-29: TightentransformReftyping; OCR state additions look fine.
imgRef,isOCRActive,ocrData,isOCRLoading, andimageScaleare well-structured for the OCR flow. The only nit istransformRefbeingany, which loses type safety around zoom/pan methods. Ifreact-zoom-pan-pinchexposes a ref type (e.g.,ReactZoomPanPinchRef), prefer:-import { TransformWrapper, TransformComponent } from 'react-zoom-pan-pinch'; +import { TransformWrapper, TransformComponent, ReactZoomPanPinchRef } from 'react-zoom-pan-pinch'; -const transformRef = useRef<any>(null); +const transformRef = useRef<ReactZoomPanPinchRef | null>(null);
47-75: Scale calculation is reasonable; verify alignment for extreme aspect ratios.Using
width / naturalWidthwith aResizeObserveron the<img>is a solid approach and should keep overlays aligned under zoom/pan (since both image and overlay sit inside the same transformed container). The only potential edge case is very tall/wide images whereobjectFit: 'contain'introduces letterboxing; in those cases, you may need to factor in height and potential offsets (e.g., viagetBoundingClientRect()plus container size) to keep OCR lines perfectly aligned.
83-118: Ctrl+T OCR toggle and stale-result guard look solid; consider global hotkey scope.The Ctrl+T handler correctly:
- Toggles
isOCRActiveand respects existingocrData/isOCRLoading.- Prevents duplicate OCR runs.
- Uses
imagePathRefso late results don’t overwrite state after an image change.- Cleans up the event listener on unmount.
One thing to double‑check: attaching the handler to
windowmeans Ctrl+T is intercepted anywhere in the app while this viewer is mounted. If there are (now or later) other views that need Ctrl+T or browser‑style shortcuts, you may want to scope this to when the viewer (or its parent panel) is focused instead of globally.
162-191: Image container + TextOverlay integration looks correct; watch alignment if layout changes.Placing the
<img>and<TextOverlay>inside a relatively positioned, rotated container keeps OCR rectangles and image in the same transform context, anddraggable={false}avoids conflicts with selection. Passingscale={imageScale}intoTextOverlaylines up with the earlier scale calculation; as long as the image sizing rules remainmaxWidth/maxHeight + objectFit: 'contain', overlays should stay in sync under zoom/rotation.If future layout tweaks change how the image is sized/positioned, remember this coupling between
imageScaleand the<img>’s rendered box so the overlay doesn’t drift.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
frontend/src/components/Media/ImageViewer.tsx(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
frontend/src/components/Media/ImageViewer.tsx (2)
frontend/src/services/OCRService.ts (1)
ocrService(59-59)frontend/src/components/Media/TextOverlay.tsx (1)
TextOverlay(10-134)
🔇 Additional comments (6)
frontend/src/components/Media/ImageViewer.tsx (6)
1-7: OCR-related imports look good; confirmPagetype matchesocrServiceoutput.The added imports (OCRService singleton, TextOverlay, tesseract.js
Page,Loader2) are cohesive and keep OCR concerns localized to this component. Just ensureocrService.recognizeactually returns aPage(or a compatible shape) so thePage | nulltyping remains accurate end‑to‑end.
39-45: Resetting OCR state on image/reset change is correct.Resetting
isOCRActive,ocrData, andisOCRLoadingtogether whenimagePath/resetSignalchanges avoids stale overlays and loading indicators carrying over between images. This aligns well with the intended flow.
77-82:imagePathRefpattern is a good guard against stale OCR updates.Keeping the latest
imagePathin a ref and syncing it viauseEffectgives you a simple way to ignore late OCR results from previous images. The setup here works cleanly with the reset effect above.
121-137: Status badges cleanly communicate OCR state.The “Processing Text…” and “Text Selection Active” glassmorphism badges are clearly conditioned on
isOCRLoadingvs.isOCRActive && ocrDataand are layered with appropriate z‑indices. This gives good feedback without interfering with the image content.
139-147: Disabling panning while OCR is active matches the selection UX requirement.Using
panning={{ disabled: isOCRActive }}onTransformWrapperis a straightforward way to lock drag‑panning during OCR mode while still allowing zoom. This matches the “lock image panning while text selection is active” objective.
148-161: Wrapper/content styles are appropriate for overlays and zoom.Setting
wrapperStyleto full width/height withoverflow: 'visible'and centering content ensures the image and overlays remain visible under zoom/pan without clipping. This integrates well with the absolute‑positioned overlay approach.
rohan-pandeyy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, kindly go through these reviews.
One more and one of the most suggestion is that you should first get these features discussed and approved with @rahulharpal1603, for the implementation to be merged into PictoPy.
| }} | ||
| <div className="relative w-full h-full"> | ||
| {isOCRLoading && ( | ||
| <div className="absolute top-6 right-6 z-[60] bg-zinc-900/80 text-white px-4 py-2 rounded-full flex items-center gap-3 backdrop-blur-md border border-white/10 shadow-xl animate-in fade-in slide-in-from-top-2 duration-300"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tailwind is warning that the class z-[60] can be written as z-60. Since it’s a standard value, we don’t need the arbitrary syntax here.
| )} | ||
|
|
||
| {isOCRActive && !isOCRLoading && ocrData && ( | ||
| <div className="absolute top-6 right-6 z-[60] bg-blue-600/90 text-white px-4 py-2 rounded-full flex items-center gap-2 backdrop-blur-md border border-white/10 shadow-xl animate-in fade-in slide-in-from-top-2 duration-300"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same tailwind warning here, kindly refer to the above review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the "Processing Text" and "Text Selection Active" visuals, I'd suggest you move these two divs' position. Currently they're covering the buttons on top right, making them inaccessible.
| <> | ||
| {showCopyFeedback && ( | ||
| <div | ||
| className="fixed bottom-8 left-1/2 transform -translate-x-1/2 z-[100] bg-zinc-900/90 text-white px-4 py-2.5 rounded-full flex items-center gap-2.5 backdrop-blur-md shadow-xl border border-white/10 animate-in fade-in slide-in-from-bottom-4 duration-300" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar tailwind warning can be seen here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another addition required would be that you should add a button for text detection along side the shortcut which most users might never discover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
frontend/src/components/Media/ImageViewer.tsx (2)
158-161: Use standard Tailwindz-60instead of arbitraryz-[60]}syntax.Per previous review feedback, Tailwind warns that
z-[60]can be simplified toz-60since it's a standard value.- className={`absolute top-6 left-6 z-60 flex items-center gap-2 ... + className={`absolute top-6 left-6 z-60 flex items-center gap-2 ...Apply the same fix to lines 171 and 178 if they currently use
z-[60].Also applies to: 171-171, 178-178
218-235: Consider addingloading="lazy"for performance.For a gallery/viewer application, lazy loading can improve initial render performance when multiple images might be in the DOM.
<img ref={imgRef} src={convertFileSrc(imagePath) || '/placeholder.svg'} alt={alt} draggable={false} + loading="lazy" className="select-none block max-w-full max-h-full"frontend/src/components/Media/TextOverlay.tsx (3)
43-44: Type safety: Avoidas anycast; define or extend the type properly.Casting
ocrData as anyand usingline: anyloses type safety and IDE assistance. ThePagetype from tesseract.js should includelines, but if it doesn't match your usage, define a proper interface.+interface OcrLine { + bbox: { x0: number; y0: number; x1: number; y1: number }; + text: string; +} + +interface OcrPageWithLines extends Page { + lines?: OcrLine[]; +} interface TextOverlayProps { - ocrData: Page | null; + ocrData: OcrPageWithLines | null; scale?: number; }Then update the map:
- const lines = (ocrData as any).lines || []; + const lines: OcrLine[] = ocrData.lines ?? []; // ... - {lines.map((line: any, index: number) => { + {lines.map((line, index) => {Also applies to: 76-77
110-130: Consider moving inline styles to a CSS module or styled-components.Inline
<style>tags within component render can cause:
- Style duplication if multiple instances render
- Potential conflicts with global styles
- No CSS scoping guarantees
For this PR, it's acceptable since
TextOverlayis typically a singleton, but for maintainability, consider extracting to a CSS file.
16-31: Copy handler intercepts all Ctrl+C events globally.The handler captures Ctrl+C on
window, which could interfere with copy operations in other parts of the app. Consider scoping this to only when the overlay or its content is focused/selected.Add a check to ensure the selection is within the overlay:
const handleKeyDown = async (e: KeyboardEvent) => { if (e.ctrlKey && e.key.toLowerCase() === 'c') { const selection = window.getSelection(); const text = selection?.toString().trim(); + + // Only handle if selection is within the OCR overlay + const anchorNode = selection?.anchorNode; + const isWithinOverlay = anchorNode && + document.querySelector('.ocr-overlay-container')?.contains(anchorNode); + if (!isWithinOverlay) return; if (text && text.length > 0) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
frontend/src/components/Media/ImageViewer.tsx(3 hunks)frontend/src/components/Media/TextOverlay.tsx(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
frontend/src/components/Media/ImageViewer.tsx (2)
frontend/src/services/OCRService.ts (1)
ocrService(59-59)frontend/src/components/Media/TextOverlay.tsx (1)
TextOverlay(10-134)
🔇 Additional comments (6)
frontend/src/components/Media/ImageViewer.tsx (3)
47-75: ResizeObserver setup is solid.The effect correctly handles initial load, load events, and resize observation with proper cleanup. The dependency on
imagePathensures re-subscription when the image changes.
194-194: Disabling panning during OCR mode is a good UX choice.This correctly prevents accidental panning while the user is trying to select text, aligning with the PR objective to lock image panning during text selection.
220-220: Fallback to placeholder won't trigger ifconvertFileSrcreturns an empty string.The expression
convertFileSrc(imagePath) || '/placeholder.svg'only falls back ifconvertFileSrcreturns a falsy value. If it returns an empty string"", the fallback works, but if it throws or returns an invalid URL, theonErrorhandler is the actual safety net. The current approach is acceptable sinceonErrorhandles the failure case.frontend/src/components/Media/TextOverlay.tsx (3)
13-39: Timeout cleanup is properly implemented.The feedback timeout is correctly cleared on unmount, addressing the previous memory leak concern. The event listener cleanup is also correct.
83-108: Line rendering logic is well-implemented.The positioning calculations correctly apply the scale factor to bbox coordinates. Using
title={text}provides accessibility for the transparent text. The styling choices (transparent color, text cursor) align with the UX goal of invisible-but-selectable text.
50-50:z-100is not a default Tailwind utility class.Tailwind's default z-index scale includes z-0, z-10, z-20, z-30, z-40, z-50, and z-auto only. The class
z-100will be ignored unless it's added to your theme configuration or you use the arbitrary value syntaxz-[100].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
frontend/src/components/Media/ImageViewer.tsx (1)
83-105: Consider using refs to prevent unnecessary callback recreation.The
triggerOCRcallback includesocrDataandisOCRLoadingin its dependency array, causing it to be recreated every time these values change. This triggers the keyboard handler effect (line 125) to re-run and re-register the event listener unnecessarily.Since these values are only used for a guard check (line 84), consider using refs instead:
+ const ocrDataRef = useRef(ocrData); + const isOCRLoadingRef = useRef(isOCRLoading); + useEffect(() => { + ocrDataRef.current = ocrData; + isOCRLoadingRef.current = isOCRLoading; + }, [ocrData, isOCRLoading]); const triggerOCR = useCallback(async () => { - if (ocrData || isOCRLoading) return; + if (ocrDataRef.current || isOCRLoadingRef.current) return; setIsOCRLoading(true); const currentPath = imagePathRef.current; try { const src = convertFileSrc(currentPath); const data = await ocrService.recognize(src); if (currentPath === imagePathRef.current) { setOcrData(data); } } catch (error) { console.error('Failed to perform OCR', error); setIsOCRActive(false); } finally { if (currentPath === imagePathRef.current) { setIsOCRLoading(false); } } - }, [ocrData, isOCRLoading]); + }, []);This prevents the callback from being recreated and stabilizes the keyboard handler registration.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
frontend/src/components/Media/ImageViewer.tsx(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
frontend/src/components/Media/ImageViewer.tsx (2)
frontend/src/services/OCRService.ts (1)
ocrService(59-59)frontend/src/components/Media/TextOverlay.tsx (1)
TextOverlay(10-134)
🔇 Additional comments (8)
frontend/src/components/Media/ImageViewer.tsx (8)
1-7: LGTM! Clean import organization.All new imports are properly used throughout the component. The addition of useCallback improves performance optimization for the OCR trigger logic.
25-29: Well-structured state management.The OCR-related state variables are properly typed and initialized. The use of Page type from tesseract.js ensures type safety for OCR results.
39-45: Proper cleanup on image change.The effect correctly resets all OCR-related state when the image changes, preventing stale OCR data from being displayed on a new image. Previous reviewer's concern about missing
isOCRLoadingreset has been properly addressed.
47-75: Comprehensive scale tracking implementation.The effect handles multiple scenarios correctly:
- Initial render when image is already loaded (
img.complete)- Async image loading via the
loadevent- Container/image resize via
ResizeObserverThe cleanup properly removes listeners and disconnects the observer. The scale calculation is accurate for aligning OCR overlays with zoomed/panned images.
107-125: Keyboard handler correctly implements Ctrl+T toggle.The event handler properly prevents the default browser behavior and toggles OCR mode as expected. The cleanup ensures no memory leaks from event listeners.
130-150: Excellent UI affordance for OCR feature.The button provides good visual feedback with:
- Clear icon and label
- Active state styling with pulse animation
- Keyboard shortcut hint for discoverability
- Proper event propagation handling
This addresses the previous reviewer's request for a button alongside the keyboard shortcut.
152-167: Verify indicator positioning doesn't interfere with other UI elements.A previous reviewer noted that these indicators might cover buttons on the top right. While the indicators are positioned on the left (
left-6), on smaller viewports or with certain zoom levels, they could potentially overlap with other UI controls.Please test the following scenarios to ensure the indicators don't interfere with other UI elements:
- Small viewport sizes (mobile, tablet)
- Different zoom levels in the TransformWrapper
- Verify accessibility of any top-right buttons (e.g., close, menu) while indicators are visible
Consider adding responsive positioning or adjusting the z-index hierarchy if overlap occurs.
Based on learnings, previous reviewer flagged this as a positioning concern.
169-223: Well-integrated OCR overlay with existing zoom/pan functionality.The implementation correctly:
- Disables panning when OCR is active (line 176) to allow text selection
- Applies rotation at the container level (line 195) so overlay aligns with rotated images
- Passes accurate
imageScaletoTextOverlay(line 219) for proper overlay positioning- Maintains existing zoom/pan behavior when OCR is inactive
The conditional rendering ensures the overlay only appears when OCR data is available, and the overflow styling allows the overlay to be fully visible.
|
Hi @rohan-pandeyy I have gone through your comments and implemented the required changes. I'll ask @rahulharpal1603 to review it. Thanks. |
Fixes #675
Description
This PR introduces a powerful OCR text selection feature that allows users to extract and copy text from images directly within PictoPy. The implementation uses Tesseract.js on the frontend for a smooth, responsive experience without heavy backend dependencies.
Key Features
Ctrl+Tto activate text recognition on any imageCtrl+CTechnical Implementation
New Dependencies
[email protected]- Lightweight OCR libraryUser Experience
Ctrl+Tto activate OCR text selectionCtrl+Cto copy selected textCtrl+Tagain to exit selection modeNotes
Testing
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.