Skip to content

feat: Add floating recorder overlay#151

Open
lrq3000 wants to merge 14 commits into
Dimowner:masterfrom
lrq3000:pr-merge
Open

feat: Add floating recorder overlay#151
lrq3000 wants to merge 14 commits into
Dimowner:masterfrom
lrq3000:pr-merge

Conversation

@lrq3000

@lrq3000 lrq3000 commented Jun 18, 2026

Copy link
Copy Markdown

This PR adds an optional floating recorder overlay, so that recordings can be enabled and stopped (with saving) while using other apps, for example overlaid on a GPS map, or a book reader.

This overlay is disabled by default.

When enabled, it will follow the setting for whether to ask to rename after stopping the recording, if so, another transparent overlay will appear to rename the file, and the position of this renaming dialog overlay can also be moved and will be memorized between instanciations.

On tapping, files are saved anyway (so if you tap in quick successions without renaming, the file is still saved with the default name).

When tapping to stop a recording, the overlay displays a color wheel animation for about 3 seconds to signal clearly to the user that the file was saved, so this can be seen at a glance without even looking at the screen.

This PR was AI-assisted with OpenCode using ChatGPT-5.5 (High reasoning) and with custom agentic coding instructions (minimize changes, literal programming, avoid redundant and bad patterns, etc).

It's a feature I wanted since a long time, I could finally make it using the latest models although it churned a lot of credits! Hopefully it can be useful to others.

Thank you so much for this incredible audio recording app, I am still using it all the time for many purposes but especially to record my own thoughts and later convert using VibeTranscribe.

AI-generated summary of the code changes

User-Facing Changes

  • Added an optional floating recorder overlay so users can start and stop recordings from outside the app after opening the app once.
  • Kept the overlay opt-in: it does not auto-start on boot and remains disabled unless enabled by the user.
  • Added draggable floating recorder controls designed for safer quick access over other apps such as navigation apps.
  • Added pinch-to-resize support for the floating recorder button, with the chosen size persisted across sessions.
  • Added post-recording rename support directly inside the floating overlay when the existing “ask to rename after recording” setting is enabled.
  • Added speech input for the overlay rename flow, with append and replace modes.
  • Switched rename speech input to Android’s standard speech-recognition intent flow, avoiding fragile service-level recognizer lifecycle handling.
  • Polished the overlay rename actions with a prominent green Save button and a lightweight Reset action.
  • Prevented the soft keyboard from opening automatically when the overlay rename panel appears.
  • Kept Reset from focusing the text field or opening the keyboard.
  • Fixed duplicate rename prompts: recordings stopped from the in-app UI show only the in-app rename dialog, while recordings stopped from the floating overlay show only the floating overlay rename panel.
  • Fixed overlay rename text theming/readability.

Code Changes

  • Added FloatingRecorderOverlayService to render and manage the floating recorder overlay window.
  • Wired the overlay to AudioRecordingService so it can start and stop recordings while staying synchronized with recording state.
  • Added overlay geometry/state helpers for drag bounds, positioning, resize behavior, no-keyboard rename behavior, reset behavior, and speech filename normalization.
  • Added persisted overlay preferences through PrefsV2 / PrefsV2Impl, including overlay size and rename speech mode.
  • Added RenameSpeechMode to model append versus replace behavior for speech-based rename input.
  • Added FloatingRenameSpeechRecognitionActivity as a transparent proxy activity for RecognizerIntent.ACTION_RECOGNIZE_SPEECH.
  • Updated AndroidManifest.xml with overlay service/activity declarations and speech recognition query support.
  • Added resources for overlay rename UI labels and icons, including mic and check-circle drawables.
  • Updated AudioRecordingService to track whether a recording was started from the overlay and whether the stop request came from the overlay.
  • Added RecordingStoppedRenamePolicy to centralize post-stop rename surface selection.
  • Updated HomeViewModel to use the shared rename policy so in-app rename dialogs are suppressed for overlay stop events.
  • Updated FloatingRecorderOverlayService to use the shared rename policy so overlay rename panels are shown only for overlay stop events.
  • Added unit coverage for overlay geometry, persisted preferences, speech rename helpers, recognizer configuration, and source-aware rename routing.

Implementation Motivation

  • The overlay is implemented as a foreground-aware Android service because it must remain available above other apps while recording state lives in the existing recording service.
  • Overlay behavior is opt-in and not boot-started to avoid surprising background behavior and unnecessary privacy risk.
  • Pinch resize is persisted through preferences so the overlay adapts to user ergonomics without adding complex UI settings.
  • Rename speech input uses RecognizerIntent through a transparent activity because Android’s speech UI is better handled by the platform activity contract than by a long-lived service-level SpeechRecognizer so that AudioRecorder does not have to micro-manage the technical intricacies of processing speech recognition, we delegate to another app and just get back the textual transcription.
  • Rename UI avoids auto-focusing the text field because the overlay is intended for quick use and the soft keyboard would obscure context, especially over navigation apps.
  • The source-aware rename policy exists because recording start source is not enough: a recording can be started in one place and stopped in another. Rename UI must depend on where the stop happened.
  • The shared policy keeps HomeViewModel and FloatingRecorderOverlayService consistent, preventing duplicate rename surfaces and making the expected behavior testable.
  • Regression tests cover the routing matrix so future changes preserve these guarantees: rename disabled shows no prompt, invalid records show no prompt, in-app stops show in-app rename only, and overlay stops show overlay rename only.

Tests units ran

  • ./gradlew testDebugConfigDebugUnitTest
  • ./gradlew assembleDebugConfigDebug

All tests passed.

lrq3000 added 2 commits June 18, 2026 23:42
Add a default-off V2 floating recorder overlay that starts after the app has opened, controls recording through the existing recording service, and supports overlay-based rename flow when the rename-after-recording setting is enabled.

Persist overlay and rename dialog positions, localize all overlay strings, and add disc-style save feedback with a continuous three-second rainbow-to-grey confirmation animation.

Add focused tests for overlay permissions, settings decisions, geometry, save feedback colors, and preference persistence.

Agentic harness: OpenCode with OpenAI GPT-5.5 (openai/gpt-5.5).
Style the floating rename overlay from the V2 dark-theme preference so dark mode uses a dark panel with white bold filename text, and light mode uses a white panel with black bold filename text.

Add focused tests for rename overlay dark and light style selection.

Agentic harness: OpenCode with OpenAI GPT-5.5 (openai/gpt-5.5).
@lrq3000

lrq3000 commented Jun 18, 2026

Copy link
Copy Markdown
Author

I personally tested all the changes, everything works to me. I will be using it extensively, so I will see if there are any rough edges, but I already cleared up most of the polishing I think (the first commit is in fact a squash merge of 5 or 6 different commits done locally).

Let me know if you have any feedback, I'll try to update asap!

lrq3000 added 3 commits June 19, 2026 19:51
Persist the floating recorder overlay diameter and restore it with display-aware clamping so the button keeps a user-selected size across app restarts.

Add tested geometry helpers for size bounds, saved-size clamping, and proportional record-disc scaling, then wire two-finger pinch handling into the overlay service while suppressing tap and drag during resize.

Agentic harness: OpenCode with OpenAI gpt-5.5.
Add system SpeechRecognizer dictation to the floating rename overlay, with a large mic button that shows the persisted append/replace mode and a long-press popup to change it.

Persist the rename speech mode, sanitize and normalize recognized text, and cap the visible filename to 251 characters so the hidden extension stays within the filename budget.

Add focused tests for filename composition and preference persistence, plus Android package visibility and mic resources for recognition providers such as FUTO Voice Input when exposed as a RecognitionService.

Agentic harness: OpenCode with OpenAI gpt-5.5.
Replace the embedded SpeechRecognizer backend with a transparent RecognizerIntent proxy Activity so activity-based providers such as FUTO Voice Input can handle rename dictation like Chromium.

Keep the existing large mic button, persisted long-press append/replace mode, and filename truncation while delegating recognition UI, permissions, and lifecycle to the recognizer app instead of managing low-level callbacks.

Agentic harness: OpenCode with OpenAI gpt-5.5.
@lrq3000

lrq3000 commented Jun 19, 2026

Copy link
Copy Markdown
Author

I added a few more quality of life features such as being able to resize the overlay record button and a big microphone button to be able to rename files by voice (it uses the system defined voice transcriptor, an open-source one that is multilingual and works well for this purpose even without Google Play Services is FUTO Voice Input using the Whisper models).

I think I won't add any more features, because I am pretty satisfied with how this turned out.

But I will extensively test the feature in the upcoming days so maybe I will come back on my words ;-)

@lrq3000

lrq3000 commented Jun 19, 2026

Copy link
Copy Markdown
Author

Ah I just saw #152 where you add the possibility to add notes in audio files, that is an amazing coincidence! I thought about doing this too to allow for the voice transcription to have more leeway rather than being added to the filename with limited characters. But I did not want to introduce too many changes in the core with my PR, I conceived it more like a companion set of functions that are plugged over the core to provide overlay functionalities. So that's why I tried to separate the overlay features into distinct files as much as possible (but it's not possible for everything because we have to set permissions, a new option in the settings, translations of the displayed test, etc).

So once you merge #152, if you agree that's a better idea, I can make the voice transcription save the result as a comment inside the audio file instead of renaming the file (or allow the user to choose).

Make the floating rename overlay save action more prominent and turn the previous default-name shortcut into a non-saving reset action.

The reset action now restores the original record name, clears inline feedback, and leaves the cursor at the end so mistakes can be corrected without closing the overlay.

Agentic harness: OpenCode with OpenAI gpt-5.5.
@lrq3000

lrq3000 commented Jun 19, 2026

Copy link
Copy Markdown
Author

Here is a video of the feature (sorry for the french in the settings menu):

SVID_20260619_203305_1-high_3.mp4

And here is a debug build for those who want to try already (I provide no guarantee whatsoever, only that there are no viruses to my knowledge and my account is associated to my real identity and I work in OSS since a long time): https://github.com/lrq3000/AudioRecorder/releases/tag/v0.0.2

@lrq3000

lrq3000 commented Jun 19, 2026

Copy link
Copy Markdown
Author

Ah and I forgot to say that if the option to ask for filenames after recording is disabled in the settings, the record button overlay also won't ask for renaming the file, so if you want to go to the most minimalistic setting, you get an overlay button that you tap once to record, and another tap to stop recording (and you can always rename later).

@lrq3000 lrq3000 changed the title Add floating recorder overlay feat: Add floating recorder overlay Jun 19, 2026
Avoid focusing the rename text field or opening the soft keyboard when the floating rename overlay appears.

Keep speech append and replace working by updating the filename field directly, and keep reset from focusing the text field so GPS/navigation views remain unobstructed.

Agentic harness: OpenCode with OpenAI gpt-5.5.
@Dimowner

Copy link
Copy Markdown
Owner

Thanks for adding the flating overlay. I'll review and test it later. I don't mind adding this feature.

@Dimowner

Copy link
Copy Markdown
Owner

Ah I just saw #152 where you add the possibility to add notes in audio files, that is an amazing coincidence! I thought about doing this too to allow for the voice transcription to have more leeway rather than being added to the filename with limited characters. But I did not want to introduce too many changes in the core with my PR, I conceived it more like a companion set of functions that are plugged over the core to provide overlay functionalities. So that's why I tried to separate the overlay features into distinct files as much as possible (but it's not possible for everything because we have to set permissions, a new option in the settings, translations of the displayed test, etc).

So once you merge #152, if you agree that's a better idea, I can make the voice transcription save the result as a comment inside the audio file instead of renaming the file (or allow the user to choose).

I was thinking about the voice transcription feature. But this feature looks complicated to me. To implement voice recognition, it might require using API (which I don't want to do for now) or using a library that might significantly increase app size. Anyway, if you feel that you can implement the Voice Transcription feature, feel free to create a PR. I believe it is possible to combine the Record Notes feature and Voice Transcription.

I tested Google's Recorder app with the voice transcription feature; it works, but with very few languages🤷‍♂️

@lrq3000

lrq3000 commented Jun 20, 2026

Copy link
Copy Markdown
Author

For the voice transcription don't worry I was mindful about this, it delegates to the app set to be the default voice transcription by calling a RecognizerIntent, just like what Chromium for Android does when tapping the microphone icon in the url bar, so that AudioRecorder does not handle anything about speech recognition, it's just an Intent call.

I tested today extensively while driving and it worked wonderfully well, the voice transcription was particularly helpful to tag the filenames to find them easily later.

I juste noticed a minor issue when stopping a record in-app with the overlay enabled (it spawns both the in-app rename dialog and the overlay dialog). I will fix this tomorrow when I will access my computer.

Most of the code is to add the overlay and manage asking for the permission to do that and guide the user where to give the permission in Android parameters.

@lrq3000

lrq3000 commented Jun 20, 2026

Copy link
Copy Markdown
Author

Btw I used FUTO Voice Input, it works for all languages supported by Whisper so that's a lot, i personally tested english and french and even combing both and it works very well. Just make sure to go into the setting to download the biggest, multilingual "slow" model.

I have been using Whisper based transcription since it went out, it still not perfect but it is accurate much more often than it is wrong.

A big common factors that drastically reduce the transcription accuracy for whisper is background noise. There are new models that work incredibly well despite heavy background noise but they are not made into an easy android app yet to my knowledge. Meanwhile, I have found that bring the phone microphone closer to the mouth and articulating well results in a big accuracy jump, I did specific tests in front of various rollercoasters and to my surprise it resulted in about a 90% accuracy even with complex jargon words in my scientific discipline !

Anyway all that is to say that Whisper multilingual via FUTO Voice Input works already very well to be perfectly usable for practical everyday voice transcription imho even though it is running locally. I use it everyday to transcribe notes because I don't have Google Play Services on my old Huawei phone. It's just a bit slow so it's good for a paragraph but not for a 10 pages brainstorming which is what I use AudioRecorder to do, I record and transcribe offline later :-)

lrq3000 and others added 7 commits June 22, 2026 01:54
Prevent the floating rename overlay from opening after recordings stopped from the in-app record controls.

Track whether the current stop request came from the overlay, and route post-recording rename UI through a shared policy so in-app stops show the app dialog while overlay stops show the overlay dialog.

Agentic harness: OpenCode with OpenAI gpt-5.5.
Add a third floating rename mic mode that appends recognized speech to the record description as a new line while leaving the rename dialog open.

Persist the new mode through the existing rename speech mode preference and save notes through updateRecordDescription so the existing saveDescriptionToFile setting controls COMMENT tag embedding.

Agentic harness: OpenCode with OpenAI gpt-5.5.
Add a compact one-line visible description field to the floating rename overlay, update it when speech is captured in description mode, and save pending filename and description changes together.

Rename speech mode labels so filename and description targets are explicit in the mic long-press menu.

Agentic harness: OpenCode with OpenAI gpt-5.5.
Use an overlay-specific short description hint and clear the default EditText minimum height, vertical padding, font padding, and scrollbar so the description draft field stays one visible line high.

Add localized short hint strings for all supported resource folders and cover the compact field configuration with a focused unit test.

Agentic harness: OpenCode with OpenAI gpt-5.5.
Speech transcripts used for overlay filename append or replace could still contain characters that are invalid on common desktop filesystems, causing non-portable generated filenames.

Sanitize filename speech against cross-platform reserved characters, trailing Windows-invalid dots or spaces, and reserved Windows device names while preserving punctuation for audio-note transcripts.

Agentic harness: OpenCode, OpenAI GPT-5.5.
Manual in-app and floating-overlay rename saves could still pass unsafe filename characters through different paths, even after speech input was sanitized.

Add a shared V2 filename cleaner and call it from overlay save, overlay speech filename drafts, home active-record rename, and records-list rename so invalid characters are silently removed at save time.

Agentic harness: OpenCode, OpenAI GPT-5.5.
@lrq3000

lrq3000 commented Jun 24, 2026

Copy link
Copy Markdown
Author

Ok I think this PR is now completed. I added the support for inputting a description by voice in the overlay instead of renaming the filename (both option can be selected by holding the microphone button), and I added a filename sanitization to ensure filenames are compatible across OSes.

Please let me know if you would like me to implement anything else @Dimowner , and thank you once again for this amazing piece of software you have made and maintained!

Latest apk : https://github.com/lrq3000/AudioRecorder/releases/tag/v0.0.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants