The mock server now correctly implements the newline-delimited JSON protocol observed in the Hyperion server and client code. - It properly parses incoming JSON streams line-by-line. - It creates QImage objects from the raw RGB data using the dimensions provided in the JSON payload. - It sends a success reply to the client after receiving an image. - This commit also updates lessons_learned.md with key findings from the debugging session.
24 KiB
24 KiB
Lessons Learned from Hyperion Grabber Wayland Adaptation
This document summarizes the key challenges, debugging steps, and solutions encountered during the process of adapting the Hyperion Grabber for Wayland, specifically focusing on the wayland_poc executable.
1. Initial Problem & Diagnosis
- Problem: The user's existing
Hyperion_Grabber_X11_QTapplication was not logging "picture recorded and sent" messages when run on a Wayland session, indicating a failure in screen capture and transmission. - Initial Analysis: The grabber is designed for X11, relying on
XDamageand other X11-specific features. Wayland's security model prevents direct screen access, requiring applications to usexdg-desktop-portaland PipeWire for screen sharing with user consent. - Conclusion: The X11 grabber would not work natively on Wayland. A Wayland-compatible screen grabbing solution was needed.
2. Wayland Grabber Adaptation Strategy
- Approach: Replace the X11 screen grabbing logic with a Wayland-compatible one, primarily using PipeWire for video streaming and
xdg-desktop-portalfor user consent. - Proof of Concept (POC): Developed
wayland_poc.cppto test thexdg-desktop-portalinteraction and PipeWire stream connection. Also usedtutorial5.cfrom PipeWire documentation to verify core PipeWire API usage.
3. Build Environment Challenges & Docker Adoption
- Initial Host Compilation Issues: Attempting to compile
wayland_poc.cppdirectly on the user's Arch Linux host resulted in a persistentfatal error: pipewire/extensions/session-manager/session-manager.h: No such file or directory.- Troubleshooting: Confirmed file existence and permissions, verified
pkg-configoutput, modifiedCMakeLists.txtfor include paths and C++ standards, tried different compilers (GCC). None resolved the issue. - Conclusion: The error was highly unusual and suggested a deep, system-specific problem with how PipeWire headers were exposed or how the compiler resolved includes on the host.
- Troubleshooting: Confirmed file existence and permissions, verified
- Adoption of Docker: Decided to use Docker to create a reproducible and controlled build environment, aiming to bypass host-specific compilation issues.
4. Docker Build Troubleshooting (Arch Linux Base)
- Dockerfile (Arch): Initial Dockerfile used
archlinux:latestas base. - Error 1:
sudo: command not found: Occurred becausepacmancommands were run asbuilderuser withoutsudoinstalled or configured.- Fix: Moved
pacmancommands to run asrootbefore switching tobuilderuser.
- Fix: Moved
- Error 2:
mkdir: cannot create directory ‘build’: File exists(Permission Denied): Occurred becausebuilddirectory was copied from host withrootownership, andbuilderuser lacked permissions to modify it.- Fix: Added
chown -R builder:builderafterCOPYto transfer ownership tobuilderuser.
- Fix: Added
- Error 3:
CMakeCache.txtmismatch: Occurred becauseCMakeCache.txtfrom host's project root was copied into container, confusing CMake.- Fix: Added
CMakeCache.txtandCMakeFiles/to.dockerignoreto prevent them from being copied.
- Fix: Added
- Error 4:
X11/extensions/Xdamage.h: No such file or directory(andscrnsaver.h,Xrender.h): Occurred because necessary X11 development headers were missing in the Docker image.- Fix: Added
libx11,libxext,libxdamage,libxss,libxrendertopacmaninstall list.
- Fix: Added
- Persistent
session-manager.herror: Despite all fixes, thefatal error: pipewire/extensions/session-manager/session-manager.h: No such file or directorypersisted even in the Arch Docker environment. This indicated the problem was not specific to the host Arch setup, but a deeper issue with PipeWire headers or their interaction with the compiler.
5. Docker Build Troubleshooting (Ubuntu Base & Clang)
- Switch to Ubuntu: Decided to switch the Docker base image to
ubuntu:latestto rule out Arch-specific packaging issues. Also switched toclangas the compiler. - Error:
fatal error: 'pipewire/extensions/session-manager/impl-session-manager.h' file not found: Thesession-manager.hissue persisted, even with Ubuntu and Clang. This confirmed the problem was not distribution-specific. - Attempt to bypass umbrella header: Modified
wayland_poc.cppto include individual headers (introspect.h,interfaces.h, etc.) instead ofsession-manager.h. This also failed with similar "file not found" errors for the individual headers. - Conclusion: The
session-manager.h(and relatedimpl-session-manager.h) compilation issue is extremely persistent and baffling, suggesting a very subtle, low-level problem that cannot be resolved by standard means or remote debugging.
6. Refocusing on xdg-desktop-portal with QDBus
- Realization: The
tutorial5.c(core PipeWire API) compiled successfully. Thesession-manager.hproblem was specific to thexdg-desktop-portalinteraction using PipeWire's session manager headers. - New Strategy: Isolate the
xdg-desktop-portalinteraction using Qt'sQDBusmodule, completely bypassing direct PipeWire session manager header includes. wayland_poc.cppRefactoring: Simplifiedwayland_poc.cppto only handle theQDBusinteraction to get thepipewire_node_id. Removed all PipeWire stream creation/processing logic.- QDBus API Challenges:
QDBusMessage::createMethodCallvscreateMethod: DiscoveredcreateMethodCallis the correct static method for creating method call messages in Qt 5.15.QDBusMessageconstructor: Found that the direct constructorQDBusMessage("service", "path", "interface", "method");is the correct way to initializeQDBusMessagefor method calls in the specific Qt5 version in Ubuntu.QDBusPendingCallconversion: EnsuredsessionBus.asyncCall(message)is used, which returnsQDBusPendingCall.
- Successful Compilation: The simplified
wayland_poc.cpp(QDBus-only) compiled successfully in the Ubuntu Docker container with Clang. This is a major breakthrough, confirming that theQDBusapproach forxdg-desktop-portalinteraction is viable.
7. Runtime Challenges (D-Bus Connection)
- Problem: Running the compiled
wayland_pocexecutable (from Docker) on the host failed withFailed to connect to D-Bus session bus. - Reason: Docker containers are isolated and do not have direct access to the host's D-Bus session bus by default. Environment variables (
DBUS_SESSION_BUS_ADDRESS,XDG_RUNTIME_DIR,DISPLAY,WAYLAND_DISPLAY) and volume mounts (/tmp/.X11-unix,$XDG_RUNTIME_DIR) are needed. - Further Problem: Even with correct
docker runparameters, thewayland_pocon the host failed withD-Bus call to xdg-desktop-portal failed: "Invalid session".- Diagnosis: This error indicates
xdg-desktop-portalis not recognizing the session context. It's often related toXDG_SESSION_ID,XDG_SESSION_TYPE, orXDG_CURRENT_DESKTOPnot being correctly propagated or interpreted.
- Diagnosis: This error indicates
- Method Name Mismatch: The
wayland_pocwas initially callingPickSourceonorg.freedesktop.portal.ScreenCast, but the correct method name isSelectSources.- Fix: Changed method call from
PickSourcetoSelectSourcesinwayland_poc.cpp.
- Fix: Changed method call from
- Argument Type Mismatch: The
SelectSourcesmethod expects an object path (o) for theparent_windowargument, butwayland_pocwas sending an empty string (s).- Fix: Changed the first argument to
QDBusObjectPath("/")inwayland_poc.cpp.
- Fix: Changed the first argument to
- Current Status: The
wayland_pocexecutable now compiles successfully. The D-Bus connection issue (Failed to connect to D-Bus session bus.orInvalid session) is still occurring when run on the host, indicating a persistent runtime environment issue with D-Bus access from the application.
8. Next Steps
- Verify
wayland_pocexecution on host: The user needs to run thewayland_pocexecutable on their host with the provideddocker runcommand (or directly if copied out) and confirm if thexdg-desktop-portaldialog appears and a PipeWire node ID is printed. The current D-Bus connection failure needs to be resolved. - Integrate: If successful, the next phase will involve integrating the
QDBusbasedxdg-desktop-portalinteraction with the core PipeWire stream capture logic (fromtutorial5.c) into the main Hyperion grabber. - Persistent
session-manager.h: The originalsession-manager.hcompilation issue remains unresolved for direct PipeWire session manager API usage. TheQDBusapproach is a workaround. If direct PipeWire session manager API interaction is ever needed, this issue would need further, likely local, investigation.
9. Recent Debugging and Solutions
-
SelectSourcesRequest Denied:- Problem: After initial compilation,
wayland_poc'sSelectSourcescall was denied byxdg-desktop-portal. - Diagnosis: The D-Bus message for
SelectSourceswas missing required options (e.g.,types,cursor_mode,persist_mode,handle_token) that are present in successful calls (e.g., from Firefox). - Fix: Modified
wayland_poc.cppto include these missing options in theSelectSourcesD-Bus message.
- Problem: After initial compilation,
-
Persistent
QDateTimeCompilation Error:- Problem: Encountered a persistent "incomplete type 'QDateTime'" error during compilation, even though
<QDateTime>was included and its position was adjusted. - Diagnosis: This was a highly unusual and difficult-to-diagnose error, possibly related to subtle interactions within the Qt build system or compiler.
- Workaround: Replaced all uses of
QDateTime::currentMSecsSinceEpoch()withQUuid::createUuid().toString()for generating unique tokens, and added#include <QUuid>. This successfully bypassed the compilation error.
- Problem: Encountered a persistent "incomplete type 'QDateTime'" error during compilation, even though
-
Docker Filesystem Isolation and
replaceTool Misunderstanding:- Problem: Changes made to
wayland_poc.cppusing thereplacetool were not reflected in the Docker container's build process. - Diagnosis: The
replacetool modifies the host's filesystem, while the Docker container operates on its own isolated filesystem. The container was building from an outdated copy of the source file. - Fix: After modifying
wayland_poc.cppon the host, the updated file was explicitly copied into the running Docker container usingdocker cp.
- Problem: Changes made to
-
Executable Retrieval from Docker Container:
- Problem: Difficulty in copying the built
wayland_pocexecutable from the Docker container to the host. The build container would exit immediately, anddocker cpfailed to find the file. - Diagnosis: The original build container was not designed to stay running.
docker cprequires a running container or a committed image. The initialdocker commitdid not seem to capture the build artifacts correctly. - Fix: Committed the original build container to a new image (
hyperion-grabber-build). Then, a new container was launched from this image with a command (tail -f /dev/null) to keep it running. The executable was then successfully copied from this running container usingdocker cp.
- Problem: Difficulty in copying the built
-
D-Bus
SelectSourcesSignature Mismatch:- Problem:
wayland_pocwas failing with "Type of message, “(oosa{sv})”, does not match expected type “(a{sv})”" when callingSelectSources. This occurred because thehandleSelectSourcesResponsefunction, designed forSelectSourcesreplies, was incorrectly connected to theCreateSessionD-Bus call'sfinishedsignal. TheCreateSessionreply's signature (starting with an object path) was being misinterpreted, leading to an incorrectSelectSourcescall. - Diagnosis: The
QObject::connectinmainwas incorrectly routing theCreateSessionreply to thehandleSelectSourcesResponsefunction. - Fix:
- Created a new handler function,
handleCreateSessionFinished, to specifically process theCreateSessionreply. This function extracts the session handle and then correctly initiates theSelectSourcesD-Bus call with its ownQDBusPendingCallWatcherconnected tohandleSelectSourcesResponse. - Modified the
mainfunction to connect theCreateSession'sQDBusPendingCallWatchertohandleCreateSessionFinished.
- Created a new handler function,
- Problem:
-
QDBusMessage::clearArguments()Method Not Found:- Problem: Compilation failed with an error indicating
QDBusMessagehad no member namedclearArguments(). - Diagnosis: The
clearArguments()method was removed in Qt 5.15, which is the version likely used in the Docker build environment. The call was also redundant as arguments were being cleared and then immediately re-added. - Fix: Removed the line
message.clearArguments();fromwayland_poc.cpp.
- Problem: Compilation failed with an error indicating
-
D-Bus
CreateSessionSignature Mismatch and Argument Evolution:- Problem: Initially,
wayland_pocfailed to callCreateSessiondue to a signature mismatch, expecting(oosa{sv})but receiving(a{sv}). Subsequent attempts to fix this by adding aparent_windowobject path argument led to a "Missing token" error, and later, a "Type of message, “(oa{sv})”, does not match expected type “(a{sv})”" error. - Diagnosis: Through iterative debugging and analysis of error messages (which contradicted some online documentation), it was determined that the
xdg-desktop-portalimplementation on the user's system forCreateSessionexpects only a single dictionary of options (a{sv}). Thehandle,session_handle,app_id, andparent_windowarguments were not expected for this method. The "Missing token" error occurred because, while only an options dictionary was expected, thehandle_tokenwithin that dictionary was a mandatory requirement. - Fix: The
CreateSessioncall inwayland_poc.cppwas modified to pass only aQVariantMapcontaining thehandle_tokenas its sole argument. - Problem: After successfully building and copying
wayland_pocto the host, running it resulted in "D-Bus call to SelectSources failed: "Remote peer disconnected"". - Diagnosis: This indicates a problem with the D-Bus connection itself, rather than a rejection from the portal. Possible causes include incorrect D-Bus environment variables on the host,
xdg-desktop-portalnot running, or permission issues. - Next Steps: Investigate D-Bus environment variables (
DBUS_SESSION_BUS_ADDRESS) andxdg-desktop-portalservice status on the host.
- Problem: Initially,
-
wf-recorderdoes not usexdg-desktop-portalfor screen capture:- Observation: Analysis of
wf-recorder/src/main.cppshows that it primarily uses Wayland protocols likewlr-screencopy-v1andxdg-outputfor screen content capturing. These protocols are specific to wlroots-based compositors. - Conclusion:
wf-recorderdoes not directly interact withxdg-desktop-portalvia D-Bus for screen capture. Therefore, it is not a suitable example for understanding the D-Bus interactions required forxdg-desktop-portalbased screen sharing, which is the approach taken bywayland_poc.cpp.
- Observation: Analysis of
-
Spectacle's Wayland screen recording uses KDE-specific Wayland protocol, not
xdg-desktop-portal:- Observation: Analysis of
cloned_repos/spectacle/src/Platforms/screencasting.cppreveals that Spectacle utilizes thezkde_screencast_unstable_v1Wayland protocol extension for screen recording. This is a KDE-specific protocol. - Conclusion: Spectacle's Wayland screen recording, as implemented in this part of the code, relies on a KDE-specific Wayland compositor (like KWin) that implements this protocol. It does not use the generic
xdg-desktop-portalD-Bus interface for screen capture. Therefore, it is not a suitable example for understanding the D-Bus interactions required forxdg-desktop-portalbased screen sharing.
- Observation: Analysis of
10. Revised Wayland Grabber Adaptation Strategy (KDE-specific focus)
- User Clarification: The user has clarified that the Proof of Concept (POC) only needs to run in the current environment, and an implementation relying on KDE-specific tools is acceptable.
- New Strategy: Based on this clarification, the focus for Wayland screen sharing will shift from the generic
xdg-desktop-portalD-Bus interface to leveraging KDE-specific Wayland protocols, specificallyzkde_screencast_unstable_v1, as observed in Spectacle's source code. - Implication: This means the
wayland_poc.cppwill be adapted to use thezkde_screencast_unstable_v1protocol directly, rather than going throughxdg-desktop-portal. This approach is expected to be more direct and potentially simpler for the current environment.
11. Migrating to Qt6 and Modern Screen Capture
- Problem: The KDE-specific Wayland protocol approach, while promising, still faced compilation issues related to missing headers (
QNativeInterface,QWaylandScreen) in the user's Qt 5 environment. - Decision: To resolve these issues and modernize the project, the decision was made to migrate the
wayland_pocto Qt6. - Initial Qt6 Migration Issues:
- The initial attempt to use
QNativeInterfaceandQWaylandScreenfailed because the user's installed Qt6 version (6.4.4) was too old. These features were introduced in later versions (6.5 and 6.7).
- The initial attempt to use
- Revised Qt6 Strategy:
- Instead of relying on low-level native interfaces, the
wayland_pocwas rewritten to use the high-levelQScreenCaptureAPI from the Qt Multimedia module. This is the recommended approach for screen capture in modern Qt6. - This required adding the
MultimediaandMultimediaWidgetscomponents to theCMakeLists.txtfile.
- Instead of relying on low-level native interfaces, the
- Successful Build: After these changes, the
wayland_pocexecutable was successfully built and run, demonstrating a working Wayland screen capture on the user's system.
12. Final Refactoring and Wayland-Only Focus
- Decision: Based on the successful Qt6/
QScreenCaptureimplementation, the decision was made to remove all X11-related code and focus exclusively on a Wayland-only grabber. - X11 Code Removal: All X11-specific files (
X11Grabber.cpp,hgx11damage.cpp, etc.) and the corresponding code inCMakeLists.txtwere removed. This significantly simplified the project's dependencies and codebase. - Hyperion Mock Server: To facilitate testing without a physical Hyperion setup, a simple mock server (
hyperion-mock.cpp) was created. This server listens on the Hyperion port, receives the image data, and saves the captured frames as PNG files for verification. - Code Refactoring: The main application was refactored to be Wayland-only. The
hgx11class was renamed toHyperionGrabberto better reflect its purpose. Verbose logging was also reduced to improve performance. - Successful Test: The refactored Wayland grabber was successfully tested with the mock server, confirming that the end-to-end screen capture and transmission pipeline is working correctly.
13. X11 Code Removal and Renaming for Wayland-Only Focus
- Decision: Following the successful implementation and testing of the Wayland grabber, a decision was made to fully transition the project to a Wayland-only focus, removing all remnants of X11-specific code and renaming components for clarity.
- Actions Taken:
- X11 Code Removal: All X11-specific files, including
hgx11.serviceandhgx11stop.service, were removed from the project. - File and Class Renaming:
hgx11.hwas renamed tohyperiongrabber.h. TheHyperionGrabberclass, which now orchestrates the Wayland grabbing, remains.hgx11.cppwas renamed tohyperiongrabber.cpp.hgx11net.hwas renamed tohyperionclient.h. Thehgx11netclass was renamed toHyperionClientto reflect its role as a generic Hyperion network client.hgx11net.cppwas renamed tohyperionclient.cpp.
- Build System Update:
CMakeLists.txtwas updated to reflect the new filenames and remove any X11-specific build configurations. - Documentation Update:
README.mdwas updated to remove all X11-specific references and to clearly state the project's Wayland-only focus.
- X11 Code Removal: All X11-specific files, including
- Outcome: The project codebase is now streamlined and exclusively focused on Wayland screen grabbing, improving clarity and maintainability.
14. Debugging Hyperion Data Acknowledgment and Image Dimensions Mismatch
- Problem: After successfully implementing hardware-accelerated scaling and image orientation, the Hyperion server was connecting but not acknowledging the image data sent by the grabber. Hyperion logs showed only a new connection, not the
registerInputmessages expected for image streams. - Initial Diagnosis:
- Ruled out malformed JSON due to Hyperion's typical verbose error reporting for such issues.
- Confirmed
QImage::Format_RGB888as the pixel format, which is generally expected by Hyperion. - Suspected a mismatch between the
imagewidth/imageheightdeclared in the JSON header and the actual dimensions of the base64 encoded image data.
- Debugging Steps:
- Temporarily re-enabled
qDebug()statements inHyperionGrabber::_processFrameandHyperionClient::sendImageto inspectscaledImagedimensions,rawDataSize,base64Data.length(), and the total_cmd_mlength. - Observed that the
imageheightandimagewidthin the JSON command (derived fromscreen->size()) did not always precisely match the actual dimensions of thescaledImage(derived fromQVideoFramedimensions). This discrepancy was due to_setImgSizebeing called only once in the constructor based onQScreendimensions, while_processFrameprocessedQVideoFrames which might have slightly different dimensions.
- Temporarily re-enabled
- Solution:
- Modified
HyperionClientby adding asetImgSize(int width, int height)function. This function updates theimgCmdBuf(which contains the JSON header withimagewidthandimageheight) dynamically. - Modified
HyperionGrabber::_processFrameto call_hclient_p->setImgSize(scaledImage.width(), scaledImage.height());every time a frame is processed, ensuring that theimagewidthandimageheightin the JSON header precisely match the dimensions of thescaledImagebeing sent. - Corrected a syntax error in the
setImgSizefunction's string literal (related to escaped quotes) that was causing compilation failures. - Corrected
QImage::byteCount()toQImage::sizeInBytes()for accurate raw data size calculation.
- Modified
- Outcome: The grabber now sends image data with dynamically updated dimensions in the JSON header, which should resolve the data acknowledgment issue with Hyperion.
15. Mock Server Debugging and Protocol Discovery
- Initial State: The
hyperion-mockserver was not processing image data, saving no PNGs and logging no errors. - Incorrect Assumption (Aha! Moment #1): My initial diagnosis was based on a faulty memory that the Hyperion protocol was binary and used a 4-byte length prefix for messages. I proposed a "fix" for the mock server based on this incorrect assumption.
- Investigation: The user correctly prompted me to read the actual
hyperion.ngsource code to build a realistic mock. - Protocol Discovery (Aha! Moment #2): Reading
JsonClientConnection.cppfrom thehyperion.ngrepo and re-reading our ownhyperionclient.cpprevealed the actual protocol: simple newline-delimited (\n) JSON messages. - Mock Server Fix: The mock server was corrected to use newline-delimited parsing. It was also fixed to correctly parse the raw image data (using
imagewidthandimageheightfrom the JSON) and to send a{"success":true}reply, which the client expects. - File Location (Aha! Moment #3): I repeatedly failed to find the generated PNG files because I was looking in the
build/directory. The files were being created in the project's root directory, which was the current working directory of the mock server process when the user launched it. - Operational Improvement (Aha! Moment #4): The user reminded me that I can use
ps aux | grep <process_name>to find the PID of background processes myself, rather than asking them for it. This is a more autonomous and efficient workflow.