Skip to content

Packet timestamp support#3

Open
trondn wants to merge 2 commits into
couchbasedeps:release-2.1.11-stable-couchbasefrom
trondn:packet_timestamp_support
Open

Packet timestamp support#3
trondn wants to merge 2 commits into
couchbasedeps:release-2.1.11-stable-couchbasefrom
trondn:packet_timestamp_support

Conversation

@trondn

@trondn trondn commented Jul 1, 2026

Copy link
Copy Markdown

Add support for bufferevents to register for receiving the network packet timestamp (on platforms supporting it).

When enabled it'll use recvmsg to receive the full "packet" from the kernel into an evbuffer and store the timestamp. Given that the memcached protocol doesn't necessarily map 1:1 to a packet, evbuffers will be squashed together and in such cases the oldest timestamp will be used to represet the MCBP packet.

trondn added 2 commits June 9, 2026 16:13
Under OpenSSL 3.0+, unexpected socket closures before a clean SSL/TLS
shutdown alert is exchanged are classified as protocol errors rather than
clean EOFs. This behavior change causes various regression tests to fail.

To resolve these compatibility issues:
1. bufferevent_openssl.c: Identify the SSL_R_UNEXPECTED_EOF_WHILE_READING
   protocol error and treat it as a dirty shutdown (clean TCP closure) to
   ensure backward compatibility.
2. test/regress_http.c (https_bev): Enable 'allow_dirty_shutdown = 1'
   on server bufferevents inside the mock HTTPS server to cleanly handle
   abrupt client socket closures.
3. test/regress_http.c (http_incomplete_errorcb): Recognize SSL protocol
   errors arising from raw socket shutdowns on OpenSSL 3.0 as successful
   terminations during the incomplete HTTP request test.
Implement kernel-measured socket receive timestamps with nanosecond
precision on supported platforms via the recvmsg() syscall. Each
recvmsg() call writes into a freshly allocated evbuffer chain so that
every call gets an independent timestamp; timestamps are never silently
discarded due to chain reuse.

Infrastructure changes:
- evbuffer-internal.h: Add timestamp storage and validity flag
  to evbuffer_chain structure.
- evbuffer_read_with_timestamp(): New internal function for reading
  data via recvmsg(), parsing control messages for timestamps, and
  safeguarding against MSG_CTRUNC truncation. Always allocates a fresh
  chain per call so per-call timestamps are preserved independently.
- bufferevent-internal.h: Add recv_timestamps_enabled flag to the
  bufferevent_private structure.
- bufferevent_sock.c: Enable receive timestamps and swap standard
  reads with evbuffer_read_with_timestamp() when requested.
- bufferevent_openssl.c: Add a custom socket BIO to extract kernel
  timestamps during direct socket reads, and propagate timestamps
  from underlying bufferevents in filtered TLS mode.
- Build system: Add CMake and Autotools socket timestamp checks
  for SO_TIMESTAMP and SO_TIMESTAMPNS option availability.

Read sizing:
- On Linux, use recv(MSG_PEEK|MSG_TRUNC) with a zero-length buffer to
  determine the exact datagram size before allocating the chain. This
  avoids truncating UDP datagrams larger than EVBUFFER_MAX_READ and
  sizes the chain precisely. If the peek returns nothing useful (TCP
  stream), fall back to EVBUFFER_MAX_READ directly to prevent FIONREAD
  from inflating howmuch and bypassing rate limiting on TCP connections.
- On macOS and Windows, MSG_PEEK|MSG_TRUNC is skipped entirely (macOS
  treats MSG_TRUNC as an output-only flag) and FIONREAD is used for
  sizing. The howmuch override is also guarded to Linux-only to prevent
  FIONREAD from inflating reads on TCP sockets on those platforms.
- EVBUFFER_MAX_READ is no longer applied as an upper cap when
  use_recvmsg is set, allowing large UDP datagrams to be received
  without truncation.

Public API additions:
- BEV_OPT_RECV_TIMESTAMPS option flag for socket bufferevents.
- evbuffer_get_timestamp() to fetch the receipt timestamp of the
  oldest data currently in the buffer.
- evbuffer_commit_space_with_timespec() to commit reserved space and
  manually attach a timestamp to the committed chains.

Platform support:
- Linux 2.6.22+: Nanosecond (SO_TIMESTAMPNS) and microsecond
  (SO_TIMESTAMP) precision.
- macOS 10.12+: Microsecond (SO_TIMESTAMP) precision.

Testing:
- Added unit tests for evbuffer receive timestamping.
- Added direct socket and filtered OpenSSL bufferevent timestamp regress tests.
- Verified all 344 unit tests pass successfully.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant