Information Theory

arXiv:cs.IT

Covers theoretical and experimental aspects of information theory and coding. Includes source coding, channel coding, data compression, and cryptographic protocols.

Trending in Information Theory

A Dual Approach for Hierarchical Information-Theoretic Tree Abstractions

In this paper, we consider establishing a formal connection between two distinct tree-abstraction problems inspired by the information-bottleneck (IB) method. Specifically, we consider the hard- and soft-constrained formulations that have recently appeared in the literature to determine the conditions for which the two approaches are equivalent. Our analysis leverages concepts from Lagrangian relaxation and duality theory to relate the dual function of the hard-constrained problem to the Q-function employed in Q-tree search and shows the connection between tree phase transitions and solutions to the dual problem obtained by exploiting the problem structure. An algorithm is proposed that employs knowledge of the tree phase transitions to find a setting of the dual variable that solves the dual problem. Furthermore, we present an alternative approach to select the dual variable that leverages the integer programming formulation of the hard-constrained problem and the strong duality of linear programming. To obtain a linear program, we establish that a relaxation of the integer programming formulation of the hard-constrained tree-search problem has the integrality property by showing that the program constraint matrix is totally unimodular. Empirical results that corroborate the theoretical developments are presented and discussed throughout.

2512.01985

Dec 2025Information Theory

Forget BIT, It is All about TOKEN: Towards Semantic Information Theory for LLMs

Large language models (LLMs) have demonstrated remarkable capabilities in numerous real-world applications. While the vast majority of research conducted from an experimental perspective is progressing rapidly, it demands substantial computational power, data, and other resources. Therefore, how to open the black-box of LLMs from a theoretical standpoint has become a critical challenge. This paper takes the theory of rate-distortion function, directed information, and Granger causality as its starting point to investigate the information-theoretic principles behind LLMs, leading to the development of semantic information theory for LLMs, where the fundamental unit is token, rather than bits that lacks any semantic meaning. By defining the probabilistic model of LLMs, we discuss structure-agnostic information-theoretic measures, such as the directed rate-distortion function in pre-training, the directed rate-reward function in post-training, and the semantic information flow in inference phase. This paper also delves deeply into the theory of token-level semantic embedding and the information-theoretically optimal vectorization method. Thereafter, we propose a general definition of autoregression LLM, where the Transformer architecture and its performance such as ELBO, generalization error bound, memory capacity, and semantic information measures can be derived theoretically. Other architectures, such as Mamba/Mamba2 and LLaDA, are also discussed in our framework. Consequently, this paper provides a theoretical framework for understanding LLMs from the perspective of semantic information theory, which also offers the necessary theoretical tools for further in-depth research.

2511.012021

Nov 2025Information Theory

Related categories:

Information Theory Networking and Internet Architecture Cryptography and Security Statistical Learning Machine Learning

805 papers

Information-Theoretic Limits on Exact Subgraph Alignment Problem

The graph alignment problem aims to identify the vertex correspondence between two correlated graphs. Most existing studies focus on the scenario in which the two graphs share the same vertex set. However, in many real-world applications, such as computer vision, social network analysis, and bioinformatics, the task often involves locating a small graph pattern within a larger graph. Existing graph alignment algorithms and analysis cannot directly address these scenarios because they are not designed to identify the specific subset of vertices where the small graph pattern resides within the larger graph. Motivated by this limitation, we introduce the subgraph alignment problem, which seeks to recover both the vertex set and/or the vertex correspondence of a small graph pattern embedded in a larger graph. In the special case where the small graph pattern is an induced subgraph of the larger graph and both the vertex set and correspondence are to be recovered, the problem reduces to the subgraph isomorphism problem, which is NP-complete in the worst case. In this paper, we formally formulate the subgraph alignment problem by proposing the Erdos-Renyi subgraph pair model together with some appropriate recovery criterion. We then establish almost-tight information-theoretic results for the subgraph alignment problem and present some novel approaches for the analysis.

2601.05173Jan 2026

View

Fundamental Tradeoffs for ISAC Multiple Access in Finite-Blocklength Regime

This paper investigates the fundamental communication--sensing tradeoffs of uplink dual-functional integrated sensing and communication (ISAC) multiple access under finite blocklength (FBL) constraints. Unlike conventional asymptotic analyses, we explicitly account for the limitations under FBL constraints imposed by short packets and low-latency transmission. By examining the unbiased channel state sensing estimator, we establish a geometric decomposition of the sensing error, indicating that it is jointly determined by the signal-to-noise ratio and the correlation structure of the information codebook. This insight reveals how cross-correlation among active users in the codebook geometry fundamentally constrains dual-functional ISAC performance. Consequently, we derive achievability and converse bounds that characterize the tradeoff between communication code rate and sensing accuracy in the FBL regime, with the converse further bounded by Shannon capacity. Moreover, by treating channel state sensing as a high-level sensing objective, a universal Cramér--Rao bound is derived to link channel estimation accuracy to practical sensing parameters. Examples of parameter sensing are also provided based on 3GPP standard. Numerical results validate the theoretical analysis and demonstrate the impact of blocklength, antenna dimensions, and sensing requirements.

2601.05165Jan 2026

View

Precoding Matrix Indicator in the 5G NR Protocol: A Tutorial on 3GPP Beamforming Codebooks

This paper bridges this critical gap by providing a systematic examination of the beamforming codebook technology, i.e., precoding matrix indicator (PMI), in the 5G NR from theoretical, standardization, and implementation perspectives. We begin by introducing the background of beamforming in multiple-input multiple-output (MIMO) systems and the signaling procedures for codebook-based beamforming in practical 5G systems. Then, we establish the fundamentals of regular codebooks and port-selection codebooks in 3GPP standards. Next, we provide rigorous technical analysis of 3GPP codebook evolution spanning Releases 15-18, with particular focus on: 1) We elucidate the core principles underlying codebook design, 2) provide clear physical interpretations for each symbolic variable in the codebook formulas, summarized in tabular form, and 3) offer intuitive visual illustrations to explain how codebook parameters convey information. These essential pedagogical elements are almost entirely absent in the often-obscure standardization documents. Through mathematical modeling, performance benchmarking, feedback comparisons, and scenario-dependent applicability analysis, we provide researchers and engineers with a unified understanding of beamforming codebooks in real-world systems. Furthermore, we identify future directions and other beamforming scenarios for ongoing research and development efforts. This work serves as both an informative tutorial and a guidance for future research, facilitating more effective collaboration between academia and industry in advancing wireless communication technologies.

2601.05092Jan 2026

View

2601.05030

Refinements of Jensen's Inequality for Twice-Differentiable Convex Functions with Bounded Hessian

Jensen's inequality, attributed to Johan Jensen -- a Danish mathematician and engineer noted for his contributions to the theory of functions -- is a ubiquitous result in convex analysis, providing a fundamental lower bound for the expectation of a convex function. In this paper, we establish rigorous refinements of this inequality specifically for twice-differentiable functions with bounded Hessians. By utilizing Taylor expansions with integral remainders, we tried to bridge the gap between classical variance-based bounds and higher-precision estimates. We also discover explicit error terms governed by Gruss-type inequalities, allowing for the incorporation of skewness and kurtosis into the bound. Using these new theoretical tools, we improve upon existing estimates for the Shannon entropy of continuous distributions and the ergodic capacity of Rayleigh fading channels, demonstrating the practical efficacy of our refinements.

2601.05030Jan 2026

View

Learning Sparsifying Transforms for mmWave Communication via $\ell^4$-Norm Maximization

The high directionality of wave propagation at millimeter-wave (mmWave) carrier frequencies results in only a small number of significant transmission paths between user equipments and the basestation (BS). This sparse nature of wave propagation is revealed in the beamspace domain, which is traditionally obtained by taking the spatial discrete Fourier transform (DFT) across a uniform linear antenna array at the BS, where each DFT output is associated with a distinct beam. In recent years, beamspace processing has emerged as a promising technique to reduce baseband complexity and power consumption in all-digital massive multiuser (MU) multiple-input multiple-output (MIMO) systems operating at mmWave frequencies. However, it remains unclear whether the DFT is the optimal sparsifying transform for finite-dimensional antenna arrays. In this paper, we extend the framework of Zhai et al. for complete dictionary learning via $\ell^4$-norm maximization to the complex case in order to learn new sparsifying transforms. We provide a theoretical foundation for $\ell^4$-norm maximization and propose two suitable learning algorithms. We then utilize these algorithms (i) to assess the optimality of the DFT for sparsifying channel vectors theoretically and via simulations and (ii) to learn improved sparsifying transforms for real-world and synthetically generated channel vectors.

2601.04980Jan 2026

View

Wireless Communication with Cross-Linked Rotatable Antenna Array: Architecture Design and Rotation Optimization

Rotatable antenna (RA) technology can harness additional spatial degrees of freedom by enabling the dynamic three-dimensional orientation control of each antenna. Unfortunately, the hardware cost and control complexity of traditional RA systems is proportional to the number of RAs. To address the issue, we consider a cross-linked (CL) RA structure, which enables the coordinated rotation of multiple antennas, thereby offering a cost-effective solution. To evaluate the performance of the CL-RA array, we investigate a CL-RA-aided uplink system. Specifically, we first establish system models for both antenna element-level and antenna panel-level rotation. Then, we formulate a sum rate maximization problem by jointly optimizing the receive beamforming at the base station and the rotation angles. For the antenna element-level rotation, we derive the optimal solution of the CL-RA array under the single-user case. Subsequently, for two rotation schemes, we propose an alternating optimization algorithm to solve the formulated problem in the multi-user case, where the receive beamforming and the antenna rotation angles are obtained by applying the minimum mean square error method and feasible direction method, respectively. In addition, considering the hardware limitations, we apply the genetic algorithm to address the discrete rotation angles selection problem. Simulation results show that by carefully designing the row-column partition scheme, the performance of the CL-RA architecture is quite close to that of the flexible antenna orientation scheme. Moreover, the CL antenna element-level scheme surpasses the CL antenna panel-level scheme by 25% and delivers a 128% performance improvement over conventional fixed-direction antennas.

2601.04862Jan 2026

View

2601.04849

Stability of Constrained Optimization Models for Structured Signal Recovery

Recovering an unknown but structured signal from its measurements is a challenging problem with significant applications in fields such as imaging restoration, wireless communications, and signal processing. In this paper, we consider the inherent problem stems from the prior knowledge about the signal's structure, such as sparsity which is critical for signal recovery models. We investigate three constrained optimization models that effectively address this challenge, each leveraging distinct forms of structural priors to regularize the solution space. Our theoretical analysis demonstrates that these models exhibit robustness to noise while maintaining stability with respect to tuning parameters that is a crucial property for practical applications, when the parameter selection is often nontrivial. By providing theoretical foundations, our work supports their practical use in scenarios where measurement imperfections and model uncertainties are unavoidable. Furthermore, under mild conditions, we establish tradeoff between the sample complexity and the mismatch error.

2601.04849Jan 2026

View

Privacy-Utility Trade-offs Under Multi-Level Point-Wise Leakage Constraints

An information-theoretic privacy mechanism design is studied, where an agent observes useful data $Y$ which is correlated with the private data $X$. The agent wants to reveal the information to a user, hence, the agent utilizes a privacy mechanism to produce disclosed data $U$ that can be revealed. We assume that the agent has no direct access to $X$, i.e., the private data is hidden. We study privacy mechanism design that maximizes the disclosed information about $Y$, measured by the mutual information between $Y$ and $U$, while satisfying a point-wise constraint with different privacy leakage budgets. We introduce a new measure, called the \emph{multi-level point-wise leakage}, which allows us to impose different leakage levels for different realizations of $U$. In contrast to previous studies on point-wise measures, which use the same leakage level for each realization, we consider a more general scenario in which each data point can leak information up to a different threshold. As a result, this concept also covers cases in which some data points should not leak any information about the private data, i.e., they must satisfy perfect privacy. In other words, a combination of perfect privacy and non-zero leakage can be considered. When the leakage is sufficiently small, concepts from information geometry allow us to locally approximate the mutual information. We show that when the leakage matrix $P_{X|Y}$ is invertible, utilizing this approximation leads to a quadratic optimization problem that has closed-form solution under some constraints. In particular, we show that it is sufficient to consider only binary $U$ to attain the optimal utility. This leads to simple privacy designs with low complexity which are based on finding the maximum singular value and singular vector of a matrix.

2601.04815Jan 2026

View

Feasibility Study Regarding Self-sustainable Reconfigurable Intelligent Surfaces

Without requiring operational costs such as cabling and powering while maintaining reconfigurable phase-shift capability, self-sustainable reconfigurable intelligent surfaces (ssRISs) can be deployed in locations inaccessible to conventional relays or base stations, offering a novel approach to enhance wireless coverage. This study assesses the feasibility of ssRIS deployment by analyzing two harvest-and-reflect (HaR) schemes: element-splitting (ES) and time-splitting (TS). We examine how element requirements scale with key system parameters, transmit power, data rate demands, and outage constraints under both line-of-sight (LOS) and non-line-of-sight (NLOS) ssRIS-to-user equipment (UE) channels. Analytical and numerical results reveal distinct feasibility characteristics. The TS scheme demonstrates better channel hardening gain, maintaining stable element requirements across varying outage margins, making it advantageous for indoor deployments with favorable harvesting conditions and moderate data rates. However, TS exhibits an element requirement that exponentially scales to harvesting difficulty and data rate. Conversely, the ES scheme shows only linear growth with harvesting difficulty, providing better feasibility under challenging outdoor scenarios. These findings establish that TS excels in benign environments, prioritizing reliability, while ES is preferable for demanding conditions requiring operational robustness.

2601.04723Jan 2026

View

Air-to-Ground Communications for Internet of Things: UAV-based Coverage Hole Detection and Recovery

Uncrewed aerial vehicles (UAVs) play a pivotal role in ensuring seamless connectivity for Internet of Things (IoT) devices, particularly in scenarios where conventional terrestrial networks are constrained or temporarily unavailable. However, traditional coverage-hole detection approaches, such as minimizing drive tests, are costly, time-consuming, and reliant on outdated radio-environment data, making them unsuitable for real-time applications. To address these limitations, this paper proposes a UAV-assisted framework for real-time detection and recovery of coverage holes in IoT networks. In the proposed scheme, a patrol UAV is first dispatched to identify coverage holes in regions where the operational status of terrestrial base stations (BSs) is uncertain. Once a coverage hole is detected, one or more UAVs acting as aerial BSs are deployed by a satellite or nearby operational BSs to restore connectivity. The UAV swarm is organized based on Delaunay triangulation, enabling scalable deployment and tractable analytical characterization using stochastic geometry. Moreover, a collision-avoidance mechanism grounded in multi-agent system theory ensures safe and coordinated motion among multiple UAVs. Simulation results demonstrate that the proposed framework achieves high efficiency in both coverage-hole detection and on-demand connectivity restoration while significantly reducing operational cost and time.

2601.04665Jan 2026

View

Bridging Distance and Spectral Positional Encodings via Anchor-Based Diffusion Geometry Approximation

Molecular graph learning benefits from positional signals that capture both local neighborhoods and global topology. Two widely used families are spectral encodings derived from Laplacian or diffusion operators and anchor-based distance encodings built from shortest-path information, yet their precise relationship is poorly understood. We interpret distance encodings as a low-rank surrogate of diffusion geometry and derive an explicit trilateration map that reconstructs truncated diffusion coordinates from transformed anchor distances and anchor spectral positions, with pointwise and Frobenius-gap guarantees on random regular graphs. On DrugBank molecular graphs using a shared GNP-based DDI prediction backbone, a distance-driven Nyström scheme closely recovers diffusion geometry, and both Laplacian and distance encodings substantially outperform a no-encoding baseline.

2601.04517Jan 2026

View

Achievable Rate and Coding Principle for MIMO Multicarrier Systems With Cross-Domain MAMP Receiver Over Doubly Selective Channels

The integration of multicarrier modulation and multiple-input-multiple-output (MIMO) is critical for reliable transmission of wireless signals in complex environments, which significantly improve spectrum efficiency. Existing studies have shown that popular orthogonal time frequency space (OTFS) and affine frequency division multiplexing (AFDM) offer significant advantages over orthogonal frequency division multiplexing (OFDM) in uncoded doubly selective channels. However, it remains uncertain whether these benefits extend to coded systems. Meanwhile, the information-theoretic limit analysis of coded MIMO multicarrier systems and the corresponding low-complexity receiver design remain unclear. To overcome these challenges, this paper proposes a multi-slot cross-domain memory approximate message passing (MS-CD-MAMP) receiver as well as develops its information-theoretic (i.e., achievable rate) limit and optimal coding principle for MIMO-multicarrier modulation (e.g., OFDM, OTFS, and AFDM) systems. The proposed MS-CD-MAMP receiver can exploit not only the time domain channel sparsity for low complexity but also the corresponding symbol domain constellation constraints for performance enhancement. Meanwhile, limited by the high-dimensional complex state evolution (SE), a simplified single-input single-output variational SE is proposed to derive the achievable rate of MS-CD-MAMP and the optimal coding principle with the goal of maximizing the achievable rate. Numerical results show that coded MIMO-OFDM/OTFS/AFDM with MS-CD-MAMP achieve the same maximum achievable rate in doubly selective channels, whose finite-length performance with practical optimized low-density parity-check (LDPC) codes is only 0.5 $\sim$ 1.8 dB away from the associated theoretical limit, and has 0.8 $\sim$ 4.4 dB gain over the well-designed point-to-point LDPC codes.

2601.04433Jan 2026

View

2601.04193

A discrete Benamou-Brenier formulation of Optimal Transport on graphs

We propose a discrete transport equation on graphs which connects distributions on both vertices and edges. We then derive a discrete analogue of the Benamou-Brenier formulation for Wasserstein-$1$ distance on a graph and as a result classify all $W_1$ geodesics on graphs.

2601.04193Jan 2026

View

Expectation Propagation for Distributed Inference in Grant-Free Cell-Free Massive MIMO

Grant-free cell-free massive multiple-input multiple-output (GF-CF-MaMIMO) systems are anticipated to be a key enabling technology for next-generation Internet-of-Things (IoT) networks, as they support massive connectivity without explicit scheduling. However, the large amount of connected devices prevents the use of orthogonal pilot sequences, resulting in severe pilot contamination (PC) that degrades channel estimation and data detection performance. Furthermore, scalable GF-CF-MaMIMO networks inherently rely on distributed signal processing. In this work, we consider the uplink of a GF-CF-MaMIMO system and propose two novel distributed algorithms for joint activity detection, channel estimation, and data detection (JACD) based on expectation propagation (EP). The first algorithm, denoted as JACD-EP, uses Gaussian approximations for the channel variables, whereas the second, referred to as JACD-EP-BG, models them as Bernoulli-Gaussian (BG) random variables. To integrate the BG distribution into the EP framework, we derive its exponential family representation and develop the two algorithms as efficient message passing over a factor graph constructed from the a posteriori probability (APP) distribution. The proposed framework is inherently scalable with respect to both the number of access points (APs) and user equipments (UEs). Simulation results show the efficient mitigation of PC by the proposed distributed algorithms and their superior detection accuracy compared to (genie-aided) centralized linear detectors.

2601.04166Jan 2026

View

2601.04041

Serving Every Symbol: All-Symbol PIR and Batch Codes

A $t$-all-symbol PIR code and a $t$-all-symbol batch code of dimension $k$ consist of $n$ servers storing linear combinations of $k$ linearly independent information symbols with the following recovery property: any symbol stored by a server can be recovered from $t$ pairwise disjoint subsets of servers. In the batch setting, we further require that any multiset of size $t$ of stored symbols can be recovered from $t$ disjoint subsets of servers. This framework unifies and extends several well-known code families, including one-step majority-logic decodable codes, (functional) PIR codes, and (functional) batch codes. In this paper, we determine the minimum code length for some small values of $k$ and $t$, characterize structural properties of codes attaining this optimum, and derive bounds that show the trade-offs between length, dimension, minimum distance, and $t$. In addition, we study MDS codes and the simplex code, demonstrating how these classical families fit within our framework, and establish new cases of an open conjecture from \cite{YAAKOBI2020} concerning the minimal $t$ for which the simplex code is a $t$-functional batch code.

2601.04041Jan 2026

View

Flexible-Duplex Cell-Free Architecture for Secure Uplink Communications in Low-Altitude Wireless Networks

Low-altitude wireless networks (LAWNs) are expected to play a central role in future 6G infrastructures, yet uplink transmissions of uncrewed aerial vehicles (UAVs) remain vulnerable to eavesdropping due to their limited transmit power, constrained antenna resources, and highly exposed air-ground propagation conditions. To address this fundamental bottleneck, we propose a flexible-duplex cell-free (CF) architecture in which each distributed access point (AP) can dynamically operate either as a receive AP for UAV uplink collection or as a transmit AP that generates cooperative artificial noise (AN) for secrecy enhancement. Such AP-level duplex flexibility introduces an additional spatial degree of freedom that enables distributed and adaptive protection against wiretapping in LAWNs. Building upon this architecture, we formulate a max-min secrecy-rate problem that jointly optimizes AP mode selection, receive combining, and AN covariance design. This tightly coupled and nonconvex optimization is tackled by first deriving the optimal receive combiners in closed form, followed by developing a penalty dual decomposition (PDD) algorithm with guaranteed convergence to a stationary solution. To further reduce computational burden, we propose a low-complexity sequential scheme that determines AP modes via a heuristic metric and then updates the AN covariance matrices through closed-form iterations embedded in the PDD framework. Simulation results show that the proposed flexible-duplex architecture yields substantial secrecy-rate gains over CF systems with fixed AP roles. The joint optimization method attains the highest secrecy performance, while the low-complexity approach achieves over 90% of the optimal performance with an order-of-magnitude lower computational complexity, offering a practical solution for secure uplink communications in LAWNs.

2601.04011Jan 2026

View

2601.03982

Unique Decoding of Hyperderivative Reed-Solomon Codes

Error-correcting codes are combinatorial objects designed to cope with the problem of reliable transmission of information on a noisy channel. A fundamental problem in coding theory and practice is to efficiently decode the received word with errors to obtain the transmitted codeword. In this paper, we consider the decoding problem of Hyperderivative Reed-Solomon (HRS) codes with respect to the NRT metric. Specifically, we propose a Welch-Berlekamp algorithm for the unique decoding of NRT HRS codes.

2601.03982Jan 2026

View

Low-Complexity Planar Beyond-Diagonal RIS Architecture Design Using Graph Theory

Reconfigurable intelligent surfaces (RISs) enable programmable control of the wireless propagation environment and are key enablers for future networks. Beyond-diagonal RIS (BD-RIS) architectures enhance conventional RIS by interconnecting elements through tunable impedance components, offering greater flexibility with higher circuit complexity. However, excessive interconnections between BD-RIS elements require multi-layer printed circuit board (PCB) designs, increasing fabrication difficulty. In this letter, we use graph theory to characterize the BD-RIS architectures that can be realized on double-layer PCBs, denoted as planar-connected RISs. Among the possible planar-connected RISs, we identify the ones with the most degrees of freedom, expected to achieve the best performance under practical constraints.

2601.03831Jan 2026

View

2601.03492

Hermitian LCD $2$-Quasi Abelian Codes over Finite Chain Rings

This paper introduces a class of Hermitian LCD $2$-quasi-abelian codes over finite fields and presents a comprehensive enumeration of these codes in which relative minimum weights are small. We show that such codes are asymptotically good over finite fields. Furthermore, we extend our analysis to finite chain rings by characterizing $2$-quasi-abelian codes in this setting and proving the existence of asymptotically good Hermitian LCD $2$-quasi-abelian codes over finite chain rings as well.

2601.03492Jan 2026

View

2601.03489

LCPs of Subspace Codes

A subspace code is a nonempty collection of subspaces of the vector space $\mathbb{F}_q^{n}$. A pair of linear codes is called a linear complementary pair (in short LCP) of codes if their intersection is trivial and the sum of their dimensions equals the dimension of the ambient space. Equivalently, the two codes form an LCP if the direct sum of these two codes is equal to the entire space. In this paper, we introduce the concept of LCPs of subspace codes. We first provide a characterization of subspace codes that form an LCP. Furthermore, we present a sufficient condition for the existence of an LCP of subspace codes based on a complement function on a subspace code. In addition, we give several constructions of LCPs for subspace codes using various techniques and provide an application to insertion error correction.

2601.03489Jan 2026

View

Page 1 of 41