Social and Information Networks
Covers the design, analysis, and modeling of social and information networks.
Looking for a broader view? This category is part of:
Covers the design, analysis, and modeling of social and information networks.
Looking for a broader view? This category is part of:
Online political divisions, such as fragmentation or polarization, are a growing global concern that can foster radicalization and hinder democratic cooperation; however, not all divisions are detrimental, some reflect pluralism and healthy diversity of opinion in a democracy. While prior research has predominantly focused on polarization in the United States, there remains a limited body of research on political divides in multiparty systems, and no universal method for comparing fragmentation across countries. Moreover, cross-country comparison is rare. This study first develops a novel measure of structural political fragmentation built on multi-scale community detection and the effective branching factor. Using a dataset of 18,325 political influencers from Brazil, Spain, and the United States, we assess online fragmentation in their Twitter/X co-following networks. We compare the fragmentation of the three countries, as well as the ideological groups within each. We further investigate factors associated with the level of fragmentation in each country. We find that political fragmentation differs across countries and is asymmetric between ideological groups. Brazil is the most fragmented, with higher fragmentation among the left-wing group, while Spain and the United States exhibit similar overall levels, with the left more fragmented in Spain and the right more fragmented in the United States. Additionally, we find that social identity plays a central role in political fragmentation. A strong alignment between ideological and social identities, with minimal overlap between ideologies, tends to promote greater integration and reduce fragmentation. Our findings provide explanations for cross-national and ideological differences in political fragmentation.
A key challenge in network science is the detection of communities, which are sets of nodes in a network that are densely connected internally but sparsely connected to the rest of the network. A fundamental result in community detection is the existence of a nontrivial threshold for community detectability on sparse graphs that are generated by the planted partition model (PPM). Below this so-called ``detectability limit'', no community-detection method can perform better than random chance. Spectral methods for community detection fail before this detectability limit because the eigenvalues corresponding to the eigenvectors that are relevant for community detection can be absorbed by the bulk of the spectrum. One can bypass the detectability problem by using special matrices, like the non-backtracking matrix, but this requires one to consider higher-dimensional matrices. In this paper, we show that the difference in graph energy between a PPM and an Erdős--Rényi (ER) network has a distinct transition at the detectability threshold even for the adjacency matrices of the underlying networks. The graph energy is based on the full spectrum of an adjacency matrix, so our result suggests that standard graph matrices still allow one to separate the parameter regions with detectable and undetectable communities.
Community detection is crucial for applications like targeted marketing and recommendation systems. Traditional methods rely on network structure, and embedding-based models integrate semantic information. However, there is a challenge when a model leverages local and global information from complex structures like social networks. Graph Neural Networks (GNNs) and Transformers have shown superior performance in capturing local and global relationships. In this paper, We propose Graph Integrated Transformer for Community Detection (GIT-CD), a hybrid model combining GNNs and Transformer-based attention mechanisms to enhance community detection in social networks. Specifically, the GNN module captures local graph structures, while the Transformer module models long-range dependencies. A self-optimizing clustering module refines community assignments using K-Means, silhouette loss, and KL divergence minimization. Experimental results on benchmark datasets show that GIT-CD outperforms state-of-the-art models, making it a robust approach for detecting meaningful communities in complex social networks.
Online hate spreads rapidly, yet little is known about whether preventive and scalable strategies can curb it. We conducted the largest randomized controlled trial of hate speech prevention to date: a 20-week messaging campaign on X in Nigeria targeting ethnic hate. 73,136 users who had previously engaged with hate speech were randomly assigned to receive prosocial video messages from Nigerian celebrities. The campaign reduced hate content by 2.5% to 5.5% during treatment, with about 75% of the reduction persisting over the following four months. Reaching a larger share of a user's audience reduced amplification of that user's hate posts among both treated and untreated users, cutting hate reposts by over 50% for the most exposed accounts. Scalable messaging can limit online hate without removing content.
Ways in which people's opinions change are, without a doubt, subject to a rich tapestry of differing influences. Factors that affect how one arrives at an opinion reflect how they have been shaped by their environment throughout their lives, education, material status, what belief systems are they subscribed to, and what socio-economic minorities are they a part of. This already complex system is further expanded by the ever-changing nature of one's social network. It is therefore no surprise that many models have a tendency to perform best for the majority of the population and discriminating those people who are members of various marginalized groups . This bias and the study of how to counter it are subject to a rapidly developing field of Fairness in Social Network Analysis (SNA). The focus of this work is to look into how a state-of-the-art model discriminates certain minority groups and whether it is possible to reliably predict for whom it will perform worse. Moreover, is such prediction possible based solely on one's demographic or topological features? To this end, the NetSense dataset, together with a state-of-the-art CoDiNG model for opinion prediction have been employed. Our work explores how three classifier models (Demography-Based, Topology-Based, and Hybrid) perform when assessing for whom this algorithm will provide inaccurate predictions. Finally, through a comprehensive analysis of these experimental results, we identify four key patterns of algorithmic bias. Our findings suggest that no single paradigm provides the best results and that there is a real need for context-aware strategies in fairness-oriented social network analysis. We conclude that a multi-faceted approach, incorporating both individual attributes and network structures, is essential for reducing algorithmic bias and promoting inclusive decision-making.
Link weight prediction extends classical link prediction by estimating the strength of interactions rather than merely their existence, and it underpins a wide range of applications such as traffic engineering, social recommendation, and scientific collaboration analysis. However, the robustness of link weight prediction against adversarial perturbations remains largely unexplored.In this paper, we formalize the link weight prediction attack problem as an optimization task that aims to maximize the prediction error on a set of target links by adversarially manipulating the weight values of a limited number of links. Based on this formulation, we propose an iterative gradient-based attack framework for link weight prediction, termed IGA-LWP. By employing a self-attention-enhanced graph autoencoder as a surrogate predictor, IGA-LWP leverages backpropagated gradients to iteratively identify and perturb a small subset of links. Extensive experiments on four real-world weighted networks demonstrate that IGA-LWP significantly degrades prediction accuracy on target links compared with baseline methods. Moreover, the adversarial networks generated by IGA-LWP exhibit strong transferability across several representative link weight prediction models. These findings expose a fundamental vulnerability in weighted network inference and highlight the need for developing robust link weight prediction methods.
We build, deploy, and evaluate Paper Skygest, a custom personalized social feed for scientific content posted by a user's network on Bluesky and the AT Protocol. We leverage a new capability on emerging decentralized social media platforms: the ability for anyone to build and deploy feeds for other users, to use just as they would a native platform-built feed. To our knowledge, Paper Skygest is the first and largest such continuously deployed personalized social media feed by academics, with over 50,000 weekly uses by over 1,000 daily active users, all organically acquired. First, we quantitatively and qualitatively evaluate Paper Skygest usage, showing that it has sustained usage and satisfies users; we further show adoption of Paper Skygest increases a user's interactions with posts about research, and how interaction rates change as a function of post order. Second, we share our full code and describe our system architecture, to support other academics in building and deploying such feeds sustainably. Third, we overview the potential of custom feeds such as Paper Skygest for studying algorithm designs, building for user agency, and running recommender system experiments with organic users without partnering with a centralized platform.
This study examines the relationship between online buzz and local election outcomes in Taiwan, with a focus on Taitung County. As social media becomes a major channel for public discourse, online buzz is increasingly seen as a factor influencing elections. However, its impact on local elections in Taiwan remains underexplored. This research addresses that gap through a comparative analysis of social media data and actual vote shares during the election period. A review of existing literature establishes the study's framework and highlights the need for empirical investigation in this area. The findings aim to reveal whether online discussions align with electoral results and to what extent digital sentiment reflects voter behavior. The study also discusses methodological and data limitations that may affect interpretation. Beyond its academic value, the research offers practical insights into how online buzz can inform campaign strategies and enhance election predictions. By analyzing the Taitung County case, this study contributes to a deeper understanding of the role of online discourse in Taiwan's local elections and offers a foundation for future research in the field.
Taiwan Cultural Memory Bank 2.0 is an online curation platform that invites the public to become curators, fostering diverse perspectives on Taiwan's society, humanities, natural landscapes, and daily life. Built on a material bank concept, the platform encourages users to co-create and curate their own works using shared resources or self-uploaded materials. At its core, the system follows a collect, store, access, and reuse model, supporting dynamic engagement with over three million cultural memory items from Taiwan. Users can search, browse, explore stories, and engage in creative applications and collaborative productions. Understanding user profiles is crucial for enhancing website service quality, particularly within the framework of the Visitor Relationship Management model. This study conducts an empirical analysis of user profiles on the platform, examining demographic characteristics, browsing behaviors, and engagement patterns. Additionally, the research evaluates the platform's SEO performance, search visibility, and organic traffic effectiveness. Based on the findings, this study provides strategic recommendations for optimizing website management, improving user experience, and leveraging social media for enhanced digital outreach. The insights gained contribute to the broader discussion on digital cultural platforms and their role in audience engagement, online visibility, and networked communication.
Health and poverty in Thailand exhibit pronounced geographic structuring, yet the extent to which they operate as interconnected regional systems remains insufficiently understood. This study analyzes ICD-10 chapter-level morbidity and multidimensional poverty as outcomes embedded in a spatial interaction network. Interpreting Thailand's 76 provinces as nodes within a fixed-degree regional graph, we apply tools from spatial econometrics and social network analysis, including Moran's I, Local Indicators of Spatial Association (LISA), and Spatial Durbin Models (SDM), to assess spatial dependence and cross-provincial spillovers. Our findings reveal strong spatial clustering across multiple ICD-10 chapters, with persistent high-high morbidity zones, particularly for digestive, respiratory, musculoskeletal, and symptom-based diseases, emerging in well-defined regional belts. SDM estimates demonstrate that spillover effects from neighboring provinces frequently exceed the influence of local deprivation, especially for living-condition, health-access, accessibility, and poor-household indicators. These patterns are consistent with contagion and contextual influence processes well established in social network theory. By framing morbidity and poverty as interdependent attributes on a spatial network, this study contributes to the growing literature on structural diffusion, health inequality, and regional vulnerability. The results highlight the importance of coordinated policy interventions across provincial boundaries and demonstrate how network-based modeling can uncover the spatial dynamics of health and deprivation.
Inferring a network's evolutionary history from a single final snapshot with limited temporal annotations is fundamental yet challenging. Existing approaches predominantly rely on topology alone, which often provides insufficient and noisy cues. This paper leverages network steady-state dynamics -- converged node states under a given dynamical process -- as an additional and widely accessible observation for network evolution history inference. We propose CS$^2$, which explicitly models structure-state coupling to capture how topology modulates steady states and how the two signals jointly improve edge discrimination for formation-order recovery. Experiments on six real temporal networks, evaluated under multiple dynamical processes, show that CS$^2$ consistently outperforms strong baselines, improving pairwise edge precedence accuracy by 4.0% on average and global ordering consistency (Spearman-$ρ$) by 7.7% on average. CS$^2$ also more faithfully recovers macroscopic evolution trajectories such as clustering formation, degree heterogeneity, and hub growth. Moreover, a steady-state-only variant remains competitive when reliable topology is limited, highlighting steady states as an independent signal for evolution inference.
In this article we propose a generalization of two known invariants of real networks: degree and ksi-centrality. More precisely, we found a series of centralities based on Laplacian matrix, that have exponential distributions (power-law for the case $j = 0$) for real networks and different distributions for artificial ones.
Community search aims to identify a refined set of nodes that are most relevant to a given query, supporting tasks ranging from fraud detection to recommendation. Unlike homophilic graphs, many real-world networks are heterophilic, where edges predominantly connect dissimilar nodes. Therefore, structural signals that once reflected smooth, low-frequency similarity now appear as sharp, high-frequency contrasts. However, both classical algorithms (e.g., k-core, k-truss) and recent ML-based models struggle to achieve effective community search on heterophilic graphs, where edge signs or semantics are generally unknown. Algorithm-based methods often return communities with mixed class labels, while GNNs, built on homophily, smooth away meaningful signals and blur community boundaries. Therefore, we propose Adaptive Community Search (AdaptCS), a unified framework featuring three key designs: (i) an AdaptCS Encoder that disentangles multi-hop and multi-frequency signals, enabling the model to capture both smooth (homophilic) and contrastive (heterophilic) relations; (ii) a memory-efficient low-rank optimization that removes the main computational bottleneck and ensures model scalability; and (iii) an Adaptive Community Score (ACS) that guides online search by balancing embedding similarity and topological relations. Extensive experiments on both heterophilic and homophilic benchmarks demonstrate that AdaptCS outperforms the best-performing baseline by an average of 11% in F1-score, retains robustness across heterophily levels, and achieves up to 2 orders of magnitude speedup.
AI-companionship platforms are rapidly reshaping how people form emotional, romantic, and parasocial bonds with non-human agents, raising new questions about how these relationships intersect with gendered online behavior and exposure to harmful content. Focusing on the MyBoyfriendIsAI (MBIA) subreddit, we reconstruct the Reddit activity histories of more than 3,000 highly engaged users over two years, yielding over 67,000 historical submissions. We then situate MBIA within a broader ecosystem by building a historical interaction network spanning more than 2,000 subreddits, which enables us to trace cross-community pathways and measure how toxicity and emotional expression vary across these trajectories. We find that MBIA users primarily traverse four surrounding community spheres (AI-companionship, porn-related, forum-like, and gaming) and that participation across the ecosystem exhibits a distinct gendered structure, with substantial engagement by female users. While toxicity is generally low across most pathways, we observe localized spikes concentrated in a small subset of AI-porn and gender-oriented communities. Nearly 16% of users engage with gender-focused subreddits, and their trajectories display systematically different patterns of emotional expression and elevated toxicity, suggesting that a minority of gendered pathways may act as toxicity amplifiers within the broader AI-companionship ecosystem. These results characterize the gendered structure of cross-community participation around AI companionship on Reddit and highlight where risks concentrate, informing measurement, moderation, and design practices for human-AI relationship platforms.
Understanding affective polarization in online discourse is crucial for evaluating the societal impact of social media interactions. This study presents a novel framework that leverages large language models (LLMs) and domain-informed heuristics to systematically analyze and quantify affective polarization in discussions on divisive topics such as climate change and gun control. Unlike most prior approaches that relied on sentiment analysis or predefined classifiers, our method integrates LLMs to extract stance, affective tone, and agreement patterns from large-scale social media discussions. We then apply a rule-based scoring system capable of quantifying affective polarization even in small conversations consisting of single interactions, based on stance alignment, emotional content, and interaction dynamics. Our analysis reveals distinct polarization patterns that are event dependent: (i) anticipation-driven polarization, where extreme polarization escalates before well-publicized events, and (ii) reactive polarization, where intense affective polarization spikes immediately after sudden, high-impact events. By combining AI-driven content annotation with domain-informed scoring, our framework offers a scalable and interpretable approach to measuring affective polarization. The source code is publicly available at: https://github.com/hasanjawad001/llm-social-media-polarization.
Influence Maximization (IM) seeks to identify a small set of seed nodes in a social network to maximize expected information spread under a diffusion model. While community-based approaches improve scalability by exploiting modular structure, they typically assume independence between communities, overlooking inter-community influence$\unicode{x2014}$a limitation that reduces effectiveness in real-world networks. We introduce Community-IM++, a scalable framework that explicitly models cross-community diffusion through a principled heuristic based on community-based diffusion degree (CDD) and a progressive budgeting strategy. The algorithm partitions the network, computes CDD to prioritize bridging nodes, and allocates seeds adaptively across communities using lazy evaluation to minimize redundant computations. Experiments on large real-world social networks under different edge weight models show that Community-IM++ achieves near-greedy influence spread at up to 100 times lower runtime, while outperforming Community-IM and degree heuristics across budgets and structural conditions. These results demonstrate the practicality of Community-IM++ for large-scale applications such as viral marketing, misinformation control, and public health campaigns, where efficiency and cross-community reach are critical.
A beyond-diagonal reconfigurable intelligent surface (BD-RIS) is an innovative type of reconfigurable intelligent surface (RIS) that has recently been proposed and is considered a revolutionary advancement in wave manipulation. Unlike the mutually disconnected arrangement of elements in traditional RISs, BD-RIS creates cost-effective and simple inter-element connections, allowing for greater freedom in configuring the amplitude and phase of impinging waves. However, there are numerous underlying challenges in realizing the advantages associated with BD-RIS, prompting the research community to actively investigate cutting-edge schemes and algorithms in this direction. Particularly, the passive beamforming design for BD-RIS under specific environmental conditions has become a major focus in this research area. In this article, we provide a systematic introduction to BD-RIS, elaborating on its functional principles concerning architectural design, promising advantages, and classification. Subsequently, we present recent advances and identify a series of challenges and opportunities. Additionally, we consider a specific case study where beamforming is designed using four different algorithms, and we analyze their performance with respect to sum rate and computation cost. To augment the beamforming capabilities in 6G BD-RIS with quantum enhancement, we analyze various hybrid quantum-classical machine learning (ML) models to improve beam prediction performance, employing real-world communication Scenario 8 from the DeepSense 6G dataset. Consequently, we derive useful insights about the practical implications of BD-RIS.
When opinion spread is studied, peer pressure is often modeled by interactions of more than two individuals (higher-order interactions). In our work, we introduce a two-layer random hypergraph model, in which hyperedges represent households and workplaces. Within this overlapping, adaptive structure, individuals react if their opinion is in majority in their groups. The process evolves through random steps: individuals can either change their opinion, or quit their workplace and join another one in which their opinion belongs to the majority. Based on computer simulations, our first goal is to describe the effect of the parameters responsible for the probability of changing opinion and quitting workplace on the homophily and speed of polarization. We also analyze the model as a Markov chain, and study the frequency of the absorbing states. Then, we quantitatively compare how different statistical and machine learning methods, in particular, linear regression, xgboost and a convolutional neural network perform for estimating these probabilities, based on partial information from the process, for example, the distribution of opinion configurations within households and workplaces. Among other observations, we conclude that all methods can achieve the best results under appropriate circumstances, and that the amount of information that is necessary to provide good results depends on the strength of the peer pressure effect.
Although Graph Neural Networks (GNNs) have become the dominant approach for graph representation learning, their performance on link prediction tasks does not always surpass that of traditional heuristic methods such as Common Neighbors and Jaccard Coefficient. This is mainly because existing GNNs tend to focus on learning local node representations, making it difficult to effectively capture structural relationships between node pairs. Furthermore, excessive reliance on local neighborhood information can lead to over-smoothing. Prior studies have shown that introducing global structural encoding can partially alleviate this issue. To address these limitations, we propose a Community-Enhanced Link Prediction (CELP) framework that incorporates community structure to jointly model local and global graph topology. Specifically, CELP enhances the graph via community-aware, confidence-guided edge completion and pruning, while integrating multi-scale structural features to achieve more accurate link prediction. Experimental results across multiple benchmark datasets demonstrate that CELP achieves superior performance, validating the crucial role of community structure in improving link prediction accuracy.
Urban mobility data are indispensable for urban planning, transportation demand forecasting, pandemic modeling, and many other applications; however, individual mobile phone-derived Global Positioning System traces cannot generally be shared with third parties owing to severe re-identification risks. Aggregated records, such as origin-destination (OD) matrices, offer partial insights but fail to capture the key behavioral properties of daily human movement, limiting realistic city-scale analyses. This study presents a privacy-preserving synthetic mobility dataset that reconstructs daily trajectories from aggregated inputs. The proposed method integrates OD flows with two complementary behavioral constraints: (1) dwell-travel time quantiles that are available only as coarse summary statistics and (2) the universal law for the daily distribution of the number of visited locations. Embedding these elements in a multi-objective optimization framework enables the reproduction of realistic distributions of human mobility while ensuring that no personal identifiers are required. The proposed framework is validated in two contrasting regions of Japan: (1) the 23 special wards of Tokyo, representing a dense metropolitan environment; and (2) Fukuoka Prefecture, where urban and suburban mobility patterns coexist. The resulting synthetic mobility data reproduce dwell-travel time and visit frequency distributions with high fidelity, while deviations in OD consistency remain within the natural range of daily fluctuations. The results of this study establish a practical synthesis pathway under real-world constraints, providing governments, urban planners, and industries with scalable access to high-resolution mobility data for reliable analytics without the need for sensitive personal records, and supporting practical deployments in policy and commercial domains.