International Journal of Medical Informatics 111 (2018) 172–181
Contents lists available at ScienceDirect
International Journal of Medical Informatics journal homepage: www.elsevier.com/locate/ijmedinf
Research on gender differences in online health communities 1
1
Xuan Liu , Min Sun , Jia Li
⁎
T
School of Business, 130 Meilong Rd., East China University of Science and Technology, Shanghai 200237, China
A R T I C L E I N F O
A B S T R A C T
Keywords: Gender difference OHC Topic modeling analysis Sentiment analysis Friendship network analysis
With the growing concern about health issues and the emergence of online communities based on user-generated content (UGC), more and more people are participating in online health communities (OHCs) to exchange opinions and health information. This paper aims to examine whether and how male and female users behave differently in OHCs. Using data from a leading diabetes community in China (Tianmijiayuan), we incorporate three different techniques: topic modeling analysis, sentiment analysis and friendship network analysis to investigate gender differences in chronic online health communities. The results indicated that (1) Male users’ posting content was usually more professional and included more medical terms. Comparatively speaking, female users were more inclined to seek emotional support in the health communities. (2) Female users expressed more negative emotions than male users did, especially anxiety and sadness. (3) In addition, male users were more centered and influential in the friendship network than were women. Through these analyses, our research revealed the behavioral characteristics and needs for different gender users in online health communities. Gaining a deeper understanding of gender differences in OHCs can serve as guidance to better meet the information needs, emotional needs and relationship needs of male and female patients.
1. Introduction With the booming of user-generated content (UGC) on the Web, the Internet has provided an ever increasingly popular platform for the general public to actively contribute their opinions, and through this channel, interact with others within a more convenient and ubiquitous manner [1]. For example, people can not only acquire information but also exchange their views and share personal experiences on multitudes of platforms, such as online communities, blogs, forums and many other social networking sites. Group of users with common interests or goals participate in online communities to exchange information [2], express opinions [3], seek emotional support [4] and establish social relationships with others [5]. By the end of June 2015, the number of Internet users using Internet forums and BBS (Bulletin Board System) was 12.07 million in China, which accounted for 18.0% of the total number of Internet users. As all online forums and knowledge sharing sites are collaborating with search engines, the usage of online communities in China has even reached 80.3% [6]. For health communities, the increasing popularity of Health 2.0 technologies in the last decade has offered patients improved access to therapeutic information, medical knowledge, and emotional comfort [7]. A 2012 survey showed that 72% of American Internet users reported that they examined others’ experiences about health or health-
⁎
1
related issues on the Internet [8]. Patient engagement is now widely recognized as a key factor of a high-quality healthcare system. Such engagement indicates that patients have enough knowledge, ability and willingness to participate in their own health management and disease treatment process and helps patients attain better health outcomes, better treatment experiences, and lower costs in their health management [9]. Online health communities are effective tools to engage and uplift patients because they are patient-focused, create perceived selfefficacy, allow for meaningful community engagement, and empower patients [10]. Not surprisingly, people are taking an active role in managing their health outside of clinical settings. In addition to the basic disease diagnosis and treatment, community members will discuss other topics such as the problems related to daily life, drug side effects, and related symptom descriptions [11]. As an important health information resource, OHC can assist patients in making health management decisions, meeting their health information needs, and offering and gaining peer support more effectively [12,13]. Moreover, OHCs are helpful in empowering patients through personal participation and providing access to both information and emotional support [14]. Gathering people with similar characteristics into a group and studying the behavioral differences between different groups is helpful to gain a deeper understanding of the various groups and obtain
Corresponding author. Tel.: +86-21-64253177; fax: 86-21-64253177. E-mail addresses:
[email protected] (X. Liu),
[email protected] (M. Sun),
[email protected] (J. Li). Tel.: +86-21-64253177; fax: +86-21-64253177.
https://doi.org/10.1016/j.ijmedinf.2017.12.019 Received 21 June 2017; Received in revised form 23 December 2017; Accepted 27 December 2017 1386-5056/ © 2018 Elsevier B.V. All rights reserved.
International Journal of Medical Informatics 111 (2018) 172–181
X. Liu et al.
topics such as government and business in online civic participation [27]. In terms of communication styles, women have been found to use more psychological and social process related words, whereas men tend to use more words that refer to specific objects, attributes, and objective topics [28].
commercially meaningful guidance. From the biological and social attributes of people and a practical sense, the division of people by gender is the most natural way to maximize the differences between groups. Customer segmentation by gender is an efficient, audience-oriented, and commercially viable segmentation method to differentiate markets and services [15]. Gaining a better understanding of gender differences in OHCs is important to understand patient needs related to information, emotional support and relationship building in OHCs. It is also beneficial for meeting the demands of male and female patients with their concerned topics and optimizing the human-computer interface for health-related websites to achieve systematic community order [16]. This paper is organized as follows. The next Section reviews studies of online gender differences and gender differences in OHCs. In Methodology Section we propose our research framework and methodology. Then, we report our secondary data set and experimental results in Section 4. Finally, in Section 5 we conclude the paper by summarizing the contributions of this study and discussing the future research suggestions.
2.1.3. Emotional expression Emotion-rich posts containing user-generated opinions have a great influence on consumer attitudes and behaviors [29], and the sentiment analysis method has been heavily applied to the relevant studies of gender emotional differences. Previous studies confirmed that males were more concerned about information seeking on the Internet, whereas females tended to express more and stronger emotions online compared with males [30]. What’s more, Zhang et al. [31] developed an algorithm to examine gender specific emotional differences and found that females were more likely to express both positive and negative emotions than males in web forum communication. Clipson et al. [32] found that women of all ages showed a greater interest in a wider range of online communities, and they placed greater emphasis on the emotional, spiritual, and social communication than did males in online social network.
2. Literature review
2.1.4. Relationship establishment A previous study found that community members with similar experiences and compassion were more likely to promote strong relationships with each other [33]. Male users built new friendship more easily, whereas female users were more oriented towards engagement in behavior consistent for relationship maintenance in online communities [20,34–36]. In a study of gender differences in social networks, Thelwall et al. [37] found that women usually played contributor roles because they showed a strong preference for helping others and publishing positive comments on social networking sites. Wright and Scanlon [38] found that friendships between women and women were more powerful and valuable than were friendships between men and women or men and men.
We conducted an extensive literature review to identify prior studies related to online gender differences and gender differences in online communities. To cover not only the medical literature but also the social sciences literature, searches were conducted using one-stop sites such as Google Scholar (scholar.google.com) and Baidu Scholar (xueshu.baidu.com). For studies related to online gender differences, we combined three search concepts: “online gender differences” OR “gender differences in social media” OR “gender differences on the Internet”. For studies related to gender differences in online communities, we combined two additional search concepts: “gender and online health” OR “gender and online support groups”. We included studies in English published in peer-reviewed journals that met the following criteria. First, intervention studies were required to focus on an Internet-based artifact such as a portal website, a search engine, an online community, an e-commerce website, or a social media website. Second, we required that the measured outcomes be behavior observed online, such as website visit behavior, contribution behavior, linkage behavior, purchase behavior, or attachment behavior. Third, the study had to present a comparison between male and female users, and the corresponding outcome measures had to be reported.
2.2. Gender differences in online health communities The male and female demands in health care are often quite different. Gender differences in health care have been examined in various aspects such as perspectives on quality of care, care-seeking behaviors, and resource utilization [39,40,11]. The growth of OHCs enables researchers to investigate the different gender demands. With the growth of OHCs, more and more people acquire health knowledge and talk about health issues online to manage their own health. Gender differences in online health settings are determined by human physiological characteristics, social characteristics and behavioral differences as well as the interactions among these factors. Previous studies indicated that males and females exhibit behavior differentiated when seeking information or help for health-related concerns [23,41]. Lieberman [30] investigated an online cancer discussion group and found that men were less likely to participate in online cancer groups than were women, and he argued that the silence of men in psychosocial support groups could be explained by disrespect for self-disclosure. Furthermore, the information needs and emotional needs of men and women in health communities are also different. Seale et al. [26] conducted a keyword analysis in online cancer forums to identify gender specific participation topics, and found that the posting content from women was oriented much more towards the exchange of emotional support than was that of men, and their use of superlatives related to feelings in their postings indicated greater emotional expressivity. Mo et al. [42] divided communication content into four topics: information provision or seeking, emotional encouragement or support, personal opinion and personal experience. They revealed that gender differences in communications in single-sex online health
2.1. Online gender differences From the literature, we observed that gender differences in online environments have generally been studied in four contexts: Internet use, information needs, emotional expression and relationship establishment. 2.1.1. Internet use As of June 2016, the proportion of male and female Internet users in China was 53:47 [17]. In the Web 2.0 environment, the gender differences among Internet users are more important since they reflect the social roles men and women play in not only real life but also virtual life. According to their own interests and material and emotional needs, men and women have different patterns in using the Internet and expressing online ideas [18–22]. 2.1.2. Information needs Depending on Internet user interests, personalities and information needs, men and women discuss different topics and they might have different communication styles when expressing their opinions in online communication [23–26]. For example, a previous study found that women often talked about their private lives, such as family and close friends, whereas men appeared to prefer discussing public life related to 173
International Journal of Medical Informatics 111 (2018) 172–181
X. Liu et al.
poster’s information and the replies of his/her post. During the data cleaning step, we removed the users whose genders were invisible; meanwhile, we labeled the posts and friendship network of every male and female user. After the processing, the datasets contained 16,137 users with 7550 friendship ties and 19,976 posts.
support groups, such as breast cancer and prostate cancer online support groups, were evident, whereas the gender differences were not significant in mixed-sex health-related support groups, such as diabetes and lung cancer online support groups [42]. 2.3. Research gap and research motivation
3.3. Data analysis Few previous studies have explicitly identified gender differences in OHCs from original community posts concerning different aspects. To empirically examine gender differences in OHCs, we investigated and compared the differences by mining the posts published by users and investigating the relationship networks among users. Using data from a leading diabetes community in China (Tianmijiayuan), we incorporated three different techniques, topic modeling analysis, sentiment analysis and friendship network analysis, to investigate gender differences in chronic mixed-sex online health communities from three different aspects: information needs, emotion needs and relationship needs.
We conducted our data analysis from the following three aspects: topic modeling analysis, sentiment analysis and friendship network. 3.3.1. Topic modeling analysis For the purpose of further understanding the content of these posts in Tianmijiayuan, we captured the topics users discussed by means of LDA (Latent Dirichlet Allocation), which enabled topics to be automatically and objectively detected using an R language script. LDA is an approach to automatic extraction of text topics that can solve the problem of semantic association among words, topics and document. It is an unsupervised machine learning technique, also known as a threelayer Bayesian probability model, which contains three-layer structure of words, topics and documents. It assumes that a corpus of text has some probability distribution over “topics”, and each topic is associated with a distribution over words. To put it simply, LDA models a document as a mixture of latent topics, each of which defines a number of unique probabilities in the dictionary. Then each topic can be further represented by a set of keywords. As a result, it allows a document to belong to multiple topics [46]. The generative process of document sampling assumes a set of topics, where each document is sampled from a mixture of topics, each of which defines a number of unique probabilities in the dictionary. When fitting a corpus of documents with the LDA, the topics you find often reveal a great deal of information about the relationship and the shared structure between documents [47].
3. Methodology 3.1. Research design We designed a general framework to examine gender differences based on the secondary data from OHCs consisting of the following steps: (1) Data acquisition: This step includes data capturing and data cleaning. (2) Data analysis: We adopted topic modeling analysis, sentiment analysis and friendship network analysis to identify gender differences in users’ information needs, sentiment needs and relationship needs in OHCs, respectively. Furthermore, we conducted tests of statistical significance in each section to examine gender differences. (3) Gender comparisons: We will present the results of gender comparison in Section 4.
3.3.2. Sentiment analysis Sentiment analysis, also known as opinion mining, is an automatic detection method for opinions and emotions from free texts [37]. Sentiment analysis can be used to solve three different kinds of tasks: subjective and objective classification of texts, emotional polarity discrimination (e.g., positive, negative and neutral sentiments) and emotional intensity discrimination [48]. In addition, many different types of emotion can also be measured. For example, Ekman [49] investigated distinct basic emotions such as fear, anger, sadness and happiness, and he found that these emotions differed not only in expression but probably in other important aspects, such as signals, appraisal, antecedent events, coherence, probable behavioral response, duration and physiology, etc. In this study, we compared positive and negative emotions between males and females. A negative effect is more likely to influence a user’s psychological state than a neutral or positive effect [50]. So we further measured the expressions of three typical negative emotions: anxiety, anger, and sadness. These three negative emotions are the most common and instinctive among patient emotional states. Anxiety is a type of painful emotional experience that is not consistent with the situation. Anger is the excitement caused by extreme dissatisfaction. Sadness is focused on the mood caused by separation, loss and failure. In the present work, we employed a text analysis program called TextMind to assess all emotional expressions posted in Tianmijiayuan. TextMind is a Chinese language psychological analysis platform developed by the Computational Cyber-Psychology Lab, Institute of Psychology, Chinese Academy of Sciences based on LIWC (Linguistic Inquiry and Word Count). As a transparent text analysis program, LIWC has been used in many studies to explore the relationship between psychological processes and word categories in the words people used in daily life over the past 20 years. It has been demonstrated that the linguistic features included in LIWC could reflect the user’s attention
In the remainder of this section, we will provide a detailed description about data acquisition and analysis process. 3.2. Data acquisition Diabetes is referred to as a metabolic disease of sustained hyperglycemia. The prevalence of diabetes in people over the age of 18 in China has reached as high as 11.6%, and 50.1% of Chinese people have pre-diabetes [43]. Self-management is essential to diabetes care. A large number of studies have confirmed that the quality of life for patients with diabetes is closely related to their self-management ability [44,45]. Patient self-management mainly relies on drugs, treatment and the corresponding diet control and exercise, so there are more uncertainties involved, which makes it more meaningful to investigate patient activities in online communities. We chose Tianmijiayuan (http://bbs.tnbz.com/) as our source of data. Tianmijiayuan is the largest and the most active online diabetes community in China. It is a nonprofit online social media website that is targeted to help individuals with diabetes and assist them in sharing knowledge or information about diabetes, exchange experiences of diabetes treatment, seek emotional comfort and make friends facing similar diabetic conditions. Since its launch in September in 2005, the user base of Tianmijiayuan has grown exponentially. To obtain the secondary data, we crawled the basic information of every user, all users’ homepages, and all the posts in Tianmijiayuan from September 2005 to June 2015 using a Java WebCrawler script. The institutional review board approval was obtained for this study. The users’ homepage includes a user’s gender and the ID of his/her friends. We also crawled all posts in Tianmijiayuan, including the 174
International Journal of Medical Informatics 111 (2018) 172–181
X. Liu et al.
female users in the Tianmijiayuan community from year 2005 to year 2015. Obviously, the number of male users in the community is more than female users. To compare the activities of male and female users, we divide all users into the following four categories according to the users’ posting and replying behaviors in Tianmijiayuan: both post and reply, only post but do not reply, only reply but do not post, and neither post nor reply. Among these categories, the inactive users refer to those users who neither posted nor replied, and the remaining users are active users. Figs. 2 and 3 show the proportion of various types of male and female users, respectively. As shown, the proportion of active male users remains stable, and the male users who both post and reply are on the rise, whereas the proportion of active female users decreases significantly. Overall, male users in the community are more active in posting than female users. We then compared the average amount of posts and replies between males and females. As shown in Fig. 4, males reply more frequently, whereas females post more frequently among active users. Thus, females are more likely to express their views through posting, whereas males are more likely to share information through replying. Independent sample t-tests were performed to test differences between the male and female users in the number of posts and replies. As shown in Table 1, female users post more initial messages than male users (Male = 1.73, Female = 2.51, p < 0.001), while male users post more reply messages than female users (Male = 131.17, Female = 94.42, p < 0.01).
focus, emotionality, social relationships, thinking styles and individual differences in language use [51]. TextMind provides users with simplified Chinese automatic word segmentation and language psychoanalysis package solution. The lexicon, text, punctuation and other processing methods are specifically for simplified Chinese context in TextMind. Additionally, the lexical classification system of TextMind is also consistently compatible with LIWC. 3.3.3. Friendship network analysis A typical network is composed of many nodes and edges, where nodes represent individuals in the system, and edges represent associations between nodes [52]. The friendship network is an undirected network constructed by the users in OHCs who make friends between each other. It represents strong ties between users in the communities compared with other associations such as posting and replying relationships. Thus, we constructed a friendship network to identify the central users who play important roles in online communities. In our work, we used three measures (degree centrality, betweenness centrality and Bonacich power) to identify those influential users. The degree centrality of a user is measured as the number of direct contacts and it captures the connectedness or popularity of a node in a social network [53]. An online community user would have a higher degree of centrality if he or she had more friendship relationships with others. Betweenness centrality is defined as the frequency at which a user falls on the shortest paths between pairs of other members in the friendship network [54]. It focuses on the extent to which a member serves as an intermediary in a social network. Bonacich Power is a modified degree measure in which a node’s centrality is its summed connections to others, weighted by their centralities. Bonacich [55] used the “power index” to study the influence of the midpoint of the network. It’s an extended measure of degree centrality that explores not only the connection of a node’s direct neighbors but also its neighbors’ connectedness [56]. If a point is connected to a point with a high centrality, the centrality of the point will increase and, accordingly, the centrality of other points connected to it will also be increased.
4.2. Topic modeling analysis In the data we crawled from Tianmijiayuan, there were 10,146 posts from male users and 9830 posts from female users. First, we segmented the Chinese texts into words, and filtered stop words for these male and female posts separately. We then carried out LDA topic analysis using an R language script. We used the Gibbs sampling method to approximate the parameters of the LDA model. After running the LDA program several times, we found that when the number of topics was 6, the topics were best clustered, and words of each topic were the easiest to summarize. All posts that are aggregated into 6 topics constitute the major components of discussion for male and female separately. As a result, each post is assigned to a topic, and the most frequently occurring words in each topic that males and females talk about are shown in Tables 2 and 3, respectively.
4. Research results 4.1. Descriptive statistics analysis By the end of June 2015, there were a total of 9646 male users and 6491 female users registered in Tianmijiayuan whose gender was visible. Fig. 1 shows a comparison of the number of both male and
Fig. 1. The number of male and female users from September 2005 to June 2015.
175
International Journal of Medical Informatics 111 (2018) 172–181
X. Liu et al.
Fig. 2. The proportion of male users.
Fig. 3. The proportion of female users.
Fig. 4. The average amount of posts and replies of active users.
Average replies
120.00
5.20
100.00
4.20
80.00 3.20 60.00
posts
2.20 40.00 1.20
20.00 0.00
0.20 2005
2007
2009
2011
2013
2015
Year
and advice (prescription, advice, condition, insulin, appearance, health). Second, possible relationships among the entities were identified. For example, the friends and doctors are both advice givers (actor), and they give advice to users. Last, we gave a simple name to
The name of each topic was determined according to the following strategy. First, words with similar meanings were grouped to identify possible entities. For example, we identified three entities from Topic 1 for male users as friends (friends, children), doctors (doctor, experts) 176
International Journal of Medical Informatics 111 (2018) 172–181
X. Liu et al.
posts from year 2005 to year 2015. The value of the proportion is very small because only a small amount of posts contain emotional content. As seen in Fig. 6, women’s anxiety was significantly higher than that of men before 2010, and after 2010, the percent alternated. The results of independent sample T test of male and female users on negative emotions (i.e., anxiety, anger and sadness) are shown in Table 5. Female users expressed more anxiety and sadness than male users. However, we didn’t observe the significant difference on expressing anger emotions.
Table 1 The differences of male and female users on posts and replies. Gender
Mean
Std.Dev
T Value
Sig.
Posts
Male Female
1.73 2.51
13.711 11.897
3.571
0.000***
Replies
Male Female
131.17 94.42
995.050 661.063
2.657
0.008**
** p < 0.01. *** p < 0.001.
4.4. Friendship network analysis
describe the identified entities and their relationships. For example, we named Topic 1 “friends’ and doctors’ advice”. In the 10,146 posts published by male users, there are 2471 posts assigned to Topic 1; 2136 posts assigned to Topic 2; 1698 posts assigned to Topic 3; 1423 posts assigned to Topic 4; 1287 posts assigned to Topic 5; and 1131 posts assigned to Topic 6. We summarized these six topics as follows: friends’ and doctors’ advice, exercise and diet, hypoglycemia and treatment, diabetes study and research, saccharification risk, and instrument use. In the 9830 posts published by female users, there are 2383 posts assigned to Topic 1; 1969 posts assigned to Topic 2; 1690 posts assigned to Topic 3; 1419 posts assigned to Topic 4; 1255 posts assigned to Topic 5; and 1114 posts assigned to Topic 6. We summarized these six topics as follows: treatment and dating, family life, instrument use, diet and exercise, emotional expression, and blood sugar control. From Tables 2 and 3, we can directly see the apparent distinctiveness of several topics. Men and women have some common topics, such as medication, diabetic therapy, diet and doctors’ advice and so on. Comparatively speaking, female users prefer talking about life, hobbies, family and emotional expression, as reflected in Topics 2 and 5. However, men pay more attention to some specific topics such as diabetes learning and research, as reflected in Topic 4 and the words in the output list. In addition, the male users are more likely to use diabetes terminologies in their posts.
The topology of the friendship network in Tianmijiayuan is extracted using the UCINET (see Fig. 7). Each blue triangle represents a single male user; likewise, each red circle represents a single female user. The isolated nodes are not included in the networks, Therefore, there are 694 male users and 551 female users in this network. Table 6 shows the distribution of male and female users’ friendship ties, and we can see that the majority (54.82%) of male users choose to make friends with male users, whereas most (61.39%) female users are more likely to build friendship with male users. Concerning network centrality, we adopt three different measures. Table 7 shows the average degree centrality, betweenness centrality and Bonacich power for 694 male users and 551 female users in this friendship network, separately. Although male users’ centrality is on average higher than that of female users’ in terms of degree, betweenness and Bonacich power, there are no statistically significant differences between male and female users. Therefore, we further explored differences between active and inactive users. Users who never posted or replied to messages were considered inactive users. In contrast, users who posted or replied to at least one message were considered active users. After distinguishing inactive users from active users, we obtained different social network patterns for active male and female users. We used ANOVA with two factors (gender and activity) to analyze the data of each social network measure separately. For betweenness centrality, there is a significant interaction between gender and activity (F (1, 1243) = 4.037, p < 0.05). However, the interactions between gender and activity are not significant for degree centrality (F (1, 1243) = 2.184, p > 0.05) and Bonacich power (F (1, 1243) = 2.737, p > 0.05). Since the active users are meaningful to the virtual community, we still decide to compare the differences among the active male users and active female users. As shown in Table 8, active male users exhibited higher values for degree centrality (Male = 10.045, Female = 7.796, p < 0.05), betweenness centrality (Male = 3141.157, Female = 1911.668, p < 0.05), and Bonacich power (Male = 25.290, Female = 19.292, p < 0.05) than active female users. Table 9–11 show those most influential users for different centralities. Table 9 includes six males and four females, which indicates that those users have the most friends. Table 10 shows the list of users with the highest betweenness centrality. Six males and four females are
4.3. Sentiment analysis Fig. 5 shows the trend of the expression of positive and negative emotions that males and females expressed in their posts from year 2005 to year 2015. In addition, we can see that positive emotions are expressed more frequently than negative emotions for both male and female users. However, female users are more sentimental than male users, expressing a higher proportion of both positive and negative emotions than male users. The results of independent sample T-tests on male and female users’ positive and negative emotions are shown in Table 4, indicating that female users express more negative feelings than male users (Male = 0.002, Female = 0.003, p < 0.05). Fig. 6 shows the average frequency of words associated with three distinct negative emotions, anxiety, anger and sadness, in the context of Table 2 Top 10 words and topic names associated with male users for each topic.
1 2 3 4 5 6 7 8 9 10 Topic name
Topic 1
Topic 2
Topic 3
Topic 4
Topic 5
Topic 6
prescription friends doctor advice condition experts children insulin appearance health friends’ and doctors’ advice
glycemic index postprandial exercise control diet nutrition weight night medicine breakfast exercise and diet
doctor treatment lower glucose diabetics medication happy feeling Metformin America dosage hypoglycemia and treatment
diabetes learn guideline injection hospital report surgery biochemistry normality research diabetes study and research
Saccharification side effect Risk Protein morning Method depressed diagnose Check Diabetic saccharificati-on risk
Glucometers Limosis test paper Result Roche Severity Time Test Sinocare Complication instrument use
177
International Journal of Medical Informatics 111 (2018) 172–181
X. Liu et al.
Table 3 Top 10 words and topic names associated with female users for each topic.
1 2 3 4 5 6 7 8 9 10 Topic name
Topic 1
Topic 2
Topic 3
Topic 4
Topic 5
Topic 6
treatment exercise friends patients diabetics discussion analysis influence symptom catch a cold treatment and dating
daughter children injection insulin pump Medtronic mom appearance thank consultation travel family life
glucometers result test strip Roche mediation Sinocare health share condition vitality instrument use
limosis exercise postprandial dinner night breakfast time make over report feeling diet and exercise
happy sugar control doctor emotion life depressed notice Bayer tired birthday emotional expression
Control glycemic index lower glucose Diet Complication lower glucose Method Hope Hospital syringe needle blood sugar control
included, which indicates that they are good mediators and had stronger social skills in this friendship network. In Table 11, we can see six male and four female users with highest Bonacich power. Both these 10 users and the users who make friends are in high centrality positions. This shows that male users are more influential in the friendship network. Comparing the top ten users with the highest degree centrality, betweenness centrality and Bonacich power, we find that five males and two females overlap (which are marked as bold in the three tables).
Table 4 The differences of male and female users on positive and negative emotions. Emotions
Gender
Mean
Std.Dev
T Value
Sig.
Positive emotions
Male Female
0.003 0.005
0.022 0.027
−1.953
0.051
Negative emotions
Male Female
0.002 0.003
0.020 0.022
−2.032
0.042*
* p < 0.05.
5. Discussion and conclusion analysis results show that female users express more negative emotions than do male users, especially the expression of anxiety and sadness. This indicates that female users are more likely to express sentiments in their posts and replies. This information is helpful for doctors and family members to pay attention to patients’ emotion in a more effective and personalized way. (4) In terms of the friendship network, we investigate the friendship network pattern in Tianmijiayuan and list the most prestigious users in the friendship network. The results show that both male and female users are more oriented towards making friends with male users. Those centrality measures reveal that active male users are more influential in the OHCs. The current study also has several limitations that could be addressed in future research. First, we collected data in only one OHC that was related to diabetes, a typical type of chronic disease, and future research could consider data from various sources and incorporate acute diseases, such as cardiovascular disease. Second, when conducting topic modeling analysis, we did not take time into consideration. In the future we can launch a time series analysis to investigate the topic evolution for different genders. Then the friendship network of users provides a limited data source for identifying the influential OHC users. Future studies can incorporate other relationships such as the replying network [57] to evaluate user’s roles in communities more comprehensively. Lastly, this study was based on a leading online diabetes community in China, and there is no comparative analysis
In this study, we explored the gender differences through mining the user-generated content (UGCs) published by users and investigating the relationship networks among users in a chronic mixed-sex online health communities. Using data from a leading diabetes community in China (Tianmijiayuan), we incorporated three different techniques, topic modeling analysis, sentiment analysis and friendship network analysis, to investigate gender differences from three different aspects: information needs, emotion needs and relationship needs. Our study enriches the knowledge regarding gender differences in the online health community literature since it provides a comprehensive and systematic research framework to mine UGCs and relationship networks in OHCs, and it also provides a new perspective for gender-related research compared with previous survey and interview studies. Overall, our analysis indicates that gender differences do exist in chronic mixed-sex online health communities: (1) From the results of the descriptive analysis, we can conclude that the number of male users in the community is more than female users. Meanwhile, males are more likely to share information through replying, whereas females are more likely to express their views through posting. (2) The results of the topic modeling analysis shows that male users’ postings are usually more professional and contained more medical terms than female users’ postings. In addition to topics that are usually discussed in health communities such as diet, medicine, and treatment, women are more likely to seek emotional support in health communities. (3) Sentiment
Fig. 5. Average frequency of words associated with positive and negative emotions.
0.03 male posiƟve emoƟon
0.025 0.02 rƟon
0.015 female posiƟve emoƟon
0.01 0.005
female negaƟve emoƟon
0 2005 2007 2009 2011 2013 2015
178
International Journal of Medical Informatics 111 (2018) 172–181
X. Liu et al.
Fig. 6. Average frequency of words associated with anxiety, anger and sadness.
0.005 0.004
male anxiety female anxiety
0.003
male anger female anger
0.002
male sadness 0.001 0 2005
2007
2009
2011
2013
2015
Table 5 The differences of male and female users on negative emotions. Negative Emotions
Gender
Mean
Table 6 The distribution of male and female users’ friendship ties.
Std.Dev
T Value
Sig.
Anxiety
Male Female
0.00018 0.00055
0.0037 0.0091
−3.060
0.002
Anger
Male Female
0.00013 0.00020
0.0033 0.0045
−1.172
0.241
Sadness
Male Female
0.00023 0.00047
0.0050 0.0059
−2.779
0.005**
**
Gender
Male
Female
Male Female
54.82% 61.39%
45.18% 38.61%
Table 7 The differences of male and female users on social network measures.
** p < 0.01.
between Eastern and Western cultures. Since the western cultures are different from the eastern cultures in many different ways, the findings from this study should be cautious when applied to the western countries. For example, westerners believe in individualism, and Chinese believe in collectivism. The social relationship is more important in China, so the findings about the friendship network in this study may not be applied directly to western culture. Our research results have numerous implications that can facilitate
Network Measures
Gender
Mean
Std.Dev
T Value
Sig.
Degree
Male Female
6.267 5.804
16.004 7.632
0.663
0.507
Betweenness
Male Female
1826.160 1386.139
8846.656 3719.331
1.093
0.190
Bonacich power
Male Female
15.439 14.313
39.427 18.803
0.660
0.500
Fig. 7. The topology construction of friendship network. (For interpretation of the references to colour in the text, the reader is referred to the web version of this article.)
179
International Journal of Medical Informatics 111 (2018) 172–181
X. Liu et al.
their needs and shed light on how to better serve those patients. First, our topic modeling analysis can be used for identifying and extracting OHC topics. It is beneficial for doctors to facilitate the guidance for patients with different genders. Second, through the emotional analysis of the posts in the forum, we can understand the emotionality of the male and female OHC members. During the treatment process, doctors and family members of patients can better interact with patients to maintain positive attitudes in a more effective and personalized way. Third, through the analysis of the friendship network, we can find influential OHC users and analyze the value and contribution of these users to the community to provide users guidance how to find and build friendship with powerful members in OHCs. Lastly, our results can also help the communities to better manage the platform by recommending the influential users and appropriate posts to patients, to reinforce user’s self-empowerment and strengthen the community’s prosperity. The current study also provides insights into ergonomic implementation, taking gender into account. First, the system should recommend and request replies to posts with different topics for male and female users. Such gender-specific recommendations would be useful because male users are more interested in answering professional or technical posts, while female users are better at providing social and emotional support. In terms of ergonomic implementation, providing information of interest to the targeted user is good for both the replier and the questioner. Second, this study provides strategies for negative emotion management. The forum managers should pay more attention to the female users’ discussion to alleviate the potential spread of negative emotions. These design considerations will help optimize human-computer interaction and improve services in the online health community.
Table 8 The differences of active and inactive users on social network measures. Network Measures
Activity level
Gender
Mean
Std. Dev
T value
Sig.
Degree centrality
Active users Inactive users
Male Female Male Female
10.045 7.796 1.419 1.351
19.921 8.364 1.473 1.623
1.990
0.047*
0.467
0.641
Active users Inactive users
Male Female Male Female
3141.157 1911.668 20.525 17.683
11261.073 3590.583 205.804 115.896
1.979
0.048*
0.523
0.601
Active users Inactive users
Male Female Male Female
25.290 19.292 3.325 3.315
48.700 20.940 3.948 3.510
2.164
0.031*
0.028
0.976
Betweenness centrality
Bonacich power
* p < 0.05. Table 9 Top 10 users with the highest degree centrality in the friendship network. User_id
Gender
Degree
82598 93272 82256 85531 84638 21308 494 47319 126222 93454
male male male male male female female female female male
228 226 192 103 82 72 56 51 50 49
Authors’ contributions Xuan Liu: Research Idea, Writing. Min Sun: Data Analysis, Writing. Jia Li: Data Collection, Research Design.
Table 10 Top 10 users with the highest betweenness centrality in the friendship network. User_id
Gender
Betweenness
93272 82598 84638 82256 61280 494 21237 126222 85531 21308
male male male male male female female female male female
128937.10 125575.66 90875.77 86051.40 43004.91 39978.12 35837.87 34272.57 32666.15 27079.86
Funding statement This research was supported by the National Natural Science Foundation of China grant numbers 71471064, 71371005, and 91646205, and also supported by the Fundamental Research Funds for the Central Universities. Competing interests statement The authors have no competing interests to declare.
Table 11 Top 10 users with highest Bonacich power in the friendship network.
Summary points
User_id
Gender
Bonacich Power
82598 93272 82256 85531 84638 21308 494 47319 126222 61280
male male male male male female female female female male
561.71 556.79 473.02 253.76 202.02 177.38 137.96 125.65 123.18 120.72
The results indicated that (1) Male users’ posting content was usually more professional and included more medical terms. Comparatively speaking, female users were more inclined to seek emotional support in the health communities. (2) Female users expressed more negative emotions than male users did, especially anxiety and sadness. (3) In addition, male users were more centered and influential in the friendship network than were women.
References interactions between doctors and patients as well as the construction of healthy community information systems. Information needs, emotional support and relationship needs are important requirements for patients with diabetes and their family members. Those differences investigated in our research for male and female patients can help better understand
[1] G.J. Johnson, P.J. Ambrose, Neo-tribes: the power and potential of online communities in health care, Commun. ACM 49 (1) (2006) 107–113. [2] H. Park, S.P. Min, Cancer information-seeking behaviors and information needs among Korean Americans in the online community, J. Commun. Health 39 (2)
180
International Journal of Medical Informatics 111 (2018) 172–181
X. Liu et al.
age: a fresh look at source credibility effects, J. Advert. 44 (2) (2015) 88–104. [30] M.A. Lieberman, Gender and online cancer support groups: issues facing male cancer patients, J. Cancer Educ. 23 (3) (2008) 167–171. [31] Y. Zhang, Y. Dang, H. Chen, Research note: examining gender emotional differences in web forum communication, Decis. Support Syst. 55 (55) (2013) 851–860. [32] T.W. Clipson, S.A. Wilson, D.D. Dufrene, The social networking arena: battle of the sexes, Bus. Commun. Q. 75 (1) (2012) 64–67. [33] K.B. Wright, S.B. Bell, K.B. Wright, et al., Health-related support groups on the internet: linking empirical findings to social support and computer-mediated communication theory, J. Health Psychol.: Interdiscip. Int. J. 8 (1) (2003) 39–54. [34] C.S. Ang, Internet habit strength and online communication: exploring gender differences, Comput. Hum. Behav. 66 (2017) 1–6. [35] S.G. Mazman, Y.K. Usluel, Gender differences in using social networks, Turk. Online J. Educ. Technol. 10 (2) (2011) 133–139. [36] N.L. Muscanell, R.E. Guadagno, Make new friends or keep the old: gender and personality differences in social networking use, Comput. Hum. Behav. 28 (1) (2012) 107–112. [37] M. Thelwall, D. Wilkinson, S. Uppal, Data mining emotion in social network communication: gender differences in MySpace, J. Assoc. Inf. Sci. Technol. 61 (1) (2010) 190–199. [38] P.H. Wright, M.B. Scanlon, Gender role orientations and friendship: some attenuation, but gender differences abound, Sex Roles 24 (9) (1991) 551–566. [39] M.T.G.W. Cecile, P.V. Katja, M.K. Annemarie, Gender perspectives and quality of care: towards appropriate and adequate health care for women, Soc. Sci. Med. 43 (5) (1996) 707–720. [40] G.M. Owens, Gender differences in health care expenditures, resource utilization, and quality of care, J. Manage. Care Pharm. 14 (Suppl. (3)) (2008) 2–6. [41] J. Rowley, F. Johnson, L. Sbaffi, Gender as an influencer of online health information-seeking and evaluation behavior, J. Assoc. Inf. Sci. Technol. 68 (1) (2017) 36–47. [42] P.K. Mo, S.H. Malik, N.S. Coulson, Gender differences in computer-mediated communication: a systematic literature review of online health-related support groups, Patient Educ. Couns. 75 (1) (2009) 16–24. [43] J.C. Chan, Y. Zhang, G. Ning, Diabetes in China: a societal solution for a personal challenge, Lancet Diabetes Endocrinol. 2 (12) (2014) 969–979. [44] K. Lorig, P.L. Ritter, M.G. Ory, et al., Effectiveness of a generic chronic disease selfmanagement program for people with type 2 diabetes: a translation study, Diabetes Educ. 39 (5) (2013) 655–663. [45] C.H. Yu, J.A. Parsons, M. Mamdani, et al., A web-based intervention to support selfmanagement of patients with type 2 diabetes mellitus: effect on self-efficacy, selfcare and diabetes distress, BMC Med. Inf. Decis. Mak. 14 (1) (2014) 117. [46] D.M. Blei, A.Y. Ng, M.I. Jordan, Latent dirichlet allocation, J. Mach. Learn. Res. 3 (2003) 993–1022. [47] D. Weinshall, D. Hanukaev, G. Levi, LDA topic model with soft assignment of descriptors to words, International Conference on Machine Learning (2013) 711–719. [48] A. Abbasi, H. Chen, A. Salem, Sentiment analysis in multiple languages: feature selection for opinion classification in Web forums, ACM Trans. Inf. Syst. 26 (3) (2008) 1–34 (12). [49] P. Ekman, An argument for basic emotions, Cognit. Emotion 6 (3–4) (1992) 169–200. [50] P. Rozin, E.B. Royzman, Negativity bias, negativity dominance, and contagion, Pers. Soc. Psychol. Rev. 5 (4) (2001) 296–320. [51] Y.R. Tausczik, J.W. Pennebaker, The psychological meaning of words: LIWC and computerized text analysis methods, J. Lang. Soc. Psychol. 29 (1) (2010) 24–54. [52] A.W. Wolfe, Social network analysis: methods and applications, Am. Ethnol. 24 (1) (1997) 136–137. [53] J. Nieminen, On the centrality in a graph, Scand. J. Psychol. 15 (1) (1974) 332–336. [54] L.C. Freeman, Centrality in social networks conceptual clarification, Soc. Netw. 1 (3) (1979) 215–239. [55] P. Bonacich, Power and centrality: a family of measures, Am. J. Sociol. 92 (5) (1987) 1170–1182. [56] X. Liu, S. Jiang, H. Chen, et al., Modeling knowledge diffusion in scientific innovation networks: an institutional comparison between China and US with illustration for nanotechnology, Scientometrics 105 (3) (2015) 1953–1984. [57] M. Chau, J. Xu, Business intelligence in blogs: understanding consumer interactions and communities, Manage. Inf. Syst. Q. 36 (4) (2012) 1189–1216.
(2014) 213–220. [3] Y. Lu, K. Jerath, P.V. Singh, The emergence of opinion leaders in a networked online community: a dyadic model with time dynamics and a heuristic for fast estimation, Manage. Sci. 59 (8) (2013) 1783–1799. [4] A. Chmiel, J. Sienkiewicz, M. Thelwall, et al., Collective emotions online and their influence on community life, PLoS One 6 (7) (2011) 1–8 (e22207). [5] A. Tommasetti, O. Troisi, S. Cosimato, Patient Empowerment and Health online Community: two ways to give the new viability doctor-patient relationship, J. Mol. Struct. 193 (2) (2014) 295–300. [6] CNNIC, Research Report of Chinese Internet Users Search Behavior [EB/OL], (2015) http://news.xinhuanet.com/politics/2015-07/23/c_128051995.htm. [7] M. Lieberman, The role of insightful disclosure in outcomes for women in peerdirected breast cancer groups: a replication study, Psycho-oncology 16 (10) (2007) 961. [8] H. Yang, C.C. Yang, Using health-consumer-contributed data to detect adverse drug reactions by association mining with temporal analysis, ACM Trans. Intell. Syst. Technol. 6 (4) (2015) 1–30. [9] S. Barello, G. Graffigna, E. Vegni, Patient engagement as an emerging challenge for healthcare services: mapping the literature, Nurs. Res. Pract. 2012 (2090–1429) (2012) 905934. [10] M. Househ, E. Borycki, A. Kushniruk, Empowering patients through social media: the benefits and challenges, Health Inf. J. 20 (1) (2014) 50. [11] A.E. Thompson, Y. Anisimowicz, B. Miedema, et al., The influence of gender and other patient characteristics on health care-seeking behavior: a QUALICOPC study, BMC Fam. Pract. 17 (1) (2016) 1–7. [12] L.M. Hoey, S.C. Ieropoli, V.M. White, et al., Systematic review of peer-support programs for people with cancer, Patient Educ. Couns. 70 (3) (2008) 315–337. [13] C. Nath, J. Huh, A.K. Adupa, et al., Website sharing in online health communities: a descriptive analysis, J. Med. Internet Res. 18 (1) (2016) e11. [14] V.D.E. Martijn, J.F. Marjan, W.M.A.V.D.E. Johanna, M.J. Faber, J.W.M. Aarts, et al., Using online health communities to deliver patient-centered care to people with chronic conditions, J. Med. Internet Res. 15 (6) (2013) 190–200. [15] H. Kraft, J.M.A. Weber, Look at gender differences and marketing implications, Int. J. Bus. Soc. Sci. 3 (21) (2012) 247–253. [16] Y. Lu, P. Zhang, J. Liu, et al., Health-related hot topic detection in online communities using text clustering, PLoS One 8 (2) (2013) 1–9 (e56221). [17] CNNIC, The 38th Statistical Report on Internet Development in China [EB/OL], http://it.people.com.cn/GB/119390/118340/406323/. [18] E. Garbarino, M. Strahilevitz, Gender differences in the perceived risk of buying online and the effects of receiving a site recommendation, J. Bus. Res. 57 (7) (2004) 768–775. [19] S.C. Herring, Posting in a Different Voice: Gender and Ethics in Computer-Mediated Communication, State University of New York, 1996. [20] A.M. Kimbrough, R.E. Guadagno, N.L. Muscanell, et al., Gender differences in mediated communication: women connect more than do men, Comput. Hum. Behav. 29 (3) (2013) 896–900. [21] C. Ogan, F. Cicek, M. Ozakca, Letters to Sarah: analysis of email responses to an online editorial, New Media Soc. 7 (4) (2005) 533–557. [22] L.H. Shaw, L.M. Gant, Users divided? Exploring the gender gap in internet use, Cyberpsychol. Behav. 5 (6) (2003) 517–527. [23] S. Bidmon, R. Terlutter, Gender differences in searching for health information on the internet and the virtual patient-physician relationship in Germany: exploratory results on how men and women differ and why, J. Med. Internet Res. 17 (6) (2015) 1–19 (e156). [24] C.S. Mackenzie, W.L. Gekoski, V.J. Knox, Age, gender, and the underutilization of mental health services: the influence of help-seeking attitudes, Aging Ment. Health 10 (6) (2006) 574–582. [25] M. Mora, J.E. Shell, C.S. Thomas, et al., Gender differences in questions asked in an online preoperative patient education program, Gend. Med. 9 (6) (2012) 457–462. [26] C. Seale, S. Ziebland, B.J. Charteris, Gender, cancer experience and internet use: a comparative keyword analysis of interviews and online cancer support groups, Soc. Sci. Med. 62 (10) (2006) 2577–2590. [27] J.E. Fuller, Equality in cyber democracy? Gauging gender gaps in on-line civic participation, Soc. Sci. Q. 85 (4) (2004) 938–957. [28] L.N. Matthew, J.G. Carla, D.H. Lori, et al., Gender differences in language use: an analysis of 14,000 text samples, Discourse Processes 45 (3) (2007) 211–236. [29] I. Kareklas, D.D. Muehling, T.J. Weber, Reexamining health messages in the digital
181