Literature Review: Preventing Suicide Using Machine Learning

Written for Lauren Brown’s Writing 316 class.



  The use of machine learning (ML) has exploded in the past decade, proving an exceptional tool for finding meaning in vast, complex data. It enables computers to perform traditionally human tasks, such as speech and image recognition, and it can also perform better than any human at many tasks, such as diagnosing early-stage lung cancer (Gould, Huang, Tammemagi, Kinar, & Shiff, 2021). All of these tasks require processing large amounts of complex data and coming to a conclusion about what it means or what action should be taken.
  What differentiates a machine learning algorithm from a traditional computer algorithm is that ML algorithms are mathematical frameworks for discovering patterns, whereas traditional algorithms have the patterns built in. For example, one common type of machine learning is supervised learning, in which the program is given a large set of questions and the correct answers. It runs through the questions many times, making a prediction about the answer each time. After predicting each right answer, it “shows itself” the correct answer and adjusts its answer-picking algorithm based on that feedback. Another type of ML is reinforcement learning, in which the program is given an objective and allowed to practice, attempting the task thousands or millions of times. It learns over repeated attempts which actions better accomplish its objective, and it gradually becomes better at accomplishing it. Supervised learning, reinforcement learning, and all other types of machine learning all have one thing in common: they learn patterns from data and use them to make predictions or perform tasks.
  This data-processing capacity makes the potential for a marriage between machine learning and suicidology obvious to anyone familiar with both fields. Human beings are certainly vast and complex—complex enough that despite of decades of suicide research, the causes of suicide remain largely a mystery. Although awareness and prevention efforts have increased, deaths by suicide remain high and unpredictable. One hope is that machine learning could be used in our suicide-prevention efforts and yield better results than we have been able to achieve without it.
  Seeking evidence for or against this hope, I surveyed the published literature for articles related to this question: Can machine learning be used to prevent suicide? The research community consensus is definitely yes, but much remains to be seen about exactly how useful and effective it can be, largely because suicide prevention is such a big process and most studies explore only one part of it. This literature review is divided into four sections, each focusing on a different part of the suicide-prevention process: identifying and predicting suicidal thoughts, long-term prediction of suicidal behaviors, short-term prediction of the same, and actively preventing suicide. Most studies in the first two categories report success and optimism, but results in the latter two are far fewer and less conclusive.

Identifying and Predicting Suicidal Thoughts
  In fact, each part of the suicide-prevention process could be further broken down, approached from many angles. Conversing face-to-face with people might allow you to identify whether they are having suicidal thoughts, but so might looking at their text messages, examining their medical history, asking them some well-chosen questions, or a number of other approaches. Among these possibilities, medical history and questionnaires based on psychological research have historically been the primary means by which humans have predicted suicidal thoughts and behaviors (STB). This information also lends itself well to STB prediction using ML, as many researchers have shown.
  Consider, for example, a 2021 meta-analysis by Schafer, Kennedy, Gallyer, & Resnik, which compared the results of all studies published before May 1, 2020, about predicting SI using either a theoretical or a machine learning approach. They found that ML predictions of suicidal ideation (SI) are on average many times more accurate than human theory-based predictions. One valuable example is the by Lin et al. (2020), which compares six different ML algorithms for predicting the presence and severity of SI in military personnel based on their answers to a psychological survey. All six algorithms achieved over 98% accuracy. This is impressive and is a valuable first step, but the only goal isn’t to have a computer do what a human can already do; it’s to find what computers can do that humans can’t.
  For progress on that goal, we turn to three related studies by Burnap, Colomba, Amery, Hodorog, & Scourfield (2017), Lekkas, Klein, & Jacobson (2021), and Jung, Kim, Nam, & Zhu (2021). All three groups take advantage of social media data to identify or predict suicidal thoughts. Burnap et al.’s study is the simplest of the three. After collecting tens of thousands of tweets relating to suicide, they had human volunteers manually classify them by purpose—to express suicidal intent, raise awareness, report news of suicide, etc. In future testing, after it was trained, they found the algorithm correctly identified 85% of tweets expressing suicidal intent, distinguishing them from other suicide-related tweets. Though this doesn’t directly prevent suicide, it’s easy to imagine how it could be used to do so, by reading tweets far faster than we ever could and raising the alarm to a person’s loved ones or medical workers if he expresses suicidal intent.
  Lekkas et al.’s (2021) study differs from Burnap et al.’s (2017) in the platform—Instagram rather than Twitter—and the data it utilizes. Burnap et al. made their predictions using only the text from the tweets, but Lekkas et al. dive deep into users’ Instagram data, including their follower count, engagement on the app, average number of likes, number of pictures posted in the last month, and much more, as well as some text from an interview held over Instagram direct message. The ML model predicted whether a user was experiencing acute SI. Its sensitivity (i.e. proportion of true cases successfully identified) was just under 77%, although the sample size was small (42 people). These results, like those of Burnap and his collaborators, show the versatility of ML and evidence of the potential of using social media data to predict suicide.
  A third study contributing to this conversation is by Jung et al. (2021), which like Lekkas et al. (2021) collected data from Twitter, but unlike Lekkas et al. factored metadata—data about the data, such as time of day, date, and location the tweet was posted—into their algorithm. This resulted in insights the other two could not have provided, such as the pattern that among tweets about suicide, the time of day at which tweets are most likely to express suicidal intent is the afternoon. These three studies taken together form a picture of one way machine learning may be uniquely suited for predicting suicidal thoughts: using social media data. The path forward for researchers is to combine and expand the ideas found in these studies, utilizing more types of data from more people and more social networks to get even more accurate predictions of SI.

Predicting Suicidal Behaviors in the Long Term
  Beyond predicting SI, an essential next step is predicting whether and when someone will attempt suicide, also referred to as suicidal behavior. This is commonly investigated retrospectively, meaning the researchers use a dataset of dozens or hundreds of pieces of information about people, one of which is whether they attempted suicide in a given time period. An ML model is then trained to look at all the information about a person aside from whether he attempted suicide and predict whether he did.
  One research group used a questionnaire-based approach, similar to that described above employed by Lin et al. (2020). The questionnaire included questions about participants’ health, family life, lifestyle, and other related information. One year after the questionnaire, a follow-up email was sent asking about their suicidal thoughts and behaviors in the past year, and the researchers used their responses to train an ML model. The model’s results, like Burnap et al.’s (2017) and Lekkas et al.’s (2021), were good but not extraordinary, with a sensitivity of 80% (Macalli et al., 2021). Lin et al.’s experimental design and Macalli et al.’s differ only slightly, but the differences are worth noting. First, Lin et al. predicted only suicidal thoughts, but Macalli et al. predicted both suicidal thoughts and behaviors. Second, Lin et al.’s population of interest was military personnel, and Macalli et al.’s was college students. The second difference is especially significant; it provides evidence that ML can be used to predict suicidal thoughts and behaviors for a variety of types of people in a variety of settings.
  More evidence of that is found in a study by Weller et al. (2021), who investigate a third population of interest: adolescents. Their model included data from 179,000 Utah high school students, including their responses to hundreds of survey questions, and was able to predict the presence of STB with 91% accuracy. It also provided valuable insight into exactly what factors correlate most strongly with STB; the top three (for adolescents in Utah) are being bullied, being cyberbullied, and having serious arguments at home. This result further confirms that ML models can use survey responses to predict STB within many different populations, although so far no single model has achieved good predictive power in a heterogeneous population (i.e. one including people from all ages and walks of life). All studies have focused on a narrow subset of people—understandably, because collecting consistent data becomes more difficult the larger and more heterogeneous the population.
  In addition to survey responses, medical records can be a good way to predict STB. Walsh, Ribeiro, & Franklin found in a 2018 study that they could use longitudinal medical records of adolescents to train a model which predicted STB with sensitivity above 80%, and sometimes much better, depending on the population of interest and the time frame. It is important to note, however, that the precision (proportion of positive predictions which are correct) of their predictions dropped significantly when the population of interest comprised entirely depressed adolescents; that is, distinguishing suicidal adolescents from depressed non-suicidal adolescents is much more difficult than distinguishing suicidal adolescents from healthy ones. Further research into that distinction would be valuable, because in practice, the more common task is the more difficult one; all of a psychiatrists patients are mentally ill, but he would like to be able distinguish those who are at risk of suicide from those who aren’t.
  At this point, I will mention some common concerns in the research community about limitations on our ability to predict suicidal behaviors. One of the most prevalent is that many methodologies, both theoretical and using ML, rely upon self-reported suicidal ideation as one of the main predictors of suicidal behaviors. But often people don’t say when they are experiencing SI because of social stigma, the desire to avoid more medical appointments, or other reasons. One study has found over half of people who die by suicide deny having suicidal thoughts the last time they are asked before they die (Busch, Fawcett, & Jacobs, 2003). Another concern is that many predictive models rely too much upon previous suicide attempts as an indicator of future attempts, which is ineffective because two-thirds of suicide deaths are first-time attempts (Tsui et al., 2021). In short, ideally we would find ways to predict suicidal behavior without taking into account whether they’re having suicidal thoughts or whether they’ve attempted suicide before.
  Two studies seek to address these concerns. In the first, Horvath, Dras, Lai, & Boag (2021) created and compared eight questionnaire-based ML models for predicting suicidal behavior, some of which included the answers to the questions about about SI and some of which excluded them. They found that one of their models that did not have access to the information about SI was able to predict suicidal behavior better than any of the others, with sensitivity 78.6% and precision over 90%. This provides great hope for the future usefulness of ML algorithms that do not depend on SI to make their predictions. A few researchers have also sought to address the second concern, that predicting suicide attempts relies too much upon previous attempts. By combining information from medical data and from doctor’s written notes, they were able to predict first-time suicide attempts (Tsui et al., 2021). These two studies taken in conjunction make machine learning a far more promising tool for predicting suicidal behaviors, because they show that it is versatile enough to make accurate predictions even when available information is limited, as it often is in real-world settings.

Predicting Suicidal Behaviors in the Short Term
  Although the results of Horvath et al. (2021) and Tsui et al.’s (2021) studies are valuable, exactly how valuable depends on the question you ask. One one hand, the studies show that machine learning can predict suicidal behaviors not only accurately but robustly. With enough data, results can be good even without key pieces of information. But on the other hand, to translate the research into real-world usefulness, a significant question remains to be answered: Once we have a prediction, what do we do with it? Because all of the studies described in the previous section were retrospective, their predictive ability has either no specific timeframe or one too long to be useful. Generally, Macalli et al. (2021) and Weller et al. (2021) address the question “Is this person likely to be having suicidal thoughts or behaviors right now or to have them sometime in the future?” Generally, Walsh et al. (2018), Horvath et al. (2021), and Tsui et al. (2021) answer the questions “Is this person likely to attempt suicide sometime in the future?” Notice how vague both questions are. None of the studies can provide information about the urgency of someone’s situation, such as how likely he is to attempt suicide in the next week or what the most effective way to help him is. Less research has been done seeking to answer these questions than the less specific ones, but they are the questions we need to answer to prevent suicide.
  Walsh et al., in 2021, narrowed the gap by designing a machine learning model to produce an estimated risk of suicide attempt in the next 30 days for each patient seen at one medical center. The first difference between this and previous studies is that this one has a specific time window, 30 days. The second is that the predictions were updated in real time. By continuously syncing new electronic health record (EHR) data input by nurses and doctors at the medical facility, the model was able to respond instantly to new information and provide an up-to-date assessment of each patient’s suicide risk. This type of model could be built into any large medical system to act as a continual protective screen, acting faster, more thoroughly, and potentially more accurately than traditional face-to-face screening for STB.
  Another study, like Walsh et al.’s (2021), relied upon EHR data to make short-term predictions, in this case of SI. However, these researchers also utilized a tool called ecological momentary assessments (EMAs). EMAs are short questionnaires about a person’s mood and behaviors that are given multiple times a day. They have significant advantages over regular backwards-looking questionnaires, because people rarely remember clearly everything they did in past days and how they felt as they did it. By asking about how you feel right now instead of how you felt sometime in the past, EMAs yields more accurate and more timely information.
  A significant result of the combination of EHR and EMA data was a 19.65% increase in sensitivity while predicting suicidal ideation compared to state-of-the-art techniques based solely on EHR data (Peis et al., 2019). The ability to ask questions throughout the day of those at risk of suicide and use ML to interpret their answers is a huge step toward effective ML-based prevention efforts. Because of this and similar results, and because the advent of smartphones makes EMA administration easy, EMAs are making their way into the research conversation in various fields. Full coverage of their uses is beyond the scope of this paper, but readers are encouraged to become familiar with the research on EMAs so as to incorporate them more expertly into suicide prevention research.
  For example, Mikus et al. (2017) explored one creative way to use EMA data for identifying short-term changes in mood of depressed patients. Responses to an EMA may be taken at face value, which is the approach Peis et al. (2019) took, but all data also produces metadata, like the Twitter metadata utilized by Jung et al (2021). Mikus et al. built an ML model using both EMA responses and EMA metadata, including adherence, or how long the patient takes to respond to the EMA after being prompted, and usage data, or how long the patient spends on the EMA app. They found that neither adherence nor usage data had a significant effect on the model’s predictions, but there is ample room for further exploration of similar techniques for predicting suicidal behavior in the short term.

Preventing Suicide
  From short-term suicide prediction, we are one small step away from the crux of the issue: not identification, nor long- nor short-term prediction, but prevention. Through the lens of machine learning, it appears that we are nearing the limits of its role in the problem, since machine learning excels at finding patterns but certainly not at intervening in the real world with love, patience, and emotional intelligence. However, even in this, the prevention itself, a few researchers have envisioned a part to play for machine learning.
  I’ll begin with the only real-world experiment I found incorporating both machine learning predictions and systematic attempts to intervene when someone at risk was identified. I provide more details about this study than others, because many aspects of it are unique and therefore insightful. Similar to Lekkas et al. (2021), Burnap et al. (2017), and Jung et al. (2021), the researchers use social media to find and identify people at risk. Using text from a suicide-related microblog group, they trained several ML models, the best of which they then used to identify expressions of STB on Weibo, China’s Twitter equivalent. From there, counselors working with the researchers direct messaged the at-risk users and asked them to complete a psychological survey. If the survey confirmed they were experiencing STB, the counselors offered online therapy. In total, 12,486 people were sent messages, 5542 responded, and 1403 completed the survey correctly, but only 27 interacted with the counselors for more than 10 days. The biggest limitation of the study was a questionable metric; to gauge success, they tracked the difference in death-related words (positively correlated with STB) and future-related words (negatively correlated with STB) in the Weibo accounts of the people who replied to the direct message before and after the counselors contacted them. The change was statistically significant, but how effective they truly were in preventing suicide remains unclear (Liu et al., 2019).
  Insights from this study are plentiful. It reconfirms that ML can process much more data much more quickly than a human can, which is critical in a problem as widespread as suicide. It shows one way collaboration between machine predictors and human helpers can yield positive results, providing a model for similar future studies. It reveals the need for more thorough data collection in the follow-up, so that we can more accurately measure and compare results. And it suggests that the best suicide prevention programs may begin with online data but end with in-person interventions, because only a minute percentage of the initial participants retained interest in an online counselor for more than a few days. There are numerous available paths for future research. It has been well established that machine learning is great for predicting suicidal thoughts and behaviors, and Liu et al.’s study (2019) provides a springboard for other researchers, past prediction and into prevention.
  Others have proposed more ways to use machine learning to prevent suicide, though these papers merely lay out the theory of their ideas. Kelly et al. (2012) envision intelligent real-time therapy, which would go one step beyond Peis et al. (2019) and Mikus et al.’s (2017) EMA-based prediction models by using machine learning to automatically determine the best way to help people when their EMA responses indicate that their mood has dropped. The ML model, built into each person’s phone, could open a funny video, offer a piece of therapeutic advice, or suggest that the person call a loved one and then measure the effect of each intervention on the patient’s mood, learning over time what helps each person most in any given situation.
  In another non-experimental paper by Reale et al. (2021), the researchers spent time in a military primary care setting interviewing physicians and nurses about how they would feel using computers to predict suicidality. In a way, Lin et al.’s (2020) study is one side of the coin, and Reale et al.’s is the other; the former makes predictions of SI in military personnel using ML, and the latter finds out what military doctors would do with those predictions. This type of research bridges the gap between technical, theoretical research and its real-world, domain-specific implementation, and it is recommended prior to any attempt to put machine learning models into practice. The ideal situation would be for researchers to organize a feedback loop between computer scientists and medical workers. Medical workers tell computer scientists what they need, computer scientists make an ML model to meet the need, medical workers put it into practice and in time suggest improvements, and the cycle repeats, little by little improving the model’s predictions and improving how it is used to save lives.

Conclusion
  There is little doubt that machine learning has a role in the future of suicide prevention. What is less certain is how large of a role it can play. As a first step, and on a large scale, multiple researchers have built models capable of identifying, with a high degree of certainty, likely indicators of STB from among thousands of social media posts or survey answers. Predicting future suicidal behavior is more complex, and models’ predictive abilities vary with the type of data input, the population of interest, and the narrowness of the time frame considered. In spite of the complexity, many researchers have built good models for retrospectively predicting STB based on other health data and surveys, though few have put their models to the test and made predictions about the future. As the goal becomes loftier, the research becomes more scant; few people have tried using ML to predict suicide attempts in the near future and to intervene in people’s lives when needed.
  For the reader, I suggest a single most important takeaway: the merger of suicidology and machine learning is far from over. Plenty of research shows that machine learning techniques can accurately predict suicidal thoughts and attempts, and yet there have been few attempts to use that information in a meaningful way. Almost all the research has been from a bird’s-eye view, distant and theoretical. What is needed is more cooperation with those on the front lines of the fight against suicide, medical and psychological professionals, and studies designed to directly measure the number of people helped, not an indirect representative statistic. There’s also a lot of potential in using smartphones, including EMAs and social media, for real-time prediction, and, perhaps most importantly, for connecting those at risk to in-person help as quickly as possible.
  I believe a group of people with the right resources and the determination to save lives could build something that would make a real dent in the suicide pandemic, because the research has laid a solid foundation for them. We know machine learning is a powerful tool for predicting suicide; now we need to get our heads out of our datasets, into the real world, and use it.

References
Burnap, P., Colombo, G., Amery, R., Hodorog, A., & Scourfield, J. (2017). Multi-class machine classification of suicide-related communication on Twitter. Online Social Networks and Media, 2, 32-44. doi: 10.1016/j.osnem.2017.08.001

Busch, K. A., Fawcett, J., & Jacobs, D. G. (2003). Clinical correlates of inpatient suicide. The Journal of clinical psychiatry, 64(1), 14–19. https://doi.org/10.4088/jcp.v64n0105

Gould, M. K., Huang, B. Z., Tammemagi, M. C., Kinar, Y., & Shiff, R. (2021). Machine learning for early lung cancer identification using routine clinical and laboratory data. American Journal of Respiratory and Critical Care Medicine, 204(4), 445–453. https://doi.org/10.1164/rccm.202007-2791OC

Horvath, A., Dras, M., Lai, C. C. W., & Boag, S. (2021). Predicting suicidal behavior without asking about suicidal ideation: Machine learning and the role of borderline personality disorder criteria. Suicide and Life-Threatening Behavior, 51(3), 455-466. doi: 10.1111/sltb.12719

Jung, W., Kim, D., Nam, S., & Zhu, Y. (2021). Suicidality detection on social media using metadata and text feature extraction and machine learning. Archives of Suicide Research. doi: 10.1080/13811118.2021.1955783

Kelly, J., Gooding, P., Pratt, D., Ainsworth, J., Welford, M., & Tarrier, N. (2012). Intelligent real-time therapy: Harnessing the power of machine learning to optimise the delivery of momentary cognitive-behavioural interventions. Journal of Mental Health, 21, 404-414. doi: 10.3109/09638237.2011.638001

Lekkas, D., Klein, R. J., & Jacobson, N. C. (2021). Predicting acute suicidal ideation on instagram using ensemble machine learning models. Internet Interventions, 25. doi: 10.1016/j.invent.2021.100424

Lin, G., Nagamine, M., Yang, S., Tai, Y., Lin, C., & Sato, H. (2020). Machine learning based suicide ideation prediction for military personnel. IEEE Journal of Biomedical and Health Informatics, 24, 1907-1916. doi: 10.1109/JBHI.2020.2988393

Liu, X., Liu, X., Sun, J., Yu, N. X., Sun, B., Li, Q., & Zhu, T. (2019). Proactive suicide prevention online (PSPO): Machine identification and crisis management for chinese social media users with suicidal thoughts and behaviors. Journal of Medical Internet Research, 21. doi:10.2196/11705

Macalli, M., Navarro, M., Orri, M., Tournier, M., Thiébaut, R., Côté, S. M., & Tzourio, C. (2021). A machine learning approach for predicting suicidal thoughts and behaviours among college students. Scientific Reports, 11, 1-8. doi: 10.1038/s41598-021-90728-z

Mikus, A., Hoogendoorn, M., Rocha, A., Gama, J., Ruwaard, J., & Riper, H. (2017). Predicting short term mood developments among depressed patients using adherence and ecological momentary assessment data. Internet Interventions, 12, 105–110. https://doi.org/10.1016/j.invent.2017.10.001

Peis, I., Olmos, P. M., Vera-Varela, C., Barrigon, M. L., Courtet, P., Baca-Garcia, E., & Artes-Rodriguez, A. (2019). Deep sequential models for suicidal ideation from multiple source data. IEEE Journal of Biomedical and Health Informatics, 23(6), 2286–2293. https://doi.org/10.1109/JBHI.2019.2919270

Reale, C., Novak, L. L., Robinson, K., Simpson, C. L., Ribeiro, J. D., Franklin, J. C., Ripperger, M., & Walsh, C. G. (2021). User-centered design of a machine learning intervention for suicide risk prediction in a military setting. AMIA Annual Symposium Proceedings, 2020, 1050–1058.

Schafer, K. M., Kennedy, G., Gallyer, A., & Resnik, P. (2021). A direct comparison of theory-driven and machine learning prediction of suicide: A meta-analysis. PLOS ONE, 16, 1-23. doi: 10.1371/journal.pone.0249833

Tsui, F. R., Shi, L., Ruiz, V., Ryan, N. D., Biernesser, C., Iyengar, S., Walsh, C. G., & Brent, D. A. (2021). Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts. JAMIA Open, 4(1). https://doi.org/10.1093/jamiaopen/ooab011

Walsh, C. G., Johnson, K. B., Ripperger, M., Sperry, S., Harris, J., Clark, N., Fielstein, E., Novak, L., Robinson, K., & Stead, W. W. (2021). Prospective Validation of an Electronic Health Record-Based, Real-Time Suicide Risk Model. JAMA network open, 4(3), e211428. https://doi.org/10.1001/jamanetworkopen.2021.1428

Walsh, C. G., Ribeiro, J. D., & Franklin, J. C. (2018). Predicting suicide attempts in adolescents with longitudinal clinical data and machine learning. Journal of Child Psychology & Psychiatry, 59, 1261-1270. doi: 10.1111/jcpp.12916

Weller O., Sagers L., Hanson C., Barnes M., Snell Q., Tass, E. (2021) Predicting suicidal thoughts and behavior among adolescents using the risk and protective factor framework: A large-scale machine learning approach. PLOS ONE, 16(11). https://doi.org/10.1371/journal.pone.0258535