Author(s): Claudia Simone Dorchain
The revealing connection between language and character has been known since antiquity. Nevertheless, the attempt to measure not only the ordinary, but especially the criminal mind by means of language has not yet been significantly successful in psychological-forensic research. Currently, there are already several questionnaires in practice for measuring the tendency to commit crimes on the basis of antisocial personality disorder, as well as various attempts at speech pattern analyses. Both are prone to error: the questionnaires because of the symptomatic tendency of many criminal subjects to manipulate the answers, and the speech analysis models for reasons inherent in the method. CKM (Cluster Category Model) is a new test method and represents a synthesis of humanistic knowledge about human personality psychology, in order to arrive at precise statements about potentially deviant character tendencies by means of a content analysis of texts bound according to seven clusters.
There has always been a revealing connection between human language and character: language characterises character, character shapes language. This fundamental insight into language as an indicator of character was already known to ancient philosophers. According to legend, Socrates once said to a young man, ‘Speak, that I may see you!’ in Latin: loquere, ut te videam) – the idea being that the individual speech patterns would reveal the character and motivation of the student [1]. The fundamental connection between individual speech patterns and personality was thus already known to the ancient Greeks, who were not only great philosophers but also good educators and judges of character. But systematic research in this area has not been handed down to us from antiquity.
In the early modern period, many writers such as Giovanni Boccaccio (1313-1375), Jean de la Bruyère (1645-1696) and Friedrich Schiller (1759-1805) dealt with language and its relationship to character – long before psychology existed as a science. In Schiller's plays, such as Don Carlos, exalted speech patterns can be found that characterise neurotic, histrionic or criminal individuals. Georg Christoph Lichtenberg (1742-1799), a naturalist, aphorist and contemporary of Johann Wolfgang von Goethe (1749-1832), was one of the first to deal with the complex psychological interplay of verbal and non-verbal expressions – albeit sporadically. Among classical literary scholars, it was generally limited to specific individual characterisations within linguistic and character research, but still without systematic value. The first scientist to recognise and name systematic connections here was the influential linguist Wilhelm von Humboldt (1767- 1835). For Humboldt, language always revealed a complex ‘world view’, both collectively, i.e. for the community of all speakers of a particular language, and individually, for the individual [2].
Humboldt distinguished language into two components
His thesis was that the ‘ergon’ (the linguistic expression) only changes when the ‘energeia’ (the attitude) has changed first. Particularly interesting in the Humboldtian context of ‘ergon- energeia’ would, of course, be the question of the extent to which an individual's language changes when that individual has psychopathic traits. But the systematic study of individual deviant character and language remained an unfulfilled aspiration of scholars until the end of the 19th century.
The first systematic impulses in language analysis only came with Sigmund Freud (1856-1939) in psychoanalytic research. The basic assumption here was that the subconscious would become visible in random slips of the tongue or in a joke [3]. Freud himself did not develop any psychometric programmes for analysing speech patterns, but the fundamental psychoanalytical assumption, based on Humboldt's idea, that the unconscious shapes the ‘energeia’ of speech patterns as their recognisable ‘ergon’, remained decisive for procedural research in the following decades [4,5].
Psychometric tests always attempt to generate information about the state of mind of the subjects on the basis of a specific database. For example, potential disruptions may be conceptualised and incorporated into the test procedure, operationalised by items that suggest specific deviations from the norm personality [6]. Recently, interesting correlations were obtained between cases of depression in companies and linguistic patterns in employee surveys on the basis of the Occupational Depression Inventory (ODI), and some of these were also transferred to computer-based review procedures [7]. But depression, as important as it may be for private and economic contexts due to its increased frequency in the general population in Western industrialised nations, is rarely a criminogenic factor. Grievance, on the other hand, is different, as it can indeed have a criminogenic effect: a dictionary of common threats and hate speech caused by grievance has recently been published [8]. Specifically with regard to the state of mind of the criminal, there are, among others, the following recognised (mostly language-based) approaches in test-theoretical research in psychology today, which are used individually or in combination in practice
The clinical-symptomatic approach of the Dark Triad/ Dark Tetrad can be exemplified by individual language patterns (‘typical expression of a narcissist’), whereas the significantly sociological approach of the CVS generally considers class-specific language patterns rather than individual ones (‘characteristic speech patterns of a white-collar offender from the lower middle class’). The individual-pathological-linguistically oriented approaches of HPC or PCL, APQ and PICTS represent a more or less good balance between the individual and specific-collective language patterns in their test-theoretical representations.
The Dark Triad/ The Dark Tetrad
Narcissism, machiavellianism and psychopathy (in subclinical form) represent the three components of the ‘Dark Triad’, which has become one of the most popular models in deviance research [9].
Narcissism is understood in the Dark Triad, analogous to the ICD-10 that was valid at the time, as a massive personality disorder accompanied by grandiosity fantasies; machiavellianism is considered the basis of ruthless manipulative behaviour (‘the end justifies the means’); and (subclinical) psychopathy primarily means egocentricity, disinhibition and fearlessness, which significantly increases the likelihood of criminal behaviour. Applications of Paulhus & Williams' model to sub populations show significant correlations between parental styles, Dark Triad traits, violence, and crime [16]. There is also some research on the use of certain words, phrases and speech patterns by individuals with Dark Triad character traits [17]. A further development of the diagnostic model of the Dark Triad is the Dark Tetrad, which additionally includes the symptom of sadism and understands it as a form of escalation [10]. However, there is hardly any empirical research on the possible connection between sadism and language, possibly because sadism is implicitly understood as ‘violently silent’ with Hannah Arendt's famous dictum on the power of violence.
Disadvantage: The classification of the Dark Triad (two clinical symptoms, one literary reference) is imprecise and therefore does not allow for the exact derivation of test methods; in addition, the elements of the Dark Triad/ Tetrad are comorbid.
In contemporary test theory, the Crime and Violence Scale (CVS) is also widespread as a psychometric procedure that is often used in empirical research and is also widely discussed. It establishes systematic and supra-individual relationships between violence (readiness to use violence), various crimes and demographic factors, for example by linking the factors of ethnicity, age and gender [11-13].
Correlations arise between male gender, gun ownership and propensity to commit crimes. Interestingly, according to the studies, perpetrators often behave in a ‘role-specific’ manner or fulfil conservative role expectations, for example, in that women with a propensity to commit crimes primarily commit acts in their immediate environment or generally commit acts related to relationships more often. The CVS can be used interculturally and is able to provide significant insights into subpopulations. However, linguistic aspects as indicators play a subordinate role in the CVS.
Disadvantage: The Crime and Violence Scale is less suitable for measuring the individual offence history of a single offender, as it helps to depict the collective and population-dependent.
In the 1970s, the Canadian psychologist Robert Hare developed the Hare test and the Hare Psychopathy Checklist (HPC or PCL) with 22 items. The Hare test uses a semi-structured interview and additional linguistic information (e.g. court transcripts, assessments by probation officers) to examine the psychopathic state and the likelihood of recidivism of criminals. Although the Hare test chronologically predates the APQ in the history of psychometrics, i.e. it is younger, it is more differentiated rather than undifferentiated: it measures ‘psychopathy’ in its various degrees of occurrence, not merely antisocial behaviour. Although designed as a diagnostic tool, the PCL also has a high predictive value for recidivists [18].
Disadvantages: The PCL depends entirely on the quantity and quality of the case file and can lead to a high degree of bias if the data density is insufficient.
Here, the subjects are asked questions that suggest antisocial behaviour, such as: ‘Would you go to the cinema without paying if no one noticed?’, ‘Do you often feel like breaking things?’ or ‘Do you become suspicious when people are friendly to you?’ The scientific – and correct – basic assumption of APQ is that antisocial personality disorder as defined by ICD-10 is a predictor of criminal behaviour. In addition, this test theory correlates with three factors of the ‘Big Five’ model, namely agreeableness, neuroticism and extraversion, which increases comparability between test models and makes APQ attractive for comparative research at the model level [19]. However, it is not language that is considered an indicator for APQ, but rather latent or evident behavioural tendencies, and language, if at all, only in direct connection with these.
Disadvantage: Not all criminal tendencies can be explained by antisocial personality disorder and therefore remain undetected by APQ in the event of an eventuality.
PICTS is a more elaborate personality test that includes the findings of Hare and Blackburn & Fawcett and goes beyond them, using 80 items to capture the thought processes of criminals (and thus indirectly their speech patterns). The PICTS test is innovative in that it goes beyond mere behaviour and takes the cognitive component into account, which in turn allows for a strong conceptual similarity to Humboldt's fundamental psycholinguistic research. There are eight main factors of ‘criminal thinking’: mollification, cutoff, entitlement, power orientation, superoptimism, sentimentality, cognitive indolence and discontinuity. These factors prove to be quite conclusive indicators in practice, as there is an overlap with the thought patterns of the Dark Triad personality types. The PICTS test has been critically reviewed many times, and it has been found to provide a weak but statistically significant predictive probability of future offences based on habitual thought patterns [14,15].
Disadvantage: PICTS is quite abstract due to its cognitive point of view and less oriented towards concrete speech patterns, which is why ‘desired’ or manipulative results may occur as deception in the respondent.
A core problem of previous psychometric methods for criminality, regardless of the possible sophistication of their items, remains that these tests are mainly designed as questionnaires. In practice, this often results in inadequate reliability, since criminal subjects often lie and deliberately engage in ‘perception management’ or have limited self-awareness due to symptomatic perception disorders [20]. Assessments by third parties, including professional specialists such as psychologists and social workers, which can usually be included in the data for evaluation, are rarely free of bias.
Furthermore, all the tests mentioned (which are weakened at PICTS) are based on ex post facto analyses, i.e. they were implemented after the actions and evaluate the actions and therefore do not help much in assessing criminal or pathological potential. The research question remains: how could one, on the one hand, assess future risks or recidivism of (repeat) offenders at an early stage, exclude as much bias as possible and, in addition, test subjects discreetly without them perceiving the test condition as such and deliberately laying false tracks through wrong answers?
An innovative possibility would be to remember Socrates and Humboldt in the linguistic forensic analysis of ordinary speech patterns according to the model
Cognitive patterns -> Speech patterns -> Behavioural tendencies
This chain of causation was already known to early psychological researchers and can now be operationalised using, for example, the recognised LIWC-22 method, which analyses language in spoken and written texts and combines it with a software program [21]. LIWC-22 provides a dictionary that classifies emotional states into 10 scale scores: affect, tone_pos, emotion, emo_neg, emo_sad, verbs, focuspast, communication, linguistic and cognition. One disadvantage here, however, is that the assignment of emotional states specifically to a potentially criminogenic profile is left to the discretion of the investigator.
For the reasons mentioned above, previous psychometric methods and dictionnaires are often insufficient or insufficiently precise in practice to determine a person's criminal potential or actual behaviour. It is also problematic that intentions, which are always significant in criminal law, can only be determined retrospectively, if at all [22]. This is where CKM, the Cluster-Category-Model, comes in. CKM, like every test model, is based on three components: database, concept and method. The CKM database consists of (in basic research) authorship texts, the concept is that of the deviant personality in a profound definition of 7 clusters, the method is a qualitative content analysis with defined categories as a correlation to the 7 clusters in linguistically associated items.
CKM was developed in collaboration with the Profiler's Institute in Frankfurt, which primarily works in the field of white-collar crime. Today, this area increasingly includes cybercrime, which corresponds to a social trend of relocating crime to the Internet [26]. Cybercrime includes offences such as cyberbullying, cyberstalking or even targeted smear campaigns on the Internet against individuals or companies. The perpetrators or clients are often individuals seeking revenge, such as ex-partners after an emotionally disturbing separation; statistically, however, they are much more often companies that want to increase their market share by deliberately spreading false information about their competitors in order to reduce demand for their products or services. In this case, the offender themselves spreads untrue negative factual claims on social media, publishes sensitive personal data, defames character, or even commissions authors (collectives) to falsify company ratings and blog entries. These highly criminal practices, which violate data protection, personal rights and possibly also competition laws, have the advantage for the test theory that there is sufficient data material available for evaluation.
The profiler collects the texts mentioned by the suspect and analyses them in terms of content in the following linguistic forensic procedure. The language patterns are assigned to the individual clusters in a qualitative content analysis, which form seven syntactic and semantic patterns of their own as categories (there is a dictionary for this purpose), they are marked in colour and evaluated in frequency and amplitude. The result is a numerically abstract profile, such as C1-C2-C4, which can provide information about motives, affinity, offender characteristics and their genesis, as well as the risk of recidivism. The profile can also show at which stage of childhood or adolescence an offender was disturbed and which complexes he acts out through criminal behaviour, since each cluster corresponds to a stage of development. This also increases the profile validity and the prediction probability. In addition, CKM includes graphological evidence by associating clusters and handwriting samples. Ideally, handwriting samples from the suspect can be included in the database, or the subject is asked to provide a handwriting sample.
In several case studies of intensive offenders, a convincing detailed character profile with a high probability of prediction was possible. In addition, the differentiated stylometry-based approach makes it possible to identify authors in the case of several suspects or perpetrator collectives using CKM, which has been a central question in psychological-criminological research for a decade [27-28]. CKM also provides a good assessment of the risk of recidivism in imprisoned offenders and serial offenders, as well as a risk assessment of first-time offenders or offenders on probation and antisocial individuals with a tendency to commit crimes who have not yet been convicted. CKM can also provide additional help for therapists in the penal system, as well as being used in the future as further training for social workers, lawyers and judges.
CKM was first presented at the international ATINER conference on psychology in Athens in May 2023. The Department of Clinical Psychology and Neuropsychology at Lund University, Sweden, subsequently criticised the psychoanalytic elements of the test and pointed out that psychometric methods in Scandinavia, but increasingly also in continental Europe, are primarily structured in terms of behavioural therapy. The counter-criticism to this is that only parts of the CKM concept are psychoanalytically based and, apart from that, psychoanalysis is still considered a valid theoretical construct that should be covered by statutory health insurance. Furthermore, therapy should not be confused with diagnostics: as true as it is that therapy in the prison system in Europe, if not worldwide, is behaviourally oriented, an aetiopathology of possible criminogenic causes from the theoretical arsenal of behaviourism itself can only be derived in rare cases.
CKM is conceptually a postmodern continuation of the ancient attempt to decode human character through language. However, since the individual case-related empirical research on which CKM is based comes from the area of white-collar crime/ cybercrime and is therefore highly specific (even though this type of crime is currently and will probably remain ‘en vogue’), and is always tied to sufficient data as a connecting factor, in practice there is currently still limited reliability [29]. In terms of test theory, this issue is similar to that of the recognised HPC or PCL, the informative and predictive quality of which is always positively correlated with the amount and quality of data, which can vary greatly from case to case [18]. The GESIS Leibniz Centre for Empirical Social Research in Cologne has therefore advised that a basic test should also be developed to enable speech forensic analysis in the future even if the data basis is intersubjectively different and/ or insufficient.
In addition to the development and integration of a basic test, experts from the business world advised that the group of subjects be expanded from cybercrime suspects to include the general population and that a psychometric test be developed that can correlate the speech patterns of each citizen with character traits. In terms of test theory, the results of a subject's character traits could be compared with the average values of a standard sample according to CKM. In terms of application, this would automatically open up a wide range of possibilities. Such tests would also be suitable for widespread use, for example by human resources recruiters who want to fill a specific management position. For use in Germany in the field of human resources, the test would also have to be linked to the DIN 33430 standard [30]. However, CKM applications are also conceivable for non-professional purposes, such as online dating platforms that want to exclude candidates with antisocial and/ or violent tendencies, which is a relevant preventive topic in times of increasing domestic violence [31].
A basic test is currently being developed in the form of a (partially) standardised questionnaire with items that have no recognisable connection to antisocial behavioural tendencies. This would enable discrete research and would be used in the next step of the CKM method. In this respect, CKM would deviate from its original intention of completely indirect testing, but it could gain significantly in construct validity. Data material going beyond the basic test, such as forum and blog entries, social media postings, court and interrogation protocols, can still be integrated in individual cases. The aim is to conduct large-scale research with the basic test and/ or the basic test plus individual data material to increase validity.
After positive validation, CKM could also be fed into self-learning speech recognition software, which corresponds to the increasing relevance of AI in test theory [32]. Depending on the legal situation at the place of use, this would open up further areas of application for private individuals and institutions.