Risk-benefit considerations of male circumcision

Last week, the Norwegian socialist party (SV) added to their program that they wanted an age limit on male circumcision, at 15 years old. The socialist party is pretty small and the major parties do not agree, so there will be no change in Norwegian legislations on this. But there is one thing that have been bothering me – and does not really depend on one’s view on circumcision – which is the evidence on risk and benefits of circumcision.

A couple of years ago, male circumcision was much debated in Norway, and I considered writing something on this then. Never got around to it, though. I suppose similar discussions appear elsewhere from time to time, so this might be of some general interest. In the debates I’ve seen, those defending the practice frequently state that circumcision did no or little harm and actually have health benefits. The main scientific source referred to (at least in the debate in Norway) was the review of research published in Pediatrics in 2012. It was an official policy statement issued by the American Association of Pediatrics (AAP) and there was also a technical report which reviewed current evidence. As far as I am aware, this is still the AAP’s official stand on this matter. Importantly, the AAP do not generally recommend circumcision and it is not a considered a medical necessity. The main conclusion of the report is that the benefits outweight the risks. My concern here is what this means. The report discussed many potential risks and benefits of which some seems rather trivial to me, while others seems important. So how did the authors weight the risks and benefits against each other? I have no idea, because that is not described neither in the technical report or policy recommendation.

The report lists all evidence for and against a range of outcomes (most evidence is “fair”, and I think no evidence is “excellent”), but there are no judgments of how important each outcome is compared to the others. Clearly, it is hard to compare risks for very different diseases and complications, and I have no idea how this should be done. Both risks and benefits can be either trivial or important or anything in between. How do you weight a trivial harm with high probability against a serious benefit with extremely low probability? How do you weight a catastrophic complication of the procedure with extremely low probability against a certain but trivial benefit? How do you weight heterogeneous effects? These are very important considerations, and the panel have apparently made such considerations since they say these outcomes are weighted against each other. Still, the only argument is a long list of evidence on various outcomes. The considerations of trade-offs are highly informal, and I have no idea how they weight the unknowns.

I suggest that when writing a review and giving policy recommendations (on any topic) where you say something like “benefits outweight the risks“, at least you really need to get the following straight:

  1. If you say you have weighted benefits against risk, this better be clear and explicit. Just saying so is just not good enough. Not all risks and benefits are equally important and should be given different weights accordingly. Some risks should perhaps be avoided at all costs, some are quite manageable, and other risks are of no concern. Same applies to benefits: some are important, others are not. This needs to be made clear. Even if done badly would be better than nothing as it would allow important trade-offs to be discussed by others. If not explicit, I doubt it has been done at all.
  2. Moreover, some benefits can be achieved by other means without risk. With no such considerations, the weighting of pros and cons have not really been done.
  3. Risks which are possible but no estimates exist needs to be explicitly considered anyway. For example, if you do (as AAP does) state things like e.g. “there are no adequate studies of late complications”. You need to make it clear how lack of knowledge are given weight.  

My main point is just that the AAP lacks a clear argument for their policy recommendation. It seems pretty opaque.

The post Risk-benefit considerations of male circumcision appeared on The Grumpy Criminologist 2018-04-23 10:00:26 by Torbjørn.

Norwegian politics: Sorry seems to be the hardest word

There is political turmoil in Norway right now. The New York Times give a brief account in English in this link. It all started when our government suggested that citizens with dual citizenship who posed a threat to Norway could loose their citizenship, of which almost all political parties agreed. Seems reasonable. The debate was then whether this should be decided by a court or administratively by the ministry of Justice. The majority of the parlament (including the Labour party) said it should be done by a court.

That disagreement on the procedure for how to loose citizen rights made our former minister of Justice to post on Facebook that the Norwegian Labour Party is a treat to our nation, and that they give priority to terrorists rights over national interests, illustrating the point by including a photo of some scary looking jihadists. This might appear to be pretty silly and also clearly a false accusation given the actual political disagreement in this case was rather small. However, it is also the case that right-wing extremist conspiracy theories have been making similar claims for many years, and portraying the labour party as an enemy. This kind of conspiracy was an explicit motivation for Breiviks terror attack on Utøya, a camp for labour party youth organization, as well as the bomb outside the government building. Clearly, many were upset that the minister of Justice fuelled such thinking, willingly or unwillingly.

Everybody can do silly things. Most of us would apologize when we realize it, often straight away, and at least when we’re told. And certainly if the prime minister asks it from us. Our former minister of Justice did no such thing. It took six days until she was pretty much forced to apologize in the Parlament. As it was not at all convincing, she had to return to the podium four times to apologize, and still hardly anyone found it convincing. Maybe because she did not apologize the content of her statement. And maybe because she also refused to remove the post until it was discovered that the picture is owned by the Associated Press who did not permit it to be used in political campaigns.

Much can be said about this. But I will rather take a visual approach, making use of Microsoft API for analyzing facial expressions. What emotions do you express when you apologize if you do not really mean it?

Data

All debates in the parlament is filmed and available online here. I downloaded the film from parlament and used Windows Movie Maker to cut only the section where the Minister of Justice entered to podium the apologize the first time. I then used ffmpeg to split the film into snapshots, two each second, resulting in about 350 photos. I then used Azure Cognitive Services API for face detection and analysis, submitted each frame and collected the results to make some time series graphics.

Importantly: The Face API is a machine learning service that also can provide analysis of facial expressions. It scores eight emotions on a scale from 0 to 1, and the sum over emotions is 1. So each score can be interpreted as a probability. I could put some caveats here, but I won’t. Let’s just say it gives a pretty good idea of what emotions are expressed. The emotions are as follows: anger, contempt, disgust, fear, happiness, neutral, sadness and surprise. My expectation from an apology would be fairly high level of sadness. At least more sad than happy. Let’s see.

Results

The plot shows composition of emotions expressions by each picture frame (two pictures each second). The exact measures varies quite a lot, so this is the smoothed trend. There is perhaps less interesting development than I had hoped for, but on the other hand, I had no reason to expect a volatile emotional life on her part. Stable, but a slight shift from neutral to happiness (she was smiling a bit).

The facial expression is dominated by neutrality, at between 70 and 80%. Happiness accounted for the main bulk of the remaining, between 10 and 20%. Neutral and happiness accounted for about 90% throughout her speech. She scores pretty low on sadness, which might be one reason why nobody believed her apology.

The next plot shows the trends for the less dominant expressions during the same speech. I just think it is worth pointing out that her score on sadness went down towards the end while surprise increased. Maybe she sensed that this did not go down as well as intended.

I will get back to this later. Maybe it would be interesting to e.g. compare with others who have made apologies recently. How do you do this convincingly?

I could not make up my mind if this post should be a tribute to Elton John or Edit Piaf. So, Elton John got the title and Edith Piaf the coda.

 

The post Norwegian politics: Sorry seems to be the hardest word appeared on The Grumpy Criminologist 2018-03-21 15:08:20 by Torbjørn.

Moffitt review her own theory

Two days ago, Nature published a review article by Terrie Moffitt that “recaps the 25-year history of the developmental taxonomy of antisocial behaviour, concluding that it is standing the test of time…”

It should also be mentioned that the taxonomy has received quite a bit of criticism (which are not mentioned in Moffitt’s review), and I feel that also much of this critique is standing the test of time. It would have been a good thing if the 25th anniversary of the taxonomy took the time to clear up some misunderstandings, controversies, and make some clarifications. However, Moffitt refrains from doing so, and I am not so sure the debate has moved forward. I have made some contributions to this debate, and I think my points are as relevant as ever. See here and here. It feels a bit wrong that they too stand the test of time. Importantly, so does the critique made by others.

In her recent review, she repeats what she also claimed in her 2006-review of evidence: a very large and important part of the empirical evidence supporting her theory is from studies using this latent trajectory models which. A key piece of evidence seems to be the identification of the hypothesized groups, as she states: “Since the advent of group-based trajectory modelling methods, the existence of trajectory groups fitting the LCP and AL taxonomy has now been confirmed by reviews of more than 100 longitudinal studies”. The method is a particular kind of latent class model for panel data. I would say this evidence is pretty weak. First of all, my discussion of trajectory models makes it clear that seemingly distinct groups can be detected in data where there are none. Since the further test of hypotheses relies on the identification of groups, these hypotheses are not reliable evidence either. The empirical evidence for the taxonomy is thus equally consistent with competing theories, and thus at best very weak evidence for either. Others have made similar points as well. 

In her new article on page 4 she makes the claim group-based trajectory methods are capable of detecting hypothesized groups. The method does no such thing. It is a data reduction technique, which might be convenient for some purposes but it does not detect distinct groups. It creates some clusters, but it could equally well reflect an underlying continuous reality. Moreover, that the existence of these groups is confirmed across studies is so only if one accepts pretty much any evidence of heterogeneity in individual trajectories. As I pointed out in an article from 2009, the findings across studies are so divergent except that there is some kind of high-rate and low-rate groups, that it is hard to imagine any results from trajectory modelling that would not be taken in support of the taxonomy.

In short: At the best, the empirical evidence is consistent with the taxonomy. But this is largely uninformative as long as it is also consistent with pretty much all competing theories that acknowledge that different people behaves differently. The bottom line is that there are no evidence that there are qualitative differences between the “groups” (at least no such evidence are presented in Moffitt’s recent review). There might be quantitative differences, though.

The other risk factors she discusses and its relation to the groups could just as well be interpreted as differences in degree. However, on page 5, she dismisses that there might be quantitative rather than qualitative differences! (This is the closest to a clarification of whether she actually means literally distinct groups or not). Now, the evidence I have seen so far, shows that there are indeed differences between the average scores in the two groups, but most theories of criminal behaviour would expect higher scores on all risk factors for the highest-offending persons. While it sounds great that she proposed hypothesis in her 1993-article that have later proved correct – these hypothesis are also very general and consistent with other perspectives.

The key point here is that the empirical evidence is consistent with the taxonomy – and pretty much all other theories. It seems that the theory has not been put to a strict test in these 25 years. In a previous post, I made the following argument which holds generally:

I think (but I am not entirely sure), that in this context “testing a theory” only means findings that are consistent with a given theory. I think this is a generous use of the term “test”. I prefer to reserve the word “test” for situations where something is ruled out – or when using methods that at least in principle would be able to rule something out. In other words: If the findings are consistent with a theory but also consistent with one or several competing (or non-competing) theories, this is at best weak evidence for either theory. (This holds regardless of methods used). It is good that a theory is consistent with the empirical findings, but that is far from enough.

Second, in 2009 I wrote a more theoretical paper assessing the arguments in the taxonomic theory. A major point was that no argument are presented that there are distinct groups in the first place. However, one might argue that I have interpreted the theory too literally regarding the distinctness etc, so in this article, I also make an explicit discussion of this possibility. In 2009, I argued that since there are clearly some confusion regarding this issue, it would have been reasonable if someone (preferably Moffitt, of course) clarified if she really meant distinct groups or not. I am not aware any such clarification to date. But, as mentioned, she now goes a long way on dismissing the differences in degree interpretation (see her new article on page 5). I think the argument made by Sampson and Laub still holds: if LCP is just another term for high-rate, then the theory brings nothing new to the table. Indeed, all the mechanisms and risk-factors discussed are relevant and sound, but does not at all rely on a taxonomy as such.

In my view, the review should have concluded something like this: First, while much empirical evidence is consistent with the taxonomy, there is a lack of good evidence for the existence of groups. Second, there are still theoretical arguments that are unclear and needs specifications to allow for strict empirical tests. Nevertheless, the taxonomy has helped focusing on some important risk factors and mechanisms. (Although these factors were also known in 1993 according to Moffitt). Whether the taxonomy itself is needed to do so is less clear. Important work remains to be done.

What I am saying in some elaborate way is that the standards for what counts as empirical evidence in support of a hypothesis is too low. So is the precision level for “theories”. I know it is hard, but we should be able to do better.

 

 

PS Moffitt also refers to one of my articles on her first page when stating that “Chronic offenders are a subset of the 30–40% of males convicted of non-traffic crimes in developed nations”. My article says nothing of the kind, but tries to estimate how many will be convicted during their lifetime. It is just the wrong reference, but would of course recommend reading it 🙂

PPS I take the opportunity to also point out that while Nagin has previously claimed that my critique is simply based on a fundamental misunderstanding of his argument (see my comment on Nagin here), I have always argued, regardless of his position, that my methodological arguments are important because of how others – like Moffitt  has just demonstrated – misunderstand the methods and the empirical results. Nagin also has a responsibility to clarify such prevalent misunderstandings.

 

The post Moffitt review her own theory appeared on The Grumpy Criminologist 2018-02-26 20:47:05 by Torbjørn.

The very best academic insult

I was re-reading the fantastic debate collected in the book “The Positivist Dispute in German Sociology” where the contributions by Karl Popper really shines. Beyond the substantive points, the insults are among the very best! The following was directed at Habermas, but you can substitute the name with whoever you like:

It is for reasons such as this that I find it so difficult to discuss any serious problem with Professor <add a relevant name here>. I am sure he is perfectly sincere. But I think that he does not know how to put things simply, clearly and modestly, rather than impressively. Most of what he says seems to me trivial; the rest seems to me mistaken.

I wish Karl Popper had a blog.

 

The post The very best academic insult appeared on The Grumpy Criminologist 2016-08-12 13:09:09 by Torbjørn.

Best practise of group-based modelling

I had initially decided to not pick more on group-based modelling, but here we go:

In his recent essay on group-based modelling in Journal of Research in Crime and Delinquency (see earlier posts here and here), Nagin discusses two examples of uses of group-based modelling in developmental criminology. It is not clear whether these are mentioned because they are particularly good examples, as they are presented as “early examples”. Maybe it is of historical interest, or these examples are mentioned because they are much cited. Since Nagin’s article is basically promoting GBTM, I assume these are mentioned because they are good examples of what can be achieved using this method. In any case, I would have liked to see examples where GBTM really made a difference.

The first example is the article by Nagin, Farrington and Moffitt who, according to Nagin using SPGM made an important contribution for the following reason:

…what was new from our analysis was the finding that low IQ also distinguished those following the low chronic trajectory from those following the adolescent limited and high chronic trajectories. This finding was made possible by the application of GBTM to identify the latent strata of offending trajectories present in the Cambridge data set.

The article by Nagin, Farrington and Moffitt shows that the relationship between IQ and delinquency varies over groups. (One could of course also say that the relationship is non-linear, but they stick to discussing groups). Which is fine, but the contribution is mainly how the groups are created – although some elaborate testing of equal parameters are involved. If other ways of summarizing the criminal careers (as either continuous or groups) would not find anything similar, it remains what is a methodological artefact and not. However, only one methodological approach was used, so it is hard to assess whether GBTM actually was the only way of discovering this kind of relationship. Maybe alternative methods (e.g. subjective classifications or a continuous index) would have found the exact same thing in these data? Could very well be.

The second example mentioned by Nagin is also written by himself together with Laub and Sampson. This is a very influential paper on the influence of marriage on crime, but has a major flaw because of how GBTM is used. I have recently written an review article together with Savolainen, Aase and Lyngstad on the marriage-crime literature published in Crime and Justice. We commented on this paper as follows:

…they estimated group-based trajectories (Nagin and Land 1993) for the entire observational period from age 7 to 32 and then assigned each person to a group on the basis of posterior probabilities. In the second
stage, they regressed the number of arrests in each 2-year interval from age 17 to 32 on changes in marital status and quality, controlling for group membership and other characteristics. In addition, they conducted separate regression analyses by trajectory group membership.

We are somewhat hesitant to endorse this conclusion for methodological reasons. Because the trajectories
were estimated over the entire period—including the marital period—controlling for group membership implies controlling for post-marriage offending outcomes as well. We expect this aspect of the analytic
strategy to bias the results, but further efforts are needed to assess the substantive implications of this methodological approach.

I think our final sentence of this quote very mild. They were partly conditioning on the outcome variable and that is bound to lead to trouble. Frankly, I do not know how to interpret these estimates. In this case, GBTM made a real difference, but to the worse. This was probably hard to see at the time since GBTM had not yet been subject to much methodological scrutiny yet. It is easier to see now.

In sum, although these two studies have other qualities, they are not examples of real success stories of GBTM. My advice would be to come up with some really good examples. But perhaps the only real success story of GBTM is Nagin and Lands 1993-article (for reasons given here).

 

P.S. Actually, I know of far better examples of using group-based modelling. Neither of them is dependent on GBTM, but it adds a nice touch. For example, Haviland et al use GBTM to improve propensity score matching. Another example is my own work with Jukka Savolainen where offending in the pre-job entry period is summarized using GBTM. For both these studies, other techniques could have been used, but GBTM works very well. There exist also other sound applications.

The post Best practise of group-based modelling appeared on The Grumpy Criminologist 2016-08-11 12:36:42 by Torbjørn.

Is there a significance filter in criminology?

Is there publication bias in criminology? There could be a bias towards only publishing significant results, which is sometimes referred to as a “significance filter”. If so, researchers have incentives to do a bit of data snooping and p-hacking to make sure they get significant results without necessarily reporting all the massaging they have done.

Although not a study of the “significance filter”, a recent study of published experimental studies in criminology nevertheless includes some relevant information. It is reported that 68% of the reported effect estimates were not significant. This might be interpret as it is no not likely to be much of a significance filter in criminology. However, the finding is based on 402 effect estimates from 66 publications, which implies an average of six effect estimates per study. The probability of getting one out of six with p<0.05 is surely much larger than 0.05.

Perhaps it only takes at least one significant result to get published? If so, the expectation given the existence of a significance filter would be that that at least 66 out of the 402 estimates were statistically significant (16%). It would have been interesting to know how many of the studies did not find any significant findings, and how many reported significant findings only for subgroup analysis or alternative specifications – as well as how many subgroups and alternative specifications were (actually) tried out.

Maybe the cartoon XKCD got it uncomfortably right?

p_values

And perhaps also: “Hey, look at this interesting alternative outcome measure”.

 

P.S. I have no reason to think the significance filter is more/less prevalent in criminology than in other fields. It is probably similar to other social sciences.

 

The post Is there a significance filter in criminology? appeared on The Grumpy Criminologist 2016-07-25 12:38:33 by Torbjørn.

About the Weisburd paradox

The “Weisburd paradox” refers to the finding by Weisburd, Petrosino and Mason who reviewed the literature of experimental studies in criminology and found that increasing the sample size did not lead to increased statistical power. While this paradox has perhaps not achieved great attention in the literature so far, the study was replicated last year by Nelson, Wooditch and Dario in Journal of Experimental Criminology confirming the phenomenon.
The empirical finding that larger sample size does not increase power is based on calculating “achieved power”. This is supposed to shed light on what the present study can and cannot achieve (see e.g. here). “Achieved power” is calculated in the same way as conventional power calculations, but instead of using the assumed effect size, one uses the estimated effect in the same study.
Statistical power refers to the probability of correctly rejecting the null hypothesis, based on assumptions about the size of the effect (usually based on previous studies or other substantive reasons). By increasing the sample size, the standard error gets smaller and this increases the probability of rejecting the null hypothesis if there is a true effect. Usually, power calculations are used to determine the necessary sample size as there is no point of carrying out a study if one cannot detect anything anyway. So, one needs to ensure sufficient statistical power when planning a study.
But using the estimated effect size in the power calculations gives a slightly different interpretation. “Achieved power” would be the probability of rejecting the null hypothesis, based on the assumption that the population effect is exactly equal to the observed sample effect. I would say this is rarely a quantity of interest since one has already either rejected or kept the null hypothesis… Without any reference to external information about true effect sizes, post-hoc power calculations brings nothing new to the table beyond what the point estimate and standard error already provides.
Larger “achieved power” imply larger estimated effect size, so let’s talk about that. The Weisburd paradox is that smaller studies tend to have larger estimated effects than larger studies. While Nelson et al discuss several reasons for why that might be, they did not put much weight on what I would consider the prime suspect: a lot of noise combined with the “significance filter” to get published. If there is a significant effect in a small study, the point estimate needs to be large. If significant findings are easier to publish, then the published findings from small studies would be larger on average. (In addition, researchers have incentives to find significant effects to get published and might get tempted to do a bit of p-hacking – which makes things worse). So, the Weisburd paradox might be explained by exaggerated effect sizes.
But why care? First, I believe the danger is that such reasoning might mislead researchers to justify conducting too small studies, ending up chasing noise rather than making scientific progress. Second, researchers might give the impression that their findings are more reliable than it really is by showing that they have high post-hoc statistical power.
Just to be clear: I do not mind small studies as such, but I would like to see the findings from small studies replicated a few times before giving them much weight.
Mikko Aaltonen and I wrote a commentary to the paper by Nelson et al. and submitted it to Journal of Experimental Criminology, pointing out such problems and argued that the Weisburd paradox is not even a paradox. We were rejected. There are both good and bad reasons for this. One of the reviewers pointed out a number of points to be improved and corrected. The second reviewer was even grumpier than me and did not want to understand our points at all. When re-reading our commentary, I can see much to be improved and I also see that we might be perceived as more confrontational than intended. (I also noticed a couple of other minor errors). Maybe we should have put more work into it. You can read our manuscipt here (no corrections made). We decided not to re-write our commentary to a more general audience, so it will not appear elsewhere.
When writing this post, I did an internet search and found this paper by Andrew Gelman prepared for the Journal of Quantitative Criminology. His commentary on the Weisburd paradox is clearly much better written than ours and more interesting for a broader audience. Less grumpy as well, but many similar substantive points. I guess Gelman’s commentary should pretty much settle this issue. Kudos to Gelman. EDIT: , but also to JQC for publishing it. An updated version of Gelman’s piece is here – apparently not(!) accepted for publication yet.
The post About the Weisburd paradox appeared on The Grumpy Criminologist 2016-07-14 10:00:39 by Torbjørn.

Criminological progress!

I recently came across this article by David Greenberg in the Journal of Developmental and Life Course Criminology. I have previously seen an early draft, and I am glad to see it finally published! (Should have been published a long time ago as the version I saw was pretty good, but I have no idea why it has not). Greenberg shows how to use standard multilevel modeling with normal distributed parameters to test typological theories. The procedure is actually not very complicated: estimate a random effects model, use empirical Bayes to get point estimates for each person’s intercept and slope(s), and explore the distributions of those point estimates using e.g. histograms. And no: those empirical Bayes estimates do not have to be normal distributed! You need to decide for yourself (preferably up front) what it takes for these distributions to be in support of your favourite typology, so it requires a bit of thinking. This can all be done in standard statistical software, only requiring knowing a little bit about what you’re doing. It would be really nice to see previous publications using group-based models reanalyzed in this way.

The article also discuss a number of related modeling choices which are highly informative. So far, I have only read the published version of the article very quickly, and I need to read it more carefully before I fully embrace all arguments, but I might very well end up embracing it all.

I have noticed that it has been claimed in the literature that models assuming normal distributed random effects cannot test for the existence of subpopulations. Well, it is the other way around.

The post Criminological progress! appeared on The Grumpy Criminologist 2016-07-04 12:00:12 by Torbjørn.

Testing typological theories using GBTM?

As I mentioned in the post yesterday, I think the debates about group-based trajectory modeling have some unresolved issues. For this reason, I submitted a commentary to Journal of Research in Crime and Delinquency. I had two reasons for doing so. First, I think Nagin mischaracterized his critics, and I believe his essay was a willful attempt to avoid serious criticism by ignoring serious arguments. (Maybe I could have been less outspoken about that). But after all, he has not addressed the actual argument I (and others) have put forward. I can only interpret this as an attempt to avoid discussing the substantive matter by keeping silent, and now subtly dismissing the whole thing. If Nagin find it worthwhile saying his critics have misunderstood, he should also bother to point out how. So far, he has done no such thing.

Second, I actually think there is a need to clarify whether GBTM can test for the presence of groups or not. If the advocates of GBTM had been clear about this, it would obviously not have been needed. There is no doubt that Nagin and others have been clear that GBTM can – or maybe even should – be interpreted as an approximation to a continuous distribution. There is no disagreement on that point. But they have also given the impression that one can identify meaningful real groups in the data by way of GBTM. They have not been clear on what this really means or under what conditions this can be done. A clarification is in order, since it is clear in the literature that findings from GBTM analyses have been interpreted as giving very strong evidence to a certain typological theory (see e.g. here). I have claimed this empirical evidence is weak and largely based on overinterpretation of empirical studies using GBTM (see, here and here). It would be helpful if Nagin could clarify the strength of this evidence.

So I wrote a commentary and submitted it to The Journal of Research in Crime and Delinquency. (See the full commentary here). According to the letter from the editor, it was rejected because:

Language at the top of page 2 in your comment underscores a fundamental misunderstanding and misreading of Nagin’s work.
(See the full rejection letter here).

Well, maybe I should have put things more politely, but I still believe my arguments are right. I can understand that there might be good editorial reason for why not having another debate about GBTM in the journal, but I am not impressed with the reason given. My fundamental misunderstanding is revealed (on the top of page 2) where I point out that Nagin himself is responsible for some of the confusion regarding the interpretation of the groups. I do so with clear references, so you can decide for yourself whether these are misreadings or not.

Even in his recent essay, Nagin presents one of the main motivations for using GBTM by first arguing that other methods are not capable of testing for the presence of groups, and then suggesting that GBTM can indeed solve this problem:

To test such taxonomical theories, researchers had commonly resorted to using assignment rules based on subjective categorization criteria to construct categories of developmental trajectories. While such assignment rules are generally reasonable, there are limitations and pitfalls attendant to their use. One is that the existence of distinct developmental trajectories must be assumed a priori. Thus, the analysis cannot test for their presence, a fundamental shortcoming. (…) The trajectories reported in Figure 2 provide an example of how GBTM models have been applied to empirically test predictions stemming from Moffitt’s (1993) taxonomic theory of antisocial behavior.
(My emphasis).

It might not say straight out whether the groups from GBTM are interpretable as real or not in this setting, nor what can be concluded from such “tests”. But given the previous debates and misconceptions, this is hardly a clarification.

My point is simply this: it has been claimed that GBTM can be used to test for the presence of distinct groups, and generally to test typological theories. (I have discussed this in more detail here and here). However, it is hard to see how such typological theories can be tested using GBTM. That is indeed very vaguely explained by the advocates of the methodology. I think (but I am not entirely sure), that in this context “testing a theory” only means findings that are consistent with a given theory. I think this is a generous use of the term “test”. I prefer to reserve the word “test” for situations where something is ruled out – or when using methods that at least in principle would be able to rule something out. In other words: If the findings are consistent with a theory but also consistent with one or several competing (or non-competing) theories, this is at best weak evidence for either theory. (This holds regardless of methods used). It is good that a theory is consistent with the empirical findings, but that is far from enough. I know of no published criminological study using GBTM that provides a test of typological theories in the stricter sense of the term. So far, it seems to me that the advocates of GBTM have not been clear on this issue. Some clarification would be in order.

The post Testing typological theories using GBTM? appeared on The Grumpy Criminologist 2016-07-01 12:00:36 by Torbjørn.

An update on group-based trajectory modeling in criminology

In a special issue of Journal of Research in Crime and Delinquency on criminal career research, Daniel Nagin wrote an essay about the contribution of group-based trajectory modeling (GBTM). Appropriately, he also refers to the controversies about the applications of this methodology, where he contends that all earlier critique is just based on a couple of misunderstandings. I am of course honored that my own critique is found important enough to be mentioned (although as one of those having misunderstood the point). I suppose that means I have made some sort of impact. It would have been nice, though, if my actual arguments had been met with rational clear arguments instead of just being dismissed. In fact, other advocates of GBTM have actually responded to my work, but without pointing out any mistakes on my part.

It is worth pointing out that Nagin also refers to another important critique of GBTM by Daniel J. Bauer, who it seems, according to Nagin is based on the same misconceptions. (Other critics could have been mentioned as well). To my knowledge, none has pointed out mistakes in the arguments made by Bauer. On the contrary, Nagin and Odgers (p. 118) have previously acknowledged the importance of a simulation study by Bauer and Curran:

Their work serves as a useful a caution against the quixotic quest to identify the true number of groups in either GMM or GBTM analyses. Perhaps most importantly, this work reinforces the need to move away from interpretations of trajectory groups as literally distinct entities.

So, Nagin has previously agreed that the groups have been interpreted as distinct entities, at least by some, and we should move away from such interpretations. Yet, reading his recent essay, one gets the impression that those criticizing such interpretations have just misunderstood the point. This seems like a contradiction to me.

I do not mind the disagreement, but it would have moved the academic debate forward if those accusing others of being misguided could meet the actual arguments or point out errors in the premises etc. I am still waiting for someone to point out the mistakes in my article on GBTM.

 

P.S. I have never seen any of the advocates for GBTM criticizing any interpretation of GBTM.

P.P.S Actually, Brame et al did point out a mistake of mine, of which I agree. I had written that Moffitt’s taxonomy was “spurred” by GBTM. That was clearly the wrong word, and I can only blame my bad English as a non-native speaker. I should have written that the popularity of the taxonomic approach was “fueled” by the development of GBTM. Not a major point, though.

The post An update on group-based trajectory modeling in criminology appeared on The Grumpy Criminologist 2016-06-30 09:06:25 by Torbjørn.
Social Media Auto Publish Powered By : XYZScripts.com