Skip to content.
Main > AmitFriedman > AmitFriedman > AtrophyWeighting
-- AmitFriedman - 04 Aug 2009 TRYING DIFFERENT STAT ROI WEIGHTING APPROACHES TO REDUCE THE N80 FOR AD AND MCI

ATROPHY WEIGHTING APPROACHES:

One approach we tried is to use the atrophy weights themselves as weights. The intuition behind this approach is that voxels with greater atrophic rates show greater tissue loss and degeneration. Therefore we can argue that the detrimental effects of the disease are more pronounced across these voxels and should be given greater importance. This approach introduces a variety of possibilities to us.

We can choose to use contraction voxels (<256) and expansion voxesls (>256) separately, and give two different numerical summaries for each Jacobian map. However the drug industry dislikes this approach of only using contraction or expansion voxels separately. This is because they reasonably argue that it is not justified and legitimate to only look at some part of our data; if nothing else, simply because different subjects have different regions of contraction and expansion and we cannot allow our method to be so subject specific. Indeed, after running this method on our MCI subjects, we learned that one of the subjects does not even have any expansion voxels in the stat ROI, and so we cannot even apply this algorithm on that given subject. A good method should not be so subject specific. Nonetheless, we still tested this method to see what results we get.

Alternatively, we can take all voxel values, and use all of our data to yield the numerical summary. If we suspected that expansion voxels in the ROI actually represented expansion of brain tissue, then the former approach might have had a better logical foundation to support it (only use contraction voxels for the numerical summary). However we know that the expansion voxels in the ROI most likely represent ventricular expansion which is correlated with tissue contraction. Therefore separating contraction and expansion voxels might not be the best approach. Indeed, the latter method, the one in which we use all voxel values proved to perform much better than the former one. Note however that if we take all voxel values and average them out, many of the contraction and expansion voxels are going to cancel out, and give us an average that is closer to 256. This is undesired for a couple of reasons. First, when terms cancel, we are essentially loosing data. But since both contraction and expansion voxels store data about tissue loss (remember that the expansion is really ventricular expansion), we should really try to avoid this. Second, our formula for n80 is inversely proportional to the square of (256 – average of all numerical summaries). This means that the larger the difference between our average of all numerical summaries and 256, the larger the denominator gets, and so we get a smaller n80. That is why we do not want voxels values to cancel. This cancellation will give us a more “moderate” result, closer to 256, which will tend to increase our n80 (smaller denominator). We can overcome this problem by reflecting all values greater or smaller then 256 about 256 to yield only voxel values which are all smaller or all bigger than 256 (or equal to). Since the n80 formula only takes into account the distance of the average of our numerical summaries from 256, it does not matter in which direction we reflect our voxel values about. I chose to reflect all the voxel values down, so that all values are smaller than or equal to 256. For example, the value 260 will be reflected about 256 to become 252. This will give us a numerical summary that is smaller than 256. Note that this downward reflection will align our voxel values. By aligning them, the values are no longer centered about 256, but are instead ordered linearly. Values much smaller than 256 stand for greater atrophy, whereas values closer to 256 (but still smaller) stand for less extreme atrophy. The value 256 is mapped to itself and still represents no change. Just reflecting all the voxel values about 256 and then taking the strait average across the stat ROI improved our previous n80 for AD by 12 (48 -> 36), and our n80 for MCI by 29 (85->56). This should not be a surprise at all because when we reflect all the values about 256 then the numerical summary will be more farther away from 256 because we will not have voxel-wise cancellation of values that will pull our numerical summary closer to 256 (as discussed above). In addition, note that by reflecting down we are essentially decreasing the voxel-wise variance because the values are no longer spread above and below 256, but rather only above or below 256. This should not necessarily decrease the variance between the numerical summaries however it is definitely possible given that now all the numerical summaries are all smaller or all bigger than 256 (depending on whether we reflect up or down); which is not the case if we do not reflect the voxel values.

The fact that our voxels are centered about 256 also affects our weights. If we use a weighting approach that is directly proportional to the voxel values, then larger values (above 256) will be assigned more weight. If our weighting approach is inversely proportional to the voxel values then the smaller voxel values (less than 256) will be assigned more weight. Note however that this type of bias can only take place when we use both contraction and expansion voxels together because when we separate them, they are already aligned so no bias can occur. We can eliminate this bias by implementing the same reflection approach. We can either make all voxel values smaller/bigger than or equal to 256 (align them) and by doing so eliminate the bias, or by just accounting for the distance from 256 which ignores whether the value is bigger or smaller than 256 (only accounts for how far the value is from 256). I will elaborate on this further when I describe the actual methods in the next several paragraphs.

If we want to use the atrophy values as weights, we can try a few different methods. We can use the actual values themselves, we can use their reciprocals, we can use their difference from 256, their difference from some max amount like the maximum atrophy rate in the whole Jacobian map, the maximum atrophy rate in the ROI, etc. We can also raise the weights to different powers and see how that affects the numerical summary.

USING THE ACTUAL VOXEL VALUES OR THEIR RECIPROCALS AS WEIGHTS If we choose to separate the expansion (>256) and contraction (<256) values from each other, then our method would output two numerical summaries. We want smaller contraction and larger “expansion” values get greater weights in our numerical summaries. The farther a value is from 256 (above or below), the more weights we give it. Therefore we tried to use the reciprocals of the contraction values as weights for the numerical summary of all the voxel values smaller than 256, and the actual expansion values themselves as weights for the numerical summary of all the voxel values greater than 256. We also tried (just for the purpose of exploration) to give the contraction values weights that are equal to the actual contraction values after they are reflected about 256 (so values smaller than 256 are mapped to larger values greater than 256). Similarly, we also gave the expansion values weights that are equal to reciprocals of the expansion values reflected about 256 (larger values greater than 256 are mapped to smaller values less than 256, and then we take their reciprocal). Obviously this would not make a big difference in our end result, because mathematically it does not really matter that much if we use the reciprocal of the contraction values as weights or the their actual reflected values; or alternatively if we use the expansion values as weights or if we use the reciprocals of their reflected values. In any case, we use reciprocals for values smaller than 256 and the actual values for the values larger than 256 (remember, the distance from 256 is what counts).

If we did not want to separate contraction and expansion voxels, we could apply all of these weighting approaches (reciprocals of voxel values, actual voxel values) to all of the voxels together. We could either take the voxel values as they are, noting of course that the expansion and contraction ones will cancel each other out, or we can reflect all of them to be smaller or larger than 256. In addition, another factor we can vary when we use all voxel values together is whether or not weights are biased towards contraction or expansion voxels. If we use the reciprocals of the voxel values as weights then the weighting method is inversely proportional to the atrophy rates, and so smaller voxel values will be given greater weight (so in general, contraction values will be more important than expansion values). If we use the actual voxel values as weights, then our weighting approach is directly proportional to the atrophy rates, and so larger voxel values will be given greater weight (so in general, expansion values will be more important than contraction values). As I explained above, if we do not want our method to be biased, then we should invert all voxel values about 256 first (align them) and only then use the reciprocals/actual atrophy rates as weights. This way there is no bias towards contraction or expansion values. It is the values that are far apart from 256 that get more weight.

Pretty much all of the methods improved the previous results for n80 using the normal average over the stat ROI. The best of this set of methods for AD was the method where we use all voxel values, but did not let them cancel (reflect them up or down about 256), and used the actual voxel values as weights (did not reflect the weights), which gives us a bias towards expansion voxels (this method gave us the third best results for AD overall). Interestingly enough, almost no methods improved the n80 for greater powers of the weights. But this method’s performance improved as we raised the weights to greater and greater powers. The best result for AD was obtained empirically with power =18. It produced: n80 = 31, n90 = 41 (for AD). The 18th power gave the smallest number for n80/90. However since we round n80/90 up to get the number of people required for the drug trial (can’t have fractions of people), the 17th power is also sufficient to yield n80 = 31 and n90 = 41 (larger values than the 18th power, but still below 31 and 41, so we still round up to get the same results). This method yielded the fourth best results for MCI (second best out of this set of methods) with n80 =40, n90 = 53. The 25th power was also sufficient. The best of this set of methods for MCI (third best overall) is the method where we only use expansion voxels, and use the actual values as weights (unbiased weights because we are using contraction and expansion voxels separately). This method yielded n80=37, n90=49. Note that this method gave a smaller n80/90 for MCI, than it did for AD (n80 = 52, n90=70). We expect the MCI patients to have an average of the numerical summaries that is closer to 256 because they have more moderate atrophy rates. However they can also have much smaller standard deviation and so their n80 might not necessarily be smaller. That is the situation with this method. In any case, we would probably not use this method because the industry frowns upon methods that use expansion and contraction voxels separately. Furthermore, MCI patient MCI_033_S_1309_MIs6L8.img does not even have any expansion voxels, and so this method is pretty much invalid. Finally, note that this set of methods did not perform as well as the set of methods to be discussed next.

USING THE DISTANCE FROM 256/MAX VALUE AS WEIGHT

If we assume that all the voxels in the ROI give us atrophy rates for tissue loss, whether the values are greater or smaller than 256, then we can just take the difference between the voxel value and 256 as an indicator of how severe the atrophy at that voxels is. Essentially, reflecting all the values so they are smaller than 256 and taking their reciprocals as weights, or reflecting all values so they are bigger than 256 and taking those reflected values as weights is the same concept as taking the distance from 256 itself as weight. As before, we can separate our contraction and expansion values, use them together and let them cancel each other out, or reflect them about 256 so they are all smaller or greater (or equal) to 256. Note however that this method is different from the previous one in the fact that this method cannot have biased weights. In the previous method, we could choose to let our weights be biased by taking the voxel values as they are, which will result in a bias toward contraction values (when we use the reciprocals), and in a bias toward expansion values (when we use the actual values themselves). We eliminated this bias by letting the weights be the aligned (reflected) values. However for this method, we are using the (absolute) distance from 256 as our weight. Hence it does not matter if the given voxel value is larger or smaller than 256, and so by default this method is unbiased.

We also tried to reflect the voxel values down about 256 and then subtract from some max amount like the largest atrophy rate in the ROI, or the largest atrophy rate in the whole Jacobian map. These approached did not perform as well as the one in which we find the distance from 256. Note that finding the distance from 256 is equivalent to reflecting down about 256 and then subtracting from 256. These methods have the disadvantage that each different Jacobian map has a different max value. Hence it might be a little problematic to compare numerical summaries of different Jacobian maps. For example, let us look at two maps, one with an incredibly large max value and one with a more moderate maximum value. Now let’s examine the weights that will be assigned to two arbitrary voxels at each of the two maps. In the first map, the maximum value is so incredibly large, that the two voxels’ distance from the max value will dwarf the difference in atrophic rates between those two voxels. Therefore they will be assigned approximately the same weight. In contrast, in the second map the distance from the max value is not that big, and so the difference in atrophic rates between the two voxels will be taken into more account and so they will be assigned relatively different weights based on their atrophic rates. This shows that this method will assign different weights based on different max values. On the other hand, the method in which we always subtract from 256 is a consistent approach that always assigns the same weights to equivalent voxels in different maps (holding the sum of all the weights which we divide by in order to normalize our weights constant). Hence it is more intuitive to compare numerical summaries of different Jacobian maps using this approach.

The method where we use the distance from the max value in the whole Jacobian map did not do that well. In essence, there isn’t an important mathematical difference between subtracting from the max value of the whole map, and the max value of the ROI. However when we subtract from the max of the whole map, we get a clumsier, not as neat of an answer because we are only dealing with values in the ROI anyway, so why subtract them from the max of the whole map? It is also possible that this method didn’t do as well because the max value in the whole map is equal to or greater than the max value in the ROI. As discussed in the previous paragraph, since the distance from the max value of the whole map is equal to or greater than the distance from the max value of the ROI, this means that this method will tend to overlook and downplay voxel-wise differences and will assign approximately the same weights to voxels with similar atrophic rates (more moderate weighting). The other method will not downplay the atrophic rate difference between voxels as much because the distance from the max in the ROI is not as big. The same principal can be applied to the difference between the method that takes the distance from 256 and the method that takes the distance from the max ROI. The distance from 256<distance from max ROI and so the distance from 256 method will give more importance to the voxel-wise atrophy differences than the distance from max ROI method.

The method in which we subtract from the max value in the ROI performed much better than the previous method discussed, and its performance improved for greater powers of the weights. The 15th power was the optimal power for AD, yielding n80 = 26 and n90 = 35, which has the second best performance overall. The 14th power was also sufficient to yield the same results (larger n80/90, but still below 26,35). This method performs almost as good as our best method for AD, but is more computationally costly. Again I will repeat that there isn’t a big mathematical difference between subtracting from 256 and subtracting from the max in the ROI. However the 256 method is consistent across subjects, which we cannot say about the max value approach. For MCI, just like for AD, this method’s performance improved for greater powers of the weights. However, we did not get one best power that outperforms the other ones. As we increase the power more and more, the n80/90 keeps going down. I stopped computing the n80/n90 at the 65th power, with n80 = 34, n90=45. The 42nd power was also sufficient to yield the same results (33<n80<34 and 44<n90<45). This is actually the best result for MCI out of all the methods. It even outperforms the distance from 256 approach.

The fact that the n80/90 keep improving for greater powers is worth discussing. The n80 will decrease for smaller variance of the numerical summaries and average of the numerical summaries that is farther away from 256. We can argue that we can just use the smallest voxel value in the stat ROI of each Jacobian map for its numerical summary. This will definitely move the average of the numerical summaries farther away from 256. However this can also potentially increase our variance if the different Jacobian maps have min values that are very spread apart from each other, thus increasing our n80. We can manipulate the numerical summaries so that we’ll get a smaller average of the numerical summaries, but we cannot really control the resulting standard deviation. As we increase the powers of the weights for a certain method, we are basically giving more weights to the most important voxel values (based on how that method determines what is an important voxel). For example, for the distance from 256 or the distance from the max ROI the smallest value gets the largest weight. Thus increasing the powers of the weights will give the smallest voxel value more and more weight compared to the other voxels. Therefore if we find that the nth power of the weights gives us the best n80 results, that means that the nth power gave the optimal possible ratio between the variance and average of the numerical summaries so that the n80/90 turned out to be the smallest. However if there is no one single optimal power, and n80/90 just keep decreasing for greater powers, that means that the given method pretty much wants us to only use the most important voxel value for each Jacobian map for the numerical summary. Apparently, for the distance from max ROI method in the MCI, we should only use the smallest voxel value (or a group of the smallest voxel values) from each map. This is apparently OK because the variance of the MCI subjects does not increase by that much as we assign more weight to the smaller voxel values. For the distance fro 256 on the other hand, the variance increases faster than the average deviates from 256 as we assign more weight to the smaller voxel values, and so the n80 actually increased for powers larger than 1. Note however that it would not be wise to use only one voxel for a numerical summary. This is because however a certain drug affects one single voxel (or just a few of them) is not a reliable measure of the slowdown of the progress of the disease. That specific voxel might have a smaller atrophic rate after the drug treatment, whereas at the same time, the atrophic rates in the rest of the voxels could have worsened. Thus, we want to use information about as many voxels as we can in order to determine whether the drug was helpful or not.

The distance from 256 approach (for our weights) yielded the best results overall for AD when we used all voxel values together, and did not let them cancel each other out (reflected them about 256). Voxels much smaller than 256 get greater weight, and values closer to 256 get smaller weights. Note again that after the reflection, values which were previously very large, will become very small and so they do not loose significance whatsoever. Raising the weights to greater powers (greater than the first power) did not improve our results. This is a very satisfying result because even though other methods perform almost as well as this one for AD, they required computing greater powers of the weights. This method on the other hand does not raise the weights to a power and it is therefore much more computationally fast and efficient. For the AD group, we got n80 = 25, n90 = 34 (best overall), and for MCI we have n80 = 36, n90 = 49 (second best overall, right after the distance from max ROI approach). This is a significant improvement compared to ADNI’s second stage method of using the average over the stat ROI. The n80 for AD is about half of what it was initially (48 -> 25), and is less than half of what it was initially for MCI (85->36). Note that although this method has the second best performance for MCI overall, it is much faster to compute, consistent across subjects (always subtract from the value 256 that does not change), uses all values instead of just expansion or contraction, and strives to use all the voxels for the numerical summary (whereas the distance from the max ROI method strives to use only one or only few of the several of the voxels with the smallest values for the numerical summary). Furthermore, one can say that this method is by construction perfect to give a very small n80/90 because the n80/90 will decrease for an average of the numerical summaries that is farther away from 256; and this method gives greater weight to voxel values that are farther away from 256. Therefore we crown this method as the best one.

Attachment sort Action Size Date Who Comment
RESULTS.doc manage 2776.5 K 07 Aug 2009 - 21:44 AmitFriedman Final Results