Roundheads and Cavaliers in drug discovery

OK, the title is probably not very accurate and I am sure that those more historically literate than myself will highlight why this is a bad analogy. Its also very UK-centric for which I apologise for indulging myself. However, I think it is a useful comparator to explain what I think is wrong with many of the bandwagons that go zooming by if you just wait around long enough in a drug discovery environment. Its not so much the roundheads as having a distinctive look or political grouping that is the source of the analogy I wish to draw but their puritan roots. In particular, it is the puritan tendency to tell people what they should NOT do and their propensity for banning things which is the basis of the comparison I want to make. Notably, they undid themselves by (amongst other things) banning things such as the various feasts and festivities and the vividly decorated public buildings that were some of the few sources of gaiety for many of the populace.

The comparator that I wish to make and which therefore, I think makes me a cavalier, is that there are too many people trying to tell drug discoverers what NOT to do but without providing any particularly useful guide to what they should do instead. I further speculate that, like the roundheads, this approach is wont to sap the joy out of drug discovery for many and ultimately is unlikely to spur the kind of creativity that we cavaliers think essential for success in this field.

Compound related metrics The first puritanical approach that I wish to highlight is that of the various drug discovery metrics and whatnot that sometimes get restyled rules (eg Lipinski’s rules, the rule of 3, etc). Other such metrics include ligand efficiency and lipophilic efficiency and the myriad related functions. My former manager, Pete Kenny, has critiqued many of these from a scientific perspective and others have challenged their rigour while their supporters have argued for their utility as well as their rigour. I should acknowledge that I have personally found lipophilic efficiency a useful guide for contextualizing compounds relative to one another and to be a thought-provoking concept. However, it is a very poor tool for suggesting what to do next. It is a much better tool for rapping me across the knuckles for daring to think about suggesting more lipophilic compounds.

Druggability (protein related metrics) In my research group, we have most enjoyable group meetings on a Friday afternoon at which, from time-to-time, we discuss journal articles. This week’s is a perspective by Kozakov et al. describing druggability. I was once more struck by the roundhead tendency of the approach. Targets can of course be undruggable but deluding ourselves into thinking that our understanding of biology and chemistry is so complete that we can predict this in advance using only the structure of the protein involved is not just puritanical but rather presumptuous too. The authors do not help themselves by mentioning that some druggability approaches consider HMGCoA-reductase to be undruggable. My main problem with this concept is that it tells you only what NOT to do and while I appreciate that drug discovery is an enterprise in which the resources available are unlikely ever to be great enough to take on all possible targets and approaches, this seems to allow no role for curiosity and the human delight in exploration. Indeed, if this is really a tactic for wrapping up a decision about resourcing as science then we really should avoid that – a resourcing decision is a resourcing decision and should not be disguised as something else.

Forbidden substructures I am aware that so far, you could read this article as me trying to tell you not to do things that tell you what not to do. To try and provide an illustration of what I hope is a more constructive (the word cavalier in this context might make me sound rash) approach to tackling roundheadism. I am sure many people working in drug discovery have come across those who would ban certain substructures, sometimes with good reason, but mostly based on extrapolating one or two bad actors to a whole class. I have heard tell of the banning of nitro groups and, of course, of aromatic amines. I was particularly vexed by the latter because I had been led to believe that the problem with aromatic amines is mostly a chemical one: they are transformed biochemically into reactive species but then undergo a purely chemical reaction with genetic material. I thought I understood chemistry and so presumed that the problem could be tackled logically. I think we demonstrated in a couple of examples in real drug discovery projects that this is correct and that you can find examples that retain all the “good” properties but which are significantly less risky from a DNA reactivity perspective (I put it no more strongly than that). I would far rather hear about interesting new ways to make logical (or illogical) progress in drug discovery than to hear new ways of telling people what not to do.

I think I will leave it at those three examples for now but expect to feel compelled to rant about further roundheadism in the not too distant future.

Roundheads and Cavaliers in drug discovery

Has medicinal chemistry entered the big data space?

Iva recently attended the London Innovation Society’s Big Data Analysis Innovation Awards at which she was selected to present a poster. This has prompted us to ponder whether our recent collaborative work with MedChemica is genuinely “Big Data” or just an analysis that happens to have more data than is normal in the field (Medicinal Chemistry). Moreover, what (if anything) can we learn from the leaps forward in big data analysis taking place in other sectors? A recently published article considers how genomics might compare with astronomy, YouTube and Twitter (if nothing else, we enjoy the juxtaposition of one of mankind’s most primordial obsessions with the obsessions into which we are now regressing). In terms of sheer scale, medicinal chemistry seems to still be some way off from having the “zetta” (10 to the power of 21) scale data attributed to genomics or astronomy. Depending on how inclusive one wishes to be, it may compare with the fractions of a billion tweets per year. My back-of-the-envelope guess is that the global medicinal chemistry effort might add some hundreds of millions of datapoints per year (an HTS may be of order 1 million but few organisations can undertake them; individual compound testing efforts within large companies may add hundreds or thousands of data points per active research project of which there may be some hundreds). Recent efforts to make and test encoded libraries with billions of compounds in them probably don’t yet add one data point per compound so are unlikely to shift this in the near term. Indeed, it is not clear whether the number of medicinal chemistry data points being generated per year is currently increasing or decreasing. It is a thought provoking contrast that the four-headed beast that is predicted for genomics (data acquisition, storage, distribution and analysis) remains barely relevant to medicinal chemistry: the data instead remain divided amongst a stack of individual companies around the globe. Databases like Chembl  (13 million datapoints) surely represent only a small fraction of the medicinal chemistry dataset but not an insignificant fraction. Others and others have recently speculated about the impact big data will have on medicinal chemists. Two aspects that we are particularly interested in are training and culture.

Unless things have changed radically in the last five years, most medicinal chemists come from a background in synthetic organic chemistry. As has been noted recently, this is the discipline of the discrete, the precise and of worrying about how to make things. Medicinal chemistry on the other hand deals with the “continuous” properties of biology which can be measured with much less precision and reproducibility and should be concerned with what to make (how to make it only kicks in afterwards). Does this training provide the best background to deal with medicinal chemistry in the big data era? What role is there for statisticians, analysts and mathematicians? Particularly in the UK, can we start to bring back some of the brilliant minds that have been lured into the city to do just this sort of analysis? Furthermore, a culture that is imbued with the beauty of synthesis (a penchant that I still share, but these days more as a guilty indulgence than anything else) and on caring for individual molecules (I have chosen the verb “to care” with some care: “the process of protecting someone or something and providing what that person or thing needs” describes one aspect of the problem rather well). It is hard not to have your head turned by any molecule (or other thing) that you have invested many days of hard work to but making the right thing may require just that.

Has medicinal chemistry entered the big data space?