Hitting difficult targets

In the 1990s, high-throughput screening (HTS) was touted as the future of hit generation. Through the decade, combinatorial chemistry, automation and the exploration of physical chemical space, allied to a focus on in vitro biology, led to the rise of a more probabilistic approach, and HTS became routine.

And whilst the biggest source of molecules entering clinical trials remains HTS, followed by knowledge-based approaches, it’s fair to say that HTS has not had the predicted impact. There has been a steady flow of compounds from screens that will never make drugs because their properties are sub-optimal, alongside a recognition that the targets being addressed today are more difficult to prosecute than those of two decades ago.

For these more challenging targets, other approaches are beginning to show benefit, and more knowledge-based approaches are becoming of increasing interest. So, while HTS remains a great tool, the number of other tools available to find chemical matter is growing.

Beyond HTS

Whilst HTS allows the investigation of perhaps a million or so compounds, more expansive screening has been enabled through the evolution of DNA-encoded library screening, or DEL. The ability to screen several billions of compounds in a single tube against drug targets of interest is a compelling proposition, but it does raise its own challenges – particularly around the nature of the hits.

These hits can be difficult to translate into leads with acceptable properties. But where DEL screens do particularly add value is for complex targets with unclear biological readouts or where there are no pre-existing ligands – areas where the more immediately tractable approaches of HTS and knowledge-based hit finding are perhaps not applicable.

Whilst we often focus upon the molecular properties of our hits, it’s worth noting that moving outside traditional HTS hits and Ro5 space does not specifically preclude drug-likeness, as has been demonstrated by the increase of interest, and clinical progression, of natural product-like macrocycles as valuable hit matter against challenging targets, including protein-protein interactions. These approaches have also been further expanded through genetic techniques, using phage display to generate vast, diverse libraries of cyclic peptides which can be used to rapidly generate hits and probes against emerging biological targets. But again, like DEL hits, these hits may require significant optimisation to convert them from biologically interesting probe molecules to ones useful as human medicines.

At the other end of the size spectrum from macrocyclic compounds and DEL hits, fragments can be very useful in determining whether a protein has druggable binding pockets. Hits will need to be evolved into something more drug-like, and this process is significantly enhanced with the knowledge of structural information about the fragment-target binding to direct the medicinal chemist’s thinking. Fragment screens have the advantage that fewer molecules need to be screened to encompass similar diversity, and may help to identify dynamic or transient pockets that are not captured in more traditional structure-based methods.

Indeed, cryptic pockets that only appear when a drug binds to them are of increasing interest, but their dynamic nature means that currently available technologies are less successful in identifying hits. Some companies claim to be able to follow protein conformational changes computationally and identify novel binding regions, with chemoinformatics and ligand-based screening, for example, now being joined by AI/ML techniques that might be able to provide more dynamic structural information. While AI companies have made big investments in the area, there is still little evidence of impact as yet in the public domain.

Recently, these computational techniques have returned to the fore, and virtual screening is being relied upon more and more, particularly with access to ultra-large virtual libraries. Is the big promise for the next 20 years the ability to use artificial intelligence (AI) and machine learning (ML) to query these ultra-large virtual libraries? A common tactic today is to use ligand-based 3D shape matching with a virtual library, such as Enamine’s Real database, which now includes nearly two billion molecules with potentially drug-like properties. If the active site (or other binding pocket) can be identified, it should be possible to use computational chemistry in a more directed way to find a high-quality hit. This, however, remains something of an art and is not a guarantee of success.

And even though AI/ML specialists can now screen ultra-large libraries very quickly, many approaches focus primarily upon potency, and optimising drug-like properties remains a challenge for many systems. While a low-affinity ligand can often be evolved to a nanomolar one, far less attention is paid to druggability and pharmacokinetic profiles at this early stage. The ability to rapidly assess hit matter through commercial “analogue by catalogue”, discarding poor quality hits and prioritising promising ones, can significantly accelerate progress. There is now much more of a focus among library companies on providing molecules that are evolvable, and some, like Enamine, have materials on the shelf ready to make and evolve focussed compound libraries quickly. However, large physical libraries rarely exist these days, away from the large cocktails of compounds in DNA-encoded and phage display libraries.

Designing better libraries

Regardless of hit finding technique, one concern with existing libraries is that if they are all based on precedented chemistry, how useful will they be for unprecedented targets? The shape of a new binding pocket may be unknown, yet little attention is paid to shape diversity within a typical screening collection. Diversity in the library, whether real or virtual, is key. An example of a modern approach to library creation is Sygnature’s own internal virtual library of about 17 million compounds; while relatively small, it is based on diverse and proprietary drug-like scaffolds, and three-dimensional shape diversity. These are increasingly being recognized as key factors in library creation. Similarly, Sygnature’s fragment library is designed to be diverse, with useful chemical functionality on the fragments to make the hit-to-lead process faster and more efficient.

And that efficiency of prosecution is of critical importance. Regardless of how a hit is generated, the identification of those with properties suitable for optimisation, and ultimately development, remains challenging, and any screening approach might ultimately only lead to probe molecules that cannot be prosecuted into drugs.

In some respects, finding a hit is the easy part. The conversion of that hit into a molecule with all the requisite properties to become a drug remains the greater challenge. If a screen provides a hit set of, say, 20 compounds, each of which has two or three vectors that requires exploration, that’s a significant synthetic task to fully explore, unless they can be further prioritised in some way. And this consideration of the underlying library design, and its future “workability” for expedient exploitation into drug candidate molecules, is of critical importance to success. Ideally, the library components will sit within larger, accessible commercial space so that rapid hit follow-up can be undertaken rapidly and effectively, with access to both commercially available elaborated derivatives, bulk scaffolds and intermediates for internal synthetic efforts.

Most importantly, the nature of the target should always dictate the strategy applied to lead finding. The ability to access different compounds and novel chemical space is already having an impact, whether this is through more traditional HTS, virtually, or through the creation of large macrocycles. As an overall process, it is still somewhat hit and miss, but with an educated and informed choice of hit finding, and perhaps a little bit of luck, hits (and eventually drugs) for these emerging, challenging targets can indeed be delivered.

This article summarises the vibrant discussions during a recent roundtable event hosted by Sygnature Discovery and chaired by Dr Mark Ashwell, Consultant at Ashwell Consulting Group. We hope you found it thought provoking and stimulating. We love to connect and interact with scientists across the globe, to discuss all aspects of drug discovery science and to help drive forward the progression of experimental therapies toward the clinic and patient benefit. If you’d like to strike up a conversation, or if we can help you accelerate your own projects toward patients, we’d love to hear from you. Please reach out using any of the contact forms.

Cookie	Duration	Description
__hssrc	session	This cookie is set by Hubspot whenever it changes the session cookie. The __hssrc cookie set to 1 indicates that the user has restarted the browser, and if the cookie does not exist, it is assumed to be a new session.
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
__hssc	30 minutes	HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
pll_language	1 year	The pll _language cookie is used by Polylang to remember the language selected by the user when returning to the website, and also to get the language information when not available in another way.

Cookie	Duration	Description
_uetsid	1 day	This cookies are used to collect analytical information about how visitors use the website. This information is used to compile report and improve site.
_uetvid	1 year 24 days	Used by Bing to store and track visits across websites.

Cookie	Duration	Description
__hstc	1 year 24 days	This is the main cookie set by Hubspot, for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_68772340_1	1 minute	This cookie is set by Google and is used to distinguish users.
_gcl_au	3 months	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_hjAbsoluteSessionInProgress	30 minutes	Used to detect the first pageview session of a user. 30 minutes duration, extended on user activity. Boolean true/false data type.
_hjFirstSeen	30 minutes	This is set by Hotjar to identify a new user’s first session. It stores a true/false value, indicating whether this was the first time Hotjar saw this user. It is used by Recording filters to identify new user sessions.
_hjid	1 year	This is a Hotjar cookie that is set when the customer first lands on a page using the Hotjar script.
_hjIncludedInPageviewSample	2 minutes	Set to determine if a user is included in the data sampling defined by your site's pageview limit. 2 minutes duration, extended every 30 seconds. Boolean true/false data type.
_hjIncludedInSessionSample	2 minutes	Set to determine if a user is included in the data sampling defined by your site's daily session limit. 2 minutes duration, extended every 30 seconds. Boolean true/false data type.
_hjTLDTest	session	We try to store the _hjTLDTest cookie for different URL substring alternatives until it fails. Enables us to try to determine the most generic cookie path to use, instead of page hostname. It means that cookies can be shared across subdomains (where applicable). After this check, the cookie is removed. Session duration. Boolean true/false data type.
CONSENT	16 years 3 months 15 days 12 hours 7 minutes	These cookies are set via embedded youtube-videos. They register anonymous statistical data on for example how many times the video is displayed and what settings are used for playback.No sensitive data is collected unless you log in to your google account, in that case your choices are linked with your account, for example if you click “like” on a video.
hubspotutk	1 year 24 days	This cookie is used by HubSpot to keep track of the visitors to the website. This cookie is passed to Hubspot on form submission and used when deduplicating contacts.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
MUID	1 year 24 days	Bing sets this cookie to recognize unique web browsers visiting Microsoft sites. This cookie is used for advertising, site analytics, and other operations.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	These cookies are set via embedded youtube-videos.
yt.innertube::requests	never	These cookies are set via embedded youtube-videos.