The haploid human genome contains approximately 20,000 protein-coding genes, significantly fewer than had been anticipated. Protein-coding sequences account for only a very small fraction of the genome (approximately 1.5%), and the rest is associated with non-coding RNA molecules, regulatory DNA sequences, LINEs, SINEs, introns, and sequences for which as yet no function has been elucidated.
In other words, the considerable observable differences between humans and chimps may be due as much or more to genome level variation in the number, function and expression of genes rather than DNA sequence changes in shared genes. Indeed, even within humans, there has been found to be a previously unappreciated amount of copy number variation (CNV) which can make up as much as 5 — 15% of the human genome. In other words, between humans, there could be +/- 500,000,000 base pairs of DNA, some being active genes, others inactivated, or active at different levels. The full significance of this finding remains to be seen. On average, a typical human protein-coding gene differs from its chimpanzee ortholog by only two amino acid substitutions; nearly one third of human genes have exactly the same protein translation as their chimpanzee orthologs. A major difference between the two genomes is human chromosome 2, which is equivalent to a fusion product of chimpanzee chromosomes 12 and 13 (later renamed to chromosomes 2A and 2B, respectively).
In addition, drug development is required to establish the physicochemical properties of the NCE: its chemical makeup, stability, solubility. The process by which the chemical is made will be optimized so that from being made at the bench on a milligram scale by a medicinal chemist, it can be manufactured on the kilogram and then on the ton scale. It will be further examined for its suitability to be made into capsules, tablets, aerosol, intramuscular injectable, subcutaneous injectable, or intravenous formulations. Together these processes are known in preclinical development as Chemistry, Manufacturing and Control (CMC).
In the US, the FDA can audit the files of local site investigators after they have finished participating in a study, to see if they were correctly following study procedures. This audit may be random, or for cause (because the investigator is suspected of fraudulent data). Avoiding an audit is an incentive for investigators to follow study procedures.
Within the field of organic chemistry, the definition of natural products is usually restricted to mean purified organic compounds isolated from natural sources that are produced by the pathways of primary or secondary metabolism. Within the field of medicinal chemistry, the definition is often further restricted to secondary metabolites. Secondary metabolites are not essential for survival, but nevertheless provide organisms that produce them an evolutionary advantage. Many natural products are cytotoxic and have been selected and optimized through evolution for use as «chemical warfare» agents against, prey, predators, and competing organisms.
One historical misconception regarding the ncRNAs is that they lack critical genetic information or function. Rather, these ncRNAs are often critical elements in gene regulation and expression. Noncoding RNA also contributes to epigenetics, transcription, RNA splicing, and the translational machinery. The role of RNA in genetic regulation and disease offers a new potential level of unexplored genomic complexity.
Discovery is the identification of novel active chemical compounds, often called «hits», which are typically found by assay of compounds for a desired biological activity. Initial hits can come from repurposing existing agents toward a new pathologic processes, and from observations of biologic effects of new or existing natural products from bacteria, fungi, plants, etc. In addition, hits also routinely originate from structural observations of small molecule «fragments» bound to therapeutic targets (enzymes, receptors, etc.), where the fragments serve as starting points to develop more chemically complex forms by synthesis. Finally, hits also regularly originate from en-masse testing of chemical compounds against biological targets, where the compounds may be from novel synthetic chemical libraries known to have particular properties (kinase inhibitory activity, diversity or drug-likeness, etc.), or from historic chemical compound collections or libraries created through combinatorial chemistry. While a number of approaches toward the identification and development of hits exist, the most successful techniques are based on chemical and biological intuition developed in team environments through years of rigorous practice aimed solely at discovering new therapeutic agents.
In such studies, multiple experimental treatments are tested in a single trial. Genetic testing enables researchers to group patients according to their genetic profile, deliver drugs based on that profile to that group and compare the results. Multiple companies can participate, each bringing a different drug. The first such approach targets squamous cell cancer, which includes varying genetic disruptions from patient to patient. Amgen, AstraZeneca and Pfizer are involved, the first time they have worked together in a late-stage trial. Patients whose genomic profiles do not match any of the trial drugs receive a drug designed to stimulate the immune system to attack cancer.
The last decade has seen a proliferation of information technology use in the planning and conduct of clinical trials. Clinical trial management systems are often used by research sponsors or CROs to help plan and manage the operational aspects of a clinical trial, particularly with respect to investigational sites. Advanced analytics for identifying researchers and research sites with expertise in a given area utilize public and private information about ongoing research. Web-based electronic data capture (EDC) and clinical data management systems are used in a majority of clinical trials to collect case report data from sites, manage its quality and prepare it for analysis. Interactive voice response systems are used by sites to register the enrollment of patients using a phone and to allocate patients to a particular treatment arm (although phones are being increasingly replaced with web-based (IWRS) tools which are sometimes part of the EDC system). Patient-reported outcome measures are being increasingly collected using hand-held, sometimes wireless ePRO (or eDiary) devices. Statistical software is used to analyze the collected data and prepare them for regulatory submission. Access to many of these applications are increasingly aggregated in web-based clinical trial portals. In 2011, the FDA approved a Phase 1 trial that used telemonitoring, also known as remote patient monitoring, to collect biometric data in patients’ homes and transmit it electronically to the trial database. This technology provides many more data points and is far more convenient for patients, because they have fewer visits to trial sites.
In 1990, the two major funding agencies, DOE and NIH, developed a memorandum of understanding in order to coordinate plans and set the clock for the initiation of the Project to 1990. At that time, David Galas was Director of the renamed «Office of Biological and Environmental Research» in the U.S. Department of Energy’s Office of Science and James Watson headed the NIH Genome Program. In 1993, Aristides Patrinos succeeded Galas and Francis Collins succeeded James Watson, assuming the role of overall Project Head as Director of the U.S. National Institutes of Health (NIH) National Human Genome Research Institute. A working draft of the genome was announced in 2000 and a complete one in 2003, with more detailed analysis still being published.