1) The startup phase/ Review and finalization of study documents
Startup phase consists of activities like CRF creation and Designing, Database designing and Testing, Edit checks preparation, and User Acceptance Testing (UAT) along with document preparation such as Data Management Plan, CRF Completion Guidelines, Data Entry Guidelines, and Data Validation plan.
The CDM staff determines the data items to be gathered and, as a result, the frequency of collection with regard to the visit schedule during this evaluation.
Then, the Data Management Plan (DMP) document outlines the CDM procedures to be followed during the trial and serves as a guide on how to handle the data under likely circumstances.
The DMP describes the database design, data entry and data tracking guidelines, internal control measures, SAE reconciliation guidelines, discrepancy management, data transfer/extraction, and database locking guidelines. together with the DMP.
A Case Report Form (CRF) is intended by the CDM team, as this is often the primary step in translating the protocol-specific activities into data being generated. the data fields should be clearly defined and consistent throughout. The sort of data to be entered should be evident from the CRF.
Similarly, the units within which measurements should be made should even be mentioned next to the data field. The CRF should be concise, self-explanatory, and user-friendly (unless you're the one entering data into the CRF).
In questions with discrete value options (like the variable gender having values of male and feminine as responses), all possible options are going to be coded appropriately.
2) Database Designing
Databases are clinical software applications, which are built to facilitate the CDM tasks to hold out multiple studies. Generally, these tools have built-in compliance with regulatory requirements and are easy to use.
“System validation” is conducted to ensure data correctness, during which system specifications, user requirements, and regulatory compliance are evaluated before implementation. Study details like objectives, intervals, visits, investigators, sites, and patients are defined within the database and CRF layouts are designed for data entry.
These entry screens are tested with dummy data before moving them to the data capture.
3) Data Collection
Data collection is finished using the CRF which will exist within the sort of a paper or an electronic version. The CRF will be annotated with coded terms to communicate where the data collected for each question is to be stored in the database.
The main objectives behind CRF development are preserving and maintaining the quality and integrity of data. CRF design should be standardized to address the needs of all users such as investigators, site coordinators, study monitors, data entry personnel, medical coder, and statistician. Data should be organized in a format that facilitates and simplifies data analysis.
Collection of a large amounts of data will result in wasted resources in collecting and processing it and in many circumstances, will not be utilized for analysis. Apart from that, standard guidelines should be followed while designing the CRF.
CRF completion manual should be provided to the site personnel to promote accurate data entry by them. These measures will result in reduced query generations and improved data integrity. It is recommended to establish and maintain a library of templates of standard CRF modules as they are time-saving and cost-effective.
Today pharmaceutical companies attempt to reduce the time taken for drug development processes by enhancing the speed of processes involved, many pharmaceutical companies are choosing e-CRF options (also called remote data entry).
4) Data Validation
Data validation is the process of testing the validity of data following the protocol specifications. In order to verify the correctness of the entered data, edit check programs are created to detect discrepancies within the data that are incorporated within the database.
These programs were created in accordance with the DVP's logical requirements. The edit check systems are initially tested using fictitious data that contains errors. A data point that fails to pass a validation check is defined as a discrepancy. Inconsistent data, missing data, range checks, and procedure violations are also to blame for the disparity.
In e-CRF-based studies, the data validation process is run frequently for identifying discrepancies. These discrepancies are going to be resolved by investigators after logging into the system.
Ongoing internal control of data processing is undertaken at regular intervals during CDM. For instance, if the inclusion criteria specify that the age of the patient should be between 18 and 65 years (both inclusive), an editing program is written for 2 conditions viz. age <18 and >65. In case any patient’s condition becomes TRUE, a discrepancy is generated.
These discrepancies are going to be highlighted within the system and Data Clarification Forms (DCFs) may be generated. DCFs are documents containing queries about the discrepancies identified.
5) Discrepancy management
It is also called query resolution.
Discrepancy management includes reviewing discrepancies, investigating the rationale, and resolving them with documentary proof or declaring them as irresolvable. Discrepancy management helps in cleaning the data and gathers enough evidence for the deviations observed in data.
The majority of CDMS includes a discrepancy database where all discrepancies are going to be recorded and stored with an audit trail.
The resolution or an explanation of the events that led to the disparity in the data will be written by the investigators. When a resolution is provided by the investigator, the database will be updated accordingly. In the case of e-CRFs, the investigator has access to the discrepancies that have been brought to his attention and is prepared to offer online resolutions.
The CDM team reviews all discrepancies at regular intervals to confirm that they need to be resolved. The resolved data discrepancies are recorded as ‘closed’. this suggests that those validation failures are no longer considered to move, and future data validation attempts on the identical data won't create a discrepancy for the identical datum.
But the closure of discrepancies isn't always possible. In some cases, the investigator won't be able to provide a resolution for the discrepancy. Such discrepancies are considered ‘irresolvable’ and can be updated within the discrepancy database.
Discrepancy management is the most important activity within the CDM process. Being the vital activity in cleaning up the data, utmost attention must be observed while handling the discrepancies.
6) Medical Coding
Medical coding helps in identifying and properly classifying the medical terminologies related to the test.
Medical dictionaries that are available online are used to categorize incidents. Technically speaking, this activity requires a comprehension of medical terminology, data on disease entities, data on medications utilized, and a fundamental understanding of the pathological processes involved. Functionally, it also requires data about the structure of electronic medical dictionaries and also the hierarchy of classifications available in them.
Adverse events occurring during the study, before and concomitantly administered medications, and pre-or coexisting illnesses are coded using the available medical dictionaries. Commonly, the WHODrug Dictionary Enhanced (WHO-DDE) is used for coding the drugs, while the Medical Dictionary for Regulatory Activities (MedDRA) is used for coding adverse events and other disorders. These dictionaries contain the respective classifications of adverse events and medicines in proper classes. Other dictionaries are available to be used in data management (eg, WHO-ART could be a dictionary that deals with adverse reactions terminology).
To meet their requirements and adhere to their standard operating procedures, certain pharmaceutical businesses use specially created dictionaries. Also, to achieve data consistency and save needless repetition, medical coding assists in classifying reported medical terminology on the CRF into simple dictionary terms. As an example, the investigators may use different terms for identical adverse events, but it's important to code all of them to one standard code and maintain uniformity within the process.
The proper coding and classification of adverse events and medicine are crucial as incorrect coding may result in the masking of issues of safety or highlight the incorrect safety concerns associated with the drug.
7) Database Locking
After a correct quality check and assurance, the ultimate data validation is run.
All data management activities should be completed before the database lock. To ensure all this, a pre-lock checklist is employed and completion of all activities is confirmed.
The database is locked and clean data is extracted for statistical analysis once locking has received clearance from all parties. Generally, no modification within the database is feasible. But just in case of a critical issue or other important operational reasons, privileged users can modify the data even after the database is locked. To update the locked database, though, there must be sufficient justification, suitable documentation, and an audit trail kept.
Data extraction is completed from the ultimate database after locking. This is often followed by its archival.