Data Management and Biostatistics Assignment Brief

Data Management and Biostatistics Assignment Brief

Data Management and Biostatistics Assignment Brief


The dataset for this data management and biostatistics assignment brief can be found in the Excel file prostate_cancer_dataset.xlsx (on Blackboard). The Excel file contains two sheets; the first sheet (‘DATASET’) contains the dataset you will need to complete this assignment, and the second sheet (‘CODEBOOK’) contains a description of each variable in the dataset.

This data was collected as part of a randomized controlled trial to assess the effect of treatment with diethylstilbestrol (or DES, a synthetic non-steroidal estrogen) on survival in stage 3 and stage 4 prostate cancer patients.

Patients with stage 3 or stage 4 prostate cancers were enrolled, and were randomly assigned to one of four treatment arms:

  • Placebo
  • 0.2 mg estrogen
  • 1.0 mg estrogen
  • 5.0 mg estrogen

Several baseline characteristics were measured, including stage of prostate cancer, age at diagnosis, weight index, performance rating(a measure of quality of life), whether patient had a history of cardiac disease, systolic and diastolic blood pressure, electrocardiogram result, serum hemoglobin, size of primary tumor, a combined index of stage and histological grade, and whether or not bone metastases had occurred.
Each patient was followed-up and survival status and survival time (in months) were measured.

A.Import the dataset to SPSS. Define the attributes of variables in the dataset (in the ‘Data View’ in SPSS). Note, you will be assessed on whether or not you correctly define variable attributes.

Your final version of this SPSS dataset should be submitted as part of this assignment, so you will need to save the dataset and any modifications you make to it. Save the dataset as: yourname_dataset.sav

Descriptive statistics

A. Patients were randomly assigned to treatment arms to achieve balance across the treatment arms with respect to baseline characteristics. To investigate whether randomization was successful, construct a table to compare baseline characteristics across the four treatment arms. Include summary information on the following baseline characteristics:

  • number in each treatment arm
  • age at diagnosis
  • stage of prostate cancer
  • weight index
  • history of cardiac disease
  • systolic blood pressure
  • diastolic blood pressure
  • serum hemoglobin
  • size of primary tumor
  • serum prostatic acid phosphates
  • bone metastases

Note:you should present the results all in one table. If you are unsure how to present the data, look up some published clinical trials to determine how baseline characteristics are usually presented.Provide a brief discussion on whether the baseline characteristics appear to be balanced across treatment arms.

1. How many, and what percentage, of men in the sample are aged >75? Show how you obtained the answer to this question.

Inferential analysis

A.Test the hypothesis that the proportion of survivors differs for patients treated with estrogen (combining those on doses 0.2, 1.0 or 5.0 mg) compared to those not treated with estrogen (placebo),using the 7 step process for hypothesis testing.
B. Describe and evaluate the assumptions of the hypothesis test you applied in part a)

You have been asked to design a clinical trial to at evaluating the efficacy of a new biologic called ABCDE in patients with active Rheumatoid arthritis. Patients who are have never before received biologic therapy will be randomized to two groups (placebo control or 1 infusion per month of ABCDE)

Patients will be stratified based on severity of disease. Patients will be treated for 3 months, with a final follow up visits at 3 months after last treatment.
There are 3 Elements of this data management and biostatistics assignment

  • DM Activities Summary (1000 words maximum)
    Describe how the DM team will support the project including
    • Data to be collected, including endpoints
    • Standards
    • Software / hardware you propose using
    • Data Collection Methods
    • Post collection data handling
  • Case Report Form
    • Measurements / data points you would need to collect to test the hypothesis
  • Database Design
    • Diagram showing the layout of the database you would build
      • Groups in database
      • Relationships between groups