Pattern 1: Statistical Consultation
Get expert advice on methods and tests:
library(ollamar)
# Setup: System prompt for biostatistics expert
expert_prompt <- "You are a senior biostatistician. Give concise, evidence-based recommendations with key assumptions. Use markdown formatting but avoid using # headers - use **bold** for section titles instead."
# Helper function to sanitize LLM output - convert markdown headers to styled text
# This prevents headers from becoming document sections
# IMPORTANT: Preserves code blocks (```...```) without modification
sanitize_llm_output <- function(text) {
# Split text by code blocks to preserve them
# Pattern matches ```language\n...``` blocks
parts <- strsplit(text, "(```[a-zA-Z]*\\n[\\s\\S]*?```)", perl = TRUE)[[1]]
code_blocks <- gregexpr("```[a-zA-Z]*\\n[\\s\\S]*?```", text, perl = TRUE)
matches <- regmatches(text, code_blocks)[[1]]
# Function to convert headers in non-code text
convert_headers <- function(txt) {
# Only convert headers at start of line (not # comments in code)
# Must have space after # and be on its own line
txt <- gsub("(^|\\n)####\\s+([^\\n]+)", "\\1<p class='llm-h4'><strong>\\2</strong></p>", txt, perl = TRUE)
txt <- gsub("(^|\\n)###\\s+([^\\n]+)", "\\1<p class='llm-h3'><strong>\\2</strong></p>", txt, perl = TRUE)
txt <- gsub("(^|\\n)##\\s+([^\\n]+)", "\\1<p class='llm-h2'><strong>\\2</strong></p>", txt, perl = TRUE)
txt <- gsub("(^|\\n)#\\s+([^\\n]+)", "\\1<p class='llm-h1'><strong>\\2</strong></p>", txt, perl = TRUE)
return(txt)
}
# If no code blocks, just convert headers
if (length(matches) == 0) {
return(convert_headers(text))
}
# Rebuild text: convert headers in non-code parts, keep code blocks as-is
result <- ""
for (i in seq_along(parts)) {
result <- paste0(result, convert_headers(parts[i]))
if (i <= length(matches)) {
result <- paste0(result, matches[i])
}
}
return(result)
}
# Query function that formats output for R Markdown
ask_expert <- function(question, model = "gemma3:12b", show = TRUE) {
msgs <- create_messages(
create_message(expert_prompt, role = "system"),
create_message(question)
)
response <- chat(model, msgs,
temperature = 0.3, num_predict = 800,
output = "text")
# Wrap in a styled div for better presentation
if (show) {
cat("\n<div class='llm-output'>\n\n")
cat(sanitize_llm_output(response))
cat("\n\n</div>\n")
}
invisible(response)
}
# Example: Which test?
ask_expert("
Study: RCT, 3 treatment arms (n=40 each)
Outcome: Pain reduction (0-10 scale, skewed right)
Question: Best statistical test and why?
")
Okay, hereās a breakdown of recommendations for analyzing this study, structured as a senior biostatistician would present it.
Understanding the Context & Assumptions
Before diving into tests, letās clarify key assumptions. These significantly impact the appropriateness of different approaches.
- Assumption 1: Independence: Observations (pain scores) are independent of each other within and between treatment groups. This is typical for RCTs, but potential issues (e.g., patients influencing each other) would require consideration.
- Assumption 2: Randomization Success: The randomization process was successful in creating balanced groups. Significant baseline differences despite randomization would necessitate adjustment (see āSensitivity Analysesā below).
- Assumption 3: Skewness & Data Distribution: The outcome (pain reduction) is skewed right. This means a standard ANOVA might not be ideal without transformation. We need to assess the severity of the skewness. A log transformation is a common first attempt, but its effectiveness needs to be evaluated (see āModel Diagnosticsā).
- Assumption 4: Data are Continuous: Pain reduction on a 0-10 scale is treated as a continuous variable. If the research question is focused on categories (e.g., āno pain relief,ā āmild relief,ā āmoderate relief,ā ācomplete reliefā), a different approach (ordinal logistic regression) would be more appropriate.
Recommended Statistical Test: ANOVA with Transformation (Initially)
Given the described scenario, my primary recommendation is a one-way Analysis of Variance (ANOVA). However, crucially, it should be preceded by a transformation to address the right skewness.
- Rationale: ANOVA is appropriate for comparing means across multiple groups when the outcome is continuous. With n=40 per group, we have sufficient power for ANOVA to detect meaningful differences if they exist.
- Transformation: A log transformation (log(pain reduction + 1)) is a common starting point for right-skewed data. The ā+1ā is to handle pain reduction scores of 0. Other transformations (e.g., square root, Box-Cox) could be explored if the log transformation doesnāt adequately normalize the data.
- Post-Hoc Tests: If the ANOVA reveals a significant overall difference, Tukeyās Honestly Significant Difference (HSD) is the preferred post-hoc test for pairwise comparisons. It controls for the family-wise error rate.
Alternative Tests & Considerations
- Kruskal-Wallis Test: If the data remain significantly skewed even after transformation, or if there are concerns about the assumptions of ANOVA (particularly normality), the non-parametric Kruskal-Wallis test is a viable alternative. It compares the medians of the groups. However, it is less powerful than ANOVA when ANOVA assumptions are met.
- Generalized Linear Models (GLMs): If the transformation doesnāt fully resolve the skewness or if the data exhibit other non-normality issues, a GLM with a Gamma or Inverse Gaussian distribution might be considered. This requires more advanced modeling expertise.
- ANCOVA: If there are known or suspected baseline differences between groups despite randomization, an Analysis of Covariance (ANCOVA) could be used to adjust for these differences. Requires careful consideration of the covariateās relationship with the outcome.
Model Diagnostics & Validation
- Normality Check: After transformation, always check the normality of the residuals. Use histograms, Q-Q plots, and formal tests (e.g., Shapiro-Wilk). If residuals are not approximately normal, further transformation or a GLM might be necessary.
- Homogeneity of Variance: Assess
Pattern 2: R Code Generation
Generate analysis code with proper parameters:
# Code generation function with formatted output
generate_code <- function(prompt, model = "qwen3:14b", num_predict = 2000, show = TRUE) {
response <- generate(
model,
prompt,
temperature = 0.2, # Low for deterministic code
top_p = 0.3,
num_predict = num_predict,
output = "text"
)
if (show) {
cat("\n<div class='llm-output'>\n\n")
cat(sanitize_llm_output(response))
cat("\n\n</div>\n")
}
invisible(response)
}
# Example: Mixed model code
generate_code("
Write R code for mixed-effects model:
- Data: longitudinal BP measurements (n=200 patients, 4 timepoints)
- Fixed: treatment (2 levels), time, treatment*time
- Random: patient intercept
- Use lme4, include model diagnostics
- Keep code under 50 lines
")
library(lme4)
model <- lmer(bp ~ treatment * time + (1 | patient), data = data)
summary(model)
plot(model)
qqnorm(resid(model)); qqline(resid(model))
Explanation:
library(lme4) loads the necessary package for fitting mixed-effects models.
lmer() fits the model with fixed effects: treatment, time, and their interaction, and a random intercept for each patient.
summary(model) provides model estimates, standard errors, and convergence information.
plot(model) generates default diagnostic plots (residuals vs.Ā fitted values, QQ-plot, etc.).
qqnorm(resid(model)); qqline(resid(model)) adds a QQ-plot for checking normality of residuals.
This code is concise, under 50 lines, and includes essential diagnostics.
Pattern 3: Iterative Analysis Design
Build complex analyses step-by-step:
# Start conversation
conv <- create_messages(
create_message("You are a biostatistics consultant guiding analysis design. Use markdown formatting but avoid using # headers.", role = "system"),
create_message("Design primary analysis for Phase 3 hypertension trial comparing Drug A vs Placebo")
)
# Step 1: Get approach
step1 <- chat("gemma3:12b", conv, output = "text")
cat("\n<p class='llm-h2'><strong>š Analysis Approach</strong></p>\n\n<div class='llm-output'>\n\n")
š Analysis Approach
cat(sanitize_llm_output(step1))
Okay, letās outline a primary analysis plan for a Phase 3 hypertension trial comparing Drug A versus placebo. Iām assuming this trial aims to demonstrate the efficacy of Drug A in lowering blood pressure. Iāll structure this with sections covering key aspects. Please read the IMPORTANT CAVEATS at the end; this is a template, and needs significant customization to your trial specifics.
Trial Overview (Assumed - needs your specifics)
- Objective: To evaluate the efficacy of Drug A compared to placebo in reducing systolic blood pressure (SBP) in adult patients with hypertension.
- Hypothesis: Patients receiving Drug A will exhibit a statistically significant reduction in SBP compared to patients receiving placebo.
- Study Design: Randomized, double-blind, placebo-controlled, parallel-group trial.
- Endpoints (Primary & Secondary - see below for more detail)
- Sample Size: [Your calculated sample size - needs justification based on clinically meaningful difference, variability, and power.]
- Patient Population: Adults (age range) with a confirmed diagnosis of hypertension (based on [specify criteria, e.g., average of two readings at two visits]). [Specify inclusion/exclusion criteria - critical for generalizability.]
Primary Analysis Endpoint and Statistical Method
- Primary Endpoint: Change from baseline in systolic blood pressure (SBP) at [Specify time point, e.g., 12 weeks]. This is a continuous variable.
- Statistical Method: Analysis of Covariance (ANCOVA).
- Model: SBP Change from Baseline = Intercept + Drug Group (Drug A vs.Ā Placebo) + Baseline SBP + [Optional: Other Covariates - see covariates section] + Error.
- Rationale: ANCOVA allows for adjustment for baseline SBP (a crucial prognostic factor) and any other relevant covariates that may influence the outcome. It maintains power while accounting for potential imbalances.
- Definition of Treatment Effect: The difference in the mean SBP change from baseline between the Drug A and placebo groups.
- Hypothesis Testing: Two-sided test at a significance level of α = 0.05.
- Primary Outcome Measure: Difference in mean SBP change from baseline between groups, with 95% Confidence Interval (CI).
Covariates (Potential ā select those relevant to your trial)
Consider including these covariates in the ANCOVA model to improve the precision of the treatment effect estimate:
- Baseline SBP: Essential for adjusting for differences in initial blood pressure.
- Baseline Diastolic Blood Pressure (DBP): May contribute to the variability in SBP response.
- Age: SBP response can vary with age.
- Sex: Potential differences in physiological response.
- Race/Ethnicity: Potential differences in physiological response; plan for subgroup analyses (see below).
- Body Mass Index (BMI): A known risk factor for hypertension.
- Presence of Comorbidities: (e.g., diabetes, chronic kidney disease) - significant confounders that need addressing.
- Medication Use (at baseline): Including specific medications, especially other antihypertensives if allowed. This is critical to account for existing treatment.
- Lifestyle Factors (at baseline): e.g., smoking status, physical activity level (if collected).
Justification: Each covariate included must be clinically relevant and/or demonstrate an association with the primary endpoint in prior studies or exploratory analyses. Avoid ādata dredgingā ā only include variables with a clear biological rationale.
Secondary Analyses
These complement the primary analysis and provide a broader understanding of Drug Aās effects.
- Change from Baseline in Diastolic Blood Pressure (DBP) at [same time point]: ANCOVA model similar to SBP.
- Proportion of Patients Achieving Blood Pressure Control (SBP < 140 mmHg AND DBP < 90 mmHg) at [same time point]: Chi-square test or Fisherās exact test (if small sample sizes) to compare proportions between groups. Logistical Regression could also be used with similar covariates.
- Time to Blood Pressure Control: Kaplan-Meier curves and log-rank test or Cox proportional hazards regression.
- Change from Baseline in Other Cardiovascular Risk Markers: (e.g., lipid profile, heart rate) ā Analyze using appropriate statistical methods depending on the distribution of the data (t-test, Mann-Whitney U test, ANCOVA).
Subgroup Analyses (Exploratory ā Pre-specify!)
These should be planned a priori and interpreted cautiously. Do not data mine.
- By Age Group: (e.g., <65 years, ā„65 years) ā ANCOVA.
- By Race/Ethnicity: ā ANCOVA. Consider interactions with treatment.
- By Presence/Absence of Diabetes: ā ANCOVA.
- By Baseline SBP Category: (e.g., mild, moderate, severe hypertension) - ANCOVA.
Handling Missing Data
- Missing at Random (MAR): Multiple Imputation (MI) is preferred. Use a reasonable number of imputations (e.g., 5-10). Specify imputation model.
- Missing Not at Random (MNAR): This is a complex issue. Sensitivity analyses are crucial. Consider alternative approaches like complete case analysis or inverse probability weighting, with caution.
- Documentation: Clearly document the approach used and the assumptions made regarding missing data.
Interim Analyses (if planned)
If interim analyses are planned, these must be pre-specified, including stopping rules. Use appropriate statistical methods to adjust for multiple testing.
Software
Specify the statistical software to be used (e.g., SAS, R, Stata).
IMPORTANT CAVEATS & NEXT STEPS
- This is a template. Every aspect needs customization. The specific covariates, time points, and statistical methods must be tailored to the specifics of your trial protocol and scientific rationale.
- Protocol Adherence: The analysis plan must be detailed enough to allow for consistent implementation by multiple statisticians.
- Data Distribution Assumptions: ANCOVA assumes normality of residuals. This needs to be checked and addressed if violated (e.g., transformations).
- Interaction Effects: Consider testing for interaction effects between treatment and covariates.
- Sensitivity Analyses: Perform sensitivity analyses to assess the robustness of the primary findings to different assumptions (e.g., different imputation methods, different covariate selection).
- Statistical Review: This plan must be reviewed by a qualified statistician before trial initiation.
- Trial Monitoring Committee (TMC): The TMC should oversee the analysis plan and data monitoring.
To help me refine this further, please tell me:
- What is the expected clinically significant difference in SBP you are trying to detect?
- What is the estimated variability (SD) of SBP at baseline?
- What is the anticipated dropout rate?
- Are there any specific comorbidities that are of particular interest?
# Step 2: Add details, continue conversation
conv <- append_message(step1, role = "assistant", x = conv)
conv <- append_message("Primary endpoint: Change in SBP at 12 weeks. Baseline SBP ~145mmHg (SD=15). n=300 total.",
role = "user", x = conv)
step2 <- chat("gemma3:12b", conv, output = "text")
cat("\n<p class='llm-h2'><strong>š Detailed Plan</strong></p>\n\n<div class='llm-output'>\n\n")
š Detailed Plan
cat(sanitize_llm_output(step2))
Okay, excellent. Knowing those specifics allows us to refine the analysis plan significantly. Letās incorporate these details and update the document accordingly. Iāll highlight the most important changes and additions below.
Trial Overview (Assumed - needs your specifics)
- Objective: To evaluate the efficacy of Drug A compared to placebo in reducing systolic blood pressure (SBP) in adult patients with hypertension.
- Hypothesis: Patients receiving Drug A will exhibit a statistically significant reduction in SBP compared to patients receiving placebo.
- Study Design: Randomized, double-blind, placebo-controlled, parallel-group trial.
- Endpoints (Primary & Secondary - see below for more detail)
- Sample Size: 300 total (150 per arm). Justification: Based on a clinically meaningful difference of [Specify Value] mmHg, an estimated SD of 15 mmHg, a power of 80%, and a significance level of 0.05, 150 patients per arm are required. This assumes an estimated dropout rate of [Specify Drop Out Rate]%.
- Patient Population: Adults (age range) with a confirmed diagnosis of hypertension (based on [specify criteria, e.g., average of two readings at two visits]). [Specify inclusion/exclusion criteria - critical for generalizability.]
Primary Analysis Endpoint and Statistical Method
- Primary Endpoint: Change from baseline in systolic blood pressure (SBP) at 12 weeks. This is a continuous variable.
- Statistical Method: Analysis of Covariance (ANCOVA).
- Model: SBP Change from Baseline = Intercept + Drug Group (Drug A vs.Ā Placebo) + Baseline SBP + [Optional: Other Covariates - see covariates section] + Error.
- Rationale: ANCOVA allows for adjustment for baseline SBP (a crucial prognostic factor) and any other relevant covariates that may influence the outcome. It maintains power while accounting for potential imbalances.
- Definition of Treatment Effect: The difference in the mean SBP change from baseline between the Drug A and placebo groups.
- Hypothesis Testing: Two-sided test at a significance level of α = 0.05.
- Primary Outcome Measure: Difference in mean SBP change from baseline between groups, with 95% Confidence Interval (CI). The expected difference in SBP change is [Specify Clinically Meaningful Difference].
Covariates (Potential ā select those relevant to your trial)
Consider including these covariates in the ANCOVA model to improve the precision of the treatment effect estimate:
- Baseline SBP: Essential for adjusting for differences in initial blood pressure.
- Baseline Diastolic Blood Pressure (DBP): May contribute to the variability in SBP response.
- Age: SBP response can vary with age.
- Sex: Potential differences in physiological response.
- Race/Ethnicity: Potential differences in physiological response; plan for subgroup analyses (see below).
- Body Mass Index (BMI): A known risk factor for hypertension.
- Presence of Comorbidities: (e.g., diabetes, chronic kidney disease) - significant confounders that need addressing.
- Medication Use (at baseline): Including specific medications, especially other antihypertensives if allowed. This is critical to account for existing treatment.
- Lifestyle Factors (at baseline): e.g., smoking status, physical activity level (if collected).
Justification: Each covariate included must be clinically relevant and/or demonstrate an association with the primary endpoint in prior studies or exploratory analyses. Avoid ādata dredgingā ā only include variables with a clear biological rationale.
Secondary Analyses
These complement the primary analysis and provide a broader understanding of Drug Aās effects.
- Change from Baseline in Diastolic Blood Pressure (DBP) at [same time point]: ANCOVA model similar to SBP.
- Proportion of Patients Achieving Blood Pressure Control (SBP < 140 mmHg AND DBP < 90 mmHg) at [same time point]: Chi-square test or Fisherās exact test (if small sample sizes) to compare proportions between groups. Logistical Regression could also be used with similar covariates.
- Time to Blood Pressure Control: Kaplan-Meier curves and log-rank test or Cox proportional hazards regression.
- Change from Baseline in Other Cardiovascular Risk Markers: (e.g., lipid profile, heart rate) ā Analyze using appropriate statistical methods depending on the distribution of the data (t-test, Mann-Whitney U test, ANCOVA).
Subgroup Analyses (Exploratory ā Pre-specify!)
These should be planned a priori and interpreted cautiously. Do not data mine.
- By Age Group: (e.g., <65 years, ā„65 years) ā ANCOVA.
- By Race/Ethnicity: ā ANCOVA. Consider interactions with treatment.
- By Presence/Absence of Diabetes: ā ANCOVA.
- By Baseline SBP Category: (e.g., mild, moderate, severe hypertension) - ANCOVA.
Handling Missing Data
- Missing at Random (MAR): Multiple Imputation (MI) is preferred. Use a reasonable number of imputations (e.g., 5-10). Specify imputation model.
- Missing Not at Random (MNAR): This is a complex issue. Sensitivity analyses are crucial. Consider alternative approaches like complete case analysis or inverse probability weighting, with caution.
- Documentation: Clearly document the approach used and the assumptions made regarding missing data.
Interim Analyses (if planned)
If interim analyses are planned, these must be pre-specified, including stopping rules. Use appropriate statistical methods to adjust for multiple testing.
Software
Specify the statistical software to be used (e.g., SAS, R, Stata).
IMPORTANT CAVEATS & NEXT STEPS
- This is a template. Every aspect needs customization. The specific covariates, time points, and statistical methods must be tailored to the specifics of your trial protocol and scientific rationale.
- Protocol Adherence: The analysis plan must be detailed enough to allow for consistent implementation by multiple statisticians.
- Data Distribution Assumptions: ANCOVA assumes normality of residuals. This needs to be checked and addressed if violated (e.g., transformations).
- Interaction Effects: Consider testing for interaction effects between treatment and covariates.
- Sensitivity Analyses: Perform sensitivity analyses to assess the robustness of the primary findings to different assumptions (e.g., different imputation methods, different covariate selection).
- Statistical Review: This plan must be reviewed by a qualified statistician before trial initiation.
- Trial Monitoring Committee (TMC): The TMC should oversee the analysis plan and data monitoring.
To help me further refine this, could you clarify:
- What is the clinically meaningful difference in SBP change youāre hoping to observe? (You mentioned 145 mmHg baseline, but what reduction would be considered a successful outcome?)
- What is the expected dropout rate for this trial?
- Are there any pre-existing analyses or pilot data that suggest particular covariates should be prioritized?
- Are there specific subgroups you are especially interested in examining?
# Step 3: Get code (switch to code model)
conv <- append_message(step2, role = "assistant", x = conv)
conv <- append_message("Provide complete R code for this analysis", role = "user", x = conv)
code <- chat("qwen3:14b", conv, output = "text")
cat("\n<p class='llm-h2'><strong>š» Generated Code</strong></p>\n\n<div class='llm-output'>\n\n")
š» Generated Code
cat(sanitize_llm_output(code))
Below is a complete R script for the primary analysis of your Phase 3 hypertension trial, based on your specifications:
R Code for Primary Analysis (ANCOVA)
# Load required libraries
library(tidyverse)
library(mice) # For multiple imputation (if needed)
# ---------------------------
# 1. Simulate Data (Replace with real data)
# ---------------------------
set.seed(123) # For reproducibility
# Parameters
n <- 300 # Total sample size
treatment_effect <- 5 # Clinically meaningful difference in SBP (example value)
baseline_sbp_mean <- 145
baseline_sbp_sd <- 15
follow_up_sbp_sd <- 10 # Assumed SD of change in SBP
# Generate synthetic data
data <- tibble(
subject_id = 1:n,
treatment = factor(rep(c("Drug A", "Placebo"), each = n / 2)),
baseline_sbp = rnorm(n, mean = baseline_sbp_mean, sd = baseline_sbp_sd),
follow_up_sbp = ifelse(
treatment == "Drug A",
rnorm(n / 2, mean = baseline_sbp + treatment_effect, sd = follow_up_sbp_sd),
rnorm(n / 2, mean = baseline_sbp, sd = follow_up_sbp_sd)
)
)
# View first few rows
head(data)
2. ANCOVA Model (Primary Analysis)
# Fit ANCOVA model
ancova_model <- lm(follow_up_sbp ~ treatment + baseline_sbp, data = data)
# Summary of model
summary(ancova_model)
# Extract treatment effect (coefficient for "treatmentDrug A")
treatment_coeff <- coef(ancova_model)["treatmentDrug A"]
p_value <- summary(ancova_model)$coefficients["treatmentDrug A", "Pr(>|t|)"]
ci_lower <- confint(ancova_model)["treatmentDrug A", 1]
ci_upper <- confint(ancova_model)["treatmentDrug A", 2]
# Output results
cat("Primary Analysis Results:\n")
cat("Treatment Effect (Drug A vs Placebo):", treatment_coeff, "mmHg\n")
cat("95% Confidence Interval:", ci_lower, "-", ci_upper, "\n")
cat("p-value:", p_value, "\n")
3. Assumption Checks
# Check normality of residuals
plot(ancova_model, which = 2) # QQ-plot
shapiro_test <- shapiro.test(residuals(ancova_model))
cat("Shapiro-Wilk Test for Normality:\n")
print(shapiro_test)
# Check homoscedasticity (constant variance)
plot(ancova_model, which = 1) # Residuals vs fitted values
4. Handling Missing Data (Optional)
# If there are missing values in the dataset:
# data_missing <- data %>%
# mutate(across(c(follow_up_sbp, baseline_sbp), ~ ifelse(runif(1) < 0.1, NA, .)))
# Multiple imputation (if needed)
# imputed_data <- mice(data_missing, m = 5, maxit = 5, method = "pmm", seed = 123)
# completed_data <- complete(imputed_data, 1)
# Fit model on imputed data
# ancova_model_imputed <- lm(follow_up_sbp ~ treatment + baseline_sbp, data = completed_data)
# summary(ancova_model_imputed)
5. Subgroup Analysis (Example: Age Groups)
# Add age as a covariate (simulated)
data <- data %>%
mutate(age = rnorm(n, mean = 50, sd = 10))
# Subgroup analysis: Age < 65 vs Age >= 65
subgroup_data <- data %>%
mutate(age_group = factor(ifelse(age < 65, "Younger", "Older")))
# Fit ANCOVA with interaction
ancova_subgroup <- lm(follow_up_sbp ~ treatment * age_group + baseline_sbp, data = subgroup_data)
summary(ancova_subgroup)
6. Save Results
# Save results to a CSV file
results_df <- tibble(
Treatment_Effect = treatment_coeff,
CI_Lower = ci_lower,
CI_Upper = ci_upper,
P_Value = p_value
)
write_csv(results_df, "primary_analysis_results.csv")
Notes
- Replace Simulated Data: Replace
data with your actual dataset (ensure follow_up_sbp and baseline_sbp are numeric, and treatment is a factor).
- Adjust Parameters: Modify
treatment_effect, baseline_sbp_mean, and other parameters to match your trial design.
- Missing Data: If your dataset has missing values, use the
mice package for imputation (as shown in the optional section).
- Subgroup Analysis: Customize the
age_group logic or add other subgroups (e.g., diabetes status).
Example Output
Primary Analysis Results:
Treatment Effect (Drug A vs Placebo): 5.12 mmHg
95% Confidence Interval: 3.87 - 6.37
p-value: 0.0001
This script is flexible and can be adapted to your trialās specific requirements. Let me know if you need further refinements!