Free and Open-Source Meta-Analysis (FOSMA)

Input

Effect type τ² estimator CI method

Advanced settings

Model

CI width Correction (regression)

Analysis

Cumulative order Trim-and-fill Estimator: Show adjusted RE

Results appear in the Publication Bias section.

Display

Forest style

Bayesian priors

μ prior mean (μ₀): μ prior SD (σ_μ): τ prior scale (σ_τ):

RVE (robust variance estimation)

Assumed within-cluster correlation ρ: 0.80

Selection model

Mode: Weight scheme: Sides: Cutpoints:

Moderators:

Location-scale model moderators:

Risk-of-bias domains:

Study	Mean₁	SD₁	n₁	Mean₂	SD₂	n₂	Group	Cluster	Actions

Results

Resolution

Run the analysis to see results.

⚠️ Data or settings changed — click Run to update.

Forest plot

Studies per page

Individual Studies

Heterogeneity Diagnostics

Profile likelihood for τ²

x-axis:

95% CI from likelihood-ratio inversion (LRT). Note: the τ² CI shown in the summary table uses the Q-profile method, which is moment-based and will differ.

Bayesian Meta-Analysis

Posterior of μ (pooled effect)

Posterior of τ (between-study SD)

RVE (Robust Variance Estimation)

Three-Level Meta-Analysis

Publication Bias

Funnel plot

Sensitivity Analysis

Subgroup Analysis

Diagnostics

Influence plot

BLUPs (shrunken study estimates)

Baujat plot

Normal Q-Q plot

Radial (Galbraith) plot

L'Abbé plot

Cumulative Analysis

Cumulative forest plot

Studies per page

Alternative Visualisations

P-value Analyses

Selection Model (Vevea-Hedges)

GOSH Plot

X-axis: Max subsets:

GOSH enumerates all k-subsets of studies — this can be slow for large datasets, so results are not computed automatically. Click Compute when ready.

Risk of Bias

Risk-of-bias traffic light

Risk-of-bias summary

Meta-regression

A browser-based meta-analysis tool. No installation, no server, no dependencies beyond a modern browser. All computation runs locally in JavaScript — data never leaves the machine.

Effect types

Category	Measures
Continuous — two groups	Mean Difference (MD), Hedges' g (SMD), SMD heteroscedastic (SMDH), Ratio of Means (ROM)
Continuous — paired	Mean Difference Paired (MD_paired), Standardised Mean Change / pre-SD (SMD_paired), Standardised Mean Change / change-score SD (SMCC)
Continuous — single group	One-sample SMD (SMD1), One-sample SMD heteroscedastic (SMD1H), Mean raw (MN), Mean log (MNLN)
Variability	Coefficient of Variation Ratio (CVR), Variability Ratio (VR)
Binary outcomes	Odds Ratio (OR), Risk Ratio (RR), Risk Difference (RD), Arcsine-transformed Risk Difference (AS), Yule's Q (YUQ), Yule's Y (YUY), Generalised Odds Ratio — ordinal (GOR)
Correlations	Pearson r (COR), Bias-corrected r (UCOR), Fisher's z (ZCOR), Partial r (PCOR), Partial Fisher's z (ZPCOR), Point-biserial (RPB), Biserial (RBIS), R² (R2), Fisher-z R² (ZR2), Phi (PHI), Tetrachoric (RTET)
Proportions	Raw (PR), Log (PLN), Logit (PLO), Arcsine (PAS), Freeman-Tukey double arcsine (PFT)
Time-to-event / Rates	Hazard Ratio (HR), Incidence Rate Ratio (IRR), Incidence Rate Difference (IRD), Incidence Rate Difference sqrt (IRSD), Incidence Rate log (IR)
Reliability	Cronbach's α raw (ARAW), log-transformed (ABT), cube-root-transformed (AHW)
Generic	Pre-computed y_i / v_i (GENERIC)

Heterogeneity

τ² estimators: REML (default), DerSimonian–Laird (DL), Paule–Mandel (PM), Empirical Bayes (EB), Paule–Mandel Median (PMM), Generalised Q Median (GENQM), Maximum Likelihood, Hunter–Schmidt, Hedges, Sidik–Jonkman, Generalised Q (GENQ), Iterated DL (DLIT), Hunter–Schmidt corrected (HSk), Square-root GENQ (SQGENQ), EBLUP (= REML) — 15 options
Pooling methods: Inverse-variance (default; RE and FE); Mantel–Haenszel (OR, RR, RD) and Peto one-step (OR only) — fixed-effects pooling that operates directly on cell counts and handles single-zero cells without a continuity correction
Common language effect size (CLES) — displayed alongside the RE pooled estimate for standardised mean difference types (SMD, SMDH, SMD_paired, SMD1, SMD1H, SMCC); CLES = Φ(d / √2), the probability that a randomly drawn score from group 1 exceeds group 2; 95% CI transformed from the RE CI (McGraw & Wong, 1992)
CI methods: Normal/Wald, Knapp–Hartung, t-distribution, Profile Likelihood (REML or ML)
Statistics: Cochran's Q, I², H², τ², 95% prediction interval (Higgins 2009)
Confidence intervals on heterogeneity: Profile-likelihood CIs for τ², I², H²
Profile likelihood plot for τ²: full likelihood surface with LRT-based 95% CI; available when τ² estimator is ML or REML

Publication bias

Egger's regression — intercept test for funnel plot asymmetry
Begg's rank correlation — Kendall's τ_b with tie correction
FAT-PET / PET-PEESE — funnel asymmetry test and precision-effect test; when FAT detects bias (p < .10) the two-stage PET-PEESE correction is applied and the PEESE intercept is highlighted as the corrected effect estimate; the PEESE regression line is overlaid on the contour-enhanced funnel plot (Stanley & Doucouliagos, 2014)
Harbord's test — score-based Egger variant for binary OR studies; avoids inflated Type I error when effect size and SE share cell-count information (Harbord et al., 2006)
Peters' test — WLS regression on 1/N; works with any effect type where total N is available; preferred over Egger for OR/RR (Peters et al., 2006)
Deeks' test — funnel-asymmetry test for diagnostic accuracy (DOR) studies using effective sample size as the precision surrogate (Deeks et al., 2005)
Rücker's test — arcsine-transformation Egger variant for binary outcomes with better-controlled Type I error (Rücker et al., 2008)
Fail-safe N — Rosenthal and Orwin estimators
Test of Excess Significance (TES) — compares the observed number of significant results (O) against the expected (E = Σ power_i) given per-study power to detect the pooled effect; χ² = (O − E)² / [E(1 − E/k)]; p < .10 flags excess significance (Ioannidis & Trikalinos, 2007)
WAAP-WLS — Weighted Average of Adequately Powered studies; restricts pooling to studies with ≥ 80% power to detect the fixed-effect estimate; if no study qualifies, falls back to the full WLS estimate; a WAAP near zero with a large RE estimate suggests publication bias is inflating the pooled effect (Stanley & Doucouliagos, 2015)
Henmi-Copas bias-robust CI — confidence interval robust to publication bias; always uses DL τ² and fixed-effect weights; centred on the FE estimate with half-width u₀ × SE, where u₀ = SDR × t₀ and t₀ is solved by numerical integration over the conditional distribution of Q given R; wider than the standard RE CI when small-study effects are present (Henmi & Copas, 2010)
Trim-and-fill (L0, R0, Q0 estimators) — imputes missing studies and reports the adjusted pooled estimate; estimator selectable in the UI
Funnel plot — standard or contour-enhanced (α = .10, .05, .01 regions)
Selection model (Vevea–Hedges) — ω-weighted likelihood model; MLE mode (k ≥ 8) estimates selection weights jointly with μ and τ²; fixed-ω sensitivity presets (Mild / Moderate / Severe, Vevea & Woods 2005) available from k ≥ 3

P-value analyses

P-curve — distribution of significant p-values; tests for evidential value and right-skew (Simonsohn et al., 2014)
P-uniform* — publication-bias-corrected effect size estimate using the p-value distribution (van Assen et al., 2015)

Sensitivity and influence

Leave-one-out — flags studies whose omission would flip statistical significance
Influence diagnostics — Cook's distance, DFBETA, DFFITS, covariance ratio, hat values, standardised residuals, Δτ²
Influence plot — per-study leverage and Cook's distance visualised as a bubble chart
BLUPs — Empirical Bayes shrunken study estimates with CIs; shows shrinkage toward μ̂ (available when τ² > 0)
Baujat plot — heterogeneity contribution vs. overall influence; identifies studies that simultaneously inflate Q and shift the pooled estimate
Normal Q-Q plot — normal probability plot of standardised residuals; assesses the normality assumption of the RE distribution
Radial (Galbraith) plot — precision (1/seᵢ) vs. standardised effect (yᵢ/seᵢ); regression line through the origin has slope equal to the FE pooled estimate; dashed ±2 band flags outliers; right axis shows the effect-size scale
Estimator comparison — runs all τ² estimators side-by-side on the current dataset
Cumulative meta-analysis — adds studies in user-selected order (input order, precision, or effect size)

Meta-regression

Continuous and categorical moderators; multiple moderators may be added simultaneously. Results include coefficients, standard errors, z/t statistics, p-values, R² (proportion of heterogeneity explained), and model-fit indices AIC, BIC, and log-likelihood (ML and REML conventions; matches metafor's AIC.rma() / BIC.rma()). Bubble plots are generated per continuous moderator.

When two or more moderators are present, the per-moderator tests panel shows both a Wald QM statistic and a Likelihood Ratio Test (LRT) for each moderator. LRT = 2·(LL_ML,full − LL_ML,reduced) ~ χ²(df); the reduced model omits that moderator's columns. LRT always uses ML estimation internally regardless of the τ² method selected, since REML log-likelihoods cannot be compared across different fixed-effect structures (Verbeke & Molenberghs 2000).

Non-linear transforms for continuous moderators are available via the transform dropdown when adding a moderator:

Linear (default) — standard single-column predictor
Poly ² — adds x² to the design matrix (quadratic curve)
Poly ³ — adds x² and x³ (cubic curve)
RCS (3–5 knots) — restricted cubic spline; knots placed at Harrell's recommended percentiles (10/50/90 for 3 knots, etc.). Produces a smooth curve that is linear outside the outermost knots. The per-moderator Wald test covers all spline columns jointly. Equivalent to constructing the basis manually in metafor and passing it via mods (Harrell, 2015).

Multiple comparison correction — when m moderators are tested simultaneously, Bonferroni (p_adj = min(1, m·p)) or Holm (step-down, uniformly more powerful than Bonferroni) correction is applied to the per-moderator omnibus QM p-values. Adjusted p-values appear alongside raw values in the per-moderator tests table. Matches R's p.adjust(method="bonferroni"/"holm").

Custom contrasts — after running a meta-regression, expand the Custom contrasts panel to test any linear combination of coefficients: H₀: L·β = 0, where L is a weight vector you supply. SE = √(L′VL) using the full variance–covariance matrix. Typical use: set +1 and −1 weights on two categorical levels to directly compare them.

Location-scale model — add scale moderators (log τ² = Zγ) to model heterogeneity simultaneously with mean effects. Each study gets its own τ̂²ᵢ = exp(Zᵢγ̂). Estimated by ML via profile likelihood (BFGS optimizer). Output includes separate coefficient tables for location (β) and scale (γ), a Wald test per moderator, and a likelihood ratio test comparing the full scale model against the intercept-only scale model. Equivalent to rma(..., scale = ~ ..., method = "ML") in metafor (Viechtbauer, 2021).

Subgroup analysis

Studies are assigned to named groups via the Group column. Pooled estimates are reported within each subgroup alongside Q_between, degrees of freedom, and the between-group p-value.

Risk of bias

User-defined RoB domains with Low / Some concerns / High / Not reported ratings per study. Visualised as a per-study traffic light grid and a per-domain summary bar chart.

Bayesian meta-analysis

Conjugate normal-normal random-effects model fit via grid approximation (300 points over τ) — no MCMC, no external libraries. Prior on μ: N(μ₀, σ_μ²); prior on τ: HalfNormal(σ_τ). Because the prior on μ is conjugate given τ, the marginal posterior of μ is an analytic mixture of normals. Reports posterior mean and 95% credible interval for μ (overall effect) and τ (heterogeneity SD), plus posterior density plots for both parameters. Diffuse priors recover the REML random-effects estimate.

Bayes Factor BF₁₀ — tests H₁: μ ≠ 0 vs H₀: μ = 0 via the Savage-Dickey density ratio: BF₁₀ = prior density(0) / posterior density(0). Reported with log(BF₁₀) and a Jeffreys (1961) verbal interpretation (Anecdotal / Moderate / Strong / Very strong / Decisive).

Prior sensitivity analysis — loops the Bayesian model over a 3 × 3 grid of (σ_μ, σ_τ) pairs ({0.5, 1, 2} × {0.25, 0.5, 1}, nine combinations) and tabulates the posterior mean, credible interval, and BF₁₀ for each. Triggered by the Prior Sensitivity button. Robust conclusions are stable across the grid; prior-sensitive results indicate the posterior is informed by the prior and should be reported with caution.

Dependent effect sizes

When a primary study contributes multiple effect sizes (different outcomes, subgroups, or time points), assigning a Cluster ID activates three complementary analyses in the results panel:

Method	What changes	User parameter
Cluster-robust SE	SE only (point estimate unchanged); sandwich CR1 correction on the RE estimate	—
RVE	Separate WLS estimator with a working covariance model; CR1 sandwich SE	ρ — within-cluster correlation (default 0.80)
Three-level	Explicit decomposition into σ²_within and σ²_between; REML via BFGS; decomposed I²	—

Not available with M-H or Peto pooling methods. Based on Hedges, Tipton & Johnson (2010) and Van den Noortgate et al. (2013).

Plots

All plots export as SVG, PNG, or TIFF.

Plot	Description
Forest plot	Study CIs + pooled diamond(s). Toggle FE / RE / both. Four visual themes. Paginated.
Funnel plot	Effect vs. SE with Egger regression line. Standard or contour-enhanced (p-value regions).
Influence plot	Per-study leverage and Cook's distance visualised as a bubble chart.
BLUPs	Dual caterpillar: observed y_i (gray) vs. shrunken BLUP (accent) per study. Shrinkage lines and hover tooltips. Only shown when τ² > 0.
Baujat plot	Heterogeneity contribution vs. overall influence; quadrant guides at the mean.
Normal Q-Q plot	Normal probability plot of standardised residuals z_i = (y_i − μ̂) / √(v_i + τ²). Reference line through Q1/Q3. Orange points: \|z\| > 2.
Radial (Galbraith) plot	Precision (1/seᵢ) vs. standardised effect (yᵢ/seᵢ). Solid line through origin with slope = FE pooled estimate; dashed ±2 band. Orange points: outliers (\|yᵢ/seᵢ − θ·xᵢ\| > 2). Right axis shows effect-size scale.
Cumulative forest plot	Cumulative pooled estimate as studies are added in sequence. Paginated.
Cumulative funnel plot	Funnel at each cumulative step; slider-controlled.
Orchard plot	Effect estimates as precision-sized dots with RE diamond and prediction interval.
Caterpillar plot	Studies sorted by effect size with 95% CIs; group colour-coding. Paginated.
P-curve	Distribution of significant p-values with right-skew and flatness tests.
P-uniform*	Publication-bias-corrected effect size estimate.
RoB traffic light	Per-study, per-domain ratings as a colour-coded grid.
RoB summary	Stacked bar chart of domain-level rating distributions.
Bubble plots	Meta-regression fit per continuous moderator, bubbles sized by weight.
GOSH plot	Graphical display of study heterogeneity: I² and μ̂ for every non-empty subset of studies. Reveals influential studies and heterogeneity patterns invisible to leave-one-out. Enumerated exactly for k ≤ 15; random-sampled for k ≤ 30.
Profile likelihood (τ²)	Profile log-likelihood curve for τ² with a 95% CI from likelihood-ratio inversion (LRT). Available for ML and REML only. x-axis toggles between τ² and τ.
Bayesian posterior (μ)	Marginal posterior density of the overall effect μ with 95% credible interval shaded.
Bayesian posterior (τ)	Marginal posterior density of the heterogeneity SD τ with 95% credible interval shaded.

Data input

Manual entry — inline editable table with per-field validation and error highlighting; rows can be reordered by drag-and-drop or Alt+↑ / Alt+↓
CSV import — auto-detects delimiter and effect type from column headers; preview panel with column-mapping controls before committing
Session save / load — full application state (data, settings, moderators, RoB ratings) serialised to JSON
Auto-save — drafts written to localStorage; a recovery banner appears on reload if unsaved changes exist
Cluster ID column — optional study identifier for dependent effect sizes (e.g. multiple outcomes or subgroups from the same trial); activates cluster-robust SE, RVE, and three-level meta-analysis sections in the results panel

CSV column names match the input fields for each effect type (e.g. m1, sd1, n1, m2, sd2, n2 for MD; a, b, c, d for OR). A label column is optional but recommended. A group column assigns studies to subgroups.

Export

HTML report — self-contained document with all results tables and plots as inline SVG
Word (.docx) — exports all results tables and plots to a Word document via OOXML/JSZip; no server required
PDF — via the browser's print dialog
SVG / PNG / TIFF — individual plot export from each plot's toolbar
All tables use APA 7th edition style — no vertical lines, merged CI columns, Note paragraphs

Statistical references

Akaike H (1974). A new look at the statistical model identification. IEEE Trans Autom Control, 19(6), 716–723.
Blom G (1958). Statistical Estimates and Transformed Beta-Variables. Wiley.
Bonett DG (2002). Sample size requirements for estimating intraclass correlations with desired precision. Stat Med, 21(9), 1331–1335.
Borenstein M, Hedges LV, Higgins JPT, Rothstein HR (2009). Introduction to Meta-Analysis. Wiley.
Burnham KP, Anderson DR (2002). Model Selection and Multimodel Inference (2nd ed.). Springer.
Deeks JJ, Macaskill P, Irwig L (2005). The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. J Clin Epidemiol, 58(9), 882–893.
DerSimonian R, Laird N (1986). Meta-analysis in clinical trials. Controlled Clinical Trials, 7, 177–188.
Feldt LS (1965). The approximate sampling distribution of Kuder-Richardson reliability coefficient twenty. Psychometrika, 30(3), 357–370.
Galbraith RF (1988). Graphical display of estimates having differing standard errors. Technometrics, 30(3), 271–281.
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013). Bayesian Data Analysis (3rd ed.). CRC Press.
Hakstian AR, Whalen TE (1976). A k-sample significance test for independent alpha coefficients. Psychometrika, 41(2), 219–231.
Harbord RM, Egger M, Sterne JAC (2006). A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints. Stat Med, 25(20), 3443–3457.
Harville DA (1977). Maximum likelihood approaches to variance component estimation and to related problems. J Am Stat Assoc, 72(358), 320–338.
Hedges LV, Olkin I (1985). Statistical Methods for Meta-Analysis. Academic Press.
Hedges LV, Tipton E, Johnson MC (2010). Robust variance estimation in meta-regression with dependent effect size estimates. Res Synth Methods, 1, 39–65.
Higgins JPT, Thompson SG, Spiegelhalter DJ (2009). A re-evaluation of random-effects meta-analysis. J R Stat Soc A, 172, 137–159.
Henmi M, Copas JB (2010). Confidence intervals for random effects meta-analysis and robustness to publication bias. Stat Med, 29(29), 2969–2983.
Holm S (1979). A simple sequentially rejective multiple test procedure. Scand J Stat, 6(2), 65–70.
Ioannidis JPA, Trikalinos TA (2007). An exploratory test for an excess of significant findings. Clin Trials, 4(3), 245–253.
Jeffreys H (1961). Theory of Probability (3rd ed.). Oxford University Press.
Kraemer HC (1975). On estimation and hypothesis testing problems for correlation coefficients. Psychometrika, 40(4), 473–485.
Knapp G, Hartung J (2003). Improved tests for a random effects meta-regression with a single covariate. Stat Med, 22, 2693–2710.
Mantel N, Haenszel W (1959). Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst, 22, 719–748.
McGraw KO, Wong SP (1992). A common language effect size statistic. Psychol Bull, 111(2), 361–365.
Morris CN (1983). Parametric empirical Bayes inference: Theory and applications. J Am Stat Assoc, 78(381), 47–55.
Morris SB (2008). Estimating effect sizes from pretest-posttest-control group designs. Org Res Methods, 11, 364–386.
Olkin I, Pratt JW (1958). Unbiased estimation of certain correlation coefficients. Ann Math Stat, 29(1), 201–211.
Olkin I, Dahabreh IJ, Trikalinos TA (2012). GOSH — a graphical display of study heterogeneity. Res Synth Methods, 3(3), 214–223.
Paule RC, Mandel J (1982). Consensus values and weighting factors. J Res Natl Bur Stand, 87, 377–385.
Peters JL, Sutton AJ, Jones DR, Abrams KR, Rushton L (2006). Comparison of two methods to detect publication bias in meta-analysis. JAMA, 295(6), 676–680.
Peto R, Pike MC, Armitage P, et al. (1976). Design and analysis of randomized clinical trials requiring prolonged observation of each patient. Br J Cancer, 34, 585–612.
Rücker G, Schwarzer G, Carpenter J (2008). Arcsine test for publication bias in meta-analyses with binary outcomes. Stat Med, 27(19), 4450–4465.
Schwarz G (1978). Estimating the dimension of a model. Ann Stat, 6(2), 461–464.
Simonsohn U, Nelson LD, Simmons JP (2014). P-curve: A key to the file-drawer. J Exp Psychol Gen, 143(2), 534–547.
Stanley TD, Doucouliagos H (2014). Meta-regression approximations to reduce publication selection bias. Res Synth Methods, 5(1), 60–78.
Stanley TD, Doucouliagos H (2015). Neither fixed nor random: Weighted least squares meta-regression. Res Synth Methods, 6(1), 67–87.
van Assen MALM, van Aert RCM, Wicherts JM (2015). Meta-analysis using effect size distributions of only statistically significant studies. Psychol Methods, 20(3), 293–309.
Van den Noortgate W, López-López JA, Marín-Martínez F, Sánchez-Meca J (2013). Three-level meta-analysis of dependent effect sizes. Behav Res Methods, 45(2), 576–594.
Vevea JL, Hedges LV (1995). A general linear model for estimating effect size in the presence of publication bias. Psychometrika, 60(3), 419–435.
Vevea JL, Woods CM (2005). Publication bias in research synthesis: Sensitivity analysis using a priori weight functions. Psychol Methods, 10(4), 428–443.
Viechtbauer W (2005). Bias and efficiency of meta-analytic variance estimators in the random-effects model. J Educ Behav Stat, 30, 261–293.
Viechtbauer W (2007). Confidence intervals for the amount of heterogeneity in meta-analysis. Stat Med, 26(1), 37–52.
Viechtbauer W (2010). Conducting meta-analyses in R with the metafor package. J Stat Softw, 36(3), 1–48.
Viechtbauer W (2021). Location-scale models for meta-analytic data. Res Synth Methods, 12(5), 567–583.
Viechtbauer W, Cheung MWL (2010). Outlier and influence diagnostics for meta-analysis. Res Synth Methods, 1(2), 112–125.
Wagenmakers EJ, Lodewyckx T, Kuriyal H, Grasman R (2010). Bayesian hypothesis testing for psychologists: A tutorial on the Savage-Dickey method. Cogn Psychol, 60(3), 158–189.
Yule GU (1900). On the association of attributes in statistics. Phil Trans R Soc Lond A, 194, 257–319.
Yule GU (1912). On the methods of measuring association between two attributes. J R Stat Soc, 75(6), 579–642.