0
$\begingroup$

I am trying to apply a Gamma GLM to a dataset. The main issue is the lack of fit for quantile deviations although no other problems are detected. I have read similar posts without a conclusive answer.

My main question is if despite obtaining a poor fit regarding the quantiles but a good one in the other parameters the model is still trustworthy.

Please find a reproducible example below.

library(DHARMa)
library(car)

datos <- data.frame(
  Time = c(20.79, 387.69, 11.55, 25.63, 12.11, 34.74, 30.76, 11.69,
           11.55, 1522.86, 1742.24, 39.84, 24.26),
  Ratio = c(1.56, 2.02, 1.49, 1.55, 1.26, 1.77, 2.76, 0.86, 1.40,
            1.84, 1.21, 2.02, 2.33),
  Type = as.factor(c("NWB", "NWB", "NWB", "NWB", "NWB",
                     "WB", "WB", "WB", "WB", "WB", "WB", "WB", "WB"))
)

datos

m1<- glm(Time~ Type+Ratio,Gamma(link="log"), data=datos, control = list(maxit = 100))
Anova(m1)

residh <- simulateResiduals(m1)

plotResiduals(residh, form = datos$Ratio)
plotResiduals(residh, form = datos$Type)

testDispersion(residh)
testUniformity(residh)
$\endgroup$
0

1 Answer 1

2
$\begingroup$

I assume this is the complete data. Personally, I wouldn't trust any model with that many parameters that was fit to such a small dataset.

I suggest you do the very basic step of plotting the data and the predictions:

library(ggplot2)
ggplot(datos, aes(y = Time, x = Ratio, color = Type)) +
  geom_point() +
  stat_function(aes(color = "WB"), fun = \(x) predict(m1, newdata = data.frame(Ratio = x, Type = factor("WB")), type = "response")) +
  stat_function(aes(color = "NWB"), fun = \(x) predict(m1, newdata = data.frame(Ratio = x, Type = factor("NWB")), type = "response"))

plot showing data and predicted values from glm

I would not trust a model that is influenced that much by only three observations.


You should also look a bit more into the estimated dispersion / shape:

library(MASS)
gamma.shape(m1)
#Alpha: 0.4182119
#SE:    0.1332239
gamma.dispersion(m1)
#[1] 2.391132

This estimate is better than that done by summary etc. You can pass it to anova.glm but not to car::Anova:

anova(m1, dispersion = gamma.dispersion(m1))

If you had some strong priors for the parameters, you could maybe salvage this with a Bayesian model, e.g., with package brms. Otherwise, I would tell you to get more data.

$\endgroup$
2
  • $\begingroup$ "more data": With how skewed this data is, I'd suspect you'd need hundreds of data points for an informative analysis. $\endgroup$ Commented yesterday
  • 1
    $\begingroup$ @LukasLohse Agreed. I would also spend some effort to investigate if those large times could be explained by another predictor. $\endgroup$
    – Roland
    Commented 23 hours ago

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.