aula_5 – <i class='bi bi-graph-up'></i> FIP 606

O que aprendemos?

1. Importação e Preparação de Dados

Importar dados diretamente do Google Sheets com gsheet
Converter variáveis para os tipos corretos: fatores e numéricos

2. Ajuste de Modelos Lineares e Generalizados

Ajustar modelos lineares generalizados (GLMs) para dados experimentais
Realizar análise de variância (ANOVA) em modelos GLMs
Avaliar diagnóstico de resíduos com o pacote DHARMa

3. Comparação de Médias e Estimativas

Estimar e comparar médias marginais com emmeans
Realizar comparações múltiplas e visualização de médias

4. Análise de Correlação e Visualização

Avaliar correlações entre variáveis com testes e gráficos
Visualizar dados e resultados com ggplot2

5. Modelos Lineares Mistos

Construir e interpretar modelos lineares mistos generalizados para dados com estrutura hierárquica
Avaliar e interpretar resultados para experimentos com blocos e interações

6. Visualização Avançada

Criar gráficos de dispersão, boxplots e combinar múltiplos gráficos para análise comparativa

Parte 1: Experimento de Campo (Fertilizantes)

Importação e preparação dos dados

library(gsheet)
library(dplyr)


Anexando pacote: 'dplyr'

Os seguintes objetos são mascarados por 'package:stats':

    filter, lag

Os seguintes objetos são mascarados por 'package:base':

    intersect, setdiff, setequal, union

library(ggplot2)
library(DHARMa)

This is DHARMa 0.4.7. For overview type '?DHARMa'. For recent changes, type news(package = 'DHARMa')

library(emmeans)

Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'

library(multcomp)

Carregando pacotes exigidos: mvtnorm

Carregando pacotes exigidos: survival

Carregando pacotes exigidos: TH.data

Carregando pacotes exigidos: MASS


Anexando pacote: 'MASS'

O seguinte objeto é mascarado por 'package:dplyr':

    select


Anexando pacote: 'TH.data'

O seguinte objeto é mascarado por 'package:MASS':

    geyser

library(agricolae)

campo <- gsheet2tbl("https://docs.google.com/spreadsheets/d/1bq2N19DcZdtax2fQW9OHSGMR0X2__Z9T/edit?gid=866852711#gid=866852711")

campo <- campo |> 
  mutate(
    TRAT = factor(TRAT),
    BLOCO = factor(BLOCO),
    FER = as.numeric(FER)
  )

ggplot(campo, aes(x = TRAT, y = PROD)) + 
  geom_jitter(width = 0.1) + 
  stat_summary(fun.data = "mean_cl_boot", colour = "red")

Ajuste do modelo linear e ANOVA

m_campo <- lm(log(FER) ~ BLOCO + TRAT, data = campo)
anova(m_campo)

Analysis of Variance Table

Response: log(FER)
          Df  Sum Sq Mean Sq F value    Pr(>F)    
BLOCO      3  0.2064 0.06880  1.7961    0.1788    
TRAT       7 11.5210 1.64585 42.9665 4.838e-11 ***
Residuals 21  0.8044 0.03831                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(simulateResiduals(m_campo))

Estimativa e comparação de médias

means_campo <- emmeans(m_campo, ~ TRAT, type = "response")
plot(means_campo)

means_campo

 TRAT response    SE df lower.CL upper.CL
 1       20.02 1.960 21    16.33    24.54
 2        5.68 0.556 21     4.63     6.96
 3        3.81 0.373 21     3.11     4.67
 4        3.08 0.301 21     2.51     3.78
 5        3.24 0.317 21     2.64     3.97
 6        2.98 0.292 21     2.43     3.65
 7        3.37 0.330 21     2.75     4.13
 8        3.48 0.341 21     2.84     4.27

Results are averaged over the levels of: BLOCO 
Confidence level used: 0.95 
Intervals are back-transformed from the log scale

cv.model(m_campo)

[1] 13.13068

cld(means_campo)

 TRAT response    SE df lower.CL upper.CL .group
 6        2.98 0.292 21     2.43     3.65  1    
 4        3.08 0.301 21     2.51     3.78  1    
 5        3.24 0.317 21     2.64     3.97  1    
 7        3.37 0.330 21     2.75     4.13  1    
 8        3.48 0.341 21     2.84     4.27  1    
 3        3.81 0.373 21     3.11     4.67  12   
 2        5.68 0.556 21     4.63     6.96   2   
 1       20.02 1.960 21    16.33    24.54    3  

Results are averaged over the levels of: BLOCO 
Confidence level used: 0.95 
Intervals are back-transformed from the log scale 
P value adjustment: tukey method for comparing a family of 8 estimates 
Tests are performed on the log scale 
significance level used: alpha = 0.05 
NOTE: If two or more means share the same grouping symbol,
      then we cannot show them to be different.
      But we also did not show them to be the same.

pwpp(means_campo)

pwpm(means_campo)

        1       2       3       4       5       6       7       8
1 [20.02]  <.0001  <.0001  <.0001  <.0001  <.0001  <.0001  <.0001
2   3.525 [ 5.68]  0.1252  0.0048  0.0110  0.0028  0.0204  0.0343
3   5.259   1.492 [ 3.81]  0.7832  0.9335  0.6440  0.9843  0.9976
4   6.500   1.844   1.236 [ 3.08]  0.9999  1.0000  0.9976  0.9842
5   6.178   1.753   1.175   0.951 [ 3.24]  0.9984  1.0000  0.9994
6   6.721   1.906   1.278   1.034   1.088 [ 2.98]  0.9842  0.9431
7   5.945   1.686   1.130   0.915   0.962   0.885 [ 3.37]  1.0000
8   5.750   1.631   1.093   0.885   0.931   0.856   0.967 [ 3.48]

Row and column labels: TRAT
Upper triangle: P values   null = 1  adjust = "tukey"
Diagonal: [Estimates] (response)   type = "response"
Lower triangle: Comparisons (ratio)   earlier vs. later

Correlações e visualização

cor(campo$FER, campo$PROD)

[1] -0.6258321

cor.test(campo$FER, campo$DFC)


    Pearson's product-moment correlation

data:  campo$FER and campo$DFC
t = 14.049, df = 30, p-value = 9.864e-15
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.8635525 0.9664228
sample estimates:
      cor 
0.9316978

campo |> 
  ggplot(aes(FER, DFC)) +
  geom_point() +
  geom_smooth(method = "lm")

`geom_smooth()` using formula = 'y ~ x'

Parte 2: Experimento com Milho

Importação e visualização

library(lme4)

Carregando pacotes exigidos: Matrix

library(car)

Carregando pacotes exigidos: carData


Anexando pacote: 'car'

O seguinte objeto é mascarado por 'package:dplyr':

    recode

library(DHARMa)
library(multcomp)
library(emmeans)

milho <- gsheet2tbl("https://docs.google.com/spreadsheets/d/1bq2N19DcZdtax2fQW9OHSGMR0X2__Z9T/edit?gid=1345524759#gid=1345524759")

milho |> 
  ggplot(aes(hybrid, index, color = method)) +
  geom_jitter(width = 0.1) +
  coord_flip()

Modelo linear misto com interação

milho <- milho |> mutate(hybrid_block = interaction(hybrid, block))

m_milho <- lmer(index ~ hybrid * method + (1 | hybrid_block), data = milho)
Anova(m_milho)

Analysis of Deviance Table (Type II Wald chisquare tests)

Response: index
                Chisq Df Pr(>Chisq)   
hybrid        11.4239  5    0.04359 * 
method         4.6964  1    0.03023 * 
hybrid:method 15.8062  5    0.00742 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(simulateResiduals(m_milho))

Modelo misto para yield

m_milho2 <- lmer(yield ~ hybrid * method + (1 | block:hybrid_block), data = milho)
Anova(m_milho2)

Analysis of Deviance Table (Type II Wald chisquare tests)

Response: yield
                Chisq Df Pr(>Chisq)    
hybrid        22.5966  5  0.0004031 ***
method         0.1052  1  0.7456932    
hybrid:method 25.9302  5  9.206e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(simulateResiduals(m_milho2))

media_milho2 <- emmeans(m_milho2, ~ method | hybrid)
cld(media_milho2, Letters = letters)

hybrid = 30F53 HX:
 method emmean  SE   df lower.CL upper.CL .group
 silk     9988 798 21.1     8328    11647  a    
 pin     11208 798 21.1     9548    12867   b   

hybrid = 30F53 YH:
 method emmean  SE   df lower.CL upper.CL .group
 silk     9211 798 21.1     7552    10870  a    
 pin      9408 798 21.1     7748    11067  a    

hybrid = 30K64:
 method emmean  SE   df lower.CL upper.CL .group
 silk    10361 798 21.1     8702    12020  a    
 pin     11675 798 21.1    10016    13334   b   

hybrid = 30S31H:
 method emmean  SE   df lower.CL upper.CL .group
 pin      8118 798 21.1     6459     9777  a    
 silk     9185 798 21.1     7526    10844   b   

hybrid = 30S31YH:
 method emmean  SE   df lower.CL upper.CL .group
 pin      7836 798 21.1     6177     9495  a    
 silk     8277 798 21.1     6618     9936  a    

hybrid = BG7049H:
 method emmean  SE   df lower.CL upper.CL .group
 pin     11970 798 21.1    10311    13629  a    
 silk    12833 798 21.1    11174    14492  a    

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
significance level used: alpha = 0.05 
NOTE: If two or more means share the same grouping symbol,
      then we cannot show them to be different.
      But we also did not show them to be the same.

Correlação entre índice e produtividade

milho |> 
  ggplot(aes(index, yield)) +
  geom_point() +
  geom_smooth(method = "lm")

`geom_smooth()` using formula = 'y ~ x'

cor1 <- cor(milho$index, milho$yield)
cor1^2 * 100

[1] 6.323713

cor.test(milho$index, milho$yield)


    Pearson's product-moment correlation

data:  milho$index and milho$yield
t = -1.7622, df = 46, p-value = 0.08468
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.49988704  0.03517829
sample estimates:
       cor 
-0.2514699