[R] R Style Guide by Hadley Wickham

해당 포스트는 Hadley Wickham이 작성한 'The tidyverse style guide' 를 번역하여 정리한 글입니다.

Lists
Intro - 0. Welcome
Analysis - 1. Files
Analysis - 2. Syntax (1)
Analysis - 2. Syntax (2)
Analysis - 3. Functions
Analysis - 4. Pipes
Analysis - 5. ggplot2
Packages - 6. Files
Packages - 7. Documentation
Packages - 8. Tests
Packages - 9. Error messages
Packages - 10. News
Packages - 11. Git/GitHub

5. ggplot2

5.1 소개(Introduction)

ggplot2의 레이어(layer)를 분리하는데 사용되는 +에 대한 코딩 스타일은 파이프라인의 %>%와 매우 유사합니다.

5.2 공백(Whitespace)

+는 항상 그 앞에 공간(space)이 있어야 하고, 새로운 줄이 나타나야 합니다. 이것은 Plot이 두 개의 레이어로만 구성되어 있더라도 동일하며, 첫 번째 단계 이후 각 라인은 2칸(two spaces) 들여쓰기(indent)해야 합니다.

만약 dplyr 파이프라인에서 벗어나 ggplot을 만들려면 들여쓰기 수준(indent level)이 하나만 있어야 합니다.

# Good
iris %>%
  filter(Species == "setosa") %>%
  ggplot(aes(x = Sepal.Width, y = Sepal.Length)) +
  geom_point()

# Bad
iris %>%
  filter(Species == "setosa") %>%
  ggplot(aes(x = Sepal.Width, y = Sepal.Length)) +
    geom_point()

# Bad
iris %>%
  filter(Species == "setosa") %>%
  ggplot(aes(x = Sepal.Width, y = Sepal.Length)) + geom_point()

5.3 긴 줄(Long lines)

만약 ggplot2 레이어에 대한 인수(argument)를 한 줄에 모두 쓰기 어렵다면, 각 인수들(arguments)을 각자의 줄(own line)에 놓고 들여쓰기(indent)하여 사용하시면 됩니다.

# Good
ggplot(aes(x = Sepal.Width, y = Sepal.Length, color = Species)) +
  geom_point() +
  labs(
    x = "Sepal width, in cm",
    y = "Sepal length, in cm",
    title = "Sepal length vs. width of irises"
  ) 

# Bad
ggplot(aes(x = Sepal.Width, y = Sepal.Length, color = Species)) +
  geom_point() +
  labs(x = "Sepal width, in cm", y = "Sepal length, in cm", title = "Sepal length vs. width of irises")

ggplot2를 사용하면 데이터 인수(argument) 내에서 필터링(filtering)이나 슬라이싱(slicing)과 같은 데이터 조작을 수행할 수 있습니다. 플로팅(plotting)을 시작하기 전에 파이프라인으로 데이터 조작을 미리 수행한다면 보다 깔끔한 코드를 작성할 수 있습니다.

# Good
iris %>%
  filter(Species == "setosa") %>%
  ggplot(aes(x = Sepal.Width, y = Sepal.Length)) +
  geom_point()

# Bad
ggplot(filter(iris, Species == "setosa"), aes(x = Sepal.Width, y = Sepal.Length)) +
  geom_point()