---
title: "Accessing GitHub Data"
output: 
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 3
vignette: >
  %\VignetteIndexEntry{Accessing GitHub Data}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
options(rmarkdown.html_vignette.check_title = FALSE)

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Description

The original accessions for the data sets included within `datamuseum`
are also available on GitHub. These are from the Global Biodiversity
Information Facility (GBIF), Invert-E-Base (InvBase), the Biological
Information System for Marine Life (BISMAL), Ocean Biodiversity Information
System (OBIS), and one data set obtained by direct request from the National
Museum of Nature and Science, Japan (NSMT).

In this workflow, accessing the GBIF-sourced data directly from the
GitHub repository for `datamuseum` will be demonstrated.

```{r packages, eval = FALSE}
library(datamuseum)
```

Due to the size of the data, the files are stored within a .zip
folder. Luckily, R is capable of downloading and unzipping the files
directly from a GitHub link!

```{r Access Data from GitHub, eval = FALSE}

rawzip <- tempfile()

download.file("https://github.com/btorgovitsky00/datamuseum/raw/master/data-raw.zip", rawzip, 
  mode = "wb")

temp <- tempdir()

unzip(rawzip, exdir = temp)

```

Each data set in `datamuseum` has two associated parent files:
the actual original accession from the respective repository
(denoted as "raw"), and a version with some columns removed
for improved visibility ("trim").

```{r GBIF, eval = FALSE}

#Raw Original Data

GBIF_clean <- read.csv(file.path(temp, "data-raw", "GBIF_Octopodoidea_raw.csv")) #88256 Observations


#Trimmed Original Data

GBIF_clean <- read.csv(file.path(temp, "data-raw", "GBIF_Octopodoidea_trim.csv")) #88256 Observations

```

The GBIF data sets were obtained and refined from the following occurrence download:

> Global Biodiversity Information Facility (GBIF). GBIF.org (30 March 2026)
> GBIF Occurrence Download. <https://www.gbif.org>.
> doi: [10.15468/dl.2379hj](https://doi.org/10.15468/dl.2379hj)
