]>
Commit | Line | Data |
---|---|---|
bae9f6d2 JC |
1 | # go-getter |
2 | ||
3 | [![Build Status](http://img.shields.io/travis/hashicorp/go-getter.svg?style=flat-square)][travis] | |
4 | [![Build status](https://ci.appveyor.com/api/projects/status/ulq3qr43n62croyq/branch/master?svg=true)][appveyor] | |
5 | [![Go Documentation](http://img.shields.io/badge/go-documentation-blue.svg?style=flat-square)][godocs] | |
6 | ||
7 | [travis]: http://travis-ci.org/hashicorp/go-getter | |
8 | [godocs]: http://godoc.org/github.com/hashicorp/go-getter | |
9 | [appveyor]: https://ci.appveyor.com/project/hashicorp/go-getter/branch/master | |
10 | ||
11 | go-getter is a library for Go (golang) for downloading files or directories | |
12 | from various sources using a URL as the primary form of input. | |
13 | ||
14 | The power of this library is being flexible in being able to download | |
15 | from a number of different sources (file paths, Git, HTTP, Mercurial, etc.) | |
16 | using a single string as input. This removes the burden of knowing how to | |
17 | download from a variety of sources from the implementer. | |
18 | ||
19 | The concept of a _detector_ automatically turns invalid URLs into proper | |
20 | URLs. For example: "github.com/hashicorp/go-getter" would turn into a | |
21 | Git URL. Or "./foo" would turn into a file URL. These are extensible. | |
22 | ||
23 | This library is used by [Terraform](https://terraform.io) for | |
15c0b25d | 24 | downloading modules and [Nomad](https://nomadproject.io) for downloading |
bae9f6d2 JC |
25 | binaries. |
26 | ||
27 | ## Installation and Usage | |
28 | ||
29 | Package documentation can be found on | |
30 | [GoDoc](http://godoc.org/github.com/hashicorp/go-getter). | |
31 | ||
32 | Installation can be done with a normal `go get`: | |
33 | ||
34 | ``` | |
35 | $ go get github.com/hashicorp/go-getter | |
36 | ``` | |
37 | ||
38 | go-getter also has a command you can use to test URL strings: | |
39 | ||
40 | ``` | |
41 | $ go install github.com/hashicorp/go-getter/cmd/go-getter | |
42 | ... | |
43 | ||
44 | $ go-getter github.com/foo/bar ./foo | |
45 | ... | |
46 | ``` | |
47 | ||
48 | The command is useful for verifying URL structures. | |
49 | ||
50 | ## URL Format | |
51 | ||
52 | go-getter uses a single string URL as input to download from a variety of | |
53 | protocols. go-getter has various "tricks" with this URL to do certain things. | |
54 | This section documents the URL format. | |
55 | ||
56 | ### Supported Protocols and Detectors | |
57 | ||
58 | **Protocols** are used to download files/directories using a specific | |
59 | mechanism. Example protocols are Git and HTTP. | |
60 | ||
61 | **Detectors** are used to transform a valid or invalid URL into another | |
62 | URL if it matches a certain pattern. Example: "github.com/user/repo" is | |
63 | automatically transformed into a fully valid Git URL. This allows go-getter | |
64 | to be very user friendly. | |
65 | ||
66 | go-getter out of the box supports the following protocols. Additional protocols | |
67 | can be augmented at runtime by implementing the `Getter` interface. | |
68 | ||
69 | * Local files | |
70 | * Git | |
71 | * Mercurial | |
72 | * HTTP | |
73 | * Amazon S3 | |
107c1cdb | 74 | * Google GCP |
bae9f6d2 JC |
75 | |
76 | In addition to the above protocols, go-getter has what are called "detectors." | |
77 | These take a URL and attempt to automatically choose the best protocol for | |
78 | it, which might involve even changing the protocol. The following detection | |
79 | is built-in by default: | |
80 | ||
81 | * File paths such as "./foo" are automatically changed to absolute | |
82 | file URLs. | |
83 | * GitHub URLs, such as "github.com/mitchellh/vagrant" are automatically | |
84 | changed to Git protocol over HTTP. | |
85 | * BitBucket URLs, such as "bitbucket.org/mitchellh/vagrant" are automatically | |
86 | changed to a Git or mercurial protocol using the BitBucket API. | |
87 | ||
88 | ### Forced Protocol | |
89 | ||
90 | In some cases, the protocol to use is ambiguous depending on the source | |
91 | URL. For example, "http://github.com/mitchellh/vagrant.git" could reference | |
92 | an HTTP URL or a Git URL. Forced protocol syntax is used to disambiguate this | |
93 | URL. | |
94 | ||
95 | Forced protocol can be done by prefixing the URL with the protocol followed | |
96 | by double colons. For example: `git::http://github.com/mitchellh/vagrant.git` | |
97 | would download the given HTTP URL using the Git protocol. | |
98 | ||
99 | Forced protocols will also override any detectors. | |
100 | ||
107c1cdb | 101 | In the absence of a forced protocol, detectors may be run on the URL, transforming |
bae9f6d2 JC |
102 | the protocol anyways. The above example would've used the Git protocol either |
103 | way since the Git detector would've detected it was a GitHub URL. | |
104 | ||
105 | ### Protocol-Specific Options | |
106 | ||
107 | Each protocol can support protocol-specific options to configure that | |
108 | protocol. For example, the `git` protocol supports specifying a `ref` | |
109 | query parameter that tells it what ref to checkout for that Git | |
110 | repository. | |
111 | ||
112 | The options are specified as query parameters on the URL (or URL-like string) | |
113 | given to go-getter. Using the Git example above, the URL below is a valid | |
114 | input to go-getter: | |
115 | ||
116 | github.com/hashicorp/go-getter?ref=abcd1234 | |
117 | ||
118 | The protocol-specific options are documented below the URL format | |
119 | section. But because they are part of the URL, we point it out here so | |
120 | you know they exist. | |
121 | ||
15c0b25d AP |
122 | ### Subdirectories |
123 | ||
124 | If you want to download only a specific subdirectory from a downloaded | |
125 | directory, you can specify a subdirectory after a double-slash `//`. | |
126 | go-getter will first download the URL specified _before_ the double-slash | |
127 | (as if you didn't specify a double-slash), but will then copy the | |
128 | path after the double slash into the target directory. | |
129 | ||
130 | For example, if you're downloading this GitHub repository, but you only | |
131 | want to download the `test-fixtures` directory, you can do the following: | |
132 | ||
133 | ``` | |
134 | https://github.com/hashicorp/go-getter.git//test-fixtures | |
135 | ``` | |
136 | ||
137 | If you downloaded this to the `/tmp` directory, then the file | |
138 | `/tmp/archive.gz` would exist. Notice that this file is in the `test-fixtures` | |
139 | directory in this repository, but because we specified a subdirectory, | |
140 | go-getter automatically copied only that directory contents. | |
141 | ||
142 | Subdirectory paths may contain may also use filesystem glob patterns. | |
143 | The path must match _exactly one_ entry or go-getter will return an error. | |
144 | This is useful if you're not sure the exact directory name but it follows | |
145 | a predictable naming structure. | |
146 | ||
147 | For example, the following URL would also work: | |
148 | ||
149 | ``` | |
150 | https://github.com/hashicorp/go-getter.git//test-* | |
151 | ``` | |
152 | ||
bae9f6d2 JC |
153 | ### Checksumming |
154 | ||
155 | For file downloads of any protocol, go-getter can automatically verify | |
156 | a checksum for you. Note that checksumming only works for downloading files, | |
157 | not directories, but checksumming will work for any protocol. | |
158 | ||
107c1cdb ND |
159 | To checksum a file, append a `checksum` query parameter to the URL. go-getter |
160 | will parse out this query parameter automatically and use it to verify the | |
161 | checksum. The parameter value can be in the format of `type:value` or just | |
162 | `value`, where type is "md5", "sha1", "sha256", "sha512" or "file" . The | |
163 | "value" should be the actual checksum value or download URL for "file". When | |
164 | `type` part is omitted, type will be guessed based on the length of the | |
165 | checksum string. Examples: | |
bae9f6d2 JC |
166 | |
167 | ``` | |
168 | ./foo.txt?checksum=md5:b7d96c89d09d9e204f5fedc4d5d55b21 | |
169 | ``` | |
170 | ||
107c1cdb ND |
171 | ``` |
172 | ./foo.txt?checksum=b7d96c89d09d9e204f5fedc4d5d55b21 | |
173 | ``` | |
174 | ||
175 | ``` | |
176 | ./foo.txt?checksum=file:./foo.txt.sha256sum | |
177 | ``` | |
178 | ||
179 | When checksumming from a file - ex: with `checksum=file:url` - go-getter will | |
180 | get the file linked in the URL after `file:` using the same configuration. For | |
181 | example, in `file:http://releases.ubuntu.com/cosmic/MD5SUMS` go-getter will | |
182 | download a checksum file under the aforementioned url using the http protocol. | |
183 | All protocols supported by go-getter can be used. The checksum file will be | |
184 | downloaded in a temporary file then parsed. The destination of the temporary | |
185 | file can be changed by setting system specific environment variables: `TMPDIR` | |
186 | for unix; `TMP`, `TEMP` or `USERPROFILE` on windows. Read godoc of | |
187 | [os.TempDir](https://golang.org/pkg/os/#TempDir) for more information on the | |
188 | temporary directory selection. Content of files are expected to be BSD or GNU | |
189 | style. Once go-getter is done with the checksum file; it is deleted. | |
190 | ||
bae9f6d2 JC |
191 | The checksum query parameter is never sent to the backend protocol |
192 | implementation. It is used at a higher level by go-getter itself. | |
193 | ||
107c1cdb ND |
194 | If the destination file exists and the checksums match: download |
195 | will be skipped. | |
196 | ||
bae9f6d2 JC |
197 | ### Unarchiving |
198 | ||
199 | go-getter will automatically unarchive files into a file or directory | |
200 | based on the extension of the file being requested (over any protocol). | |
201 | This works for both file and directory downloads. | |
202 | ||
203 | go-getter looks for an `archive` query parameter to specify the format of | |
204 | the archive. If this isn't specified, go-getter will use the extension of | |
205 | the path to see if it appears archived. Unarchiving can be explicitly | |
206 | disabled by setting the `archive` query parameter to `false`. | |
207 | ||
208 | The following archive formats are supported: | |
209 | ||
210 | * `tar.gz` and `tgz` | |
211 | * `tar.bz2` and `tbz2` | |
15c0b25d | 212 | * `tar.xz` and `txz` |
bae9f6d2 JC |
213 | * `zip` |
214 | * `gz` | |
215 | * `bz2` | |
15c0b25d | 216 | * `xz` |
bae9f6d2 JC |
217 | |
218 | For example, an example URL is shown below: | |
219 | ||
220 | ``` | |
221 | ./foo.zip | |
222 | ``` | |
223 | ||
224 | This will automatically be inferred to be a ZIP file and will be extracted. | |
225 | You can also be explicit about the archive type: | |
226 | ||
227 | ``` | |
228 | ./some/other/path?archive=zip | |
229 | ``` | |
230 | ||
231 | And finally, you can disable archiving completely: | |
232 | ||
233 | ``` | |
234 | ./some/path?archive=false | |
235 | ``` | |
236 | ||
237 | You can combine unarchiving with the other features of go-getter such | |
238 | as checksumming. The special `archive` query parameter will be removed | |
239 | from the URL before going to the final protocol downloader. | |
240 | ||
241 | ## Protocol-Specific Options | |
242 | ||
107c1cdb ND |
243 | This section documents the protocol-specific options that can be specified for |
244 | go-getter. These options should be appended to the input as normal query | |
245 | parameters ([HTTP headers](#headers) are an exception to this, however). | |
246 | Depending on the usage of go-getter, applications may provide alternate ways of | |
247 | inputting options. For example, [Nomad](https://www.nomadproject.io) provides a | |
248 | nice options block for specifying options rather than in the URL. | |
bae9f6d2 JC |
249 | |
250 | ## General (All Protocols) | |
251 | ||
252 | The options below are available to all protocols: | |
253 | ||
254 | * `archive` - The archive format to use to unarchive this file, or "" (empty | |
255 | string) to disable unarchiving. For more details, see the complete section | |
256 | on archive support above. | |
257 | ||
258 | * `checksum` - Checksum to verify the downloaded file or archive. See | |
259 | the entire section on checksumming above for format and more details. | |
260 | ||
15c0b25d AP |
261 | * `filename` - When in file download mode, allows specifying the name of the |
262 | downloaded file on disk. Has no effect in directory mode. | |
263 | ||
bae9f6d2 JC |
264 | ### Local Files (`file`) |
265 | ||
266 | None | |
267 | ||
268 | ### Git (`git`) | |
269 | ||
270 | * `ref` - The Git ref to checkout. This is a ref, so it can point to | |
271 | a commit SHA, a branch name, etc. If it is a named ref such as a branch | |
272 | name, go-getter will update it to the latest on each get. | |
273 | ||
274 | * `sshkey` - An SSH private key to use during clones. The provided key must | |
275 | be a base64-encoded string. For example, to generate a suitable `sshkey` | |
276 | from a private key file on disk, you would run `base64 -w0 <file>`. | |
277 | ||
278 | **Note**: Git 2.3+ is required to use this feature. | |
107c1cdb ND |
279 | |
280 | * `depth` - The Git clone depth. The provided number specifies the last `n` | |
281 | revisions to clone from the repository. | |
282 | ||
283 | ||
284 | The `git` getter accepts both URL-style SSH addresses like | |
285 | `git::ssh://git@example.com/foo/bar`, and "scp-style" addresses like | |
286 | `git::git@example.com/foo/bar`. In the latter case, omitting the `git::` | |
287 | force prefix is allowed if the username prefix is exactly `git@`. | |
288 | ||
289 | The "scp-style" addresses _cannot_ be used in conjunction with the `ssh://` | |
290 | scheme prefix, because in that case the colon is used to mark an optional | |
291 | port number to connect on, rather than to delimit the path from the host. | |
bae9f6d2 JC |
292 | |
293 | ### Mercurial (`hg`) | |
294 | ||
295 | * `rev` - The Mercurial revision to checkout. | |
296 | ||
297 | ### HTTP (`http`) | |
298 | ||
15c0b25d AP |
299 | #### Basic Authentication |
300 | ||
301 | To use HTTP basic authentication with go-getter, simply prepend `username:password@` to the | |
302 | hostname in the URL such as `https://Aladdin:OpenSesame@www.example.com/index.html`. All special | |
303 | characters, including the username and password, must be URL encoded. | |
bae9f6d2 | 304 | |
107c1cdb ND |
305 | #### Headers |
306 | ||
307 | Optional request headers can be added by supplying them in a custom | |
308 | [`HttpGetter`](https://godoc.org/github.com/hashicorp/go-getter#HttpGetter) | |
309 | (_not_ as query parameters like most other options). These headers will be sent | |
310 | out on every request the getter in question makes. | |
311 | ||
bae9f6d2 JC |
312 | ### S3 (`s3`) |
313 | ||
314 | S3 takes various access configurations in the URL. Note that it will also | |
15c0b25d AP |
315 | read these from standard AWS environment variables if they're set. S3 compliant servers like Minio |
316 | are also supported. If the query parameters are present, these take priority. | |
bae9f6d2 JC |
317 | |
318 | * `aws_access_key_id` - AWS access key. | |
319 | * `aws_access_key_secret` - AWS access key secret. | |
320 | * `aws_access_token` - AWS access token if this is being used. | |
321 | ||
322 | #### Using IAM Instance Profiles with S3 | |
323 | ||
324 | If you use go-getter and want to use an EC2 IAM Instance Profile to avoid | |
325 | using credentials, then just omit these and the profile, if available will | |
326 | be used automatically. | |
327 | ||
15c0b25d AP |
328 | ### Using S3 with Minio |
329 | If you use go-gitter for Minio support, you must consider the following: | |
330 | ||
331 | * `aws_access_key_id` (required) - Minio access key. | |
332 | * `aws_access_key_secret` (required) - Minio access key secret. | |
333 | * `region` (optional - defaults to us-east-1) - Region identifier to use. | |
334 | * `version` (optional - defaults to Minio default) - Configuration file format. | |
335 | ||
bae9f6d2 JC |
336 | #### S3 Bucket Examples |
337 | ||
338 | S3 has several addressing schemes used to reference your bucket. These are | |
339 | listed here: http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html#access-bucket-intro | |
340 | ||
341 | Some examples for these addressing schemes: | |
342 | - s3::https://s3.amazonaws.com/bucket/foo | |
343 | - s3::https://s3-eu-west-1.amazonaws.com/bucket/foo | |
344 | - bucket.s3.amazonaws.com/foo | |
345 | - bucket.s3-eu-west-1.amazonaws.com/foo/bar | |
15c0b25d | 346 | - "s3::http://127.0.0.1:9000/test-bucket/hello.txt?aws_access_key_id=KEYID&aws_access_key_secret=SECRETKEY®ion=us-east-2" |
bae9f6d2 | 347 | |
107c1cdb ND |
348 | ### GCS (`gcs`) |
349 | ||
350 | #### GCS Authentication | |
351 | ||
352 | In order to access to GCS, authentication credentials should be provided. More information can be found [here](https://cloud.google.com/docs/authentication/getting-started) | |
353 | ||
354 | #### GCS Bucket Examples | |
355 | ||
356 | - gcs::https://www.googleapis.com/storage/v1/bucket | |
357 | - gcs::https://www.googleapis.com/storage/v1/bucket/foo.zip | |
358 | - www.googleapis.com/storage/v1/bucket/foo |