Mercurial > hg > fetch
annotate README.txt @ 56:6ebd2d10fc03 default tip
stub embedding fetch
author | Jeff Hammel <jhammel@mozilla.com> |
---|---|
date | Fri, 02 Dec 2011 17:41:12 -0800 |
parents | 2dfdff7549b2 |
children |
rev | line source |
---|---|
0 | 1 fetch |
20 | 2 ===== |
0 | 3 |
4 fetch stuff from the interwebs | |
5 | |
53 | 6 `fetch.py <http://k0s.org/mozilla/hg/fetch/raw-file/tip/fetch.py>`_ is |
7 a single-file python module bundled as a | |
8 `package <http://k0s.org/mozilla/hg/fetch/>`_ for easy installation | |
9 and python importing. The purpose of fetch is to mirror remote | |
10 resources (URLs) to a local filesystem in order to synchronize and | |
11 update dependencies that are desired to be mirrored in this way. | |
12 | |
20 | 13 |
0 | 14 Format |
15 ------ | |
16 | |
53 | 17 ``fetch`` fetches from a manifest of the format:: |
20 | 18 |
19 [URL] [Destination] [Type] | |
20 | |
21 A URL can contain a hash tag (e.g. http://example.com/foo#bar/fleem) | |
22 which is used to extract the subdirectories from a multi-directory | |
23 resource. | |
24 | |
53 | 25 The ``Type`` of the resource is used to dispatch to the included |
26 Fetchers that take care of fetching the object. | |
27 | |
28 Manifests are used so that a number of resources may be fetched from a | |
29 particular ``fetch`` run. | |
20 | 30 |
31 | |
55 | 32 Example |
33 ------- | |
34 | |
35 After you checkout the `repository <http://k0s.org/mozilla/hg/fetch>`_ | |
36 and run ``python setup.py develop``, you should be able to run | |
37 ``fetch`` on the example manifest:: | |
38 | |
39 fetch example.txt | |
40 | |
41 This will create a ``tmp`` directory relative to the manifest and pull | |
42 down several resources to it. | |
43 | |
44 | |
20 | 45 Fetchers |
46 -------- | |
47 | |
53 | 48 ``fetch`` includes several objects for fetching resources:: |
13
3fee8ecd1af8
restructure while we still just have one module
Jeff Hammel <jhammel@mozilla.com>
parents:
0
diff
changeset
|
49 |
20 | 50 file : fetch a single file |
51 tar : fetch and extract a tarball | |
52 hg : checkout a mercurial repository | |
53 git : checkout a git repository | |
54 | |
53 | 55 The ``file`` fetcher cannot have a hash tag subpath since it is a single |
20 | 56 resource. |
57 | |
53 | 58 Though ``fetch`` has a set of fetchers included, you can pass an |
59 arbitrary list into ``fetch.Fetch``'s constructor. | |
20 | 60 |
61 | |
62 Version Control | |
63 --------------- | |
64 | |
53 | 65 The ``hg`` and the ``git`` fetchers fetch from version control systems and |
66 have additional options. The only current option to the constructor | |
67 is ``export``, which is by default True. If ``export`` is True, then | |
68 the repository will be exported into a non-versioned structure. If a | |
69 subpath is specified with a ``#`` in the URL, the repository will also | |
70 be exported. | |
71 | |
72 | |
73 TODO | |
74 ---- | |
75 | |
76 * use `manifestparser <https://github.com/mozilla/mozbase/blob/master/manifestdestiny/manifestparser.py>`_ | |
77 ``.ini`` files versus another manifest | |
78 format: when I started work on ``fetch``, I thought a | |
79 domain-specific manifest would be a big win. But, now, maybe a | |
80 ``.ini`` type manifest looks about the same, and is something that | |
81 is already used. The switch internally wouldn't be that bad, but | |
82 if ``fetch.py`` is used as a single file, it cannot have | |
83 "traditional" python dependencies. Since ``manifestparser.py`` is | |
84 also a single file, and ``fetch`` is only usable with internet | |
85 access anyway, maybe the | |
86 `require <http://k0s.org/hg/config/file/tip/python/require.py>`_ | |
87 pattern could be used for this purpose | |
0 | 88 |
53 | 89 * clobber: generally, you will want the existing resource to be |
90 clobbered, avoiding renames regarding upstream dependencies | |
91 | |
92 * outputting only subpaths: often, you will not to fetch from the | |
93 whole manifest, only from certain subpaths of the manifest. You | |
94 should be able to output a subset of what is to be mirrored based | |
95 on destination subpaths. The CLI option ``--dest`` is intended for | |
96 this purpose but currently unused. | |
97 | |
55 | 98 * fetcher options: currently ``read_manifests`` reads an unused |
99 column into ``options`` when present in the form of a string like | |
100 ``foo=one,bar=two`` into a dict like | |
101 ``{'foo': 'one', 'bar': 'two'}``. This hasn't been needed yet and | |
102 is unused. If we want to have resource-specific options, we should | |
103 use this and make it work. Otherwise it can be deleted. | |
53 | 104 |
55 | 105 * python package fetcher: often you will want to fetch a python |
106 package as a resource. This would essentially fetch the object | |
107 (using another fetcher) and take the (untarred) result of | |
108 ``python setup.py sdist`` as a resource. This will strip out files | |
109 that aren't part of the python package. Unknowns include how to | |
110 specify the sub-fetcher. You could also include other | |
111 domain-specific fetchers as needed. | |
112 | |
113 * note python 2.5+ specifics: ``fetch`` currently uses at least | |
114 ``os.path.relpath`` from python 2.5 and perhaps other 2.5+isms as | |
115 well. These should at least be documented and checked for if not | |
116 obviated. One such reimplementation is at | |
117 https://github.com/mozilla/mozbase/blob/master/manifestdestiny/manifestparser.py#L66 | |
118 | |
53 | 119 |
120 Unsolved Problems | |
121 ----------------- | |
122 | |
123 A common story for ``fetch`` is mirroring files into a VCS repository | |
124 because the remote resources are needed as part of the repository and | |
125 there is no better way to retrieve and/or update them. However, what | |
126 do you do if these remote resources are altered? In an ideal | |
127 ecosystem, the fixes would be automatically triaged and triggered for | |
128 upstream inclusion, or the diffs from the upstream are kept in local | |
129 modifications (although vendor branches, etc, are more suitable for | |
130 the latter class of problems, and in general discouraged when a less | |
131 intrusive system of consuming upstream dependencies are available). | |
0 | 132 |
133 ---- | |
134 | |
135 Jeff Hammel | |
13
3fee8ecd1af8
restructure while we still just have one module
Jeff Hammel <jhammel@mozilla.com>
parents:
0
diff
changeset
|
136 http://k0s.org/mozilla/hg/fetch |