Mercurial > hg > GlobalNeighbors
comparison globalneighbors/neighbors.py @ 22:e69cb496324e
we have a data dump
author | Jeff Hammel <k0scist@gmail.com> |
---|---|
date | Sun, 25 Jun 2017 17:45:19 -0700 |
parents | 2fef925fbf37 |
children | 6891c5523b69 |
comparison
equal
deleted
inserted
replaced
21:22c384fe954d | 22:e69cb496324e |
---|---|
1 """ | 1 """ |
2 read neighbors file; | 2 read neighbors file; |
3 this should be in the form of: | 3 this should be in the form of: |
4 | 4 |
5 `{geoid: [(geoid_closest_neighbor, distance), | 5 `geoid [(geoid_closest_neighbor, distance), (geoid_2nd_closest_neighbor, distance), ...]` |
6 (geoid_2nd_closest_neighbor, distance), | 6 |
7 ...] | 7 *PER LINE* this format was chosen because it is easier to |
8 }` | 8 iteratively read and write vs JSON. |
9 | |
10 While CSV could be made to fit this model, because | |
11 there are both distances and geo IDs as pairs, it is not | |
12 the most natural fit. So we'll settle for our own data model. | |
13 No, it's not the best, but so be it (for now). | |
9 """ | 14 """ |
10 | 15 |
11 import json | 16 import json |
12 import os | 17 import os |
13 | 18 |
19 with open(f) as _f: | 24 with open(f) as _f: |
20 return read_neighbors_file(f) | 25 return read_neighbors_file(f) |
21 | 26 |
22 retval = {} | 27 retval = {} |
23 for line in f: | 28 for line in f: |
24 data = json.loads(line) | 29 key, value = line.split(None, 1) |
25 retval.update(data) | 30 key = int(key) |
31 data = json.loads(value) | |
32 data = [tuple(item) for item in data] | |
33 retval[key] = data | |
26 return retval | 34 return retval |