Commit Graph

254 Commits

Author SHA1 Message Date
Ed Chalstrey
814203952a change column type 2022-04-14 09:28:34 +01:00
Ed Chalstrey
bc0aca8d56 doc fix 2022-04-14 09:28:34 +01:00
Ed Chalstrey
7ed8d8defe add warning 2022-04-14 09:28:34 +01:00
Ed Chalstrey
0048f0ae73 Updating the buildings table with coordinates SQL 2022-04-14 09:28:34 +01:00
Ed Chalstrey
66138855a3 find the converted csvs 2022-04-14 09:28:34 +01:00
Ed Chalstrey
709be9ef70 separate python step since memory failiure is possible 2022-04-14 09:28:34 +01:00
Ed Chalstrey
08d4610336 remove index not needed 2022-04-14 09:28:34 +01:00
Ed Chalstrey
d100f12991 add coordinates step 2022-04-14 09:28:34 +01:00
Ed Chalstrey
96707f1d73 spelling 2022-04-14 09:28:34 +01:00
Ed Chalstrey
7faf79aa9e create new csv instead of using old 2022-04-14 09:28:34 +01:00
Ed Chalstrey
0e3c754075 switch to cython 2022-04-14 09:28:34 +01:00
Ed Chalstrey
5e89bc9dac add conversion to latlon step 2022-04-14 09:28:34 +01:00
Ed Chalstrey
c1e51531f7 update instructions for python 2022-04-14 09:28:34 +01:00
Ed Chalstrey
6593f9d249 add python script to convert to lat lon 2022-04-14 09:28:34 +01:00
Ed Chalstrey
faa6ddbe4d add creation of the_geom column 2022-04-14 09:28:34 +01:00
Ed Chalstrey
239b1db0f0 add missing column 2022-04-14 09:28:34 +01:00
Ed Chalstrey
22f3773632 create load coordinates script 2022-04-14 09:28:34 +01:00
Ed Chalstrey
9242b22bb5 add OS TOID download info 2022-04-14 09:28:34 +01:00
Ed Chalstrey
75e6382897 load new geometries 2022-04-14 09:28:34 +01:00
Ed Chalstrey
a635655995 explain "ETL" 2022-04-01 11:23:24 +01:00
Ed Chalstrey
667110537f clarify 2022-04-01 10:50:34 +01:00
Ed Chalstrey
efc49bd2b9 fix link 2022-04-01 10:37:56 +01:00
Ed Chalstrey
aba46dc95d add contents 2022-04-01 10:32:53 +01:00
Ed Chalstrey
b470591420 rearrange sections 2022-04-01 10:20:18 +01:00
Ed Chalstrey
a3d2537e22 Revert "create script to load new geometries"
This reverts commit d6ca8852d4.
2022-04-01 10:05:11 +01:00
Ed Chalstrey
d6ca8852d4 create script to load new geometries 2022-03-31 16:27:36 +01:00
Ed Chalstrey
5b2029a4e3 remove bad comments 2022-03-31 15:42:06 +01:00
Ed Chalstrey
7bd78bf03a fix filtering script 2022-03-29 15:41:00 +01:00
Ed Chalstrey
2163dc5812 add echos to load geometries 2022-03-29 14:48:27 +01:00
Ed Chalstrey
0e01971b4a add echos to extraction step 2022-03-29 14:46:18 +01:00
Ed Chalstrey
4a8d79a54a add echos to filtration step 2022-03-29 14:42:34 +01:00
Ed Chalstrey
3f07a9b081 remove print statement 2022-03-29 14:37:42 +01:00
Ed Chalstrey
f1946ea35b flake8 2022-03-29 10:33:47 +01:00
Ed Chalstrey
f49e378fe3 fix filter_mastermap 2022-03-29 10:32:13 +01:00
Ed Chalstrey
6602fc916a comment out step temporarily 2022-03-28 15:23:40 +01:00
Ed Chalstrey
4cd14ffd77 remove sudo 2022-03-28 15:22:04 +01:00
Ed Chalstrey
c85b321a60 revert changes to env vars 2022-03-28 15:06:23 +01:00
Ed Chalstrey
7bffa44502 use env vars properly? 2022-03-28 14:42:29 +01:00
Ed Chalstrey
bb964f55c9 add sudo 2022-03-28 14:18:10 +01:00
Ed Chalstrey
405f9558ad gather stored vars for postgres 2022-03-28 14:17:55 +01:00
Ed Chalstrey
da89c9b63b clarify comment 2022-03-28 14:02:47 +01:00
Ed Chalstrey
5f9aaf7f0b tidy 2022-03-28 13:54:00 +01:00
Ed Chalstrey
d060ebf4bb add comment 2022-03-28 13:42:08 +01:00
Ed Chalstrey
d7b4f9eeb5 only convert gml files with 5690395 to csv 2022-03-24 11:41:56 +00:00
Ed Chalstrey
8398bcea3a remove python environment not needed 2022-03-24 09:51:41 +00:00
Ed Chalstrey
cf30f71d0a remove bad comments 2022-03-18 16:52:41 +00:00
Ed Chalstrey
27623cf3d4 update filter step explanation 2022-03-18 16:49:46 +00:00
Ed Chalstrey
db01442d9b flake8 2022-03-18 16:18:44 +00:00
Ed Chalstrey
3f281a5e73 handle missing descriptiveGroup 2022-03-18 15:44:19 +00:00
Ed Chalstrey
b875e27fb8 missing colon 2022-03-18 14:48:19 +00:00
Ed Chalstrey
f6cb0c488e add test for filtering mastermap (not working) 2022-03-18 14:34:06 +00:00
Ed Chalstrey
103300f969 remove not needed non-buildings toid code 2022-03-18 13:36:06 +00:00
Ed Chalstrey
7ec09a7123 add screenshot image 2022-03-18 11:27:05 +00:00
Ed Chalstrey
dfe48da577 tidy script 2022-03-18 11:08:33 +00:00
Ed Chalstrey
3653e30362 remove addressbase from all steps and reorder readme 2022-03-18 11:07:15 +00:00
Ed Chalstrey
f55ce63d84 add drop_outside_limit to instructions 2022-03-18 10:49:41 +00:00
Ed Chalstrey
9e4224f51c add sudo cmds and comment 2022-03-18 10:46:28 +00:00
Ed Chalstrey
0e35a7cca2 mastermap filtering without using addressbase 2022-03-17 15:43:23 +00:00
Ed Chalstrey
d822dfaaec removee addressbase stuff 2022-03-17 15:02:51 +00:00
Ed Chalstrey
dafe5de278 comment out addessbase stuff 2022-03-17 14:47:08 +00:00
Ed Chalstrey
8b8f6622f5 temp python path edit 2022-03-17 14:41:54 +00:00
Ed Chalstrey
8fccec6827 temp python path edit 2022-03-14 09:39:29 +00:00
Ed Chalstrey
7e8817d2c5 make scripts executable 2022-03-11 13:28:23 +00:00
Ed Chalstrey
6916e4ca59 remove use of addressbase 2022-03-11 11:46:00 +00:00
Ed Chalstrey
7f3693355e add comments and update how to run 2022-03-10 14:02:00 +00:00
Ed Chalstrey
5b06b12f98 temp hard code python path 2022-03-10 13:51:05 +00:00
Ed Chalstrey
5796df69a1 update python path in script 2022-03-10 13:47:43 +00:00
Ed Chalstrey
4f58ea64f4 rearrange 2022-03-10 13:40:30 +00:00
Ed Chalstrey
8050419e71 add error 2022-03-10 12:02:26 +00:00
Ed Chalstrey
26090e03b0 update mastermap instructions 2022-03-10 12:01:41 +00:00
Ed Chalstrey
3366b67560 Revert "unzip stage"
This reverts commit b773cec0e8.
2022-03-10 12:00:37 +00:00
Ed Chalstrey
c1cf06dca9 update AddressBase instructions 2022-03-10 11:55:35 +00:00
Ed Chalstrey
e66e375cb8 add note 2022-03-10 11:46:34 +00:00
Ed Chalstrey
5810aafc1e update how to run migrations 2022-03-10 10:44:31 +00:00
Ed Chalstrey
5fb4f91e5c source python env 2022-03-10 10:40:38 +00:00
Ed Chalstrey
4da6c96181 tidy 2022-03-10 10:39:06 +00:00
Ed Chalstrey
c029976741 todo 2022-03-10 10:06:44 +00:00
Ed Chalstrey
b773cec0e8 unzip stage 2022-03-10 09:59:22 +00:00
Ed Chalstrey
b85a2bf865 add sudo cmds 2022-03-10 09:53:12 +00:00
Ed Chalstrey
490307e9c5 clarify 2022-03-10 09:46:12 +00:00
Ed Chalstrey
d58b0d35fb remove duplicate info 2022-03-10 09:35:04 +00:00
Ed Chalstrey
269f24b946 remove section duplicated elsewhere 2022-03-10 09:33:55 +00:00
Ed Chalstrey
bf75d6f9ed make data available 2022-03-09 13:38:57 +00:00
Ed Chalstrey
732bae1f20 clarify 2022-03-09 13:33:26 +00:00
Ed Chalstrey
2dae59e540 data downloading 2022-03-09 13:24:27 +00:00
Ed Chalstrey
48fd7ec67f clarify pre-rquisites and link to doc 2022-03-09 11:48:45 +00:00
Ed Chalstrey
99056aac4b remove docker stuff - new branch: dockerise-colouring-london 2022-02-14 09:48:59 +00:00
Ed Chalstrey
226c9a1a5f add etl files with specific db name 2022-02-07 13:45:55 +00:00
Ed Chalstrey
97caaa5f06 bump shapely version 2022-02-03 14:51:48 +00:00
Ed Chalstrey
df77db2854 bump osmnx version 2022-02-03 14:46:49 +00:00
Maciej Ziarkowski
fef3a6532c Merge branch 'master' of github.com:tomalrussell/colouring-london 2020-06-18 13:00:53 +01:00
Maciej Ziarkowski
da52fa4971 Fix typo in etl requirements.txt 2020-06-18 13:00:39 +01:00
Tom Russell
150ef00ca9 Update etl/migrations docs with a little more on prerequisites 2020-06-18 10:32:35 +01:00
Tom Russell
4eb5961af5 Fix use of osmnx to work with v0.14 2020-06-18 10:31:34 +01:00
Maciej Ziarkowski
c997654545 Add debug, no overwrite flags 2020-06-16 16:16:46 +01:00
Maciej Ziarkowski
e2b94cfe2e Move to argparse for command line options 2020-06-16 13:27:08 +01:00
Maciej Ziarkowski
cd74ff6f32 Add retrying logic 2020-06-16 13:24:43 +01:00
Tom Russell
5d8a0dd42b
Merge branch 'master' into features/migrations_sustainability 2020-04-09 10:47:55 +01:00
Tom Russell
055ed426b4
Merge branch 'master' into fix/upload 2020-04-09 10:19:13 +01:00
dominic
bdd3462e99
Update load_csv_to_staging.py
Updates to bring staging script into line with main upload script
- Script was not working
2020-02-27 12:04:09 +00:00
dom_ucl_mb
5ca6a6f7fe WIP for #405 2020-02-13 17:22:16 +00:00
Maciej Ziarkowski
82a50d77d6 Allow specifying JSON columns for CSV bulk import 2019-12-10 17:25:55 +00:00
Maciej Ziarkowski
26ca7f8873 Accept CSV with building_id for API data import 2019-12-10 17:16:26 +00:00
Tom Russell
f3b9be39bf Update load_csv python script
- use building_id if present in CSV
- print DEBUG even if response from API is 200
- handle empty-string sust_dec (which fails because empty-string isn't a valid
  enum value as defined in 011.sustainability.up.sql) by deleting from data
  if present
2019-11-21 13:17:17 +00:00
Dominic H
1d372823ed Adjusts guide notes no code change 2019-11-08 12:36:54 +00:00
Dominic H
e00d912686 Enable python upload to staging
- Based on load_csv.py with edit for staging
2019-11-08 12:31:40 +00:00
Dominic H
9943bdd9a5 Test load to staging
- Copy of main load_csv.py with edit to get round ssh error
2019-10-02 16:35:56 +01:00
Tom Russell
46ac6c7a40 Fix load_csv script - tested against localhost 2019-10-02 09:43:12 +01:00
Tom Russell
c055bea38b Add generic CSV upload script (rename load_data to load_shapefile) 2019-09-30 10:39:43 +01:00
Maciej Ziarkowski
2c9b5ea3d8 Modify routes, refactor API structure 2019-08-14 14:05:49 +01:00
Tom Russell
482ab5060c Set postcode zoom and class 2019-02-11 09:07:26 +00:00
Tom Russell
731f299a18 Load open postcode data 2019-02-05 13:36:43 +00:00
Tom Russell
42f72bea9c Add script to upload conservation areas 2019-01-20 15:29:54 +00:00
Tom Russell
d72dd90351 Comment sections in etl run script 2018-10-21 20:47:31 +01:00
Tom Russell
3441bf88e2 Update load_data to use API 2018-10-20 18:37:02 +01:00
Tom Russell
4415b21af5 Ensure index exists for uprn link 2018-10-04 21:13:25 +01:00
Tom Russell
a10e4bf5c6 Update run_all etl script 2018-10-04 19:01:40 +01:00
Tom Russell
30086766db Boundary file not needed in initial extraction 2018-10-04 19:01:17 +01:00
Tom Russell
20e4a73e73 Skip altering foreign key restrictions 2018-10-04 19:00:56 +01:00
Tom Russell
4c62da548b Drop outside limit of boundary 2018-10-04 18:59:53 +01:00
Tom Russell
b180602a5b Uncomment copy-uprns block 2018-10-03 20:10:47 +01:00
Tom Russell
b73fb7118e Fix sed quoting 2018-10-03 20:10:27 +01:00
Tom Russell
f06b820d19 Remove clipsrc for speedup 2018-10-03 20:10:16 +01:00
Tom Russell
4696e3e079 Update etl to load UPRNs to table 2018-10-02 21:12:46 +01:00
Tom Russell
79724cc449 Copy from stdin (cat user-accessible file) when loading geometries 2018-09-30 21:23:19 +01:00
Tom Russell
2a1902f6ce Update ETL docs 2018-09-29 18:29:57 +01:00
Tom Russell
73aa3df290 Clip OSMM to GLA on extraction 2018-09-27 21:37:47 +01:00
Tom Russell
322a976f7a Fix get_test_polygons script 2018-09-25 22:01:09 +01:00
Tom Russell
342167f9c9 Update UPRN-load script 2018-09-25 21:47:58 +01:00
Tom Russell
d9797385d9 Use default pool size (CPU count) in ETL 2018-09-25 21:47:29 +01:00
Tom Russell
3f9c9f3221 Split indexing further, UPRN requires bigint 2018-09-25 21:46:22 +01:00
Tom Russell
bddd7e769f Rename etl scripts 2018-09-25 20:46:16 +01:00
Tom Russell
c6b3d3d5ca Extract and load addressbase/mastermap using id, gml, parallel 2018-09-25 19:20:41 +01:00
Tom Russell
181e850225 Parallel extract/filter OS data 2018-09-21 11:10:39 +01:00
Tom Russell
f6f7cc1341 Save test polygons projected 2018-09-10 10:44:09 +01:00
Tom Russell
204740e46e Sketch matching data by best-intersection 2018-09-09 11:58:50 +01:00
Tom Russell
5695f2dc9d Record python packages for etl 2018-09-09 11:34:37 +01:00
Tom Russell
54633d9e04 Create building-per-geometry 2018-09-09 11:32:27 +01:00
Tom Russell
3711fc5f80 Load test polygons 2018-09-09 11:32:12 +01:00
Tom Russell
981d3608a2 Mappings for Camden and Fitrovia CASA data 2018-08-01 17:06:00 +01:00
Tom Russell
69bb6c5790 Provide transform when joining data 2018-08-01 17:05:02 +01:00
Tom Russell
847eb75a70 Fix loading buildings 2018-08-01 15:48:31 +01:00
Tom Russell
990e4241cf Sketch generic update from shapes 2018-08-01 14:17:10 +01:00
Tom Russell
1c30f4d436 Skip trying to join buildings/geometries up front 2018-08-01 13:50:08 +01:00
Tom Russell
c7b322f214 Skip duplicate geometries 2018-08-01 13:49:16 +01:00
Tom Russell
f6c8323bfa Load buildings from CSV 2018-08-01 13:49:03 +01:00
Tom Russell
cd65a5aaab Load from geojson to db 2018-08-01 13:12:56 +01:00
Tom Russell
39160d8a09 Join each polygon to UPRNs if possible 2018-07-27 17:47:22 +01:00
Tom Russell
92b81928ca Rename etl scripts to reflect stages 2018-07-27 15:48:00 +01:00
Tom Russell
105271a1e8 Sketch loading Addressbase buildings to database
- TODO handle intersection with geometries
2018-07-17 09:10:24 +01:00