Hello! I am semi-new to python but have had some success in the past with web-scraping. This idea is definitely beyond my current skill.
There is an interactive map on this website ( https://www4.sii.cl/mapasui/internet/#/contenido/index.html ) that I want to scrape data from. The info I want is parcel-level data (called “PREDIOS”) for a comuna named San Antonio (5401). I already have a shapefile for the properties needed, but that shapefile is not connected to any cadaster information like property number, value, materials etc. The issue is that I can’t figure out how to query the WMS proxy service to getInfoPredio or getFeatureInfoRequest (find this relevant info in the mapascontroller.js element). All I want is the property number WITH an X-Y coordinate or I mean a shapefile would also be great. The PREDIO number is “3 or 4 numbers representing block number”-“1 to 3 numbers representing the property number” (so something like XXX-X, XXX-XX, or XXXX-XXX). The latter has maybe 50 exceptions where the property number is 5 digits. I’d be happy just to get the centroid of the polygon connected with the property number. The problem is that I am not sure the how easy it will be to get the info from the JSON without somehow having a function ‘click’ in various coordinates to get the leaflet popup response that includes a coordinate.
I have considered a few options including: – Does it make sense to somehow query the site through a round of X,Y coordinates corresponding to the centroids of various properties? – Can anyone better than me at python take a look to see if I am totally barking up the wrong tree? – I wrote a little bit of code that seemed to have made something happen using the url=’https://www4.sii.cl/mapasui/services/data/mapasFacadeService/getFeatureInfo‘ only to end in the error of “JSONDecodeError: Expecting value: line 1 column 1 (char 0)”
I just feel like I’m missing something. Code Blow:
url=’https://www4.sii.cl/mapasui/services/data/mapasFacadeService/getFeatureInfo‘ def get_data_from_api_3(): headers = { ‘Accept’: ‘application/json, text/plain, /‘, ‘Accept-Encoding’: ‘gzip, deflate, br’, ‘Accept-Language’: ‘en-US,en;q=0.9,es;q=0.8’, ‘Connection’: ‘keep-alive’, ‘Content-Type’: ‘application/json’, ‘Cookie’: ‘JSESSIONID=678699A651D2527DC8785878758D252B.mrs15; dtCookie=v_4_srv_20_sn_825696C3D10F38AB4F7E5F241F632D9D_perc_100000_ol_0_mul_1_app-3Ae1a4760ba607c5ca_1_app-3Aea7c4b59f27d43eb_0; rxVisitor=165056858722955CVBPA9VNKCGHEVAJE8E7T9KBACLNQR; AMCVS_673031365C06A5620A495CFC%40AdobeOrg=1; s_cc=true; AAGSID=678699A651D2527DC8785878758D252B.mrs15; AMCV_673031365C06A5620A495CFC%40AdobeOrg=281789898%7CMCIDTS%7C19213%7CMCMID%7C11303111576103507753606040005210109685%7CMCAAMLH-1660590196%7C4%7CMCAAMB-1660590196%7C6G1ynYcLPuiQxYZrsz_pkqfLG9yMXBpb2zX5dvJdYQJzPXImdj0y%7CMCOPTOUT-1659992596s%7CNONE%7CMCSYNCSOP%7C411-19214%7CvVersion%7C4.1.0; dtSa=true%7CC%7C-1%7Cfa%20fa-key%20menubtn%7C-%7C1659985402556%7C385395768_218%7Chttps%3A%2F%2Fwww4.sii.cl%2Fmapasui%2Finternet%2F%7C%7C%7C%2Fcontenido%2Findex.html%7C%7C%2Fmapasui%2Finternet%2F%7C1659985395566%7C%7Ci1%5Esk0%5Esh0%5Est1; dtLatC=7; s_sq=siiprd%3D%2526c.%2526a.%2526activitymap.%2526page%253Dhttps%25253A%25252F%25252Fwww4.sii.cl%25252Fmapasui%25252Finternet%25252F%252523%25252Fcontenido%25252Findex.html%2526link%253DP%252520Predios%2526region%253Dcatalogo-comuna-5401%2526.activitymap%2526.a%2526.c%2526pid%253Dhttps%25253A%25252F%25252Fwww4.sii.cl%25252Fmapasui%25252Finternet%25252F%252523%25252Fcontenido%25252Findex.html%2526oid%253Djavascript%25253Avoid%2525280%252529%25253B%2526ot%253DA; rxvt=1659987230960|1659983372990; dtPC=20$385405520_813h42vHKAPAKVSHQWGMNSFCOEUFFFMPVMJUSMF-0e0’, ‘DNT’: ‘1’, ‘Host’: ‘www4.sii.cl’, ‘Origin’: ‘https://www4.sii.cl‘, ‘Referer’: ‘https://www4.sii.cl/mapasui/internet/‘, ‘sec-ch-ua’: ‘”.Not/A)Brand”;v=”99″, “Google Chrome”;v=”103″, “Chromium”;v=”103″‘, ‘sec-ch-ua-mobile’: ‘?0’, ‘sec-ch-ua-platform’: ‘”Windows”‘, ‘Sec-Fetch-Dest’: ’empty’, ‘Sec-Fetch-Mode’: ‘cors’, ‘Sec-Fetch-Site’: ‘same-origin’, ‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36’, ‘x-dtpc’: ’20$385405520_813h42vHKAPAKVSHQWGMNSFCOEUFFFMPVMJUSMF-0e0′, } params = ((‘comuna’,’5401′), (‘predio’, “”), ) response = requests.get(url, headers=headers, params=params) return response.json() try: sii_data = get_data_from_api_3() jsondata = sii_data[‘predioPublicado’]
data_file = open('jsonoutput.csv', 'w', newline='') csv_writer = csv.writer(data_file) count = 0 for data in jsondata: if count == 0: header = data['predioPublicado'].keys() csv_writer.writerow(header) count += 1 csv_writer.writerow(data['predioPublicado'].values()) else: print('error')
except ConnectionError: exit()
submitted by /u/hoomalooma to r/learnpython
[link] [comments]
More Stories
Will County, Illinois 1864 Map – May 20, 2023 at 04:14AM
This kid on Google Map trying to get by – April 27, 2023 at 05:05PM
World of Hyatt: Complete list of all-inclusive properties in Europe (with map) – April 27, 2023 at 04:57PM