Link to home
Start Free TrialLog in
Avatar of Jonathan
Jonathan

asked on

I would like to get in a .csv file the data from the points in this svg map

Hello, I would like to find the id's (I mean to be able to rebuild the URL with the good id linked to the points found on this svg OpenLayers map.
Could you help me, and tell me were I can find this data. It should be already dowloaded, am I right?
Maybe could you tell me the technology I should use, or the files I need to edit.

https://urbanisme.irisnet.be/permis?set_language=fr

When I download the whole page, I find in the folder, some wms files (joined here). Do you think I can get the data there?

Thanks
Avatar of David Favor
David Favor
Flag of United States of America image

Things you'll require doing to implement a solution.

1) When an SVG image is built, the final constructed SVG data can either be in a static file or dynamically created.

For static SVG images, just download the file + operate on the file with command line tools.

2) In your case, this page's SVG image appears to be dynamically generated, so continue reading...

3) The page you provided does not contain any SVG images, as standalone assets, rather the entire page appears to make calls to an API to generate the page... so...

4) Your solution will be to use http://phantomjs.org/ (headless Chrome) to generate the DOM + then save the DOM locally + operate on the saved DOM with command line tools.

This means, completing your project will likely take a very long time.

Be sure to build in test code, to visit known pages + scrape data + decode data + verify page format is as expected... because... anytime this page's DOM changes, then your code will have to be rewritten/fixed.

All scrapers suffer from this problem. They only work, till page layout changes, then a rewrite/retool is required.

So this will be an ongoing project to maintain.
Avatar of Jonathan
Jonathan

ASKER

Wow thanks a lot for your solution! Very good answer.
I'll take a udemy training to learn how to use phantomjs.org if I can't do another way.

But tell me one think. There is another way to scrap it. By just incrementing the id in the URL of that kind
https://urbanisme.irisnet.be/permis/public?id=785522. The problem is that only 13000 Id are reffering to an URL. And incrementing a 6 digit number can reach to 1.000.000 requests, even if 983.000 reach to a blank page.

So I just need the url linked to those points (same problem I guess. Or should I try the million requests?
This question needs an answer!
Become an EE member today
7 DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.