I would like to get in a .csv file the data from the points in this svg map

Jonathan
Jonathan used Ask the Experts™
on
Hello, I would like to find the id's (I mean to be able to rebuild the URL with the good id linked to the points found on this svg OpenLayers map.
Could you help me, and tell me were I can find this data. It should be already dowloaded, am I right?
Maybe could you tell me the technology I should use, or the files I need to edit.

https://urbanisme.irisnet.be/permis?set_language=fr

When I download the whole page, I find in the folder, some wms files (joined here). Do you think I can get the data there?

Thanks
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
David FavorFractional CTO
Distinguished Expert 2018

Commented:
Things you'll require doing to implement a solution.

1) When an SVG image is built, the final constructed SVG data can either be in a static file or dynamically created.

For static SVG images, just download the file + operate on the file with command line tools.

2) In your case, this page's SVG image appears to be dynamically generated, so continue reading...

3) The page you provided does not contain any SVG images, as standalone assets, rather the entire page appears to make calls to an API to generate the page... so...

4) Your solution will be to use http://phantomjs.org/ (headless Chrome) to generate the DOM + then save the DOM locally + operate on the saved DOM with command line tools.

This means, completing your project will likely take a very long time.

Be sure to build in test code, to visit known pages + scrape data + decode data + verify page format is as expected... because... anytime this page's DOM changes, then your code will have to be rewritten/fixed.

All scrapers suffer from this problem. They only work, till page layout changes, then a rewrite/retool is required.

So this will be an ongoing project to maintain.

Author

Commented:
Wow thanks a lot for your solution! Very good answer.
I'll take a udemy training to learn how to use phantomjs.org if I can't do another way.

But tell me one think. There is another way to scrap it. By just incrementing the id in the URL of that kind
https://urbanisme.irisnet.be/permis/public?id=785522. The problem is that only 13000 Id are reffering to an URL. And incrementing a 6 digit number can reach to 1.000.000 requests, even if 983.000 reach to a blank page.

So I just need the url linked to those points (same problem I guess. Or should I try the million requests?

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial